[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] CNN for Sentence Classification, ๋ฌธ์žฅ ๋ถ„๋ฅ˜๋ฅผ CNN์œผ๋กœ ์‹œ๋„ํ•˜๋‹ค.

2022. 3. 16. 23:15ใ†๐Ÿงช Data Science/Paper review

 

(๋…ผ๋ฌธ title ์บก์ณ)

 

 

์˜ค๋Š˜ ๋ฆฌ๋ทฐํ•  paper๋Š” Yoon kim๋‹˜(Newyork University)์˜ ๋…ผ๋ฌธ 'Convolutional Neural Networkds for Sentece Classification' ์ด๋‹ค. ๊ฐ„๋‹จํ•œ ํ˜•ํƒœ์˜ CNN์„ ์ด์šฉํ•œ ๋ฌธ์žฅ ๋ถ„๋ฅ˜์—์„œ ํƒ์›”ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์–ด ์ฃผ๋ชฉ์„ ๋ฐ›์•˜๋‹ค. LSTM์„ ๊ณต๋ถ€ํ•˜๋‹ค๊ฐ€ Vision์ชฝ์—์„œ ์“ฐ์ด๋Š” CNN์ด ๋ถ„๋ฅ˜์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค€๋‹ค๊ธฐ์— ์ฝ๊ฒŒ๋˜์—ˆ๋‹ค.

 

[Source url: https://arxiv.org/abs/1408.5882 , Cornell University] 

[Github url: https://github.com/yoonkim/CNN_sentence]

 

 

 

1. Summary

NLP ๋ฌธ์ œ( ๋Œ€ํ‘œ์ ์œผ๋ก  ์˜ˆ์ธก, ๋‹จ์–ด/๋ฌธ์žฅ ๋ถ„๋ฅ˜ )๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋งŽ์€ ML,DL ์—ฐ๊ตฌ๊ฐ€ ์žˆ์—ˆ๋‹ค. ๊ทธ๋Ÿฌ๋˜ ์ค‘ CNN ๋ชจ๋ธ์ด Static word vector๋ฅผ ์ƒํƒœ๋กœ ํŠœ๋‹ ์—†์ด ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ์ด์–ด 'Pre-trained, task-specific' word vector์—์„œ๋„ ๋™์ž‘ํ•˜๋ฉฐ, ๋‘˜์„ ๋™์‹œ์— ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋„๋ก modificaiton์„ ํ•˜์—ฌ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์ด๋ค„๋ƒˆ๋‹ค. 

 

 

NLP ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ CNN์œผ๋กœ ๋Œ๋ฆฌ๋Š” ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 

"The idea is to capture the most important feature—one

with the highest value—for each feature map.

This pooling scheme naturally deals with variable sentence lengths."

 

CNN์˜ ์›๋ฆฌ๋Š” image์—์„œ ํŠน์ง•๋“ค์„ ๊ฐ„์ถ”๋ฆฐ ๋‹ค์Œ, max-pooling๊ณผ soft-max ๊ณผ์ •์„ ๊ฑฐ์ณ ํ™•๋ฅ ๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒƒ์ด์—ˆ๋‹ค. ์ด ์›๋ฆฌ๋Š” NLP์—์„œ๋„ ๋˜‘๊ฐ™์ด ์ ์šฉ๋˜์—ˆ๋‹ค. word vector๋ฅผ layer์— ํ†ต๊ณผ์‹œ์ผœ word์˜ ํŠน์ง•๋“ค์„ ๊ฐ„์ถ”๋ฆฐ๋‹ค. ๋…ผ๋ฌธ์˜ ์ €์ž๋“ค์€ "one feature is extracted from one filter" ํ‘œํ˜„์„ ์“ฐ๋ฉฐ, Multiple filters๋ฅผ ํ†ตํ•ด Multiple features๋ฅผ ์ถ”์ถœํ–ˆ๋‹ค๊ณ  ๋งํ–ˆ๋‹ค. layer๋“ค์„ ํ†ตํ•ด ํŠน์ง•๋“ค์„ ์ถ”์ถœํ•˜๊ณ , ๋งˆ์ง€๋ง‰ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(Soft max)๋ฅผ ํ†ตํ•ด ๊ฐ€์žฅ ์ค‘์š”ํ•œ feature๋ฅผ ํ™•๋ฅ ๋กœ ๋‚˜ํƒ€๋ƒˆ๋‹ค. 

 

 

 

Experiment

 

์‹คํ—˜ model์— ๋Œ€ํ•œ ์„ค๋ช…

 

model ์‹คํ—˜ ๊ฒฐ๊ณผ

 

Conclusion

1) CNN ๋ชจ๋ธ์„ ํ† ๋Œ€๋กœ ์‹คํ—˜ํ•œ ๊ฒฐ๊ณผ, One layer ๋งŒ์œผ๋กœ๋„ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

2) ๋น„์ง€๋„ Pre-trained Word vector๊ฐ€ ์ž˜ ํ›ˆ๋ จ๋˜์–ด ์žˆ๋Š” ๊ฒƒ์ด NLP์—์„œ ์ค‘์š”ํ•˜๋‹ค๋Š” ์ ์„ ์ด๋Œ์–ด๋ƒˆ๋‹ค.

 

 

 

2. ์•„์‰ฌ์šด ์ 

vision ์ชฝ์€ ๊ณต๋ถ€๋ฅผ ๊ฑฐ์˜ ์•ˆํ•˜๋‹ค ๋ณด๋‹ˆ, ํ•ด์„ํ•˜๊ณ  ์›๋ฆฌ๋ฅผ ๋”ฐ๋ผ๊ฐ€๊ธฐ์— ๋ฐ”๋นด๋‹ค. ๋งŒ์•ฝ ๊ธฐ์กด์˜ CNN ์›๋ฆฌ์™€ ๋ฐฐ๊ฒฝ์ง€์‹์„ ์•Œ์•˜๋‹ค๋ฉด ํ›จ์”ฌ ์˜๋ฏธ์žˆ๊ฒŒ ๋…ผ๋ฌธ์„ ์ฝ๊ณ , ๋”๋‚˜์•„๊ฐ€ ์ฝ”๋“œ ๋ฆฌ๋ทฐ๊นŒ์ง€ ์ง„ํ–‰ํ•ด๋ณผ ์ˆ˜ ์žˆ์—ˆ๊ฒ ๋‹ค๋Š” ์•„์‰ฌ์›€์ด ๋‚จ์•˜๋‹ค. 

๋‹ค์Œ์—๋Š” ์ข€ ๋” ์ต์ˆ™ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ฝ๊ณ , ์ฝ”๋“œ๊นŒ์ง€ ๋ฆฌ๋ทฐํ•ด๋ด์•ผ๊ฒ ๋‹ค.