๐Ÿ“’ Jero's Review

2017๋…„์— ๋ฐœํ‘œ๋œ Attention Is All You Need๋Š” Transformer ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ ๋…ผ๋ฌธ์ด๋‹ค. ์ด ๋ชจ๋ธ์€ ๊ธฐ์กด์˜ RNN์ด๋‚˜ LSTM์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ , ์˜ค์ง Attention Mechanism๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ Sequence Data๋ฅผ ์ฒ˜๋ฆฌํ•œ๋‹ค. ์ง€๊ธˆ๋ถ€ํ„ฐ NLP ๋ถ„์•ผ์— ํ˜์‹ ์ ์ธ ๋ฐœ์ „์„ ์ผ์œผํ‚จ Transformer์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.Attention MechanismTransformer์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฐœ๋…์ธ Attention์€ ํ•œ ๋‹จ์–ด์™€ ๋‹ค๋ฅธ ๋‹จ์–ด ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋ชจ๋‘ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์–ด ํŠน์ • ๋ช‡ ๋‹จ์–ด ๊ฐ„์˜ ๊ด€๊ณ„๊ฐ€ ์•„๋‹Œ ๋ชจ๋“  ๋‹จ์–ด์™€ ๋Œ€์‘๋˜๋Š” ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ Sequence Data์˜ ์ˆœ์„œ ์ •๋ณด์—์„œ ๋ฒ—์–ด๋‚˜ ํ•™์Šตํ•˜์—ฌ RNN, LSTM ๋“ฑ์—์„œ ๋ฌธ์ œ ๋˜์—ˆ๋˜ Long Term Dependency ..
์ด๋ฒˆ์— ๋ฆฌ๋ทฐํ•  ๋…ผ๋ฌธ์€ 2015๋…„ CVPR์— ๊ฒŒ์‹œ๋œ Show and Tell: A Neural Image Caption Generator๋กœ, Image Caption ๋ชจ๋ธ์„ ์†Œ๊ฐœํ•˜๊ณ  ์žˆ๋‹ค. Image CaptionImage Caption์€ ๋ง ๊ทธ๋Œ€๋กœ Image์— ๋Œ€ํ•œ Caption์„ ์ƒ์„ฑํ•ด๋‚ด๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ์ฆ‰, ๋ชจ๋ธ์˜ Input์€ Image data๊ฐ€ ๋˜๊ณ , Output์€ Text data๊ฐ€ ๋œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์†Œ๊ฐœํ•˜๋Š” Image Caption ๋ชจ๋ธ์€ NIC(Neural Image Caption)์œผ๋กœ Neural Network(CNN, RNN)๋ฅผ ์ด์šฉํ•˜์—ฌ Image๋ฅผ ์ถฉ๋ถ„ํžˆ ์„ค๋ช…ํ•˜๋Š” Caption์„ ์ƒ์„ฑํ•˜๋Š” ๋ชจ๋ธ์ด๋‹ค. ์ง€๊ธˆ๋ถ€ํ„ฐ ์–ด๋–ค ๋ฐฉ๋ฒ•์œผ๋กœ Image Caption์— ์ ‘๊ทผํ•˜๋Š”์ง€, ์„ค๋ช…ํ•˜๊ณ  ์žˆ๋Š” ๋ชจ๋ธ ๊ตฌ์กฐ๋Š” ..
๋ณธ ๊ธ€์—์„œ ๋ฆฌ๋ทฐํ•  ๋…ผ๋ฌธ์€ 2014๋…„ NeurIPS์— ๊ฒŒ์‹œ๋œ Sequence to Sequence Learning with Neural Network ์ด๋‹ค. Sequence to Sequence๋…ผ๋ฌธ ์ œ๋ชฉ์—์„œ ์•Œ ์ˆ˜ ์žˆ๋“ฏ์ด ๋ณธ ๋…ผ๋ฌธ์€ Neural Network๋กœ ์ด๋ฃจ์–ด์ง„ Sequence to Sequence(์ดํ•˜ Seq2Seq) ๋ชจ๋ธ์„ ์†Œ๊ฐœํ•˜๊ณ  ์žˆ๋‹ค. Seq2Seq ๋ชจ๋ธ์€ ์–ด๋–ค ๋ชจ๋ธ์ผ๊นŒ? ์œ„ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ์–ด๋– ํ•œ Sequence {A,B,C}{A,B,C} ๊ฐ€ ์ž…๋ ฅ๋˜๋ฉด, ๊ทธ์— ๋Œ€์‘ํ•˜๋Š” Sequence {W,X,Y,Z} ๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๋ชจ๋ธ์„ Seq2Seq ๋ชจ๋ธ์ด๋ผ ํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ ์‚ฌ์šฉ๋˜๋Š” Sequence Data๋Š” ๊ธธ์ด๊ฐ€ ์ •ํ•ด์ ธ ์žˆ์ง€ ์•Š๋‹ค. ์ฆ‰, ๊ฐ€๋ณ€์ ์ธ ๋ฌธ์žฅ์„ ์ž…๋ ฅ์œผ๋กœ ๋„ฃ์œผ๋ฉด ๋˜ ๋‹ค๋ฅธ ๊ฐ€๋ณ€์ ์ธ ๋ฌธ์žฅ์„ ์ถœ..