728x90
๋ฐ˜์‘ํ˜•

์•™์ƒ๋ธ” 3

[Machine Learning] ์•™์ƒ๋ธ” ๊ธฐ๋ฒ•์ด๋ž€?

Ensemble ๊ธฐ๋ฒ• Ensemble Learning์ด๋ž€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ๊ทธ ์˜ˆ์ธก์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๋ณด๋‹ค ์ •ํ™•ํ•œ ์˜ˆ์ธก์„ ๋‚ด๋Š” ๊ธฐ๋ฒ• ๊ฐ•๋ ฅํ•œ ํ•˜๋‚˜์˜ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์‹  ๋ณด๋‹ค ์•ฝํ•œ ๋ชจ๋ธ์„ ์—ฌ๋Ÿฌ๊ฐœ ์กฐํ•ฉํ•˜๋Š” ๋ฐฉ์‹ Ensemble Learning ์ข…๋ฅ˜ ์•™์ƒ๋ธ” ํ•™์Šต์€ 3๊ฐ€์ง€ ์œ ํ˜•์œผ๋กœ ๋ถ„๋ฅ˜๋จ Voting Bagging Boosting Voting ์—ฌ๋Ÿฌ๊ฐœ์˜ classifier๊ฐ€ ํˆฌํ‘œ๋ฅผ ํ†ตํ•ด ์ตœ์ข… ์˜ˆ์ธก๊ฒฐ๊ณผ ๊ฒฐ์ • ์„œ๋กœ ๋‹ค๋ฅธ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์—ฌ๋Ÿฌ๊ฐœ ๊ฒฐํ•ฉํ•˜์—ฌ ์‚ฌ์šฉ Voting ๋ฐฉ์‹ Hard Voting : ๋‹ค์ˆ˜์˜ classifier๊ฐ€ ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ๊ฐ’์„ ์ตœ์ข… ๊ฒฐ๊ณผ๋กœ ์„ ์ • (๋‹ค์ˆ˜๊ฒฐ์˜ ๋ฒ•์น™) Soft Voting : ๋ชจ๋“  classifier๊ฐ€ ์˜ˆ์ธกํ•œ label๊ฐ’์˜ ๊ฒฐ์ • ํ™•๋ฅ  ํ‰๊ท ์„ ๊ตฌํ•œ ๋’ค ๊ฐ€์žฅ ํ™•๋ฅ ์ด ๋†’์€ label๊ฐ’์„ ์ตœ์ข…๊ฒฐ๊ณผ๋กœ ์„ ..

[Machine Learning] LightGBM์ด๋ž€? โœ” ์„ค๋ช… ๋ฐ ์žฅ๋‹จ์ 

๐Ÿ“Œ Remind LightGBM์— ๋“ค์–ด๊ฐ€๊ธฐ์ „์— ๋ณต์Šต ๊ฒธ reminding์„ ํ•ด๋ณด์ž. Light GBM์˜ GBM์€ Gradient Boosting Model๋กœ, tree๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. ์ด GBM์˜ ํ•™์Šต๋ฐฉ์‹์„ ์‰ฝ๊ฒŒ๋งํ•˜๋ฉด, ํ‹€๋ฆฐ๋ถ€๋ถ„์— ๊ฐ€์ค‘์น˜๋ฅผ ๋”ํ•˜๋ฉด์„œ ์ง„ํ–‰ํ•œ๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. Gradient Boosting์—์„œ Boosting์€ ์—ฌ๋Ÿฌ๊ฐœ์˜ tree๋ฅผ ๋งŒ๋“ค๋˜, ๊ธฐ์กด์— ์žˆ๋Š” ๋ชจ๋ธ(tree)๋ฅผ ์กฐ๊ธˆ์”ฉ ๋ฐœ์ „์‹œ์ผœ์„œ ๋งˆ์ง€๋ง‰์— ์ด๋ฅผ ํ•ฉํ•˜๋Š” ๊ฐœ๋…์œผ๋กœ, Random Forest์˜ Bagging๊ธฐ๋ฒ•๊ณผ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์ด๋‹ค. Boostingํ•˜๋Š” ๋ฐฉ์‹์—๋„ ํฌ๊ฒŒ 2๊ฐ€์ง€๊ฐ€ ์žˆ๋‹ค. 1. AdaBoost์™€ ๊ฐ™์ด ์ค‘์š”ํ•œ ๋ฐ์ดํ„ฐ(์ผ๋ฐ˜์ ์œผ๋กœ ๋ชจ๋ธ์ด ํ‹€๋ฆฐ ๋ฐ์ดํ„ฐ)์— ๋Œ€ํ•ด weight๋ฅผ ์ฃผ๋Š” ๋ฐฉ์‹ 2. GBDT์™€ ๊ฐ™์ด loss fun..

Random Forest ๊ฐ„.๋‹จ.๋ช….๋ฃŒ

Ensemble ์•™์ƒ๋ธ” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋จธ์‹ ๋Ÿฌ๋‹ model์„ ์—ฐ๊ฒฐํ•˜์—ฌ ๊ฐ•๋ ฅํ•œ model์„ ๋งŒ๋“œ๋Š” ๊ธฐ๋ฒ• classifier/regression์— ์ „๋ถ€ ํšจ๊ณผ์  random forest์™€ gradient boosting์€ ๋‘˜๋‹ค model์„ ๊ตฌ์„ฑํ•˜๋Š” ๊ธฐ๋ณธ ์š”์†Œ๋กœ decision tree๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค random forest ์กฐ๊ธˆ์”ฉ ๋‹ค ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ decision tree์˜ ๋ฌถ์Œ ๋žœ๋ค ํฌ๋ ˆ์ŠคํŠธ์˜ ๋“ฑ์žฅ ๋ฐฐ๊ฒฝ : ๊ฐ๊ฐ์˜ tree๋Š” ๋น„๊ต์  ์˜ˆ์ธก์„ ์ž˜ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ฐ์ดํ„ฐ์˜ ์ผ๋ถ€์— overfittingํ•˜๋Š” ๊ฒฝํ–ฅ์„ ๊ฐ€์ง ๋”ฐ๋ผ์„œ, ์ž˜ ์ž‘๋™ํ•˜์ง€๋งŒ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐฉํ–ฅ์œผ๋กœ overfitting๋œ tree๋ฅผ ๋งŽ์ด ๋งŒ๋“ค๊ณ  ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ท ๋‚ด๋ฉด overfitting์„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด tree model์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์€ ์œ ์ง€ํ•˜๋˜ overf..

728x90
๋ฐ˜์‘ํ˜•