728x90
๋ฐ˜์‘ํ˜•

regression 3

Decision Tree ๊ฐ„.๋‹จ.๋ช….๋ฃŒ

Decision tree : ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด ๋ถ„๋ฅ˜(classification)๊ณผ ํšŒ๊ท€๋ถ„์„(regression)์— ๋ชจ๋‘ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๊ธฐ ๋–„๋ฌธ์— CART(Classification And Regression Tree)๋ผ๊ณ  ๋ถˆ๋ฆผ node tree์˜ node : ์งˆ๋ฌธ/๋‹ต์„ ๋‹ด๊ณ  ์žˆ์Œ root node : ์ตœ์ƒ์œ„ node ์ตœ์ƒ์œ„ node์˜ ์†์„ฑ feature๊ฐ€ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ํŠน์„ฑ leaf node : ๋งˆ์ง€๋ง‰ node (๋ง๋‹จ๋…ธ๋“œ) ๋งŒ์•ฝ tree์˜ ๋ชจ๋“  leaf node๊ฐ€ pure node๊ฐ€ ๋  ๋•Œ๊นŒ์ง€ ์ง„ํ–‰ํ•˜๋ฉด model์˜ ๋ณต์žก๋„๋Š” ๋งค์šฐ ๋†’์•„์ง€๊ณ  overfitting๋จ overfitting ๋ฐฉ์ง€ tree์˜ ์ƒ์„ฑ์„ ์‚ฌ์ „์— ์ค‘์ง€ : pre-prunning (=๊นŠ์ด์˜ ์ตœ๋Œ€๋ฅผ ์„ค์ •, max_depth) ๋ฐ์ดํ„ฐ๊ฐ€ ์ ์€ node ์‚ญ..

Random Forest ๊ฐ„.๋‹จ.๋ช….๋ฃŒ

Ensemble ์•™์ƒ๋ธ” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋จธ์‹ ๋Ÿฌ๋‹ model์„ ์—ฐ๊ฒฐํ•˜์—ฌ ๊ฐ•๋ ฅํ•œ model์„ ๋งŒ๋“œ๋Š” ๊ธฐ๋ฒ• classifier/regression์— ์ „๋ถ€ ํšจ๊ณผ์  random forest์™€ gradient boosting์€ ๋‘˜๋‹ค model์„ ๊ตฌ์„ฑํ•˜๋Š” ๊ธฐ๋ณธ ์š”์†Œ๋กœ decision tree๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค random forest ์กฐ๊ธˆ์”ฉ ๋‹ค ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ decision tree์˜ ๋ฌถ์Œ ๋žœ๋ค ํฌ๋ ˆ์ŠคํŠธ์˜ ๋“ฑ์žฅ ๋ฐฐ๊ฒฝ : ๊ฐ๊ฐ์˜ tree๋Š” ๋น„๊ต์  ์˜ˆ์ธก์„ ์ž˜ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ฐ์ดํ„ฐ์˜ ์ผ๋ถ€์— overfittingํ•˜๋Š” ๊ฒฝํ–ฅ์„ ๊ฐ€์ง ๋”ฐ๋ผ์„œ, ์ž˜ ์ž‘๋™ํ•˜์ง€๋งŒ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐฉํ–ฅ์œผ๋กœ overfitting๋œ tree๋ฅผ ๋งŽ์ด ๋งŒ๋“ค๊ณ  ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ท ๋‚ด๋ฉด overfitting์„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด tree model์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์€ ์œ ์ง€ํ•˜๋˜ overf..

๋‹จ์ˆœ์„ ํ˜•ํšŒ๊ท€ / ๋‹ค์ค‘์„ ํ˜•ํšŒ๊ท€ ๊ฐ„.๋‹จ.๋ช….๋ฃŒ

๋‹จ์ˆœ์„ ํ˜•ํšŒ๊ท€ ํ•˜๋‚˜์˜ ํŠน์„ฑ์„ ์ด์šฉํ•ด์„œ ํƒ€๊ฒŸ ์˜ˆ์ธก y = wx + b y : ์˜ˆ์ธก๊ฐ’ x : ํŠน์„ฑ w : ๊ฐ€์ค‘์น˜/๊ณ„์ˆ˜(coefficient) b : ํŽธํ–ฅ(offset) ์ฃผ์–ด์ง„ sample data๋“ค์„ ์ด์šฉํ•˜์—ฌ ๊ฐ€์žฅ ์ ํ•ฉํ•œ w์™€ b๋ฅผ ์ฐพ์•„์•ผ ํ•จ -> ๋ณดํ†ต ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(gradient descent)๋ฅผ ์ด์šฉํ•ด์„œ ์ฐพ๋Š”๋‹ค ๋‹ค์ค‘์„ ํ˜•ํšŒ๊ท€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ํŠน์„ฑ์„ ์ด์šฉํ•ด์„œ ํƒ€๊ฒŸ ์˜ˆ์ธก y = w0x0 + w1x1 = w2x2 + ... + b ์—ญ์‹œ MSE๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฐ€์žฅ ์ ํ•ฉํ•œ w๋“ค๊ณผ b๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ ๋ฌธ์ œ : ๊ณผ๋Œ€์ ํ•ฉ ๋  ๋•Œ๊ฐ€ ์ข…์ข… ์žˆ๋‹ค => ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์ด ๋–จ์–ด์ง„๋‹ค ๋ฆฟ์ง€(Ridge)์™€ ๋ผ์˜(Lasso) ๋ฐฉ๋ฒ•์œผ๋กœ ํ•ด๊ฒฐ

728x90
๋ฐ˜์‘ํ˜•