Machine Learning A-Z: Part 2 – Regression (Random Forest Regression)

Random Forest Intuition

Ensemble Learning

STEP 1: Pick at random K data points from the Training set.

STEP 2: Build the Decision Tree associated to these K data points.

STEP 3: Choose the number Ntree of trees you want to build and repeat STEPS 1 & 2.

STEP 4: For a new data point, make each one of your Ntree trees predict the value of Y to for the data point in question, and assign the new data point the average across all of the predicted Y values.

e.g. A wild guessing game using a jar with jellybeans in it.
Calculate the average of many wild guesses.

Random Forest Regression

Python

R

Machine Learning A-Z: Part 2 – Regression (Decision Tree Regression)

Decision Tree Intuition

CART (Classification and Regression Trees)
– Classification Trees
– Regression Trees

Splitting data into segments.
Split 1: X1 < 20
Split 2: X2 < 200
Split 3: X2 < 170
Split 4: X1 < 40

Decision Tree Regression

Python

R

Machine Learning A-Z: Part 2 – Regression (SVR)

Support Vector Regression (SVR)

Python

R

Machine Learning A-Z: Part 2 – Regression (Polynomial Regression)

Polynomial Regression

Python

Reset console.

IPythonコンソール|カーネルの再起動

Show summary.

R

Templates

Python

R

Machine Learning A-Z: Part 2 – Regression (Multiple Linear Regression)

Dummy Variable Trap

Dummy variables must be:
D2 = 1 – D1

You cannot have more than one pair of dummy variables at the same time.

Building a model

PDF

1. All-in
=> 2. Backward Elimination
3. Forward Selection
4. Bidirectional Elimination
5. Score Comparison

Akaike information criterion (AIC) 赤池情報量規準

– a measure of the relative quality of statistical models for a given set of data.
– Given a collection of models for the data, estimates the quality of each model, relative to each of the other models.
– Hence, provides a means for model selection.

Multiple Linear Regression

Python

R

Clear environment of RStudio.

RStudio Keyboard Shortcuts

ページトップへ