Day #11 - Data Science Learning Notes - Evaluating a model
R Square - Goodness of Fit Test
- R square = (1- (Sum of Squares of Error/Sum of Squares of Total))
- SST - Variance of dependent variable
- SSE - Variance of Actual vs Predicted Values
Adjusted R Square
- Adjusted R Square = (1-((n-1)/(n-p-1)))(1-RSquare)
- P - Number of independent variables
- n - records in dataset
RMSE (Root mean square error)
- For every record predicted compute error
- Square it and find mean
- RMSE error should be same for training and testing dataset
Bias (Underfit)
- Model can't explain the dataset
- R Square value very less
- Add more Independent variable
Variance
- RMSE High for test dataset, RMSE low for training dataset
- Cut down Independent variable
Collinearity Problem
- Conduct P test to validate null hypothesis is valid
Next Pending Reads
- Subset Selection Technique
- Cross Validation Technique
- Z test / P Test
Happy Learning!!!
No comments:
Post a Comment