I have translated most of the R code in the book into Python. Sometimes the Python output is similar but not the same. In a few cases, there is no available Python package equivalent to that found with R. I have long experience with R but not so much with Python so any suggestions for more elegant Python would be welcome.
Read more about it in this blog post.
Here are the data as lmrcsv.zip as CSV files.
You will commonly need several Python packages including numpy
, scipy
, pandas
, statsmodels
, matplotlib
, seaborn
, scikit-learn
and patsy
. I recommend the Anaconda distribution of Python which includes these packages.
Introduction notebook and output
Estimation notebook and output
Prediction notebook and output
Explanation notebook and output. Uses match.py.
Diagnostics notebook and output
Problems with the Predictors notebook and output
Problems with the Error notebook and output
Transformation notebook and output
Model Selection notebook and output
Shrinkage Methods notebook and output
Insurance Redlining - A Complete Example notebook and output
Missing Data notebook and output
Categorical Predictors notebook and output
One Factor Models notebook and output
Models with Several Factors notebook and output
Experiments with Blocks notebook and output