Date: August 5th, 2025, from 8:30-11:45AM
Instructor: Suraj Rampure (rampure@umich.edu)
Program Website
In this session, we will review the basics of predictive modeling and approaches to build an accurate and reproducible model, introduce best practices in reporting that will allow others to appropriately interpret and reproduce the results, and discuss guiding principles on how to reproduce others’ results.
There are four Python-based Jupyter Notebooks for this session.
sklearn
, Logistic Regression, Train-Test Splits)Don’t worry if you’re not familiar with Python – most of the code is already provided for you. Instead, focus on understanding the concepts and the code that’s provided.
There are two ways to access these notebooks. To get the most out of the workshop, I recommend you follow at least one of them, so that you can run code and experiment yourself.
This link uses mybinder.org, a service that allows you to run Jupyter Notebooks in your browser. Some code may not work properly or take a long time to run, but this should suffice for the workshop.
You can also clone our GitHub repository and run the notebooks locally. This option is suggested if you’ve used Jupyter Notebooks locally before, but if you haven’t, the web-based option is probably easier.
Find the repository here. There is a requirements.txt
file in the repository that you can use to install the necessary dependencies.
In your Terminal, run the following commands:
git clone https://github.com/surajrampure/dair3-2025.git
cd dair3-2025
pip install -r requirements.txt
cd files
You can then open the notebooks in your browser by running jupyter notebook
in your Terminal.
If you’d like more detailed steps on how to run Jupyter Notebooks locally, refer to this guide.