Welcome to QTM151 - Introduction to Statistical Computing II! This repository contains all the materials for the course, including lectures, assignments, quizzes, and tutorials.
This course is designed to introduce students to statistical computing techniques using Python and SQL. It builds upon the foundational knowledge from QTM150 and focuses on practical applications of data analysis, reproducible research, and database management.
This repository is organised as follows:
assignments/
: Contains all course assignmentslectures/
: Includes lecture materials and codetutorials/
: Step-by-step guides for the tools used in the courseREADME.md
: This file, providing an overview of the course and repositorysyllabus.pdf
: Course syllabus in PDF format
The course covers the following topics, with corresponding lecture materials available in the lectures folder:
-
Wednesday, August 28: Lecture 01: Welcome to QTM 151 - Introduction. Please make sure to install the necessary software for the course by following the Course Tutorials: How to Install Anaconda, Jupyter, PostgreSQL, VSCode, and Open a Free Educational Account on GitHub.
-
Monday, September 02: Labour Day (no class)
-
Wednesday, September 04: Lecture 02: GitHub Review, Introduction to Jupyter Notebooks.
- Extra lecture: Lecture slides. Video recording.
- Assignment 01: Problem Set 01
-
Monday, September 09: Lecture 03: Variables and Lists. Jupyter Notebook, Lecture Slides.
-
Wednesday, September 11: Lecture 04: Mathematical Operations, Arrays, and Random Numbers. Jupyter Notebook, Lecture Slides
- Kahoot Quiz
- Assignment 01 due (5%)
- Assignment 02: Problem Set 02
-
Monday, September 16: Lecture 05: Boolean Variables and If/Else Statements. Jupyter Notebook, Lecture Slides.
-
Wednesday, September 18: Lecture 06: While Loops and For Loops. Jupyter Notebook, Lecture Slides
- Kahoot Quiz
- Assignment 02 due (5%)
- Assignment 03: Problem Set 03
-
Monday, September 23: Lecture 07: Applications 1: Simulation Studies. Jupyter Notebook. Lecture Slides.
-
Wednesday, September 25: Lecture 08: Functions and Arguments. Jupyter Notebook. Lecture Slides.
- Kahoot Quiz.
- Assignment 03 due (5%)
- Assignment 04: Problem Set 04
-
Monday, September 30: Lecture 09: Global and Local Variables. Jupyter Notebook. Lecture Slides.
-
Wednesday, October 02: Lecture 10: Quiz 01: Application 02 - Operations over Multiple Datasets (6%)
- Assignment 05: Problem Set 05
-
Friday, October 04: Assignment 04 due (5%)
-
Monday, October 07: Lecture 11: Subsetting Data. Jupyter Notebook. Lecture Slides.
-
Wednesday, October 09: Lecture 12: Application 03: Linear Models. Jupyter Notebook. Lecture Slides.
- Kahoot Quiz
- Assignment 05 due (5%)
- Assignment 06: Problem Set 06
-
Monday, October 14: Fall Break (no class)
-
Wednesday, October 16: Lecture 13: Replacing and Recoding Variables. Jupyter Notebook. Lecture Slides.
- Assignment 06 due (5%)
- Assignment 07: Problem Set 07
-
Monday, October 21: Lecture 14: Quiz 2: Application 4: Random Assignment (6%)
-
Wednesday, October 23: Lecture 15: Aggregating Data. Jupyter Notebook. Lecture Slides.
- Assignment 07 due (5%)
- Assignment 08: Problem Set 08
-
Monday, October 28: Lecture 16: Merging Data. Jupyter Notebook. Lecture Slides.
-
Wednesday, October 30: Lecture 17: Introduction to SQL. SQL file. Lecture Slides.
- Kahoot Quiz
- Assignment 09: Problem Set 09
- Instructions for the Final Project: Final Project
-
Friday, November 01: Assignment 08 due (5%)
-
Monday, November 04: Lecture 18: Quiz 3: Application 5: Practicing Chaining (6%)
-
Wednesday, November 06: Lecture 19: Import SQL Data into Python. Jupyter Notebook. Lecture Slides.
- Assignment 09 due (5%)
- Assignment 10: Problem Set 10
-
Monday, November 11: Lecture 20: Merging Tables in SQL. Jupyter Notebook. Lecture Slides.
-
Wednesday, November 13: Lecture 21: Time Series Analysis. Jupyter Notebook. Lecture Slides.
- Assignment 10 due (5%)
-
Monday, November 18: Lecture 22: Quiz 4: Application 6: Practice SQL Queries (6%)
-
Wednesday, November 20: Lecture 23: Pivot Tables. Jupyter Notebook. Lecture Slides.
-
Monday, November 25: Lecture 24: Manipulating Text Data. Jupyter Notebook. Lecture Slides.
-
Wednesday, November 27: Thanksgiving Break (no class)
-
Monday, December 02: Lecture 25: Quiz 5: Application 8: Time Data, Panel Data, and Plots (6%)
-
Wednesday, December 04: Drop-in session for the final project (no readings).
-
Monday, December 09: Final Project due (20%)
Each lecture folder contains an HTML file and a Jupyter notebook (.ipynb
) with code examples and explanations, along with any additional resources or datasets used in the lecture.
Throughout the course, students will complete various assignments and quizzes
to reinforce their learning. These will be posted in the respective
assignments/
and quizzes/
folders as the course progresses. We will also announce these in class and on
Canvas. Please refer to the syllabus for due dates and submission guidelines.
The tutorials/
folder contains step-by-step guides for various tools and techniques used in
the course. These include:
- VSCode and Anaconda Tutorial
- Jupyter Notebook and Markdown Tutorial
- GitHub Tutorial
- PostgreSQL Tutorial
- Prerequisites: None, only willingness to learn and explore new tools 😃
- Software: Anaconda distribution of Python 3.x, VS Code, PostgreSQL, GitHub Desktop
- Assignments: 50%
- Class Quizzes: 30%
- Final Project: 20%
For detailed information on course policies, grading criteria, attendance
requirements, and academic integrity guidelines, please refer to the
syllabus.pdf
file in the repository root.
To supplement your learning, you may find the following resources helpful:
- Python for Data Analysis by Wes McKinney
- Python Data Science Handbook by Jake VanderPlas
- Elements of Data Science by Allen Downey
- Automate the Boring Stuff with Python by Al Sweigart
- Python for Everybody by Charles Severance
- SQL for Data Scientists by Renee M. P. Teate
- Coursera: Python for Everybody Specialisation
- edX: Python Basics for Data Science
- Codecademy: Learn Python
- DataCamp: Introduction to SQL
- Coursera: SQL for Data Science
- Official Python Documentation
- NumPy Documentation
- Pandas Documentation
- Matplotlib Documentation
- PostgreSQL Documentation
The syllabus also includes a list of additional readings and resources for each week.
- Instructor: Danilo Freire
- Email: [email protected]
- Office Hours: At your convenience, please schedule an appointment via email.
Students are expected to adhere to the Emory University Honour Code. Any suspected violations will be reported to the Honour Council.
If you require any accommodations for this course, please contact the Department of Accessibility Services and the instructor as soon as possible.
If you encounter any issues with the course materials or have questions about the content, please:
- Check the course syllabus and this README for relevant information
- Review the lecture materials and tutorials in the repository
- Consult with your classmates or post in the course discussion forum
- Attend office hours or schedule an appointment with the instructor
While this repository is primarily maintained by the course instructor, everyone is welcome to contribute. Please feel free to suggest improvements or report issues by opening a GitHub issue, submitting a pull request, creating a discussion post, or contacting the instructor directly.
This course and its materials have been developed with inspiration from previous version of this course, as well as various open-source communities and educational resources. I am particularly grateful to Alejandro Sánchez Becerra for his teaching materials and guidance. I am also thankful for the contributions of the Python, SQL, and data science communities that make courses like this possible.
This repository is licensed under the MIT License. You are free to use, modify, and distribute the materials as needed, with appropriate attribution to the original source.
We look forward to an engaging and productive semester! Good luck, and happy coding! 😃