Welcome to the intro-level DS class, where we will learn about python basics and how to use python for exploratory data analysis. Hope you'll enjoy the class and learn something from it.
Note: If you're getting the following error while cloning or pushing to GitHub, try to set up PAT following this instruction.
Error message:
remote: Support for password authentication was removed on August 13, 2021. Please use a personal access token instead.
remote: Please see https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ for more information.
fatal: unable to access 'https://github.com/emma-data-works/ds-class-intro.git/': The requested URL returned error: 403
If you can, try to go throug the following reading and set up your local environment before class:
- Anaconda set up
- Git introduction
- Set up you forked repo for commit and push + Homework submisstion instructions
You can run python in different settings, for example, you can use jupyter
notebook for interactive exploration, use interpreter in command line by typing python
in terminal (you'll see >>>
prompt appear), or run python script in command line by python <your_script>.py
. We will be using notebooks for the class as it's easy to follow with markdown and easy to interact with.
0. Environment set up (material in section 0)
1. Assign values to variables and simple arithmetics
2. `Print` and simple string manimulation
3. Value comparison and conditions using `if-elif-else`
4. Collections: list, tuple, set, and dictionary
* Git - Commiting, Pushing, and Pull Request
Homework_01(Exercise0,3,4) is due next class. Please refer to homework submission instructions for how to open pull request for submission.
5. Iteration: loops and comprehensions
* HW01 review [delayed]
6. Writing functions
Homework_02 (Exercise 5, 6) is assigned, it's due next Wednesday 8/11, but we'll start discussion/review on 8/8. Sample answer is posted for reference.
7. Reading and writing files
8. Intro to code complexity and performance (part1)
Sample answer for exercise 7
is posted for reference.
9. Coding challenge example using HackerRank and LeetCode
10. [Material Provided] Objected oriented programming
11. A/B testing discussion
class07 Beginning of pandas
for data analysis
1. Data exploration: Intro to `pandas`
2. Data wrangling basics
2. Data wrangling basics
3. Using `pandas` for exploratory data analysis
Homework_03 (all exercises in pandas section) is assigned, it's due before the final class, sample answer will be posted before final class.
3. Using `pandas` for exploratory data analysis
4. Plotting in python
class10 dataset has been uploaded for preview
[5. Advanced EDA topics] -- depends on time
6. Mock take-home case study
https://github.com/tdpetrou/Minimally-Sufficient-Pandas
https://github.com/cmawer/pycon-2017-eda-tutorial/blob/master/EDA-cheat-sheet.md