Welcome to the intro-level DS class where we will learn about python basics and how to use python for exploratory data analysis. Hope you'll enjoy the class and learn something from it.
- Anaconda set up
- Git introduction
- Set up you forked repo for commit and push + Homework submisstion instructions
You can run python in different settings, for example, you can use jupyter
notebook for interactive exploration, use interpreter in command line by typing python
in terminal (you'll see >>>
prompt appear), or run python script in command line by python <your_script>.py
. We will be using notebooks for the class as it's easy to follow with markdown and easy to interact with.
0. Environment set up (material in section 0)
1. Assign values to variables and simple arithmetics
2. `Print` and simple string manimulation
3. Value comparison and conditions using `if-elif-else`
4. Collections: list, tuple, set, and dictionary
* Git - Commiting, Pushing, and Pull Request
Homework_01(Exercise0,3,4) is due next class. Please refer to homework submission instructions for how to open pull request for submission.
* HW01 review
5. Iteration: loops and comprehensions
5. Iteration: loops and comprehensions
6. Writing functions
6. Writing functions
7. Reading and writing files
Homework_02 (Exercise 5, 6, 7) is assigned, it's due next Wednesday 5/6.
This time, please submit the .py
files for all submissions. Similarly, once you're done, you can open a PR with these files.
8. Intro to code performance ~~Useful basic modules (numpy, os, datetime)~~
9. Coding challenge examples on HackerRank
1. Intro to `pandas`
2. Data wrangling
3. Using `pandas` for EDA
Homework_03 is assigned, it's due next Wednesday 5/13 so we can spend some time on discussion.
Please spend some time to work on EDA dataset so we can have a good discussion session next week.
- Pandas Exercise 5-8 in notebook 03
- World Happiness dataset exploration: ask one interesting question and try to answer it using the datasets.
4. Basic plotting + (slightly) advanced EDA topics
* HW3 review
5. Mock Take-home case study
- HackerRank coding test
- Dataset exploration
6. Demo of `pandas-profiling`
7. Discussion on A/B test
An intro on A/B testing: https://towardsdatascience.com/the-math-behind-a-b-testing-with-example-code-part-1-of-2-7be752e1d06f
0. simple scripting
1. Introduction to `class`
2. Tests and others
Resources:
- Tutorial for
class
dev and intro to OOP: https://realpython.com/python-data-classes/ - Intro to testing with different types of tests and tools: https://docs.python-guide.org/writing/tests/
- Please try to follow and read the provide the material to make sure we can cover more stuff during class.
- Please be respectiful of your own time and commit to as many of the assignments as possible:)
- The internet (primarily Stackoverflow) is your friend if you have questions - you won't be the first of the last with this question. Try to do a quick Google search and see you can find existing solutions.
https://github.com/tdpetrou/Minimally-Sufficient-Pandas
https://github.com/cmawer/pycon-2017-eda-tutorial/blob/master/EDA-cheat-sheet.md