* A guide to getting started with Data Science and ML *
(Deep Learning not included)
For Data Analysis knowledge of Statistics is enough but for building ML models Calculus, Linear Algebra and Probability also plays a huge role.
- Math for Data Science
- Statistics Revision
- Khan Academy Calculus
- Gilbert Strang's linear algebra
- Blog post for all Math resources required for ML
Reading thoeritical books might be getting too involved, if your goal is to make ML models to just fulfill your applications. But for people who'd like to understand deep learning algorithms and the math behind it, this is a short list of resources.
- How do I learn mathematics for machine learning?
This quora answer gives a detailed 5 month roadmap (which can and should be extended according to your comfort) for learning the math behind machine learning and math that every engineer must knof of in general. - Maths for Machine Learning
This book brings the mathematical foundations of basic machine learning concepts to the fore and collects the information in a single place. This book is intended to be a guidebook to the vast mathematical literature that forms the foundations of modern machine learning.
Data analysis is a process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusions and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays a role in making decisions more scientific and helping businesses operate more effectively
Numpy
A very useful library for math and Scientific Computing
Pandas
Most used Python library for Data Analysis
Data Visualization
SQL
Big Data is a massive amount of data sets that cannot be stored, processed, or analyzed using traditional tools. Big Data analytics is a process used to extract meaningful insights, such as hidden patterns, unknown correlations, market trends, and customer preferences. Big Data analytics provides various advantages—it can be used for better decision making, preventing fraudulent activities, among other things.
Here are some popular tools used in Big Data analytics:
- Hadoop - helps in storing and analyzing data
- Spark - used for real-time processing and analyzing large amounts of data
- Kafka - a distributed streaming platform that is used for fault-tolerant storage
- Cassandra - a distributed database used to handle chunks of data
Practical (More bent towards Programming)
- Intro to Machine-Learning Udacity
- Kaggle Mini-Courses
- Machine Learning A-Z: Hands-On Python & R In Data Science Udemy
Theoritical (More in-depth Math Concepts)
- Machine Learning Andrew Ng (MATLAB)
- Stanford CS229: Machine Learning (Autumn 2018)
- Machine Learning Crash Course by Google
For absolute beginners
- Python for Data Analysis:Data Wrangling with Pandas,NumPy,and IPython
- Intro to ML with Python
- Hands on ML with Scikit-learn and Tensorflow
For intermediates
- Made with ML by Goku Mohandas
- End to end ML by Brendan Rohrer
- A.I. by Google Researchers
- Towards Data Science by Medium
Best Websites to get free datasets
- Clone repo and create a new branch:
$ git checkout https://github.com/CSI-SFIT/Data-Science-Resources -b name_for_new_branch
. - Make changes and test.
- Submit Pull Request with comprehensive description of changes.
CSI SFIT Tech Team 2020 - 2021 :
- Joint Tech Head : @kaifkohari
- Tech Executive : @EktaMasrani