Topics to be covered in Notes 1
- Learning Path
- Introduction
- Data Science Specialization
- Work flow of a Data Scientist
Below is the learning path of this course 1. Statistics 1.a) Basic Statistics 1.b) Advanced Statistics 2. Analytics 2.a) Basic 2.b) Advanced 3. Machine Learning ( R and Python ) 4. Visualization Tools ( Tableaue ) 5. Databases
Data Science is a scientific method, process, systems to extract knowledge , insights from data.This Data can be either structured or unstructured. Data Science = Maths + Statistics
Below are few fields in Data Science 1. Speech Analysis 2. Text Analysis(Natural Language Processing) 3. Image Processing 4. Video Processing 5. Medicine Simulation 6. Material Simulation
Sample work flow of a Data Scientist
-
Understand the Business Problem- by why's , or interviews, questionaires with client
-
Data Acquisition- Acquire the require data to solve the business problem, Data can be Webserver logs, DB Data, Online Repositories etc..)
-
Data Preparation-Data cleaning- Removes inconsistent Data Type, Misspelled Attributes, Missing & Duplicate Values
-
Data Transformation- Modifies Data using defined Mapping Rule , Tools like Talend and Informatica to form complex transformation to understand the data better
-
Data Analysis- Defines and Refines selection of feature variable that will be used in the model development.
-
Data Modeling- Apply ML Algorithm like Decision Treee, apply on Data to identify best fit for Business. Trains the Models with Training Data set and Test, Select Best ML Model. ML Prefers python for Modelling the data
-
Visulization and Communication- Create Powerfull reports and Dashboards using Tableaue, Power BI and Qlick View and communicate the customer about how the business problem is going to be solved with the best fit ML Model.
-
Deploy and Test and Maintain ML Model- Deploy and Test ML Model and Maintains the Model.