Skip to content

Latest commit

 

History

History
49 lines (38 loc) · 2.05 KB

N1.md

File metadata and controls

49 lines (38 loc) · 2.05 KB

Data Science - Notes 1

Topics to be covered in Notes 1

  • Learning Path
  • Introduction
  • Data Science Specialization
  • Work flow of a Data Scientist

1.Learning Path:

Below is the learning path of this course 1. Statistics 1.a) Basic Statistics 1.b) Advanced Statistics 2. Analytics 2.a) Basic 2.b) Advanced 3. Machine Learning ( R and Python ) 4. Visualization Tools ( Tableaue ) 5. Databases

2.Introduction:

Data Science is a scientific method, process, systems to extract knowledge , insights from data.This Data can be either structured or unstructured. Data Science = Maths + Statistics

3.Data Science Specialization :

Below are few fields in Data Science 1. Speech Analysis 2. Text Analysis(Natural Language Processing) 3. Image Processing 4. Video Processing 5. Medicine Simulation 6. Material Simulation

4.Work flow of a Data Scientist:

Sample work flow of a Data Scientist

  • Understand the Business Problem- by why's , or interviews, questionaires with client

  • Data Acquisition- Acquire the require data to solve the business problem, Data can be Webserver logs, DB Data, Online Repositories etc..)

  • Data Preparation-Data cleaning- Removes inconsistent Data Type, Misspelled Attributes, Missing & Duplicate Values

  • Data Transformation- Modifies Data using defined Mapping Rule , Tools like Talend and Informatica to form complex transformation to understand the data better

  • Data Analysis- Defines and Refines selection of feature variable that will be used in the model development.

  • Data Modeling- Apply ML Algorithm like Decision Treee, apply on Data to identify best fit for Business. Trains the Models with Training Data set and Test, Select Best ML Model. ML Prefers python for Modelling the data

  • Visulization and Communication- Create Powerfull reports and Dashboards using Tableaue, Power BI and Qlick View and communicate the customer about how the business problem is going to be solved with the best fit ML Model.

  • Deploy and Test and Maintain ML Model- Deploy and Test ML Model and Maintains the Model.