-
-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
113 additions
and
115 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,50 +1,50 @@ | ||
# 0.0. Course | ||
|
||
## What is this course about? | ||
## What will this course teach you? | ||
|
||
This course is designed to guide you through the process of transforming Python code from simple scripts to a sophisticated, production-ready AI/ML codebase. The curriculum focuses on: | ||
Welcome to our comprehensive course designed to elevate your Python programming from basic notebooks to crafting a sophisticated, production-grade AI/ML codebase. Throughout this journey, you will learn: | ||
|
||
- Constructing and deploying software artifacts suitable for production environments. | ||
- Moving beyond initial prototyping in notebooks to develop organized Python packages. | ||
- Improving code reliability and maintainability using tools for linting and testing. | ||
- Simplifying and automating repetitive tasks, either on your local machine or through CI/CD pipelines. | ||
- Applying industry best practices to create a flexible and robust AI/ML codebase. | ||
- How to build and deploy production-worthy software artifacts. | ||
- Transitioning from prototyping in notebooks to developing structured Python packages. | ||
- Enhancing code reliability and maintenance through linting and testing tools. | ||
- Streamlining repetitive tasks using automation, both locally and via CI/CD pipelines. | ||
- Adopting best practices to develop a versatile and resilient AI/ML codebase. | ||
|
||
## How much does this course cost? | ||
## Is there a fee for this course? | ||
|
||
This course is provided free of charge, under the Creative Commons Attribution 4.0 International license, enabling you to alter, share, and utilize the material commercially, as long as you credit the original creators. | ||
We are delighted to offer this course at no cost, under the Creative Commons Attribution 4.0 International license. This means you can adapt, share, and even use the content for commercial purposes, provided you attribute the original authors. | ||
|
||
We also offer additional support options to enhance your learning experience, including personalized mentoring sessions and access to online assistants. | ||
Additionally, for those seeking a deeper understanding, we provide extra support options, including personal mentoring sessions and access to online assistance. | ||
|
||
## Why should I enroll in this course? | ||
## Why enroll in this course? | ||
|
||
As AI and ML technologies increasingly permeate software applications, the complexity of these projects grows, presenting challenges in model management, dataset handling, and complex code organization. This course seeks to fill the gap between software developers and data scientists, equipping you with the skills necessary to adeptly manage and deliver AI/ML projects. | ||
The intersection of AI and ML with software applications is becoming increasingly complex, necessitating sophisticated management of models, datasets, and code. This course aims to bridge the knowledge gap between software engineers and data scientists, empowering you to efficiently navigate and manage AI/ML projects. | ||
|
||
The course emphasizes the importance of transitioning from the common practice of using notebooks for production, which can lack robust software development practices, to properly structured codebases. This shift can mitigate production challenges, improve collaboration, and increase the maturity of your MLOps practices. | ||
A key focus is the shift from using notebooks for production, which often lack rigorous software development practices, to a structured codebase. This transition is crucial for tackling production challenges, fostering better collaboration, and advancing your MLOps capabilities. | ||
|
||
## What are the course's prerequisites? | ||
## What should you know before starting? | ||
|
||
Before starting, you should have: | ||
To get the most out of this course, you should have: | ||
|
||
1. A basic understanding of Python programming, including constructs like loops, conditionals, functions, and classes. | ||
2. Some familiarity with using the terminal for tasks such as software installation, following README instructions, and starting applications. | ||
3. An introductory level of data science knowledge, encompassing data exploration, feature engineering, model training and tuning, and evaluation. | ||
1. A basic grasp of Python programming—understanding loops, conditionals, functions, and classes. | ||
2. Familiarity with terminal commands for software installation, following README guides, and launching applications. | ||
3. An introductory level of knowledge in data science, including data exploration, feature engineering, model training and tuning, and performance evaluation. | ||
|
||
## What will I learn in this course? | ||
## What skills will you acquire? | ||
|
||
The course is structured into six detailed chapters, each aimed at enhancing different aspects of your coding and project management capabilities: | ||
The course is divided into six in-depth chapters, each focusing on different facets of coding and project management skills: | ||
|
||
1. **Initializing**: Setting up your development environment with essential tools and platforms. | ||
2. **Prototyping**: Starting with notebooks to explore data science projects and identify potential solutions. | ||
3. **Refactoring**: Evolving your prototype into a well-organized Python package, including scripts, configuration files, and documentation. | ||
4. **Validating**: Implementing practices like typing, linting, testing, and logging to improve code quality. | ||
5. **Refining**: Applying advanced software development techniques and tools to further enhance your project. | ||
6. **Collaborating**: Creating a collaborative environment for efficient team contributions and communication. | ||
1. **Initialization**: Equip yourself with the necessary tools and platforms for your development environment. | ||
2. **Prototyping**: Begin with notebooks to dive into data science projects and pinpoint viable solutions. | ||
3. **Refactoring**: Transform your prototype into a neatly organized Python package, complete with scripts, configurations, and documentation. | ||
4. **Validation**: Adopt practices like typing, linting, testing, and logging to refine code quality. | ||
5. **Refinement**: Leverage advanced software development techniques and tools to polish your project. | ||
6. **Collaboration**: Foster a productive team environment for effective contributions and communication. | ||
|
||
## What is not covered in this course? | ||
## What's beyond the scope of this course? | ||
|
||
While this course lays a strong foundation for managing AI/ML projects, it does not cover specific MLOps platforms like SageMaker, Vertex AI, Azure ML, or Databricks. Instead, it focuses on essential principles and practices that are applicable across various platforms, whether on-premises, cloud-based, or a combination of both. | ||
While this course provides a solid grounding in managing AI/ML projects, it does not delve into specific MLOps platforms like SageMaker, Vertex AI, Azure ML, or Databricks as online courses already cover these end-to-end platforms. Instead, this course focuses on core principles and practices that are universally applicable, whether you're working on-premise, cloud-based, or in a hybrid setting. | ||
|
||
## How long does it take to complete this course? | ||
## How much time do you need to complete this course? | ||
|
||
Completion time varies depending on your prior experience and familiarity with the tools and practices covered. If you're already acquainted with some of the tools like git or VS Code, you can navigate the content more swiftly. The course adopts a philosophy of incremental improvement—"make it done, make it right, make it fast"—encouraging you to start with a functional project version and progressively refine it to enhance quality and efficiency. | ||
The time required to complete this course varies based on your prior experience and familiarity with the covered tools and practices. If you're already comfortable with tools like Git or VS Code, you may progress faster. The course philosophy encourages incremental improvement—"make it done, make it right, make it fast"—urging you to begin with a functional project version and steadily refine it for better quality and efficiency. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,25 @@ | ||
# 0.1. Projects | ||
# 0.1 Projects | ||
|
||
## What is the default project for this course? | ||
## What's our default learning project? | ||
|
||
The standard project for this course focuses on a forecasting task using the Bike Sharing Dataset. This project aims to predict bike rental numbers based on a range of factors, such as date and time, weather conditions, and historical rental data. | ||
The cornerstone project of this course involves a forecasting task using the Bike Sharing Dataset. The objective is to predict the number of bike rentals based on variables like date and time, weather conditions, and past rental data. | ||
|
||
Forecasting tasks are prevalent in both academic research and real-world applications, employing various machine learning techniques. Projects like these come with specific challenges, including the management of data subsets to avoid data leakage—where future data inadvertently informs past predictions. Through this project, you will gain practical experience in organizing MLOps projects effectively, making it an ideal learning tool. | ||
Forecasting is a critical skill with wide-ranging applications in academia and industry, utilizing diverse machine learning techniques. This project introduces challenges such as managing data subsets to prevent data leakage, where future information could wrongly influence past predictions. Through tackling this project, you'll gain hands-on experience in structuring MLOps projects effectively, offering a solid foundation for your learning journey. | ||
|
||
## Can I choose my own project? | ||
## Is personal project selection possible? | ||
|
||
Yes, you are highly encouraged to work on a project of your own choice. This can be a project you're involved in at work or a personal interest project you're passionate about developing. Using your own project allows you to focus on improving the development process with the domain knowledge you already possess, instead of spending time getting to know a new project's specifics. | ||
Absolutely! We encourage you to dive into a project that resonates with you personally. This could be a project you're currently working on professionally, or a passion project you're eager to develop further. Opting for your own project allows you to apply improvements directly within a familiar domain, streamlining the learning process by removing the need to acquaint yourself with a new project's nuances. | ||
|
||
## Where can I find inspiration? | ||
## How to find project ideas? | ||
|
||
If you're searching for project inspiration, there are several platforms where data science challenges are hosted, complete with datasets and well-defined problems: | ||
Looking for inspiration? There are several online platforms offering data science challenges, complete with datasets and clearly defined objectives: | ||
|
||
- **Kaggle**: Home to a global data science community, Kaggle offers tools and resources to support your data science goals. | ||
- **DrivenData**: This platform runs competitions for data scientists to tackle significant global issues through creative predictive modeling. | ||
- **DataCamp**: Known for its real-world data science competitions, DataCamp allows you to sharpen your skills, win prizes, and showcase your work. | ||
- **Kaggle**: A hub for data scientists worldwide, Kaggle provides the tools and community support needed to pursue your data science aspirations. | ||
- **DrivenData**: Hosts competitions where data scientists can address significant societal challenges through innovative predictive modeling. | ||
- **DataCamp**: Offers real-world data science competitions, allowing participants to hone their skills, win accolades, and present their solutions. | ||
|
||
## Can I work on an LLM project? | ||
## Can you work on a Large Language Model (LLM) project? | ||
|
||
While projects centered around Large Language Models (LLM) and generative AI share some common ground with predictive ML projects—like the necessity for robust model management and code organization—they also possess unique challenges. Evaluating LLMs can be more complex, potentially requiring the use of external LLMs for comprehensive testing. Moreover, training and fine-tuning LLMs usually require specialized hardware, such as high-memory GPUs, and involve different practices compared to typical ML tasks. | ||
Working on projects centered around Large Language Models (LLM) and generative AI does hold similarities with predictive ML projects, particularly in the areas of model management and code structuring. However, LLM projects also present distinct challenges. Evaluating LLMs can be more intricate, sometimes necessitating the use of external LLMs for thorough testing. Additionally, the training and fine-tuning of LLMs typically demand specific hardware, like high-memory GPUs, and adhere to different methodologies compared to conventional ML tasks. | ||
|
||
Given these distinctions, it may be more practical to start with a predictive ML project to familiarize yourself with essential MLOps practices. These foundational skills can later be adapted and applied to LLM projects, facilitating a smoother transition to more specialized tasks. | ||
Therefore, we recommend starting with a predictive ML project to get acquainted with fundamental MLOps practices. These core skills will then be easier to adapt and apply to LLM projects, easing the progression to these more specialized areas. |
Oops, something went wrong.