Links:
Note: check the final result, it's a bit different from what we showed in the videos: we further improved it by doing some small things here and there, like improved README, code readability, etc.
- Generating data for the project
- Setting up the project
- Implementing the initial version of the RAG flow
- Preparing the README file
- Generating gold standard evaluation data
- Evaluting retrieval
- Findning the best boosting coefficients
- Using LLM-as-a-Judge (type 2)
- Comparing gpt-4o-mini with gpt-4o
- Turnining the jupyter notebook into a script
- Creating the ingestion pipeline
- Creating the API interface with Flask
- Improving README
- Creating a Docker image for our application
- Putting everything in docker compose
- Logging all the information for monitoring purposes
- Changes between 7.5 and 7.6 (postres logging, grafara, cli.py, etc)
- README file improvements
- Total cost of the project (~$2) and how to lower it
- Using generated data for real-life projects
- Different chunking strategies
- Use cases: multiple articles, one article, slide decks
Links:
- https://chatgpt.com/share/a4616f6b-43f4-4225-9d03-bb69c723c210
- https://chatgpt.com/share/74217c02-95e6-46ae-b5a5-ca79f9a07084
- https://chatgpt.com/share/8cf0ebde-c53f-4c6f-82ae-c6cc52b2fd0b
- First link goes here
- Did you take notes? Add them above this line (Send a PR with links to your notes)