This project is aimed at providing a comprehensive guide and tools for extracting valuable data from FreeBSD documentation and man pages to fuel machine learning and deep learning initiatives.
This is the workflow of how this project extract data from freebsd src codes.
Additionally, we give an implement that uses Retrieval-Augmented Generation (RAG) with an embedded vector database generated by us to enhance ChatGPT (or any other model) capacities (example/retrieve
). It enables ChatGPT to provide contextually relevant text in response to your questions, leading to more accurate answers. Essentially, it transforms ChatGPT into an expert system specialized in the FreeBSD domain. This is one of the
This DL implementation is just one example of what you can achieve with freebsd data. Your creativity is the limit, allowing you to explore various other implementations and applications.
This project makes use of specialized tools tailored for FreeBSD OS, including 'mandoc' for rendering man pages. As a result, running the script on a FreeBSD OS is recommended to ensure compatibility and smooth operation.
Run data.sh
.
$ sh data.sh
In November 2023, OpenAI introduced GPTs, a specialized version of ChatGPT. You can upload data here to establish your personalized expert system tailored to the domain of FreeBSD.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.