This repository contains a fine-tuned question answering model based on BERT (Bidirectional Encoder Representations from Transformers) architecture, specifically fine-tuned for question answering tasks. The model utilizes "deepset/bert-base-cased-squad2" which is optimized for handling complex question formats and extracting answers from longer contexts.
The Squad dataset was used for fine-tuning purposes, with a limited dataset of 1000 data inputs. This approach was undertaken primarily for learning purposes to demonstrate the capabilities of the model and the effectiveness of the fine-tuning process.
- Fine-Tuned for Question Answering Tasks: The model is fine-tuned specifically for question answering tasks, making it well-suited for tasks like extracting answers from given contexts.
- Utilizes BERT Architecture: BERT is a proven model for various NLP tasks, including QA. By leveraging the BERT architecture, this model inherits its robustness and performance.
- Trained on SQuAD2 Dataset: The model is fine-tuned on the SQuAD2 dataset, allowing it to handle challenging question formats and context understanding.
- Consideration of Letter Case: The "cased" attribute indicates that the model considers the case of letters during processing, potentially enhancing its ability to capture the intended meaning of words.
The peft.LoraConfig helps in fine-tuning large language models more efficiently by reducing their size and improving their performance. It achieves this by using LoRA, which breaks down the model into smaller, more manageable components while providing configuration options for balancing efficiency and accuracy.
These results demonstrate the effectiveness of fine-tuning the model with peft.LoraConfig, even with a limited dataset. Despite using only 1000 data inputs for training, the model achieves promising performance metrics, showcasing its potential for further development and application.
- Global Step: 1000
- Training Loss: 1.86
- Training Runtime: ~163 seconds
- Training Steps per Second: 6.135
- Training Samples per Second: 12.27
- Training Steps per Second: 6.135
- Total FLOPs: 524.41 trillion
- Epochs: 2.0
- Clone the repository:
git clone https://github.com/ns9920/LLMfinePEFT.git
- Navigate to the project directory:
cd LLMfinePEFT
- Load the trained model using the provided scripts or integrate it into your NLP pipeline.
- Fine-tune the model further based on your specific dataset and requirements, if necessary
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
Special thanks to the creators of BERT, SQuAD2 dataset, and peft.LoraConfig for their valuable contributions to the NLP community.