Skip to content

Latest commit

 

History

History
42 lines (33 loc) · 2.22 KB

README.md

File metadata and controls

42 lines (33 loc) · 2.22 KB

Parameter-Efficient Fine-Tuning (PEFT) with NeMo

In this example, we utilize NeMo's PEFT methods to showcase how to adapt a large language model (LLM) to a downstream task, such as financial sentiment predictions.

With one line configuration change, you can try different PEFT techniques such as p-tuning, adapters, or LoRA, which add a small number of trainable parameters to the LLM that condition the model to produce the desired output for the downstream task.

For more details, see the PEFT script in NeMo, which we adapt using NVFlare's Lightning client API to run in a federated scenario.

Dependencies

The example was tested with the NeMo 23.10 container. In the following, we assume this example folder of the container is mounted to /workspace and all downloading, etc. operations are based on this root path.

Note in the following, mount both the current directory and the job_templates directory to locations inside the docker container. Please make sure you have cloned the full NVFlare repo.

Start the docker container from this directory using

# cd NVFlare/integration/nemo/examples/peft
DOCKER_IMAGE="nvcr.io/nvidia/nemo:23.10"
docker run --runtime=nvidia -it --rm --shm-size=16g -p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit stack=67108864 \
-v ${PWD}/../../../../job_templates:/job_templates -v ${PWD}:/workspace -w /workspace ${DOCKER_IMAGE}

Next, install NVFlare.

pip install nvflare~=2.5.0rc

Examples

1. Federated PEFT using a 345 million parameter GPT model

We use JupyterLab for this example. To start JupyterLab, run

jupyter lab .

and open peft.ipynb.

Hardware requirement

This example requires a GPU with at least 24GB memory to run three clients in parallel on the same GPU.