This API provides text generation and classification capabilities using a pre-trained large language model.
To run the API, execute the following command:
CUDA_VISIBLE_DEVICES=1 python llm-chat-llama3.py
The Swagger UI for this API is accessible at the following URL: