This repository contains example code demonstrating how to use department self-deployed DeepSeek R1 distillation models via API.
The repository provides sample code for interacting with locally deployed Large Language Models (LLMs) using both Python script and Jupyter Notebook file. The examples demonstrate various API interaction patterns including standard requests, streaming, and multi-turn conversations.
The department has deployed the following models which can be accessed through the API:
- DeepSeek-R1-Distill-Llama-70B (Port:
50000
) - DeepSeek-R1-Distill-Qwen-14B (Port:
50001
)
- API key (register to obtain your key, see Hackathon instructions)
openai
Python package
Note
If you want to use Javascript, also check this link.
If you want to use other language, also check the last cell of example_LLM_API_call.ipynb
and this link.
-
Clone this repository:
git clone https://github.com/Vision-and-Multimodal-Intelligence-Lab/LLM-Local-Deployment.git cd LLM-Local-Deployment
-
Create your Python environment (Conda) and install the required dependencies. If you do not plan to use Jupyter Notebook
conda create -n local_LLM python=3.12 conda activate local_LLM pip install openai
Before using the examples, you need to set three key variables:
HOST
: The host address of the deployed modelsPORT
: The port number for the specific model you want to access (50000
for Llama-70B or50001
for Qwen-14B)API_KEY
: Your personal API key for authentication
The repository provides examples using the OpenAI client library, which offers a convenient interface for interacting with the models.
The examples showcase several important features:
- Standard (Non-streaming) Completions: Get the full response at once
- Streaming Responses: Process the response as it's being generated
- Multi-turn Conversations: Maintain context over multiple exchanges
- Reasoning Content: Access the model's reasoning process separately from its final output
- Python Requests Alternative: Examples using the lower-level requests library
example_LLM_API_call.py
: A Python script demonstrating API interactions.example_LLM_API_call.ipynb
: A Jupyter notebook provides step by step instructions.
- Experiment with different
system prompts
to tailor the model's behavior - Adjust
temperature
andtop_p
parameters to control response randomness/creativity - For real-time applications, use
streaming
to improve user experience - The reasoning content can be valuable for debugging or educational applications
For issues or questions specific to the hackathon, please contact the event organizers.
See the LICENSE file for details.