LLM-Local-Deployment

This repository contains example code demonstrating how to use department self-deployed DeepSeek R1 distillation models via API.

Overview

The repository provides sample code for interacting with locally deployed Large Language Models (LLMs) using both Python script and Jupyter Notebook file. The examples demonstrate various API interaction patterns including standard requests, streaming, and multi-turn conversations.

Available Models

The department has deployed the following models which can be accessed through the API:

DeepSeek-R1-Distill-Llama-70B (Port: 50000)
DeepSeek-R1-Distill-Qwen-14B (Port: 50001)

Getting Started

Prerequisites

API key (register to obtain your key, see Hackathon instructions)
openai Python package

Note

If you want to use Javascript, also check this link.
If you want to use other language, also check the last cell of example_LLM_API_call.ipynb and this link.

Installation

Clone this repository:

git clone https://github.com/Vision-and-Multimodal-Intelligence-Lab/LLM-Local-Deployment.git
cd LLM-Local-Deployment

Create your Python environment (Conda) and install the required dependencies. If you do not plan to use Jupyter Notebook
```
conda create -n local_LLM python=3.12
conda activate local_LLM
pip install openai
```

Configuration

Before using the examples, you need to set three key variables:

HOST: The host address of the deployed models
PORT: The port number for the specific model you want to access (50000 for Llama-70B or 50001 for Qwen-14B)
API_KEY: Your personal API key for authentication

Usage Examples

The repository provides examples using the OpenAI client library, which offers a convenient interface for interacting with the models.

Key Features Demonstrated

The examples showcase several important features:

Standard (Non-streaming) Completions: Get the full response at once
Streaming Responses: Process the response as it's being generated
Multi-turn Conversations: Maintain context over multiple exchanges
Reasoning Content: Access the model's reasoning process separately from its final output
Python Requests Alternative: Examples using the lower-level requests library

Example Files

example_LLM_API_call.py: A Python script demonstrating API interactions.
example_LLM_API_call.ipynb: A Jupyter notebook provides step by step instructions.

Tips

Experiment with different system prompts to tailor the model's behavior
Adjust temperature and top_p parameters to control response randomness/creativity
For real-time applications, use streaming to improve user experience
The reasoning content can be valuable for debugging or educational applications

Support

For issues or questions specific to the hackathon, please contact the event organizers.

License

See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_LLM_API_call.ipynb		example_LLM_API_call.ipynb
example_LLM_API_call.py		example_LLM_API_call.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM-Local-Deployment

Overview

Available Models

Getting Started

Prerequisites

Installation

Configuration

Usage Examples

Key Features Demonstrated

Example Files

Tips

Support

License

About

Contributors 2

Languages

License

Vision-and-Multimodal-Intelligence-Lab/LLM-Local-Deployment

Folders and files

Latest commit

History

Repository files navigation

LLM-Local-Deployment

Overview

Available Models

Getting Started

Prerequisites

Installation

Configuration

Usage Examples

Key Features Demonstrated

Example Files

Tips

Support

License

About

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages