Setting up DeepSeek-R1 with FastAPI involves several steps, including configuring your environment, installing necessary tools, and integrating the model with FastAPI. Here's a detailed guide:
Step 1: Set Up Your Development Environment
First, ensure you have Python 3.8 or later installed on your system. You will also need `pip`, Python's package manager, for installing dependencies.
1. Create a Virtual Environment:**
To keep your project dependencies isolated, create a virtual environment. Open a terminal and run:
bash
python -m venv ollama_deepseek_env
Activate the environment:
- On Windows:
bash
ollama_deepseek_env\Scripts\activate
- On macOS/Linux:
bash
source ollama_deepseek_env/bin/activate
2. Install Required Packages:**
Install FastAPI, Uvicorn (for running FastAPI), and Ollama using pip:
bash
pip install fastapi uvicorn ollama
Step 2: Install and Configure Ollama
Ollama is a tool that simplifies the process of downloading and serving large language models like DeepSeek-R1 locally.
1. Download and Install Ollama:**
Visit the official Ollama website to download and install it on your machine. Follow platform-specific instructions for installation.
2. Download DeepSeek-R1 Model:**
Use Ollama to download the DeepSeek-R1 model. Open a terminal and run:
bash
ollama run deepseek-r1
If your hardware cannot handle the full model, you can specify a smaller version, such as the 7B parameter model:
bash
ollama run deepseek-r1:7b
3. Serve DeepSeek-R1 Locally:**
To continuously serve the model, run:
bash
ollama serve
By default, Ollama listens on `http://localhost:11434`.
Step 3: Integrate DeepSeek-R1 with FastAPI
Now, you'll create a FastAPI application that interacts with the locally served DeepSeek-R1 model.
1. Create a FastAPI App:**
Create a new Python file (e.g., `main.py`) and import necessary modules:
python
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import requests
2. Define the API Endpoint:**
Create an endpoint to interact with DeepSeek-R1. Here's an example that sends a prompt to the model and streams the response:
python
app = FastAPI()
@app.post("/api/chat")
async def handle_chat_data(prompt: str):
url = "http://localhost:11434/api/chat"
data = {
"model": "deepseek-r1",
"messages": [{"role": "user", "content": prompt}],
"stream": True
}
response = requests.post(url, json=data, stream=True)
def stream_response():
for chunk in response.iter_content(chunk_size=1024):
yield chunk
return StreamingResponse(stream_response(), media_type="text/event-stream")
3. Run the FastAPI Application:**
Use Uvicorn to run your FastAPI app:
bash
uvicorn main:app --reload
Step 4: Test the Integration
1. Send a Request:**
Use a tool like `curl` or a web browser to send a request to your FastAPI endpoint. For example:
bash
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "Explain the concept of AI"}' http://localhost:8000/api/chat
2. Observe the Response:**
You should see the response from DeepSeek-R1 streaming back to your client.
Conclusion
By following these steps, you've successfully integrated DeepSeek-R1 with FastAPI, allowing you to leverage the model's capabilities in a scalable and customizable web service. This setup provides privacy, low latency, and flexibility in managing your AI applications locally.
Citations:
[1] https://blog.stackademic.com/integrating-deepseek-r1-with-fastapi-building-an-ai-powered-resume-analyzer-code-demo-4e1cc29cdc6e
[2] https://www.youtube.com/watch?v=mtFo6uG1VgY
[3] https://www.vadim.blog/deepseek-r1-ollama-fastapi
[4] https://vadim.blog/deepseek-r1-ollama-fastapi
[5] https://github.com/stephenc222/example-deepseek-r1
[6] https://builtin.com/artificial-intelligence/how-implement-deepseek-locally
[7] https://www.byteplus.com/en/topic/397556
[8] https://www.byteplus.com/en/topic/375726
[9] https://launchdarkly.com/blog/deepseek-ai-configs-get-started-python/