Home Arrow Icon Knowledge base Arrow Icon Global Arrow Icon How can I customize the prompts for DeepSeek R1 in FastAPI


How can I customize the prompts for DeepSeek R1 in FastAPI


Customizing prompts for DeepSeek R1 in a FastAPI application involves several steps, including setting up the environment, crafting effective prompts, and integrating them into your FastAPI service. Here's a detailed guide on how to achieve this:

Setting Up the Environment

1. Install Required Packages: Ensure you have FastAPI, Pydantic, and any necessary AI client libraries installed. For example, if you're using Ollama for managing DeepSeek-R1 models locally, you'll need to install it along with FastAPI.

2. Configure DeepSeek-R1 Model: Use a library like Ollama to manage and serve DeepSeek-R1 models locally. This allows you to integrate the model into your FastAPI application without relying on external services.

Crafting Effective Prompts

1. Understand Prompt Engineering: DeepSeek R1 benefits from clear, concise, and specific prompts. Avoid using system prompts if possible, as DeepSeek R1 is optimized for zero-shot prompting[4][6].

2. Use Markdown Structuring: Organize your prompts with markdown to improve readability and clarity. This can help the model understand the structure of your request better[1].

3. Temperature Settings: Experiment with different temperature settings to optimize the model's response. Lower temperatures can lead to more deterministic outputs, while higher temperatures introduce more randomness[1].

4. Chain-of-Thought Reasoning: Utilize DeepSeek R1's Chain-of-Thought (CoT) reasoning by breaking down complex tasks into step-by-step logical responses. This can enhance the model's ability to provide detailed and accurate answers[1].

Integrating Prompts into FastAPI

1. Define Streaming Endpoints: Create endpoints in your FastAPI application that can handle streaming responses. This allows you to receive partial outputs in real-time, which is beneficial for long-running tasks or interactive applications[5].

2. Convert User Input to Model Messages: Use a function to convert user input into the format expected by the DeepSeek R1 model. This typically involves creating a list of messages that the model can process[5].

3. Send Requests to the Model: Use the `client.chat.completions.create` method with the `stream=True` parameter to send requests to the model and receive streaming responses[5].

4. Handle Responses: Process the streaming responses in your FastAPI application. You can yield chunks of the response as they are received, allowing the client to receive updates in real-time[5].

Example Code

Here's an example of how you might integrate DeepSeek R1 into a FastAPI application:

python
from fastapi import FastAPI, Query
from fastapi.responses import StreamingResponse
from openai import OpenAI

app = FastAPI()

# Assuming you have a client configured for DeepSeek R1
client = OpenAI(api_key="your_api_key", base_url="http://localhost:11434/v1/")

class Request(BaseModel):
    messages: List[ClientMessage]

def stream_text(messages: List[ClientMessage], protocol: str = 'data'):
    stream = client.chat.completions.create(
        messages=messages,
        model="deepseek-r1",
        stream=True,
    )
    
    for chunk in stream:
        for choice in chunk.choices:
            if choice.finish_reason == "stop":
                # Handle completion
                pass
            elif chunk.choices == []:
                # Handle usage data
                pass
            yield f'd:{{"finishReason":"{"tool-calls" if len(draft_tool_calls) > 0 else "stop"}", "usage":{{"promptTokens":prompt_tokens,"completionTokens":completion_tokens}}}}\n'

@app.post("/api/chat")
async def handle_chat_data(request: Request, protocol: str = Query('data')):
    messages = request.messages
    openai_messages = convert_to_openai_messages(messages)
    response = StreamingResponse(stream_text(openai_messages, protocol))
    response.headers['x-vercel-ai-data-stream'] = 'v1'
    return response

This setup allows you to customize prompts by modifying the `messages` variable and send them to DeepSeek R1 for processing. You can further refine your prompts based on the model's performance and your specific use case.

Citations:
[1] https://www.youtube.com/watch?v=kRXfddrtrmM
[2] https://www.youtube.com/watch?v=mtFo6uG1VgY
[3] https://aws.amazon.com/blogs/machine-learning/optimize-reasoning-models-like-deepseek-with-prompt-optimization-on-amazon-bedrock/
[4] https://www.reddit.com/r/LocalLLaMA/comments/1ico38r/why_no_system_prompt_when_utilizing_the/
[5] https://vadim.blog/deepseek-r1-ollama-fastapi
[6] https://www.reddit.com/r/artificial/comments/1ijas54/how_to_prompt_the_deepseekr1_model/
[7] https://rasa.com/blog/self-improving-calm-prompts-using-deepseek-r1/
[8] https://builtin.com/artificial-intelligence/how-implement-deepseek-locally
[9] https://launchdarkly.com/blog/deepseek-ai-configs-get-started-python/
[10] https://apidog.com/blog/deepseek-prompts-coding/