Customizing prompts for DeepSeek R1 in a FastAPI application involves several steps, including setting up the environment, crafting effective prompts, and integrating them into your FastAPI service. Here's a detailed guide on how to achieve this:
Setting Up the Environment
1. Install Required Packages: Ensure you have FastAPI, Pydantic, and any necessary AI client libraries installed. For example, if you're using Ollama for managing DeepSeek-R1 models locally, you'll need to install it along with FastAPI.
2. Configure DeepSeek-R1 Model: Use a library like Ollama to manage and serve DeepSeek-R1 models locally. This allows you to integrate the model into your FastAPI application without relying on external services.
Crafting Effective Prompts
1. Understand Prompt Engineering: DeepSeek R1 benefits from clear, concise, and specific prompts. Avoid using system prompts if possible, as DeepSeek R1 is optimized for zero-shot prompting[4][6].
2. Use Markdown Structuring: Organize your prompts with markdown to improve readability and clarity. This can help the model understand the structure of your request better[1].
3. Temperature Settings: Experiment with different temperature settings to optimize the model's response. Lower temperatures can lead to more deterministic outputs, while higher temperatures introduce more randomness[1].
4. Chain-of-Thought Reasoning: Utilize DeepSeek R1's Chain-of-Thought (CoT) reasoning by breaking down complex tasks into step-by-step logical responses. This can enhance the model's ability to provide detailed and accurate answers[1].
Integrating Prompts into FastAPI
1. Define Streaming Endpoints: Create endpoints in your FastAPI application that can handle streaming responses. This allows you to receive partial outputs in real-time, which is beneficial for long-running tasks or interactive applications[5].
2. Convert User Input to Model Messages: Use a function to convert user input into the format expected by the DeepSeek R1 model. This typically involves creating a list of messages that the model can process[5].
3. Send Requests to the Model: Use the `client.chat.completions.create` method with the `stream=True` parameter to send requests to the model and receive streaming responses[5].
4. Handle Responses: Process the streaming responses in your FastAPI application. You can yield chunks of the response as they are received, allowing the client to receive updates in real-time[5].
Example Code
Here's an example of how you might integrate DeepSeek R1 into a FastAPI application:
python
from fastapi import FastAPI, Query
from fastapi.responses import StreamingResponse
from openai import OpenAI
app = FastAPI()
# Assuming you have a client configured for DeepSeek R1
client = OpenAI(api_key="your_api_key", base_url="http://localhost:11434/v1/")
class Request(BaseModel):
messages: List[ClientMessage]
def stream_text(messages: List[ClientMessage], protocol: str = 'data'):
stream = client.chat.completions.create(
messages=messages,
model="deepseek-r1",
stream=True,
)
for chunk in stream:
for choice in chunk.choices:
if choice.finish_reason == "stop":
# Handle completion
pass
elif chunk.choices == []:
# Handle usage data
pass
yield f'd:{{"finishReason":"{"tool-calls" if len(draft_tool_calls) > 0 else "stop"}", "usage":{{"promptTokens":prompt_tokens,"completionTokens":completion_tokens}}}}\n'
@app.post("/api/chat")
async def handle_chat_data(request: Request, protocol: str = Query('data')):
messages = request.messages
openai_messages = convert_to_openai_messages(messages)
response = StreamingResponse(stream_text(openai_messages, protocol))
response.headers['x-vercel-ai-data-stream'] = 'v1'
return response
This setup allows you to customize prompts by modifying the `messages` variable and send them to DeepSeek R1 for processing. You can further refine your prompts based on the model's performance and your specific use case.
Citations:
[1] https://www.youtube.com/watch?v=kRXfddrtrmM
[2] https://www.youtube.com/watch?v=mtFo6uG1VgY
[3] https://aws.amazon.com/blogs/machine-learning/optimize-reasoning-models-like-deepseek-with-prompt-optimization-on-amazon-bedrock/
[4] https://www.reddit.com/r/LocalLLaMA/comments/1ico38r/why_no_system_prompt_when_utilizing_the/
[5] https://vadim.blog/deepseek-r1-ollama-fastapi
[6] https://www.reddit.com/r/artificial/comments/1ijas54/how_to_prompt_the_deepseekr1_model/
[7] https://rasa.com/blog/self-improving-calm-prompts-using-deepseek-r1/
[8] https://builtin.com/artificial-intelligence/how-implement-deepseek-locally
[9] https://launchdarkly.com/blog/deepseek-ai-configs-get-started-python/
[10] https://apidog.com/blog/deepseek-prompts-coding/