Cost Breakdown of Customizing and Deploying DeepSeek-R1 Models on AWS

What is the cost of customizing DeepSeek-R1 models

Customizing DeepSeek-R1 models involves several cost components, particularly when deploying them on platforms like Amazon Web Services (AWS). Here's a detailed breakdown of the costs associated with customizing and deploying DeepSeek-R1 models:

1. Training Costs: The initial training cost of DeepSeek-R1 is significantly lower compared to other models like ChatGPT o1. DeepSeek-R1 is estimated to cost between $6 million to train, which is 16x to 33x cheaper than ChatGPT o1[3]. However, this cost is for the initial development and training of the model, not for customization.

2. Customization and Deployment on AWS: When customizing DeepSeek-R1 models using Amazon Bedrock's Custom Model Import feature, there are several costs to consider:
- Inference Costs: These are based on the number of active model copies and their duration of activity. The pricing per Custom Model Unit (CMU) varies depending on factors like architecture and region. For example, the price per CMU per minute is $0.0785[4].
- Storage Costs: There is a monthly storage cost per CMU, which is $1.95[4]. The number of CMUs required depends on the model's architecture and context length.

3. Example Cost Calculation: For a DeepSeek-R1-Distill-Llama-8B model requiring 2 CMUs, if the model is active for 1 hour per day, the daily inference cost would be approximately $9.42. Over a month, this totals to $282.60 for inference and $3.90 for storage, resulting in a total estimated monthly cost of $286.50[4].

4. On-Demand Pricing for DeepSeek Models: On AWS, the on-demand pricing for DeepSeek models like DeepSeek R-1 is $0.00135 per 1,000 input tokens and $0.0054 per 1,000 output tokens[8]. This pricing model is useful for applications that require flexible usage without long-term commitments.

In summary, customizing DeepSeek-R1 models involves costs related to model deployment and usage on platforms like AWS, with specific charges for inference and storage based on Custom Model Units. The initial training cost of the model is significantly lower than comparable models, making it a cost-efficient option for AI applications.

Citations:
[1] https://api-docs.deepseek.com/quick_start/pricing
[2] https://www.together.ai/models/deepseek-r1
[3] https://www.creolestudios.com/deepseek-vs-chatgpt-cost-comparison/
[4] https://repost.aws/questions/QU-hcixrtFSaSoKH8GL-KogA/pricing-model-of-deepseek-r1-distilled-llama-models-with-amazon-bedrock-custom-model-import
[5] https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Discover-the-Power-of-DeepSeek-R1-A-Cost-Efficient-AI-Model/post/1665557
[6] https://team-gpt.com/blog/deepseek-pricing/
[7] https://www.reddit.com/r/MachineLearning/comments/1icfbll/d_deepseek_distillation_and_training_costs/
[8] https://aws.amazon.com/bedrock/pricing/