Droomjobs
Add a review FollowOverview
-
Sectors Construction / Facilities
-
Posted Jobs 0
-
Viewed 99
Company Description
DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation reasoning models, achieving performance comparable to OpenAI-o1 throughout mathematics, code, and reasoning tasks.
Models
DeepSeek-R1
Distilled designs
DeepSeek team has shown that the reasoning patterns of bigger models can be distilled into smaller sized designs, resulting in much better performance compared to the thinking patterns found through RL on little designs.
Below are the models created via fine-tuning against several thick models extensively utilized in the research study community utilizing thinking data created by DeepSeek-R1. The examination results demonstrate that the distilled smaller sized dense models carry out exceptionally well on standards.
DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License
The are certified under the MIT License. DeepSeek-R1 series support business usage, permit any modifications and derivative works, including, however not restricted to, distillation for training other LLMs.

