
Droomjobs
Add a review FollowOverview
-
Sectors Construction / Facilities
-
Posted Jobs 0
-
Viewed 25
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation reasoning models, achieving performance comparable to OpenAI-o1 throughout mathematics, code, and reasoning tasks.
Models
DeepSeek-R1
Distilled designs
DeepSeek team has shown that the reasoning patterns of bigger models can be distilled into smaller sized designs, resulting in much better performance compared to the thinking patterns found through RL on little designs.
Below are the models created via fine-tuning against several thick models extensively utilized in the research study community utilizing thinking data created by DeepSeek-R1. The examination results demonstrate that the distilled smaller sized dense models carry out exceptionally well on standards.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The are certified under the MIT License. DeepSeek-R1 series support business usage, permit any modifications and derivative works, including, however not restricted to, distillation for training other LLMs.