Overview

  • Sectors Construction / Facilities
  • Posted Jobs 0
  • Viewed 25

Company Description

DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation reasoning models, achieving performance comparable to OpenAI-o1 throughout mathematics, code, and reasoning tasks.

Models

DeepSeek-R1

Distilled designs

DeepSeek team has shown that the reasoning patterns of bigger models can be distilled into smaller sized designs, resulting in much better performance compared to the thinking patterns found through RL on little designs.

Below are the models created via fine-tuning against several thick models extensively utilized in the research study community utilizing thinking data created by DeepSeek-R1. The examination results demonstrate that the distilled smaller sized dense models carry out exceptionally well on standards.

DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License

The are certified under the MIT License. DeepSeek-R1 series support business usage, permit any modifications and derivative works, including, however not restricted to, distillation for training other LLMs.