🚀Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉Now live on App, Web, and API. 💰API prices cut by 50%+!
关于我们
DeepSeek (深度求索), founded in 2023, is a Chinese company dedicated to making AGI a reality. Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism. 🐋
- 网站
-
https://www.deepseek.com
DeepSeek AI的外部链接
- 所属行业
- 科技、信息和网络
- 规模
- 51-200 人
- 总部
- Hangzhou
- 类型
- 私人持股
- 创立
- 2023
地点
-
主要
CN,Hangzhou
DeepSeek AI员工
-
Karl Zhao, PhD
AI & Robotics Innovation Leader | Bridging Deep Learning & Real World Applications | Strategic Vision for AI, Robots &Human Work Together | Speaker…
-
Gredi Nikollaj (DSN) M.Sc. / M.A - Mr. Crypto 🇦🇱🇩🇪
President & CEO DeepSeek AG Sylt
-
Claudio Cotar
SISCOT.SITE Hosting Web - E-comerce - Mail Marketing - Coaching Tecnológico - CM
-
Charles Xun
AGI Researcher
动态
-
Introducing DeepSeek-V3.1: our first step toward the agent era! 🧠Hybrid inference: Think & Non-Think — one model, two modes ⚡️Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528 🔨Stronger agent skills: Post-training boosts tool use and multi-step agent tasks Try it now — toggle Think/Non-Think via the "DeepThink" button: https://chat.deepseek.com
-
🚀DeepSeek-R1-0528 is here! 🔹 Improved benchmark performance 🔹 Enhanced front-end capabilities 🔹 Reduced hallucinations 🔹 Supports JSON output & function calling ✅ Try it now: https://chat.deepseek.com 🔌 No change to API usage — docs here: https://lnkd.in/ge9wJdVV 🔗 Open-source weights: https://lnkd.in/gM9uKSeQ
-
🚀 DeepSeek-V3-0324 is out now! - Major boost in reasoning performance - Stronger front-end development skills - Smarter tool-use capabilities ✅ For non-complex reasoning tasks, we recommend using V3 — just turn off “DeepThink” 🧲 API usage remains unchanged 📚 Models are now released under the MIT License, just like DeepSeek-R1! 🔗 Open-source weights: https://lnkd.in/gTT5_fhW
-
-
🚀 Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview Optimized throughput and latency via: 🧲 Cross-node EP-powered batch scaling 🔀 Computation-communication overlap ⚖️ Load balancing Statistics of DeepSeek's Online Service: ⚡ 73.7k/14.8k input/output tokens per second per H800 node 📊 Cost profit margin 545% 💡 We hope this week's insights offer value to the community and contribute to our shared AGI goals. 📚 Deep Dive: https://bit.ly/4ihZUiO
-
🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster ⚡ 40+ GiB/s peak throughput per client node for KVCache lookup 📍 Disaggregated architecture with strong consistency semantics ✅ Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1 🔗 3FS → https://lnkd.in/e-VgDCQ8 🔗 Smallpond - data processing framework on 3FS → https://lnkd.in/ggsC8Ye5
-
Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 https://lnkd.in/gurXyTVe ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗 https://lnkd.in/giPt92JG 📊 Analyze computation-communication overlap in V3/R1. 🔗 https://lnkd.in/gubSQfMP
-
🚀 Day 3 of #OpenSourceWeek: DeepGEMM Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference. ✅ Up to 1350+ FP8 TFLOPS on Hopper GPUs ✅ No heavy dependency, as clean as a tutorial ✅ Fully Just-In-Time compiled ✅ Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes ✅ Supports dense layout and two MoE layouts 🔗 GitHub: https://lnkd.in/d2S4Sxww