I. Introduction to DeepSeek
Among China's seven big model startups, DeepSeek, though quiet, always attracts attention. It was remembered a year ago for the tens of thousands of A100 chips reserved by its backer Huan Fang, and a year later, it became famous for releasing the DeepSeek V2 open-source model that triggered a price war. This model has a low inference cost and is called "the Pinduoduo of the AI industry," prompting major manufacturers to reduce prices. Unlike the big factories that subsidize money, DeepSeek is profitable, which stems from its innovation in model architecture, such as proposing the MLA architecture to reduce memory occupation and creating the DeepSeekMoESparse structure to reduce computational volume.
II. Interviews Related to the Price War
The cause of the price war
Liang Wenfeng, the founder of DeepSeek, said that he did not intend to trigger a price war, but just did things according to his own pace, calculated the cost and set the price, with the principle of not losing money or making excessive profits. After Zhi Pu AI followed, ByteDance took the lead in reducing the price of its flagship model to the same price as DeepSeek, triggering the big factories to reduce prices, while the big factories' model costs are higher.
Liang Wenfeng said that grabbing users is not the main purpose, and the price reduction is one because exploring the next-generation model structure has reduced costs, and the second is the belief that APIs and AI should be inclusive.
The reason for model structure innovation
The goal is AGI, which requires research on new structures to achieve stronger capabilities with limited resources, and the Llama structure has a gap with foreign counterparts in training efficiency and inference cost. There is a gap between domestic models and the best foreign levels in terms of model structure and training dynamics, data efficiency, and it needs to be narrowed.
The reason for only doing research exploration
DeepSeek believes that it should participate in the global innovation wave at present, break the inertia of Chinese companies only doing application innovation, and go to the forefront of technology to promote the development of the ecosystem.
III. Discussion on Innovation
The significance and challenges of innovation
DeepSeek V2 surprised Silicon Valley because Chinese companies joined as innovators. The cost of innovation is high, but China is not short of capital, it is short of confidence and the ability to organize talent. Chinese companies used to be accustomed to taking, neglecting innovation, and innovation requires curiosity and creativity.
Faced with the issue of technology being copied, Liang Wenfeng believes that team growth and the formation of an innovative culture are the moat, and open source is a cultural behavior that is attractive.
Comparison with market views
Zhu Xiaohu's market belief school view is suitable for companies that make money quickly, and many of the companies that make money in the United States are high-tech companies that have accumulated a lot. There is a gap between Chinese AI and American original and imitation, and there needs to be someone who stands at the forefront of technology to promote the development of the ecosystem.
The choice of open source
DeepSeek will not be closed source, believing that it is more important to have a strong technical ecosystem first. There are no short-term financing plans, and there are issues with high-end chip embargoes.
IV. Company Development Strategy
The reason for not doing applications
It is currently a period of technological innovation, and in the long run, it is hoped that the industry will use its technology to form an industry upstream and downstream. If necessary, applications can also be done, but research and innovation are the top priority.
View on competition with big factories
Liang Wenfeng doesn't care too much about competing with big factories, thinking that big factories have users but also have baggage, and DeepSeek's goal is to achieve AGI. He believes that there may be 2-3 big model startups that survive, and those with clear self-positioning and refined operations have more opportunities.
Innovative organizational structure and talent
The ones who made DeepSeek V2 are young people such as fresh graduates from local universities. The MLA innovation stems from the interest of young researchers, and the process from idea to landing is long. The company is a bottom-up management method, without pre-division of labor, personnel can flexibly call cards and people, and the selection standard is love and curiosity.
V. Views on the Industry
The realization and roadmap of AGI
Liang Wenfeng believes that AGI may be achieved in 2-5-10 years, and the company bets on three directions: mathematics and code, multimodal, and natural language itself.
Views on the endgame of big models
There will be companies that provide basic models and services, with professional division of labor, and more people will meet social needs.
Views on industry changes
Liang Wenfeng admires Wang Huiwen's choice, and his main focus is on researching the next generation of big models. He believes that discussing AI profit models with the logic of the Internet business is to seek a sword in the boat, and he is optimistic about hard-core innovation, believing that economic adjustments will rely on hard-core technological innovation.