Ascend Boosts DeepSeek Growth

Advertisements

The rapid rise of DeepSeek, an advanced AI model, has caused significant upheaval in the industry, leading to what many describe as a "server busy" epidemic. Users attempting to access the platform often find themselves met with the frustrating message to "please try again later." The tremendous demand highlights the startling success of DeepSeek, which topped off at a remarkable leap from 347,000 daily active users to an astonishing 119 million within just one month, fueled by innovative algorithmic enhancements and an open-source strategy that promotes its adoption in niche markets.

As DeepSeek ignites discussions about computational power, the pressure on tech companies to keep pace has intensified. Prominent players in the field—ranging from Shengteng and TianShu Zhixin to more recent entrants like Moole and Huaran Technology—have all announced compatibility with DeepSeek, indicating a broader industry recognition of the urgent need to optimize hardware capabilities to support sophisticated AI operations. However, experts caution that merely achieving compatibility is a preliminary step. In order to fully harness DeepSeek's algorithms, substantial investment is required in areas such as mixed precision training (like FP8), energy consumption balance across multiple scenarios, and deep collaborative optimization between software and hardware.

The emergence of DeepSeek has stimulated a bifurcation in computational power, which experts identify as a dual trajectory of technological sophistication and engineering innovation. As a result, the demand for computational resources is expected to grow even further. Industry leaders are doubling down on their investment in pre-trained foundational models, aiming to align with what is termed the Scaling Law while simultaneously exploring the lofty goals of artificial general intelligence (AGI). They are prioritizing the development of efficient, stable, and open infrastructures, along with the building of robust AI clusters and expansive ecosystems.

To illustrate this trend, we can observe Meta’s substantial increase in AI investment from $40 billion to $65 billion, while Google has raised its investment from $52.5 billion to $75 billion. Furthermore, model iteration and technological upgrades are accelerating, as evidenced by the release of Qwen 2.5 - Max by Qianwen and the Gemini 2.0 series by Google.

On the engineering front, new paradigms have emerged, lowering the entry barriers for post-training and distillation processes, leading to what some refer to as a resurgence of "a hundred models, a thousand variations." Companies are now focusing on user-friendly, affordable platforms that balance cost and performance in their distillation and fine-tuning approaches, as well as emphasizing quick deployment and agile business rollouts.

In the B2B sector, many enterprises are rapidly integrating DeepSeek to capitalize on the traffic it generates; within just 20 days of the R1 release, more than 160 businesses worldwide have hooked into DeepSeek. On the consumer side, the user base has experienced explosive growth, giving rise to super apps that are accelerating the widespread adoption of large language models (LLMs). DeepSeek’s extraordinary performance has played a crucial role in elevating societal awareness of LLMs, paving the way for new business models and promoting a virtuous cycle of commercial activity.

To cater to the divergent needs presented by this landscape, computational power structures must evolve to support various demands. First, model architecture needs optimization, allowing for larger models to run on existing hardware, thereby enhancing both scale and performance. Next, communication between computational units must be optimized to improve usage efficiency and reduce training time, allowing companies to carry out complex AI tasks more effectively. Additionally, optimizations during post-training are essential to minimize labeling data requirements and lower data costs, while techniques like reinforcement learning can significantly enhance model performance.

In terms of inference optimization, supporting the prediction of multiple tokens simultaneously could exponentially increase inference efficiency, providing businesses with quicker, more effective AI applications. The reality is that most AI practitioners need to rely on sufficiently robust foundational computational resources and comprehensive solutions to achieve effective training and inferencing. A stable and reliable computational platform not only reduces trial-and-error costs but also enables companies to focus on optimizing their models.

After the launch of DeepSeek V3, Huawei promptly initiated an internal analysis and technological adaptation process, discovering a strong match between DeepSeek’s technical framework and their Shengteng products. For instance, the MoE architecture aligns with Huawei’s earlier predictions regarding the future of large models, demonstrating their proactive approach in this domain. The Shengteng platform also offers substantial capabilities for simplifying the reinforcement learning process, making it easier for developers.

Remarkably, Shengteng is noted as the industry's first chip platform that has completed full-scale adaptation of DeepSeek's core algorithms, thus supporting the pre-training and fine-tuning of all DeepSeek models. It includes advanced features such as support for DualPipe, cross-node All2All, and high-bandwidth communication that align well with DeepSeek's pipeline parallelism and other innovative functionalities. Furthermore, Shengteng stands out as the only AI training platform that comprehensively adapts from pre-training to fine-tuning for DeepSeek. With the industry shifting towards reinforcement learning training methods from standard fine-tuning (SFT), Shengteng is positioned to provide DeepSeek R1 models alongside reinforcement learning algorithms, coupled with prompt engineering and data sampling techniques to generate high-quality synthetic data.

Through collaboration with partners and clients, Shengteng has rolled out various product forms, including integrated machines, cloud services, and hardware plus open-source community platforms to expedite enterprise deployment. The coverage spans various sectors, including internet services, finance, telecommunications, government, and education. Currently, over 80 clients and partners across different industries have swiftly adapted and launched DeepSeek series models, with another 20+ businesses in testing, anticipated to go live within two weeks. Insights from industry experts indicate a robust trend, reporting that approximately 70% of domestic enterprises are rapidly adopting Shengteng-based solutions.

Ascend Boosts DeepSeek Growth

Leave Your Comment

Categories

Fresh From Blog

Europe, China Poised for "Davis Double Play"

Gold Prices Extend Rally

Strong Oil Price Rebound in the New Year

Green Paper on the UK Steel Industry

Weak Performance of U.S. Stock Market

What Happened to the A-Share Hot Sectors?

Silvies Turns Profitable After $10B+ Losses

Ascend Boosts DeepSeek Growth

Fed Anchors PCE Index

Rising Momentum of Public Fund Research

Ascend Boosts DeepSeek Growth

Leave Your Comment

Categories

Fresh From Blog

Social Media