The Impact of DeepSeek on GPU Consumption and Business Implications
There is considerable discussion on the impact of DeepSeek’s recent announcements, which are technical by nature, suggesting far reaching impacts upon AI infrastructure and other participants of the AI supply chain that is being developed – with substantial investment – worldwide.
To better understand the meaning of these announcements requires an assessment of the technological innovations presented, within the broader context of the purpose and goals of AI in business and public sector use cases.
I have enjoyed the discussion on DeepSeek’s innovations and the implications on our GenAI field, but like many new services, there is considerable information hidden within the deep recesses of the supporting materials that have only recently been published.
I agree with the basic premise that the DeepSeek introduction has an impact on the environment that all of us work in, but I am not convinced that it will have the dramatic reduction in the use of GPUs and other AI/ML/GenAI services that are being portrayed within the marketplace today. I have seen the cynical responses to these announcements talking about shorting stocks as part of a release process, but looking at the implied volatility of NVIDIA at the moment (https://www.barchart.com/stocks/quotes/NVDA/put-call-ratios) the implied volatility is within 2% of the historical volatility and the short interest chart (https://www.marketbeat.com/stocks/NASDAQ/NVDA/short-interest/) does not seem to demonstrate any unusual activity in the stock itself.
As we mature in our use and understanding of Generative AI, there will be many announcements that set an expectation of a marked changed in how we view, consume and support AI offerings across multiple markets. Each day brings new offerings, opportunities, challenges and products within this sphere. Our role is to understand what each change means and how we can harness the knowledge that it shares for us. We are being given a ring side seat to a technology that is constantly changing, evolving and challenging the way we work and interact with technology. I personally took this as an opportunity to understand the hype that DeepSeek reigned across our world and used it to understand more about how GenAI works and where we can gain leverage across the technology stack. Yesterday, I would have spoken of LLMs, SLMs and custom Language models to reduce token costs, today I have expended that to include Floating Point Calculations, Speculative Decoding and Loss Free Decoding. I am always learning, and I encourage everyone to do the same.
On that note, this is my evaluation of the impact of DeepSeek and its impact on the world we are all currently active participants in.
Introduction
DeepSeek’s advancements in large language models (LLMs) have positioned it as a key player in the evolution of AI. DeepSeek-V3 currently boasts 671 billion parameters with highly optimized training and inference strategies. The release of DeepSeek and its associated costs for development have raised questions about their impact on GPU consumption. While DeepSeek’s innovations reduce inefficiencies, the broader trend of scaling model sizes and expanding applications suggests an overall increase in GPU demand. Understanding this duality is crucial for businesses investing in AI infrastructure and services.
Why Expectations of GPU Reduction Emerged The initial perception of GPU reduction emerged due to the efficiency improvements DeepSeek introduced. Innovations such as FP8 mixed-precision training, speculative decoding, and auxiliary-loss-free load balancing highlighted significant reductions in memory usage and computational overhead for specific tasks. Marketing and research papers often emphasized these gains without fully accounting for the counterbalancing effects of scaling model sizes and datasets; this created a narrative that DeepSeek would universally reduce GPU demands, overshadowing the reality that its advanced capabilities and applications drive up total consumption. These efficiencies reduce cost per task, but total GPU utilization increases as more tasks, larger models, and broader applications are adopted.
Innovations Driving GPU Efficiency
DeepSeek incorporates several innovations aimed at optimizing GPU utilization, adding to the early perceptions of reductions in GPU demand. These innovations, such as FP8 mixed-precision training, speculative decoding, and auxiliary-loss-free load balancing, highlighted tangible cost-saving efficiencies for specific tasks. When considered across a broader context, the implications of scaling these models revealed a more complex reality:
These advancements created excitement about their potential to significantly reduce computational costs, but they often led to an incomplete understanding of their broader implications, particularly when scaled for enterprise use.
Scaling Practices Increasing GPU Demand
The initial perception that GPU demand would decrease under DeepSeek’s innovations stemmed from the impressive efficiency improvements highlighted during its development. We have already discussed some of the advances and techniques, such as FP8 mixed-precision training, speculative decoding, and auxiliary-loss-free load balancing and their implications across memory consumption and computational overhead for individual tasks. The expectation led to a narrative suggesting widespread reductions in overall GPU usage; this is not correct as these expectations did not fully account for the implications of scaling DeepSeek’s models and applications, which ultimately drive-up aggregate GPU consumption.
Several factors ensure that DeepSeek’s overall GPU requirements are unlikely to decline:
These factors underscore why the initial expectation of reduced GPU usage can be perceived to be overly optimistic. While efficiencies improve individual task performance, they enable larger, more complex models and applications that increase overall demand.
Despite the above efficiencies, several factors ensure that DeepSeek’s overall GPU requirements are unlikely to decline:
These factors underscore why the initial expectation of reduced GPU usage was overly optimistic. While efficiencies improve individual task performance, they enable larger, more complex models and applications that increase overall demand.
Validating the Business Value of DeepSeek
Validating the business value of DeepSeek involves examining real-world applications, cost reductions, and performance gains enabled by its models. The following evidence highlights the tangible benefits, with specific references supporting each case:
Confirming the business value of DeepSeek involves examining real-world applications, cost reductions, and performance gains enabled by its models. The following thoughts highlights the tangible benefits:
Cost Efficiency
Increased Productivity in AI Workflows
Expansion into New Markets
Enhanced Accessibility
Competitive Advantage
Business Implications
Initial Expectations of GPU Reduction The market’s initial response that DeepSeek would reduce GPU consumption stemmed from the efficiency improvements the system highlighted, including FP8 mixed-precision training, auxiliary-loss-free load balancing, and speculative decoding. These advancements significantly reduce GPU requirements for individual tasks, leading to a perception that overall GPU usage would decline. However, this expectation overlooked the broader implications of scaling DeepSeek models and applications. By enabling larger models, more complex tasks, and broader adoption across industries, DeepSeek ultimately drives higher total GPU demand. This dichotomy between efficiency gains per task and aggregate GPU usage illustrates how initial impressions can evolve with a fuller understanding of real-world scaling practices and business adoption.
The broader implications for businesses reliant on AI infrastructure include:
Conclusion
While DeepSeek’s innovations improve GPU efficiency per task, the overall scale and ambition of its models ensure that total GPU consumption will rise. For businesses, this trend underscores the need for strategic investments in advanced infrastructure and cloud solutions. Organizations that adapt to these demands will be well-positioned to leverage cutting-edge AI capabilities, while those that lag may face challenges in keeping pace with technological advancements.
References
Managing Director | Artificial Intelligence | Data | Accenture Global Technology | ex-AWS | ex-IBM
8moYou have to question whether there would have been the same market hysteria if a west coast lab had developed and published an equivalent capability.
Maximising revenue and growth in a highly changing landscape.
9moTotally agree, the AI market is evolving and will continue to change and keep improving.
Assisting people to connect with martech
9moPerhaps the innovation lies in the technique itself. From what I’m beginning to understand (and I still have much more research and learning to do) about DeepSeek, is that it leverages the work done by previous LLMs. It’s interesting how the large LLMs initially learned from what was available on the internet, but now new LLMs are using the traditional LLMs to refine the very concept of what an LLM is. In my mind, it feels like RAG on steroids. It is sort of interesting to say "traditional LLMs" when they are in some cases less than 12 months old.
Coach | Facilitator | Consultant - - - Over 25+ years in Consulting, Account Management, Operations and Quality, Business Development - - - Accenture | Aon | Hewitt | Monitor | WNS | GE | Panell Kerr Forster
9moSheldon Kimber hope this helps
Founder & CEO at TomorrowX | Revolutionary Programmable Data Agent for Cyber and AI | Computational Linguist | Moonshot Entrepreneur 🚀
9moThanks for the post, Justin. Everyone is interested in AI - and for good reason! So, it’s invaluable to have a real expert’s view on the innovations being brought to market and what implications they may have upon the supply chains that are established and being established. Thanks for remaining curious - and for updating us!!