Chinese company DeepSeek’s new model surprised the market this week, with the tech sector initially experiencing a sell-off.
While this shift to potentially disruptive forms of model scaling hasn’t necessarily taken everyone aback in Silicon Valley, what DeepSeek has demonstrated is a different type of optimisation that requires far less initial processing, can produce extraordinary results.
Jamie Mills O’Brien, manager of abrdn Global Innovation Equity Fund, has shared his initial thoughts as the dust settles, and suggested why, despite the initial de-rating of many tech stocks across the AI semi supply chain, demand could remain robust.
He commented:
“DeepSeek’s approach suggests that AI model training does not require the most advanced Nvidia GPUs and a brute-force approach to training. This belief had been driving CAPEX spending from Hyperscalers in the US, who have been seeking to build a competitive advantage. It is too early to say what the exact fallout of the development will be, but there is a risk these companies will now need to reevaluate the justifications for any future spending and how best to compete.
AI tools still remain solutions in search of a problem overall. The focus has largely been on increasing the number of GPUs in advanced clusters in order to train the most powerful models possible. DeepSeek has demonstrated an alternative approach to building complex models which requires far less initial processing. This could enable greater competition by allowing more participants to develop custom AI models, potentially accelerating the commoditisation of AI as market participants aim to compete through product differentiation and specialisation.
Despite the initial de-rating of many tech stocks across the AI semi supply chain, demand could remain robust, and we remain relatively bullish on its prospects. The increased democratisation of AI can drive greater uptake among end users, sustaining the demand for increased processing power in the operational phase of the AI model lifecycle. The continued demand for processing power maintains a broadly constructive outlook for players in the semiconductor, data centre, and utility space.
What is likely not true is the $6m number – and DeepSeek are clear about that in their own release, in which they say the costs stated are only for the final training run. But distillation and the way they have trained the model (test time compute) likely are at lower costs than e.g. OpenAI.”