$30 DeepSeek dupe? US scientists claim to duplicate AI model for peanuts

by Pelican Press
3 minutes read

$30 DeepSeek dupe? US scientists claim to duplicate AI model for peanuts

A group of researchers at the University of California, Berkeley, claims they’ve managed to reproduce the core technology behind DeepSeek’s headline-grabbing AI at a total cost of roughly $30.

The news is another twist in a quickly developing narrative about whether building state-of-the-art AI demands colossal budgets or if far more affordable alternatives have been overlooked by tech’s biggest players.

DeepSeek made waves recently by introducing R1, an AI model that claims to replicate the functions of ChatGPT and other costly systems at just a fraction of the training expense typically seen in Silicon Valley.

The Berkeley team’s response? To do it even more cheaply. Led by PhD candidate Jiayi Pan, the researchers created a smaller-scale version, dubbed “TinyZero,” and released it on GitHub for public experimentation. Though it lacks the massive 671-billion-parameter heft of DeepSeek’s main offering, Pan says TinyZero captures the core behaviors seen in DeepSeek’s so-called “R1-Zero” model.

Pan’s approach centers on reinforcement learning, a technique in which the AI, starting with almost random guesses, gradually refines its answers by revising and searching through possible solutions. In a post describing the project, he highlighted the Countdown game, a British TV puzzle where players combine given numbers to reach a target value. “The results: it just works!” Pan wrote that although the AI initially spat out “dummy outputs,” it ultimately figured out how to correct its mistakes.

The bigger revelation

The idea that a few days of work and a handful of dollars could replicate such a core AI capability is an eye-opener for many industry watchers. It flies in the face of conventional wisdom that big breakthroughs in AI require entire data centers, power-hungry GPUs, and millions or even billions of dollars in spending.

DeepSeek shook the tech world by asserting that training its main model costs merely a few million, substantially less than many U.S. firms spend on AI. According to Pan and his team, it can be done for a mere $30 on a small scale.

Still, skeptics have urged caution. Critics point out that DeepSeek’s claimed affordability numbers may not give the complete picture, as the company might be benefiting from alternate resources or distillation techniques from other proprietary models.

While Pan’s “TinyZero” shows that advanced reinforcement learning can be done on a budget, it doesn’t necessarily address the depth or breadth of tasks the larger DeepSeek system can handle. TinyZero may be more akin to a simplified proof-of-concept than a fully fledged challenger.

Yet the demonstration hints at a deeper shift in the AI scene. If open-source developers can replicate sophisticated functionalities with scant resources, it raises questions about why major players like OpenAI, Google, or Microsoft pour vast sums into their platforms. On one hand, scale and advanced capabilities do come at a price. On the other, the possibility of cost inflation within the industry emerges. After all, open-source initiatives could undercut these tech giants by operating on leaner budgets.

“TinyZero” and DeepSeek’s R1 indicate a growing appetite for compact, resource-friendly AI models. Many had assumed that cutting-edge breakthroughs required billions in expenditure. Now, it seems a bright graduate student or a scrappy startup might surprise the world with innovation on the cheap. Whether this ultimately reshapes the future of AI infrastructure or remains an exciting anomaly, the conversation around affordable, powerful AI is only beginning.

For Pan and his team, there’s a clear aim: “We hope this project helps to demystify the emerging RL scaling research,” he wrote. Judging by the global reaction, they’ve already made their mark.



Source link

#DeepSeek #dupe #scientists #claim #duplicate #model #peanuts

Add Comment

You may also like