DeepSeek Fails Researchers’ Safety Tests

DeepSeek Fails Researchers’ Safety Tests DeepSeek Fails Researchers’ Safety Tests

Chinese AI firm DeepSeek is making headlines with its low-cost and high-performance chatbot, but it may have an AI safety problem.

Cisco’s research team used algorithmic jailbreaking techniques to test DeepSeek R1 “against 50 random prompts from the HarmBench dataset,” covering six categories of harmful behaviors including cybercrime, misinformation, illegal activities, and general harm.

“The results were alarming: DeepSeek R1 exhibited a 100% attack success rate, meaning it failed to block a single harmful prompt,” Cisco says. “This contrasts starkly with other leading models, which demonstrated at least partial resistance.”

Other frontier models, such as o1, blocked a majority of adversarial attacks with its model guardrails, according to Cisco.

attack success rate on popular llms

(Credit: Cisco)

As Wired notes, security firm Adversa AI reached similar conclusions.

Cisco’s researchers point to the much lower budget of DeepSeek compared to rivals as a potential reason for these failings, saying its cheap development came at a “different cost: safety and security.” DeepSeek claims its model took just $6 million to develop, while a six-month training run for OpenAI’s yet-to-be-released GPT-5 “can cost around half a billion dollars in computing costs alone, The Wall Street Journal reports.

Though DeepSeek may be easier to fool with the right know-how, it’s been shown to have strong content restrictions—at least when it comes to China-related political content. We tested it on controversial topics, such as the treatment of Uyghurs by the Chinese government, a Muslim minority group that the UN claims is being persecuted. DeepSeek replied: “Sorry, that’s beyond my current scope. Let’s talk about something else.”

Recommended by Our Editors

The chatbot also refused to answer questions about the Tiananmen Square Massacre, a 1989 student demonstration in Beijing where protesters were gunned down. But it’s yet to be seen if AI safety or censorship issues will have any impact on DeepSeek’s skyrocketing popularity.

According to web traffic tracking tool Similarweb, the LLM has gone from receiving just 300,000 visitors a day earlier at launch to 6 million visitors. Meanwhile, US tech firms like Microsoft and Perplexity are rapidly incorporating DeepSeek, which uses an open-source model.

Get Our Best Stories!

Sign up for What’s New Now to get our top stories delivered to your inbox every morning.

This newsletter may contain advertising, deals, or affiliate links.
By clicking the button, you confirm you are 16+ and agree to our
Terms of Use and
Privacy Policy.
You may unsubscribe from the newsletters at any time.

Newsletter Pointer

About Will McCurdy

Contributor

Will McCurdy

I’m a reporter covering weekend news. Before joining PCMag in 2024, I picked up bylines in BBC News, The Guardian, The Times of London, The Daily Beast, Vice, Slate, Fast Company, The Evening Standard, The i, TechRadar, and Decrypt Media.

I’ve been a PC gamer since you had to install games from multiple CD-ROMs by hand. As a reporter, I’m passionate about the intersection of tech and human lives. I’ve covered everything from crypto scandals to the art world, as well as conspiracy theories, UK politics, and Russia and foreign affairs.


Read Will’s full bio

Read the latest from Will McCurdy

Add a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use