Startup DreamersStartup Dreamers
  • Home
  • Startup
  • Money & Finance
  • Starting a Business
    • Branding
    • Business Ideas
    • Business Models
    • Business Plans
    • Fundraising
  • Growing a Business
  • More
    • Innovation
    • Leadership
Trending

An FBI ‘Asset’ Helped Run a Dark Web Site That Sold Fentanyl-Laced Drugs for Years

February 26, 2026

Solving The Data Bottleneck For Physical AI

February 26, 2026

Supreme Court Rules Most of Donald Trump’s Tariffs Are Illegal

February 25, 2026
Facebook Twitter Instagram
  • Newsletter
  • Submit Articles
  • Privacy
  • Advertise
  • Contact
Facebook Twitter Instagram
Startup DreamersStartup Dreamers
  • Home
  • Startup
  • Money & Finance
  • Starting a Business
    • Branding
    • Business Ideas
    • Business Models
    • Business Plans
    • Fundraising
  • Growing a Business
  • More
    • Innovation
    • Leadership
Subscribe for Alerts
Startup DreamersStartup Dreamers
Home » A New Attack Impacts ChatGPT—and No One Knows How to Stop It
Startup

A New Attack Impacts ChatGPT—and No One Knows How to Stop It

adminBy adminAugust 2, 20230 ViewsNo Comments4 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email

“Making models more resistant to prompt injection and other adversarial ‘jailbreaking’ measures is an area of active research,” says Michael Sellitto, interim head of policy and societal impacts at Anthropic. “We are experimenting with ways to strengthen base model guardrails to make them more ‘harmless,’ while also investigating additional layers of defense.”

ChatGPT and its brethren are built atop large language models, enormously large neural network algorithms geared toward using language that has been fed vast amounts of human text, and which predict the characters that should follow a given input string.

These algorithms are very good at making such predictions, which makes them adept at generating output that seems to tap into real intelligence and knowledge. But these language models are also prone to fabricating information, repeating social biases, and producing strange responses as answers prove more difficult to predict.

Adversarial attacks exploit the way that machine learning picks up on patterns in data to produce aberrant behaviors. Imperceptible changes to images can, for instance, cause image classifiers to misidentify an object, or make speech recognition systems respond to inaudible messages.

Developing such an attack typically involves looking at how a model responds to a given input and then tweaking it until a problematic prompt is discovered. In one well-known experiment, from 2018, researchers added stickers to stop signs to bamboozle a computer vision system similar to the ones used in many vehicle safety systems. There are ways to protect machine learning algorithms from such attacks, by giving the models additional training, but these methods do not eliminate the possibility of further attacks.

Armando Solar-Lezama, a professor in MIT’s college of computing, says it makes sense that adversarial attacks exist in language models, given that they affect many other machine learning models. But he says it is “extremely surprising” that an attack developed on a generic open source model should work so well on several different proprietary systems.

Solar-Lezama says the issue may be that all large language models are trained on similar corpora of text data, much of it downloaded from the same websites. “I think a lot of it has to do with the fact that there’s only so much data out there in the world,” he says. He adds that the main method used to fine-tune models to get them to behave, which involves having human testers provide feedback, may not, in fact, adjust their behavior that much.

Solar-Lezama adds that the CMU study highlights the importance of open source models to open study of AI systems and their weaknesses. In May, a powerful language model developed by Meta was leaked, and the model has since been put to many uses by outside researchers.

The outputs produced by the CMU researchers are fairly generic and do not seem harmful. But companies are rushing to use large models and chatbots in many ways. Matt Fredrikson, another associate professor at CMU involved with the study, says that a bot capable of taking actions on the web, like booking a flight or communicating with a contact, could perhaps be goaded into doing something harmful in the future with an adversarial attack.

To some AI researchers, the attack primarily points to the importance of accepting that language models and chatbots will be misused. “Keeping AI capabilities out of the hands of bad actors is a horse that’s already fled the barn,” says Arvind Narayanan, a computer science professor at Princeton University.

Narayanan says he hopes that the CMU work will nudge those who work on AI safety to focus less on trying to “align” models themselves and more on trying to protect systems that are likely to come under attack, such as social networks that are likely to experience a rise in AI-generative disinformation.

Solar-Lezama of MIT says the work is also a reminder to those who are giddy with the potential of ChatGPT and similar AI programs. “Any decision that is important should not be made by a [language] model on its own,” he says. “In a way, it’s just common sense.”

Read the full article here

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Articles

An FBI ‘Asset’ Helped Run a Dark Web Site That Sold Fentanyl-Laced Drugs for Years

Startup February 26, 2026

Supreme Court Rules Most of Donald Trump’s Tariffs Are Illegal

Startup February 25, 2026

Mark Zuckerberg Tries to Play It Safe in Social Media Addiction Trial Testimony

Startup February 24, 2026

Inside the Rolling Layoffs at Jack Dorsey’s Block

Startup February 23, 2026

Code Metal Raises $125 Million to Rewrite the Defense Industry’s Code With AI

Startup February 22, 2026

Senators Urge Top Regulator to Stay Out of Prediction Market Lawsuits

Startup February 20, 2026
Add A Comment

Leave A Reply Cancel Reply

Editors Picks

An FBI ‘Asset’ Helped Run a Dark Web Site That Sold Fentanyl-Laced Drugs for Years

February 26, 2026

Solving The Data Bottleneck For Physical AI

February 26, 2026

Supreme Court Rules Most of Donald Trump’s Tariffs Are Illegal

February 25, 2026

Mark Zuckerberg Tries to Play It Safe in Social Media Addiction Trial Testimony

February 24, 2026

Inside the Rolling Layoffs at Jack Dorsey’s Block

February 23, 2026

Latest Posts

Senators Urge Top Regulator to Stay Out of Prediction Market Lawsuits

February 20, 2026

Zillow Has Gone Wild—for AI

February 19, 2026

OpenAI’s President Gave Millions to Trump. He Says It’s for Humanity

February 18, 2026

Meta Goes to Trial in a New Mexico Child Safety Case. Here’s What’s at Stake

February 16, 2026

Salesforce Workers Circulate Open Letter Urging CEO Marc Benioff to Denounce ICE

February 15, 2026
Advertisement
Demo

Startup Dreamers is your one-stop website for the latest news and updates about how to start a business, follow us now to get the news that matters to you.

Facebook Twitter Instagram Pinterest YouTube
Sections
  • Growing a Business
  • Innovation
  • Leadership
  • Money & Finance
  • Starting a Business
Trending Topics
  • Branding
  • Business Ideas
  • Business Models
  • Business Plans
  • Fundraising

Subscribe to Updates

Get the latest business and startup news and updates directly to your inbox.

© 2026 Startup Dreamers. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • Press Release
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.

GET $5000 NO CREDIT