Startup DreamersStartup Dreamers
  • Home
  • Startup
  • Money & Finance
  • Starting a Business
    • Branding
    • Business Ideas
    • Business Models
    • Business Plans
    • Fundraising
  • Growing a Business
  • More
    • Innovation
    • Leadership
Trending

‘Uncanny Valley’: Nvidia’s ‘Super Bowl of AI,’ Tesla Disappoints, and Meta’s VR Metaverse ‘Shutdown’

April 3, 2026

Kalshi Has Been Temporarily Banned in Nevada

April 2, 2026

‘A Rigged and Dangerous Product’: The Wildest Week for Prediction Markets Yet

April 1, 2026
Facebook Twitter Instagram
  • Newsletter
  • Submit Articles
  • Privacy
  • Advertise
  • Contact
Facebook Twitter Instagram
Startup DreamersStartup Dreamers
  • Home
  • Startup
  • Money & Finance
  • Starting a Business
    • Branding
    • Business Ideas
    • Business Models
    • Business Plans
    • Fundraising
  • Growing a Business
  • More
    • Innovation
    • Leadership
Subscribe for Alerts
Startup DreamersStartup Dreamers
Home » A New Attack Impacts ChatGPT—and No One Knows How to Stop It
Startup

A New Attack Impacts ChatGPT—and No One Knows How to Stop It

adminBy adminAugust 2, 20231 ViewsNo Comments4 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email

“Making models more resistant to prompt injection and other adversarial ‘jailbreaking’ measures is an area of active research,” says Michael Sellitto, interim head of policy and societal impacts at Anthropic. “We are experimenting with ways to strengthen base model guardrails to make them more ‘harmless,’ while also investigating additional layers of defense.”

ChatGPT and its brethren are built atop large language models, enormously large neural network algorithms geared toward using language that has been fed vast amounts of human text, and which predict the characters that should follow a given input string.

These algorithms are very good at making such predictions, which makes them adept at generating output that seems to tap into real intelligence and knowledge. But these language models are also prone to fabricating information, repeating social biases, and producing strange responses as answers prove more difficult to predict.

Adversarial attacks exploit the way that machine learning picks up on patterns in data to produce aberrant behaviors. Imperceptible changes to images can, for instance, cause image classifiers to misidentify an object, or make speech recognition systems respond to inaudible messages.

Developing such an attack typically involves looking at how a model responds to a given input and then tweaking it until a problematic prompt is discovered. In one well-known experiment, from 2018, researchers added stickers to stop signs to bamboozle a computer vision system similar to the ones used in many vehicle safety systems. There are ways to protect machine learning algorithms from such attacks, by giving the models additional training, but these methods do not eliminate the possibility of further attacks.

Armando Solar-Lezama, a professor in MIT’s college of computing, says it makes sense that adversarial attacks exist in language models, given that they affect many other machine learning models. But he says it is “extremely surprising” that an attack developed on a generic open source model should work so well on several different proprietary systems.

Solar-Lezama says the issue may be that all large language models are trained on similar corpora of text data, much of it downloaded from the same websites. “I think a lot of it has to do with the fact that there’s only so much data out there in the world,” he says. He adds that the main method used to fine-tune models to get them to behave, which involves having human testers provide feedback, may not, in fact, adjust their behavior that much.

Solar-Lezama adds that the CMU study highlights the importance of open source models to open study of AI systems and their weaknesses. In May, a powerful language model developed by Meta was leaked, and the model has since been put to many uses by outside researchers.

The outputs produced by the CMU researchers are fairly generic and do not seem harmful. But companies are rushing to use large models and chatbots in many ways. Matt Fredrikson, another associate professor at CMU involved with the study, says that a bot capable of taking actions on the web, like booking a flight or communicating with a contact, could perhaps be goaded into doing something harmful in the future with an adversarial attack.

To some AI researchers, the attack primarily points to the importance of accepting that language models and chatbots will be misused. “Keeping AI capabilities out of the hands of bad actors is a horse that’s already fled the barn,” says Arvind Narayanan, a computer science professor at Princeton University.

Narayanan says he hopes that the CMU work will nudge those who work on AI safety to focus less on trying to “align” models themselves and more on trying to protect systems that are likely to come under attack, such as social networks that are likely to experience a rise in AI-generative disinformation.

Solar-Lezama of MIT says the work is also a reminder to those who are giddy with the potential of ChatGPT and similar AI programs. “Any decision that is important should not be made by a [language] model on its own,” he says. “In a way, it’s just common sense.”

Read the full article here

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Articles

‘Uncanny Valley’: Nvidia’s ‘Super Bowl of AI,’ Tesla Disappoints, and Meta’s VR Metaverse ‘Shutdown’

Startup April 3, 2026

Kalshi Has Been Temporarily Banned in Nevada

Startup April 2, 2026

‘A Rigged and Dangerous Product’: The Wildest Week for Prediction Markets Yet

Startup April 1, 2026

Livestream Replay: The War Machine

Startup March 31, 2026

Arm Is Now Making Its Own Chips

Startup March 30, 2026

A New Game Turns the H-1B Visa System Into a Surreal Simulation

Startup March 29, 2026
Add A Comment

Leave A Reply Cancel Reply

Editors Picks

‘Uncanny Valley’: Nvidia’s ‘Super Bowl of AI,’ Tesla Disappoints, and Meta’s VR Metaverse ‘Shutdown’

April 3, 2026

Kalshi Has Been Temporarily Banned in Nevada

April 2, 2026

‘A Rigged and Dangerous Product’: The Wildest Week for Prediction Markets Yet

April 1, 2026

‘NYT Mini’ Clues And Answers For Wednesday, April 1

April 1, 2026

Livestream Replay: The War Machine

March 31, 2026

Latest Posts

Arm Is Now Making Its Own Chips

March 30, 2026

A New Game Turns the H-1B Visa System Into a Surreal Simulation

March 29, 2026

Google Shakes Up Its Browser Agent Team Amid OpenClaw Craze

March 28, 2026

Why Walmart and OpenAI Are Shaking Up Their Agentic Shopping Deal

March 27, 2026

At Palantir’s Developer Conference, AI Is Built to Win Wars

March 26, 2026
Advertisement
Demo

Startup Dreamers is your one-stop website for the latest news and updates about how to start a business, follow us now to get the news that matters to you.

Facebook Twitter Instagram Pinterest YouTube
Sections
  • Growing a Business
  • Innovation
  • Leadership
  • Money & Finance
  • Starting a Business
Trending Topics
  • Branding
  • Business Ideas
  • Business Models
  • Business Plans
  • Fundraising

Subscribe to Updates

Get the latest business and startup news and updates directly to your inbox.

© 2026 Startup Dreamers. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • Press Release
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.

GET $5000 NO CREDIT