Today I heard from MIT research scientist Andrei Barbu about working with LLMs, and how to prevent certain kinds of problems related to data leaks.
“We’re interested in studying language in particular,” he said of his work, turning to some of the goals of research teams in this area.
Thinking about the duality of human and computer cognition, he pointed out some differences. Humans, he said, teach one another, for example. Another thing humans can do is keep secrets. That might be challenging for our digital counterparts.
“There’s a problem with LLMs,” he noted, explaining their potential for leaks. “They can’t keep a secret.”
In outlining how to identify the issue, Barbu talked about prompt injection attacks as a prime example. We heard something a little like this from Adam Chipala just prior, where he mentioned verifiable software principles as a potential solution.
Barbu brought up something that I found interesting, though: a kind of catch-22 for systems that are not very good at sealing in data.
“Models are as sensitive as the most sensitive piece of data put inside that model,” he explained. “People can interrogate the model. … (on the other hand, models are) as weak and vulnerable to attack as the least sensitive piece that you put in.”
People can poison your model, he suggested, in some cases, fairly easily.
The solution: Barbu talked about a customized model, with fine tuning, and how that may work.
Specifically, he was talking about something called low-rank adaptation or LORA, a fine-tuning method pioneered at Microsoft in 2021.
When I looked into it, experts cite two things that LORA does in a sort of proprietary way: it tracks changes to weights rather than updating weights directly, and it makes the large matrix of weight changes into smaller matrices to isolate parameters.
In describing some related methods, Barbu talked about extracting what’s needed from a library of components, and explained that there are lots of ways to approach this. Using a Venn diagram, he noted the difference between adaptive and selective types of methods, and others.
You can think, he suggested, about strategies like turning English into SQL to problem-solve and expedite solutions. At the end of the day, though, the challenge remains.
“Security is binary,” he noted. “You succeed or fail.”
Now, this part of the presentation caught my ear: Barbu talked about some kinds of potential AI tools that could end up taking a lot of the labor out of information security. He described a situation where the tools sit on top of a network, looking for sensitive information, and actually provide input.
This, he suggested, could solve some rampant HIPAA problems with leaks of protected health data.
“We just don’t have a good solution to this so far,” he said. But with the right tools, we might!
Another good idea that I heard from his talk was around labeling ‘informed perplexity’ and ‘uninformed perplexity’ to see where a problem lies.
Then, he said, we can also improve the secret-keeping capabilities of our LLMs.
“We can build secure LLMs,” he added, “models that are completely immune to any kind of attack, because we simply don’t connect the parameters that the user should have access to, so there’s literally nothing that they can possibly do…”
Look, there was a lot more in this demo – for example, Barbu talked a good bit about the scenario of keeping data on alien landings in a database – just as an example! So that part was notable, too. You’ll have to delve into the video to really see all of it in detail. But I’m going to keep bringing the highlights from this conference to you, almost in real time!
Read the full article here