AI Can Really Cook! How Far Can It Go?

Video: from drawing unicorns to – well, how high will AI fly?

We’ve seen a lot about large learning models in general, and a lot of that has been elucidated at this conference, but many of the speakers have great personal takes on how this type of process works, and what it can do!

For example, here we have Yoon Kim talking about statistical objects, and the use of neural networks (transformer-based neural networks in particular) to use next-word prediction in versatile ways. He uses the example of the location of MIT:

“You might have a sentence like: ‘the Massachusetts Institute of Technology is a private land grant research university’ …. and then you train this language model (around it),” he says. “Again, (it takes) a large neural network to predict the next word, which, in this case, is ‘Cambridge.’ And in some sense, to be able to accurately predict the next word, it does require this language model to store knowledge of the world, for example, that must store factoid knowledge, like the fact that MIT is in Cambridge. And it must store … linguistic knowledge. For example, to be able to pick the word ‘Cambridge,’ it must know what the subject, the verb and the object of the preceding or the current sentence is. But these are, in some sense, fancy autocomplete systems.”

Pairing knowledge of the world with linguistic knowledge, he says, is important for these systems.

Check out the part of the video where he talks about a New Yorker cartoon illustrating some of the worst-case scenarios around this type of learning.

As for the capabilities, Kim talks about drawing a unicorn with TiKZ:

“You know, it’s not a good unicorn, but it’s doing something,” he notes. “And this sort of made my jaw drop, the fact that you could get a model to produce something that connotes some sort of sophisticated real-world knowledge, just through being able to predict next-word tokens.”

Later, Kim goes over some of the limits and capabilities of surface-form text work.

“We’re looking at this through the lens of what we’re calling counterfactual evaluation of language models. In particular, our hypothesis is that if a language model is really able to solve a task … then it should be able to solve the same task under a counterfactual world that essentially describes the same task of (the) same difficulty.”

In what he calls “counterfactual evaluation,” he shows how the system tries to accommodate requests, but starts to see loss of coherence when asked to rotate an image 90 degrees.

Another big failure, he suggests, is evident in training the system with a theoretical language called “ThonPy” that’s the same as Python, but has a one-based index, as opposed to the conventional zero-based index.

“It turns out, these models fail spectacularly at the ThonPy language,” Kim says, using that result as a bellwether for an AI’s capacity to learn other systems.

You can watch this part of the presentation to see more about why these systems may not be able to do some of the higher-level tasks that they are asked to do.

“We see the performance of this sort of extensible ability to (perform tasks) decrease dramatically in this counterfactual world,” he says. “So we’re currently working on a suite of benchmarks across math, code, programming, linguistics, even music, to see how much these models’ capabilities are due to their deep reasoning capabilities, or in some sense, sort of overfitting to the surface form text. So … when we were doing these experiments, often we were very amused by the failures, but also incredibly impressed by these capabilities.”

Training, he said, can distill lots of this knowledge, and what we need, he suggests, is a catalog approach, for organizing the components that will build new AI applications.

“I think what’s been incredibly surprising to me is … the fact that you can train these simple parameterized learners on large corpora, and get a system that is, in some sense, a general purpose language learner. So it turns out surface form prediction, in particular, next-word prediction, is in some sense, a viable path towards distilling really civilization-level knowledge that’s embedded in the text – that exists out there, into a learner. But these models are, of course, not so perfect. They’re still prone to memorization. And I think there’s still a lot of work to be done in cataloguing the limits and capabilities of these systems.”

It’s a good set of ideas to use to pursue the next forms of AI that will astound us humans!

Read the full article here