Open Source Vs. Proprietary Models

Generative AI, dominated by proprietary models locked inside big tech companies, is

being disrupted by a new wave of open-source models.

Advocates argue open sourcing has vital benefits like enabling wider access, fostering innovation, and promoting transparency. Many people argue that open source will win in the marketplace.

But that conclusion is not obvious.

Open-sourcing generative AI is fundamentally different from the open-source movement that has given us tools like TensorFlow, MySQL or Kubernetes. Open-source dominated those arenas because the investment required – time and brain power – could be crowdsourced. But generative AI requires data and energy, both of which are increasingly expensive and out-of-reach for most open-source players.

A handful of big players are dumping billions into generative AI models, and they have largely cornered the GPU market. And as these proprietary players compete, they could offer their models at cost or low margin to build market share. In the near term, the price of remaining GPU capacity could escalate to a point where it’s not competitive for open-source users to run large models.

Furthermore, the value created by proprietary models and open-source models could be asymmetric. There is work on reducing generative AI models so that they can compute on edge devices such as smartphones or autonomous vehicles. But the greatest value will be in models – or the orchestration of multiple models – with agency, that can reason, control other software and act in the real world.

That is likely to remain the purview of well-resourced private companies. Already, companies are building agents that leverage models like GPT-4 for reasoning and can manipulate tools – something that open-source generative AI models cannot fully execute today. Incentives are high to keep such more performant models proprietary.

Meta is the exception with its decision to open source its LLaMA series of models. But there may yet be a backlash against open-sourcing such powerful models because of the potential for misuse.

While Meta argues that generative AI should be open source because otherwise too much power is concentrated in the hands of a few, there is an argument for keeping generative AI locked up – particularly once large models develop agency, which they most likely will.

At a recent closed-door Senate forum convened by Chuck Schumer (D-NY), Meta’s founder, Mark Zuckerberg, was challenged about the safety of open-sourcing such powerful technology, citing LLaMA 2’s ability to give detailed instructions for creating Anthrax, a deadly toxin.

While the conversation moved on, that argument hung in the air. A team of researchers at Collaborations Pharmaceuticals recently asked a proprietary generative AI system called MegaSyn to create toxic molecules and it generated a large number, including some that were similar to known nerve agents.

Open-source models allow access to parameters – the weights between artificial neurons that influence predictions – so that anyone can experiment with the models without constraint.

While the creators of open-source models build in guardrails, researchers say it takes only a few days to remove that fine tuning so the model will do whatever someone wants it to do. Given that, it may be hard for open source to prevail against proprietary models without some scientific breakthrough – competitive performance with fewer parameters, for example – or a commitment by powerful private players such as Meta to outsource future generations of models.

As difficult as it is to accept big tech controlling generative AI, it may be the only path forward without such breakthroughs or a shift in funding. But things are moving fast…

Read the full article here