In the dizzying race to build generative A.I. systems, the tech industry’s mantra has been bigger is better, no matter the price tag.
Now tech companies are starting to embrace smaller A.I. technologies that are not as powerful but cost a lot less. And for many customers, that may be a good trade-off.
On Tuesday, Microsoft introduced three smaller A.I. models that are part of a technology family the company has named Phi-3. The company said even the smallest of the three performed almost as well as GPT-3.5, the much larger system that underpinned OpenAI’s ChatGPT chatbot when it stunned the world upon its release in late 2022.
The smallest Phi-3 model can fit on a smartphone, so it can be used even if it’s not connected to the internet. And it can run on the kinds of chips that power regular computers, rather than more expensive processors made by Nvidia.
Because the smaller models require less processing, big tech providers can charge customers less to use them. They hope that means more customers can apply A.I. in places where the bigger, more advanced models have been too expensive to use. Though Microsoft said using the new models would be “substantially cheaper” than using larger models like GPT-4, it did not offer specifics.
The smaller systems are less powerful, which means they can be less accurate or sound more awkward. But Microsoft and other tech companies are betting that customers will be willing to forgo some performance if it means they can finally afford A.I.
Customers imagine many ways to use A.I., but with the biggest systems “they’re like, ‘Oh, but you know, they can get kind of expensive,’” said Eric Boyd, a Microsoft executive. Smaller models, almost by definition, are cheaper to deploy, he said.
Mr. Boyd said some customers, like doctors or tax preparers, could justify the costs of the larger, more precise A.I. systems because their time was so valuable. But many tasks may not need the same level of accuracy. Online advertisers, for example, believe they can better target ads with A.I., but they need lower costs to be able to use the systems regularly.
“I want my doctor to get things right,” Mr. Boyd said. “Other situations, where I am summarizing online user reviews, if it’s a little bit off, it’s not the end of the world.”
Chatbots are driven by large language models, or L.L.M.s, mathematical systems that spend weeks analyzing digital books, Wikipedia articles, news articles, chat logs and other text culled from across the internet. By pinpointing patterns in all that text, they learn to generate text on their own.
But L.L.M.s store so much information, retrieving what is needed for each chat requires considerable computing power. And that is expensive.
While tech giants and start-ups like OpenAI and Anthropic have been focused on improving the largest A.I. systems, they are also competing to develop smaller models that offer lower prices. Meta and Google, for instance, have released smaller models over the past year.
Meta and Google have also “open sourced” these models, meaning anyone can use and modify them free of charge. This is a common way for companies to get outside help improving their software and to encourage the larger industry to use their technologies. Microsoft is open sourcing its new Phi-3 models, too.
(The New York Times sued OpenAI and Microsoft in December for copyright infringement of news content related to A.I. systems.)
After OpenAI released ChatGPT, Sam Altman, the company’s chief executive, said the cost of each chat was “single-digits cents” — an enormous expense considering what popular web services like Wikipedia are serving up for tiny fractions of a cent.
Now, researchers say their smaller models can at least approach the performance of leading chatbots like ChatGPT and Google Gemini. Essentially, the systems can still analyze large amounts of data but store the patterns they identify in a smaller package that can be served with less processing power.
Building these models are a trade-off between power and size. Sébastien Bubeck, a researcher and vice president at Microsoft, said the company built its new smaller models by refining the data that was pumped into them, working to ensure that the models learned from higher-quality text.
Part of this text was generated by the A.I. itself — what is known as “synthetic data.” Then human curators worked to separate the sharpest text from the rest.
Microsoft has built three different small models: Phi-3-mini, Phi-3-small and Phi-3-medium. Phi-3-mini, which will be available on Tuesday, is the smallest (and cheapest) but the least powerful. Phi-3 Medium, which is not yet available, is the most powerful but the largest and most expensive.
Making systems small enough to go directly on a phone or personal computer “will make them a lot faster and order of magnitudes less expensive,” said Gil Luria, an analyst at the investment bank D.A. Davidson.