Big Benefits of Small AI Models for Tech Giants

Updated on November 20, 2024 3 minutes read

Big Benefits of Small AI Models for Tech Giants

In the pursuit of replicating human intelligence, the focus of the artificial intelligence arms race initially centered around the creation of massive models trained on extensive datasets. However, there has been a notable shift in recent times, with tech giants and startups increasingly directing their attention towards more streamlined and specialized AI software that is cheaper and faster.

These smaller and medium-sized language models, particularly designed for specific tasks and trained with less data, have gained significant popularity. Unlike their larger counterparts, they can be developed for a cost of under $10 million and utilize fewer than 10 billion parameters. For comparison, OpenAI’s GPT-4o, one of the largest models, required over $100 million to construct and used more than one trillion parameters. The smaller size of these models translates to lower computational power requirements and reduced prices per query.

For instance, Microsoft has placed emphasis on its Phi family of small models. Satya Nadella, CEO of Microsoft, claims that these models are 1/100th the size of the model behind OpenAI’s ChatGPT, yet they can handle many tasks with comparable efficiency. Yusuf Mehdi, Microsoft’s Chief Commercial Officer, reinforces the need for distinct models for different tasks, citing the higher-than-anticipated costs associated with running large models. Furthermore, Microsoft has recently introduced AI laptops that utilize numerous AI models for search and image generation. These models operate on the device itself without relying on extensive cloud-based supercomputers, as is the case with ChatGPT.

Other corporations such as Google and AI startups like Mistral, Anthropic, and Cohere have also released smaller models. Additionally, Apple has unveiled plans to integrate small models to enhance the speed and security of AI operations on phones.

OpenAI, renowned for advocating large models, has launched a more affordable version of its flagship model and intends to focus on developing smaller models in the future. Large models prove excessive for tasks like document summarization or image generation, similar to using a tank for a simple grocery shopping trip. Smaller models, on the other hand, can offer comparable performance at a significantly lower cost. These models are often tailored for specific tasks like managing legal documents or internal communications. Yoav Shoham of AI21 Labs asserts that small models are more economically feasible for widespread use, costing only a fraction of what large models would entail while providing answers to questions.

Businesses are readily adopting these smaller models to enhance efficiency and reduce costs. For example, Experian transitioned to smaller models for their AI chatbots and achieved similar performance to larger models but at a reduced expense. Salesforce’s Clara Shih highlights the practicality of smaller models, as models often lead to excessive expenditures and latency issues.

Since the release of GPT-4 by OpenAI, there have been no significant advancements in large model development, resulting in a stagnation of progress. Consequently, efforts have been redirected towards improving the efficiency of smaller models. Sébastien Bubeck from Microsoft observes a current pause in large development and encourages endeavors to enhance efficiency.

Despite this shift, large models still hold value for advanced tasks. Companies like Apple and Microsoft continue to incorporate large models such as ChatGPT into their products, although these integrations typically represent only a fraction of their overall AI initiatives. This progression signifies the transformation of AI from futuristic demonstrations to practical commercials.


Master Data Science and AI with Code Labs Academy’s industry-driven bootcamp, designed to teach you the skills top employers demand.