Microsoft Secretly Develops First $100 Billion Large Model, by a rival to OpenAI
In a recent announcement, Microsoft revealed that it is developing a new AI model at the 100 billion parameter level, less than two weeks after the release of the Phi-3 Mini model.
For the first time since investing more than $10 billion in OpenAI in exchange for the right to reuse its AI models, Microsoft has begun to internally develop a new AI model that is sufficiently large to compete with state-of-the-art models from Google, Anthropic, and OpenAI.
The new model, known internally as MAI-1, is being overseen by Mustafa Suleyman, a former Google AI leader and CEO of AI startup Inflection. MAI-1 will have parameter sizes that are significantly larger than those of any of the smaller open-source models that Microsoft has trained previously, such as Phi-3, according to sources familiar with the matter. This will result in higher costs due to the need for more computing power and training data.
At the same time, Microsoft's move suggests that it is now pursuing a "dual track" in AI, with the goal of developing "small language models" that can be cheaply built into apps and run on mobile devices, as well as larger, state-of-the-art AI models. Apple is also pursuing a similar strategy, having released eight small AI language models for use on devices.
The MAI-1 is expected to have approximately 500 billion parameters or settings that can be adjusted to determine what the model learns during training. By comparison, OpenAI's GPT-4 has more than 1 trillion parameters, while smaller open-source models released by companies such as Meta and Mistral have 70 billion parameters. This suggests that MAI-1 could be positioned as a model at a level between GPT-3 and GPT-4, which would be able to provide response accuracy well above that of open source models such as Llama and Mistral, but at or below that of OpenAI's flagship LLM. To train the model, Microsoft has been allocating a significant number of servers with Nvidia GPUs and compiling training data from a variety of sources, including OpenAI's GPT-4-generated text and public Internet data.
The precise application of MAI-1 remains undetermined, even within Microsoft. Its optimal use will depend on its performance. If the model in question has 500 billion parameters, it would be too complex to run on consumer devices. This implies that Microsoft will likely utilise MAI-1 in its data centres, where the large language model can be integrated into services such as Bing and Azure.
Inflection was previously a competitor to OpenAI, but its business focus has shifted from Pi, the chatbot, to selling AI software to enterprises, according to a statement on OpenAI's official website. Sean White, who has held a variety of technical roles, has joined the company as its new CEO. In March, Microsoft acquired most of the startup's employees and intellectual property for $650 million and hired Suleiman to lead a new consumer AI division. The division brings together consumer-facing products, including Microsoft's Copilot, Bing, Edge and GenAI, into a team called Microsoft AI, which reports directly to Microsoft CEO Satya Nadella.
The new division represents a significant organisational change for Microsoft, with the president of network services, Mikhail Parakhin, reporting to Suleiman along with his entire team. This is also one of Microsoft's latest moves to capitalise on the generative AI boom. Microsoft's development of the MAI-1 Big Model also demonstrates its willingness to explore AI development independently of AI vendors such as OpenAI.
Previously, Microsoft has been working to roll out AI assistants in its Windows, Office software and cybersecurity tools, among other products. However, it has mostly taken the approach of partnering with outside firms. Last year, Microsoft invested $13 billion in OpenAI, the developer of ChatGPT, and is rapidly integrating its technology into products and digital experiences. OpenAI's technology is now integrated into many of Microsoft's generative AI capabilities, including Azure, Copilot, and chatbots built into Windows.
It is possible that this could change in the future, with Microsoft considering the use of its own large models across its products. The department led by Suleiman is said to be taking over projects such as integrating the AI version of Copilot into the Windows operating system, as well as working on projects to enhance the use of generative AI in its Bing search engine.