Databricks shocked the big data world last week when it announced plans to acquire MosaicML for a cool $1.3 billion. With just $1 million in revenue at the end of 2022 and $20 million so far this year, some speculated that Databricks wildly overpaid. But according to Databricks CEO Ali Ghodsi, buying MosaicML will be a great deal that will pan out his company and its customers.
MosaicML is a generative AI startup co-founded in early 2021 by Naveen Rao, a respected technology executive who also was the founder of Nervana Systems, a deep learning software startup that Intel bought in 2016 for a reported $400 million.
MosaicML’s goal is pretty straightforward: It wants to make generative AI easy and to bring it to a larger number of people. To that end, MosaicML developed its own platform that allows customers to train their own large language model (LLM) on their own data, and to run it wherever they want.
Instead of forcing users to become AI experts who can train, tune, deploy, scale, and monitor this new type of AI product, MosaicML handles those nitty gritty details for its customers. Users get the choice of using any of the available open source LLMs. Alternatively they can use the company’s LLMs, including MPT-7B, a 7 billion parameter open source LLM, or the newly released MPT-30B, a 30 billion parameter LLM. It also has access to GPUs, which are at a premium these days.
MosaicML’s timing couldn’t be much better. While the AI and machine learning community has been aware of emerging capabilites in LLMs built on Google’s groundbreaking Transformer architecture for the last several years, the world at large didn’t find out about the developments until about eight months ago–to be more specific, November 30, the day OpenAI released ChatGPT to the world.
While ChatGPT has done wonders for garnering attention to progress in AI, not everyone is thrilled with the emerging business model (some are also afraid of what AI will do to society, but that’s another story). Many companies will be happy to tap into the power of OpenAI’s pre-trained GPT models via its API, but many others are vehemently against sending private data over the wire to OpenAI, let alone to Google or Microsoft, which has partnered with OpenAI and is integrating LLMs deeply across its product set.
And therein lies the first big reason that MosaicML landed on Databricks’ radar: MosaicML lets companies build their own LLMs while keeping their data private.
“In the last six to seven months, every large enterprise that we’ve talked to, I think every meeting that I’ve had with a customer, always ends up going towards generative AI,” Ghodsi said during a press conference last week. “Every conversation, even when it’s about something else, at the end of the day, it goes towards generative AI.
“And the number one thing that everyone is saying is we want to own our own intellectual property, we want to build our own machine learning models, and we want to be able to compete in my market that I’m in,” he continued.
When the Databricks executives asked around to see what companies were using to build their own LLMs, there was one name that kept popping up. “And the name we kept hearing in meetings was MosaicML, MosaicML, Naveen at Mosaic,” Ghodsi said.
Databricks reached out to MosaicML with the possibility of establishing a “close partnership.” But as the talks progressed, a barrier soon presented itself.
“We realized that if we really want to make it seamless, if we want to help customer build their own [LLM], keep their data private, and push down the cost and make it really cost effective to do this cheaply themselves, the best way is to join forces,” Ghodsi said.
The integration between Databricks and MosaicML hasn’t begun–regulators will get to have their say in the acquisition. Once it does, Ghodsi is convinced that the resulting platform will enable customers to churn out generative AI apps much more quickly than using alternative methods.
“It’s actually very, very hard to build them today,” he said. “Very few companies, very few researchers can do it. It’s a magic art. You have to know how to do everything right. And Mosiac just works out of the box. We actually tried it while we were doing our due diligence. You set some parameters and it just works out of the box. They have the GPUs and you get the model that you can own as your own IP on your own data. So it’s perfect.”
In just a few quick years, Databricks has grown from being keepers of Apache Spark to running one of the premier platforms for AI. It has already amassed 10,000 paying customers who can tap into Databricks’ solutions for data storage, SQL analytics, machine learning, and streaming data. The acquisition of MosaicML will add generative AI to the list of things customers can do with their data in Delta Lake.
This isn’t Databricks’ first foray into generative AI. Earlier this year, amid the ChatGPT hype, the company launched Dolly, a 6 billion parameter, open source LLM that customers can use to build their own generative AI applications. It followed that up in April with Dolly 2, a 12 billion parameter model that was also built on EleutherAI
While the tech giants focus on massive, proprietary LLMs that contain hundreds of billions of parameters and are trained on huge tracks of diverse data harvested from the public Internet, Databricks has espoused the benefits of using smaller, open source models trained on much more targeted datasets. Cost is a big reason for this approach, and with MosaicML, customers will be able to dial the cost up and down, depending on their needs.
“You can get something useable for $1,000, and if you go all the way up to $1 million, you can get something very high quality, extremely high quality,” Ghodsi said. “It’s far away from what the press is talking about…when they’re talking about OpenAI spending billions of dollars, or Google DeepMind spending billions, which they are. But we’re not talking about anything at all like that to be able to produce very valuable, very good machine learning models that can bring value to these organizations.”
Rao, who was a 2017 Datanami Person to Watch, also discussed the pending acquisition during a keynote address last week at the Databricks Data + AI Summit. Specifically, he talked about how the folks behind Replit, which develops a browser-based IDE for AI software, tapped MosaicML and Databricks to build new tools. According to Rao, it took just three days for Replit to get a working prototype after plugging in their data housed in Databricks.
“It kind of worked on the first try. That’s quite amazing. We love those use cases,” Rao said. “Really what we focus on from the very start was respecting data privacy. We believe that’s critical to scaling generative AI capabilities and democratizing them. If you don’t actually have technologies that can enforce data privacy, you end up with misaligned incentives int the market. So really giving people the ability to build solutions on their unique data, creating their own IP, is super important.”
MosaicML has scaled itself to address this generative AI frenzy. The company, which raised $37 million, hired more than 60 employees as it sought to grow the business. Ghodsi said the MosaicML team shares the same DNA as Databricks, which will make integrating the two companies easier.
Databricks is not buying MosaicML to gain access to its LLMs. As Ghodsi noted, LLMs will come and they will go. There will be thousands to choose from. Instead, MosaicML appealed to him and the Databricks executive team because of its capability to help users churn out their own custom LLMs and generative AI apps.
“I’m buying the factory that can create this,” he said. “I’m not buying Tesla Model X. If the Tesla Model X has some malfunction or something, it’s okay. I’m buying the factory that can produce those Tesla cars and can produce more and more of them together.”
But there’s one more thing that helped to seal the deal. “I would also say they have access to a lot of GPUs,” Ghodsi said. “That’s nice icing on the cake, because of the scarcity in GPUs right now.”