These are still early days for AI, but the trajectory from things like ChatGPT makes it pretty clear to Pure Storage: The need to store and serve huge amounts of data to train AI models on GPUs will almost certainly require large amounts of speedy, next-gen all-Flash arrays.
Large language models (LLMs) have given the world a taste for what AI can do, but there’s much more work to do, says Pure Storage Vice President of R&D Shawn Rosemarin.
“The challenge here is that the majority of businesses are looking to glean information from data sets that are not available in the open Internet, some of which is highly confidential, highly secure, highly restricted,” Rosemain tells Datanami. “And all of that data requires training on its own in order to actually be useful.”
AI models like ChatGPT function have given us a sort of reasoning engine, which is wonderful, Rosemarin says. Instead of requiring a human to absorb a bunch of data in order to make sense of it and be able to ask questions about it, pre-trained transformer models like ChatGPT have given us another path.
The next step is applying the same techniques on company’s private data, such as radiology records, trading records, or oil reserves, he says. That requires a significant increase in storage and compute.
“It puts tremendous pressure on storage. Because tape, where a lot of this is held, isn’t fast enough, can’t be parallelized. Hard drives aren’t fast enough, can’t be parallelized,” Rosemarin says. “Customers are very clearly seeing that storage is the bottleneck for them to get full utilization out of their GPUs. These things command a ton of power, but they also command a ton of storage, not just in terms of IOPS but in terms of parallel storage performance.”
Companies that originally considered Flash as their performance storage tier may need to rethink their approach, and move to Flash as their primary data store, he says. Flash arrays will be better able to keep GPUs fed with training data and handle all of the other data tasks required to train AI models.
“We have to think about this concept of training as being very data-intensive. We have to take very large data sets. We have to break those data sets into chunks of relevant information, specifically relevant by that I mean labeled, ideally labeled information,” Rosemarin says. “And then feed it to these GPUs…that can then go and train the model.”
Not only do large data sets required more storage, but training LLMs on large data requires more performance and more IOPs. All of this points to a future where super-fast Flash arrays become the standard for training AI models.
“More parameters means I need to have more IOPs, because I have more IOs per second so that I can actually train these models,” he says. “Performance becomes essential because the GPUs will consume as much data as I throw at it and in most cases, there is a major issue actually getting enough storage to the GPUs. And then there’s the parallelization of all these data services. I have potentially thousands of GPUs all starving for storage. They all want to be fed with storage in a very quick amount of time, and nobody wants to wait for anybody else to finish.”
Rosemarin, naturally, thinks Pure Storage has an inside track to be able to fill this looming demand for fast storage for AI training. He points to the fact that the company makes its own disks, or DirectFlash Modules (DFMs) from raw NAND sourced from suppliers, which he says gives Pure Storage more control. He points out that the company develops its own operating system, Purity, which also gives it more control.
In terms of capacity, Pure Storage also has a lead, Rosemarin says. Pure Storage’s roadmap calls for a 300 TB DFM by 2026, while other flash providers’ roadmaps only go out to 60 TB, Rosemarin says.
Pure Storage has worked with some of the largest AI companies in the world, including Facebook parent Meta, where it supplies storage for Meta AI’s Research Super Cluster (AI RSC), one the largest AI supercomputers in the world. Pure worked with Nvidia to devise its AI-Ready Infrastructure (AIRI) solution, which is built on the Nvidia DGX BasePOD reference architecture for AI and includes the latest FlashBlade//S storage.
This week at its Pure//Accelerate 2023 user conference, Pure Storage made several announcements, including the unveiling of new additions to its FlashArray//X and FlashArray//C R4 models, as well as ransomware protection for its Evergreen//One storage-as-a-service offerings.
Pure says the FlashArray//C R4 models delivery up to a 40% performance boost, an 80% increase in memory speeds, and a 30% increase in inline compression. The FlashArray//C line will include the 75TB QLC DFMs, X offering, while the FlashArray//X line will ship with the 36TB TLC DFMs, the company says.
The new service level agreement (SLA) for the Evergreen//One storage service, meanwhile, gives customers certain guarantees following a ransomware attack. Specifically, the company states that it will ship clean storage arrays the day following an attack at the latest, and that it will work with the customer to finalize a recovery plan within 48 hours.
Editor’s note: This article was corrected. Pure Storage expects 300TB DFMs by 2025, not 2026. Datanami regrets the error.