Storage and archiving technology deficiencies could thwart AI initiatives, preventing artificial intelligence technology from achieving its full potential. While the industry is focused on other potential roadblocks—including the inordinate amounts of energy consumed by AI data centers, the lack of power availability, the scarcity of GPUs and high-powered CPUs, or the lack of data center capacity in key markets—the inefficiency of many archiving solutions is being given less attention.
This topic is the subject of a recent report from the Active Archive Alliance (AAA), “How Arctive Archives Support Modern AI Strategies.”
“While much of the focus of AI adopters has been on the front end of data processing and analytics, the sustainability of AI workflows must now address the long-term retention and protection of what will be massive and persistent volumes of data,” said Rich Gadomski, Head of Tape Evangelism for FUJIFILM North America and AAA co-chair. “A modern strategy is needed to manage the growth and volume of data, and this can be provided by a sensible active archive implementation coupled with intelligent data management.”
Consider the size of many large language models (LLMs). All the data they analyze needs to be stored somewhere. Companies with unlimited budgets can afford to keep a lot of it in memory and the rest on solid-state drives (SSDs), but for most organizations, the price makes that out of the question. Even keeping all the data on spinning disks is an expensive way to go when you are dealing with the vastness of AI data repositories.
According to Furthur Market Research, storage capacity surpassed one zettabyte (ZB) in 2016, 4.8 ZB in 2022, and is expected to reach 50 ZB by 2035. Clearly, AI energy consumption is destined to become a real challenge.
That’s where an active archive comes in. It provides organizations with an intelligent data management layer that can move data where it belongs based on activity, cost, and performance. Data needed by AI applications is shifted to where it can be rapidly analyzed. Otherwise, it sits in a lower storage tier such as hard disks, optical disks, or tape. The data layer ensures there are never long delays in waiting for data to be made available. Automated tiering takes care of data movement. This contains costs and provides eco-friendly long-term storage or performance storage as needed.
As organizations create LLMs and originate AI applications, the need for more storage and efficient archiving will only increase. The topic of AI and energy consumption is not going away, and that will gradually bring storage and archiving discussion to the fore.
“AI will accelerate demand for active archive storage,” said Mark Pastor Director, Platform Product Management at Western Digital. “Archived data will have more value than ever before and will therefore need to be stored for a long time and actively accessed during its life.”