The Gen AI industry has spurred massive growth in the GPU market. NVIDIA, the most valuable company in the world, is at the forefront of this explosion. When it comes to GPU consumption models, a number of factors affect this industry – GPU supply (shortage vs. glut), cost vs. performance in the midst of the explosion of the various LLM models, complexity in the underlying technology choices, and the need for enterprises to experiment (do PoCs) as opposed to making long term commitments. 

The lack of certainty means enterprises and startups prefer to “rent” GPUs as opposed to buying dedicated hardware. This has created a new industry of “GPU-as-a-Service” or “AI Cloud” providers who rent GPUs to customers – often bare metal GPUs, but sometimes integrated with sophisticated softwares and services packaged for customers. This nascent GPU-as-a-Service market is forecasted to grow 16x to $80B over the next decade. 

To date, a small set of hyperscalers, crypto mining companies, and startups offer GPU-as-a-Service (GPUaaS). Moving forward, the number is expected to grow massively. Sequoia Capital, the world's leading venture capital firm recently published a blog that likened the GPU Capex buildout to the erstwhile railroad industry – “build the railroad and hope they will come”. 

If you have decided to be in the GPUaaS business, you are looking at a great business opportunity. However, as with any attractive business, there is no free lunch. As of July 2024, there are 600 new competitors, intense margin pressure, and a complex tech stack to deal with. Bare metal GPU instances are already at 2 dollars and 30 cents per hour. Supply pressure easing, these prices are likely to drop further.

Offering only bare metal-as-a-service is not a prudent ROI option for most AI Cloud providers. While it works for very large and long term workloads like training LLMs, the majority of the market does not need such large capacities locked up for extended durations. Inferencing, fine tuning and training of smaller deep learning models account for a much larger market with “bursty” and dynamic requirements. 

Given the rapid shifts in the GPU market, what should an AI Cloud provider do? If your customers are startups or enterprises building products that require training smaller models, inferencing or fine tuning existing off the shelf models, how should they go about it? These are some of the questions participants in the AI value chain are seeking answers to.

The lack of clarity in this emerging industry has given us at Aarna.ml an opportunity to provide an independent point of view to the GPUaaS providers. Over the next few weeks we will be publishing a series of posts on the GPU-as-a-service industry. We hope enterprises, startups, data center companies and AI Cloud providers can find our observations and opinions useful. 

About us : Aarna.ml is an NVIDIA and venture backed startup building software that helps GPU-as-a-service companies build hyperscaler grade cloud services with multi-tenancy and isolation.