Top 10 Cloud GPU Providers in 2023


GPU procurement complexity has been increasing with more providers offering GPU cloud options. AIMultiple analyzed GPU cloud provider across most relevant dimensions to facilitate cloud GPU procurement.

While listing pros and cons for each provider, we relied on user reviews on G2, other online reviews as well as our assessment.

Amazon Web Services (AWS)

AWS is the largest cloud platform provider and a leading cloud GPU provider.1Amazon EC2 (Elastic Compute Cloud) offers GPU-powered virtual machine instances facilitating accelerated computations for deep learning tasks. 

Pros

Offers seamless integration with other popular AWS solutions like:

  • SageMaker, used for creating, training, deploying, and large-scale application of ML models
  • Simple Storage Service (Amazon S3), Amazon RDS (Relational Database Services) or other AWS storage services, which can serve as a storage solution for training data

Cons

  • AWS offers fewer GPU options than some other players like Azure.
  • UI is found to be complex by users
2
A review about AWS EC2 2
1
A review about AWS EC2 3
  • On-demand pricing per hour is higher than other big cloud providers. Like other cloud providers, AWS offers volume discounts.

Microsoft Azure

Microsoft Azure, the second largest cloud provider, provides a cloud-based GPU service known as Azure N-Series Virtual Machines, which leverages NVIDIA GPUs like other providers to deliver high-performance computing capabilities.4 This service is particularly suited for demanding applications such as deep learning, simulations, rendering and the training of AI models.

Pros

  • Microsoft Azure is offering a larger set of GPU options than most other providers
  • Free plan offers 12 months of access to some services
  • Azure’s intuitive user interface is praised for its ease of use

Cons

  • Some users find that certain advanced features within Azure require a high level of technical expertise to configure and manage effectively
3
A review about Azure Virtual Machines 5
  • Some users find Azure’s pricing structure complex to navigate and stress the importance of careful planning to avoid unexpected costs

Google Cloud Platform (GCP)

Google Cloud Platform (GCP) is the third biggest cloud platform.6 GCP offers GPU instances that can be attached to existing virtual machines (VMs) or can be part of a new VM setup.

Pros

  • UI is easier than other common platforms such as AWS
  • Offers limited free GPU options for Kaggle and Colab users
  • Customers can use 20+ products for free, up to monthly usage limits

Cons

  • GPUs must be attached to standard VMs, making pricing confusing
  • Like AWS, GCP offers fewer GPU options than some players like Azure

NVIDIA DGX Cloud

NVIDIA is the leader in the GPU hardware market. NVIDIA launched its GPU cloud offering, DGX Cloud, by leasing space in leading cloud providers’ (e.g. OCI, Azure and GCP) data centers.

DGX cloud offers NVIDIA Base Command™, NVIDIA AI Enterprise and NVIDIA networking platforms. DGX Cloud instances featured 8 NVIDIA H100 or A100 80GB Tensor Core GPUs at launch.

An initial customer’s, Amgen’s, research team claims 3x faster training of protein LLMs with BioNeMo and up to 100x faster post-training analysis with NVIDIA RAPIDS.8

Oracle Cloud Infrastructure (OCI)

Oracle ramped up its GPU offering after formalizing its partnership with NVIDIA.9

Oracle provides GPU instances in both bare-metal and virtual machine formats for quick, cost-effective, and high-efficiency computing. Oracle’s Bare-Metal instances offer customers the capability to execute tasks in non-virtualized settings. These instances are accessible in regions such as the United States, Germany, and the United Kingdom, with availability under both on-demand and interruptible pricing models.

Pros

  • Wide range of cloud products and services. Among the tech giants’ cloud services, only OCI offers bare metal GPUs.10 For GPU cluster users, only OCI offers RoCE v2 for its cluster technology among the tech giants’ cloud services.11
  • Cost-effective compared to other major cloud providers
  • Offers provision for free trial period and some free-forever products

Cons

  • User interface perceived as clunky and slow by users
5
A review about the OIC Compute 12
  • Some users find the documentation difficult to understand
4
A review about the OIC Compute 13
  • The process of starting to use Oracle Cloud compute services was viewed as bureaucratic, complicated, and time-consuming by some users

CoreWeave

CoreWeave is a specialized GPU cloud provider. NVIDIA is one of CoreWeave’s investors. CoreWeave claims to have 45,000 GPUs and to be selected as the first first Elite level cloud services provider by NVIDIA.14

Jarvis Labs

Jarvis Labs, established in 2019 and based in India, specializes in facilitating swift and straightforward training of deep learning models on GPU compute instances. With its data centers located in India, Jarvis Labs is recognized for its user-friendly setup that enables users to start operations promptly.

Pros

  • No credit card required to register
  • A simple interface for beginners

Cons

  • Although gaining momentum, Jarvis Labs is not a good option for enterprise-level and time-consuming tasks

Lambda Labs

Originally, Lambda Labs was a hardware company offering GPU desktop assembly and server hardware solutions. Since 2018, Lambda Labs offer Lambda Cloud as a GPU platform. The virtual machines they offer are pre-equipped with predominant deep learning frameworks, CUDA drivers, and a dedicated Jupyter notebook. Users can connect to these instances through the web terminal in the cloud dashboard or directly using the given SSH keys.

Pros

  • Purely GPU focused offering

Paperspace CORE

Paperspace is a cloud computing platform that offers GPU-accelerated virtual machines, among other services. The company is well-regarded for its focus on GPU-intensive workloads and provides a cloud platform for developing, training, and deploying machine learning models.

Pros

  • Offers a wide range of GPUs compared to other providers
  • Users find the prices fair for the computing power provided
  • Users find the customer service to be friendly and responsive

Cons

  • Some users complain about machine availability, both in terms of the free virtual machines and specific machine types not being available in all regions
7
A review about Paperspace Core 15
  • The integrated Jupyter interface is criticized and lacks some keyboard shortcuts, although a native Jupyter Notebook interface is offered
6
A review about Paperspace Core 16
  • Longer loading or creation times for machines
  • Monthly subscription fee on top of machine costs can be a downside, and multi-GPU training can be expensive
  1. Big Three Dominate the Global Cloud Market, Statista, Retrieved July 19, 2023
  2. https://www.g2.com/products/amazon-ec2/reviews/amazon-ec2-review-8154729
  3. https://www.g2.com/products/aws-cloud/reviews/aws-cloud-review-8271023
  4. Same Statista source as above
  5. https://www.g2.com/products/azure-virtual-machines/reviews/azure-virtual-machines-review-8145738
  6. Same Statista source as above
  7. “NVIDIA Launches DGX Cloud, Giving Every Enterprise Instant Access to AI Supercomputer From a Browser“. NVIDIA. March 21, 2023. Retrieved September 26, 2023.

    The offering is enterprise focused with the list price of DGX Cloud instances starting at $36,999 per instance per month at launch.

    Pros

    • Support from NVIDIA engineers

    Cons

    • Offering is not suitable for firms with limited GPU needs
    • The service is provided on top of cloud providers’ physical infrastructure. Therefore buyer needs to pay for the margins of both the cloud provider and NVIDIA.

    IBM Cloud

    The GPU offered by IBM Cloud allows for a flexible process of selecting servers, and it has a seamless integration with the architecture, applications, and APIs of IBM Cloud. This is accomplished via a globally distributed network of data centers that are interconnected.

    Pros

    • Powerful integration with IBM Cloud architecture and applications
    • Worldwide distributed data centers increases data protection

    Cons



Source link

This post originally appeared on TechToday.