This course provides a comprehensive guide to deploying, managing, and optimizing AI and high-performance computing (HPC) workloads on Google Cloud. Through a series of lessons and practical demonstrations, you’ll explore diverse deployment strategies, ranging from highly customizable environments using Google Compute Engine (GCE) to managed solutions like Google Kubernetes Engine (GKE). Specifically, you’ll learn how to create clusters and deploy GKE for inference.

Enjoy unlimited growth with a year of Coursera Plus for $199 (regularly $399). Save now.

What you'll learn
Describe the process of creating a GPU-accelerated cluster.
Identify how to provision a GPU-accelerated cluster on GCE.
Identify how to provision a GPU-accelerated cluster on GKE.
Identify how to deploy AI inference workloads on GKE.
Skills you'll gain
- Model Deployment
- Kubernetes
- Distributed Computing
- System Configuration
- Containerization
- Cloud Infrastructure
- Network Performance Management
- Network Planning And Design
- AI Orchestration
- Application Deployment
- Cloud Engineering
- Cloud Deployment
- Infrastructure As A Service (IaaS)
- Google Cloud Platform
- Performance Tuning
Details to know

Add to your LinkedIn profile
December 2025
4 assignments
See how employees at top companies are mastering in-demand skills

There are 6 modules in this course
This module offers an overview of the course and outlines the learning objectives.
What's included
1 plugin
This module details the AI Hypercomputer cluster creation process. It covers the key decisions required, including choosing a machine type, consumption option, deployment option, orchestrator, and cluster image.
What's included
1 assignment6 plugins
This module identifies key configuration options and optimization techniques for deploying an AI Hypercomputer cluster on Google Compute Engine (GCE). It covers selecting machine types, accelerator OS images, deployment options, and strategies for optimizing network performance.
What's included
1 assignment4 plugins
This module identifies configuration options for deploying an AI Hypercomputer cluster on Google Kubernetes Engine (GKE). It covers containerization, GKE modes of operation, networking configurations, and workload optimization techniques like distributed training and GPU sharing.
What's included
1 assignment4 plugins
This module examines optimization techniques for architecting an inference workload on GKE. It covers the GKE inference workflow, key infrastructure and model-level optimizations.
What's included
1 assignment4 plugins
Student PDF links to all modules
What's included
1 reading
Instructor

Offered by
Why people choose Coursera for their career





Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy
Frequently asked questions
Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.
If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.
Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.
More questions
Financial aid available,

