Custom container requirements for inference | Gemini Enterprise Agent Platform | Google Cloud Documentation

Skip to main content

Technology areas

AI and ML
Application development
Application hosting
Compute
Data analytics and pipelines
Databases
Distributed, hybrid, and multicloud
Industry solutions
Migration
Networking
Observability and monitoring
Security
Storage

Cross-product tools

Access and resources management
Costs and usage management
Infrastructure as code
SDK, languages, frameworks, and tools

/

Console

English
Deutsch
Español – América Latina
Français
Indonesia
Italiano
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어

Sign in

Gemini Enterprise Agent Platform

Start free

Overview Studio Agents Models Notebooks

Agent Platform
Generative AI

Engineering Blog

Technology areas
- More
- Overview
- Studio
- Agents
- Models
- Notebooks
- Pricing
  - More
- Engineering Blog
Cross-product tools
- More
Console

Overview
Beginner's guide
Get started
Get started with Agent Platform
Develop Gemini API code with the Gen AI SDK
Get an API key
Configure application default credentials
Migrate from Google AI Studio to Agent Platform
Get started with Gemini 3
Gemini 3 prompting guide
Google GenAI libraries
Generative AI cookbook
Access Gemini models using OpenAI libraries
Express mode
Select models
- Model Garden
- Overview of Model Garden
- Use models in Model Garden
- Test model capabilities
- Google Models
- All Google models
- Gemini
- Veo
- Lyria
  - Lyria 2
  - Lyria 3
- Model versions
- Partner Models
- Partner models overview
- Claude
- Grok
  - Overview
  - Responses API
  - Function calling
  - Structured output
  - Reasoning
  - Model details
  - Grok 4.1 Fast
  - Grok 4.20
  - Grok 4.3
- Mistral AI
  - Overview
  - Model details
  - Mistral Medium 3
  - Mistral OCR (25.05)
  - Mistral Small 3.1 (25.03)
  - Codestral 2
- Deploy partner models from Model Garden
- Model deprecations (MaaS)
- Open Models
- Overview
- DeepSeek
- Embedding (e5)
  - Multilingual E5 Small
  - Multilingual E5 Large
- Google Gemma
- Kimi
  - Overview
  - Kimi K2 Thinking
- Llama
  - Overview
  - Request predictions
  - Model details
  - Llama 4 Maverick
  - Llama 4 Scout
  - Llama 3.3
- MiniMax
  - Overview
  - MiniMax M2
- OpenAI
- Qwen
- ZAI.org
- Managed open models (MaaS)
- Self-deployed open models
Build
- Prompt design
- Introduction to prompting
- Prompting strategies
- Task-specific prompt guidance
  - Design multimodal prompts
  - Design chat prompts
- Capabilities
- Safety
- Text and code generation
- Image generation
- Video generation
- Music generation
- Media analysis
- Grounding
- URL context
- Thinking
  - Overview
  - Thought signatures
- Computer Use
- Live API
- Embeddings
  - Overview
  - Text embeddings
    Get text embeddings
    Choose an embeddings task type
  - Get multimodal embeddings
  - Get batch embeddings inferences
- Translation
- Generate speech from text
- Transcribe speech
- Development tools
- Use AI-powered prompt writing tools
  - Overview
  - Optimize prompts
    Overview
    Zero-shot optimizer
    Few-shot optimizer
    Data-driven optimizer
  - Use prompt templates
- Model tuning
- Introduction to tuning
- Tuning Gemini models
  - Supervised fine-tuning
    About supervised fine-tuning
    Prepare your data
    Use supervised fine-tuning
    
    Supported modalities
    Text tuning
    Document tuning
    Image tuning
    Audio tuning
    Video tuning
    Tune function calling
  - Preference tuning
    About preference tuning
    Prepare your data
    Use preference tuning
  - Use tuning checkpoints
  - Use continuous tuning
  - Tuning recommendations with LoRA and QLoRA
  - Distillation
- Open models
  - Supervised and distillation fine-tuning
- Embeddings models
  - Tune text embeddings models
- Translation models
- Migrate
- Call Agent Platform models using OpenAI libraries
Evaluate
- Overview
- Tutorial: Perform evaluation using the console
- Perform evaluation using the GenAI Client in Agent Platform SDK
- Alternative evaluation methods
- Evaluate using the evaluation module in Agent Platform SDK
- Run AutoSxS pipeline
- Run a computation-based evaluation pipeline
Deploy
Administer
Build your own model
- Overview
- MLOps on Agent Platform
- Interfaces for Agent Platform
- Agent Platform beginner's guides
  - Train an AutoML model
  - Train a custom model
  - Get inferences from a custom model
  - Train a model using Agent Platform and the Python SDK
    Introduction
    Prerequisites
    Create a notebook
    Create a dataset
    Create a training script
    Train a model
    Make an inference
- Integrated ML frameworks
  - PyTorch
  - TensorFlow
- Agent Platform for BigQuery users
- Glossary
- Get started
- Set up a project and a development environment
- Install the Agent Platform SDK for Python
- Authenticate to Agent Platform
- Choose a training method
- Try a tutorial
  - Tutorials overview
  - AutoML tutorials
    
    Hello image data
    Overview
    Set up your project and environment
    Create a dataset and import images
    Train an AutoML image classification model
    Evaluate and analyze model performance
    Deploy a model to an endpoint and make an inference
    Clean up your project
    
    Hello tabular data
    Overview
    Set up your project and environment
    Create a dataset and train an AutoML classification model
    Deploy a model and request an inference
    Clean up your project
  - Custom training tutorials
    Train a custom tabular model
    
    Train a TensorFlow Keras image classification model
    Overview
    Set up your project and environment
    Train a custom image classification model
    Serve predictions from a custom image classification model
    Clean up your project
    Fine-tune an image classification model with custom data
- Use Agent Platform development tools
- Development tools overview
- Use the Agent Platform SDK
  - Overview
  - Introduction to the Agent Platform SDK for Python
  - Agent Platform SDK for Python classes
    Agent Platform SDK classes overview
    Data classes
    Training classes
    Model classes
    Prediction classes
    Tracking classes
- Terraform support for Agent Platform
- Agent Platform Training
- Overview
- Agent Platform serverless training
  - Overview of serverless training in Agent Platform
  - Load and prepare data
    Data preparation overview
    Use Cloud Storage as a mounted file system
    Mount an NFS share for serverless training
    Use managed datasets
  - Prepare training application
    Understand the serverless training service
    Prepare training code
    
    Use prebuilt containers
    Create a Python training application for a prebuilt container
    Prebuilt containers for serverless training
    
    Use custom containers
    Custom containers for serverless training
    Create a custom container
    Containerize and run training code locally
  - Train on a persistent resource
    Overview
    Create persistent resource
    Run training jobs on a persistent resource
    Get persistent resource information
    Reboot a persistent resource
    Delete a persistent resource
  - Configure training job
    Choose a custom training method
    Configure container settings for training
    Configure compute resources for training
    Use reservations with training
    Use Spot VMs with training
  - Submit training job
    Create custom jobs
    
    Hyperparameter tuning
    Hyperparameter tuning overview
    Use hyperparameter tuning
    Create training pipelines
    Schedule jobs based on resource availability
    Use distributed training
    Training with Cloud TPU VMs
    Use private IP for custom training
    Use Private Service Connect interface for training (recommended)
  - Monitor and debug
    Monitor and debug training using an interactive shell
    Profile model training performance
  - Tutorial: Build a pipeline for continuous training
  - Create custom organization policy constraints
- Vertex AI training clusters
  - Overview
  - Get started with training clusters
  - Deployment considerations
    Compute resources
    Networking
    Storage
    Orchestration
  - Create and manage clusters
    Create cluster
    Manage cluster
    Manage accounts and job scheduling on a cluster
  - Cluster resiliency
  - Feature guides
    Using