Skip to main content
Technology areas
close
AI and ML
Application development
Application hosting
Compute
Data analytics and pipelines
Databases
Distributed, hybrid, and multicloud
Industry solutions
Migration
Networking
Observability and monitoring
Security
Storage
Cross-product tools
close
Access and resources management
Costs and usage management
Infrastructure as code
SDK, languages, frameworks, and tools
/
Console
English
Deutsch
Español – América Latina
Français
Indonesia
Italiano
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어
Sign in
Gemini Enterprise Agent Platform
Start free
Overview
Studio
Agents
Models
Notebooks
Pricing
Agent Platform
Generative AI
Engineering Blog
Technology areas
More
Overview
Studio
Agents
Models
Notebooks
Pricing
More
Engineering Blog
Cross-product tools
More
Console
Overview
Beginner's guide
Get started
Get started with Agent Platform
Develop Gemini API code with the Gen AI SDK
Get an API key
Configure application default credentials
Migrate from Google AI Studio to Agent Platform
Get started with Gemini 3
Gemini 3 prompting guide
Google GenAI libraries
Generative AI cookbook
Access Gemini models using OpenAI libraries
Express mode
Overview
Console tutorial
API tutorial
Select models
Model Garden
Overview of Model Garden
Use models in Model Garden
Test model capabilities
Google Models
All Google models
Gemini
Migrate to the latest Gemini models
Pro
Gemini 3.1 Pro
Gemini 3 Pro
Gemini 3 Pro Image
Gemini 3 Pro Image Preview
Gemini 2.5 Pro
Flash
Gemini 3.5 Flash
Gemini 3.1 Flash Image
Gemini 3.1 Flash Image Preview
Gemini 3 Flash
Gemini 2.5 Flash
Gemini 2.5 Flash Image
Gemini 2.5 Flash Live API
Flash-Lite
Gemini 3.1 Flash-Lite
Gemini 2.5 Flash-Lite
Embedding
Gemini Embedding 2
Veo
Veo 2
Veo 3
Veo 3.1
Lyria
Lyria 2
Lyria 3
Model versions
Partner Models
Partner models overview
Claude
Overview
Request predictions
Quotas for Anthropic Claude models
Batch predictions
Structured outputs
Prompt caching
Count tokens
Web search
Safety classifiers
Model details
Claude Fable 5
Claude Opus 4.8
Claude Opus 4.7
Claude Sonnet 4.6
Claude Opus 4.6
Claude Opus 4.5
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Grok
Overview
Responses API
Function calling
Structured output
Reasoning
Model details
Grok 4.1 Fast
Grok 4.20
Grok 4.3
Mistral AI
Overview
Model details
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Codestral 2
Deploy partner models from Model Garden
Model deprecations (MaaS)
Open Models
Overview
DeepSeek
Overview
DeepSeek-V3.2
DeepSeek-V3.1
DeepSeek-R1-0528
DeepSeek-OCR
Embedding (e5)
Multilingual E5 Small
Multilingual E5 Large
Google Gemma
Model-as-a-Service (MaaS)
Gemma-4-26B-A4B-IT MaaS
Use Gemma
Tutorial: Deploy and inference Gemma (GPU)
Tutorial: Deploy and inference Gemma (TPU)
Kimi
Overview
Kimi K2 Thinking
Llama
Overview
Request predictions
Model details
Llama 4 Maverick
Llama 4 Scout
Llama 3.3
MiniMax
Overview
MiniMax M2
OpenAI
Overview
OpenAI gpt-oss-120b
OpenAI gpt-oss-20b
Qwen
Overview
Qwen 3 Next Instruct 80B
Qwen 3 Next Thinking 80B
Qwen 3 Coder
Qwen 3 235B
ZAI.org
Overview
GLM 5
GLM 4.7
Managed open models (MaaS)
Overview
Use open models via Model as a Service (MaaS)
Grant access to open models
API
Call MaaS APIs for open models
Function calling
Thinking
Structured output
Batch prediction
Self-deployed open models
Overview
Deploy open models
Deploy open models from Model Garden
Deploy open models with prebuilt containers
Deploy open models with a custom vLLM container
Deploy models with custom weights
Use Hugging Face Models
Tutorials
Optimize model performance with advanced features in Model Garden
Hex-LLM
Comprehensive guide to vLLM for Text and Multimodal LLM Serving (GPU)
vLLM TPU
xDiT
Deploy Llamma 3 models with SpotVM and Reservations
Build
Prompt design
Introduction to prompting
Prompting strategies
Overview
Give clear and specific instructions
Use system instructions
Include few-shot examples
Add contextual information
Structure prompts
Compare prompts
Instruct the model to explain its reasoning
Break down complex tasks
Experiment with parameter values
Prompt iteration strategies
Task-specific prompt guidance
Design multimodal prompts
Design chat prompts
Capabilities
Safety
Overview
Responsible AI
System instructions for safety
Configure content filters
Gemini for safety filtering and content moderation
Abuse monitoring
Process blocked responses
Content Credentials
AI Content Detection API
Text and code generation
Text generation
System instructions
Function calling
Structured output
Content generation parameters
Code execution
Image generation
Generate images with Gemini
Generate images from video with Gemini
Edit images with Gemini
Gemini image generation best practices
Gemini image generation limitations
Responsible AI and usage for Gemini image generation
Imagen documentation
Video generation
Introduction to Veo
Text to video
First frame image to video
First and last frames to video
Ingredients to videos with image references
Extend videos
Insert objects
Remove objects
Veo prompt guide
Veo best practices
Turn off Veo's prompt rewriter
Responsible AI for Veo
Music generation
Introduction to Lyria
Generate music using Lyria
Lyria prompt guide
Media analysis
Image understanding
Video understanding
Audio understanding
Document understanding
Bounding box detection
Grounding
Overview
Grounding with Google Search
Grounding with Google Maps
Grounding with Agent Search
Grounding with your search API
Grounding responses using RAG
Grounding with Elasticsearch
Grounding with Parallel web search
Grounding with Exa web search
Web Grounding for Enterprise
URL context
Thinking
Overview
Thought signatures
Computer Use
Live API
Overview
Get started
Get started using the Gen AI SDK
Get started using WebSockets
Get started using ADK
Start and manage live sessions
Send audio and video streams
Configure language and voice
Configure Gemini capabilities
Asynchronous function calling
Best practices with Live API
Troubleshooting Live API
Demo apps and resources
Embeddings
Overview
Text embeddings
Get text embeddings
Choose an embeddings task type
Get multimodal embeddings
Get batch embeddings inferences
Translation
Generate speech from text
Transcribe speech
Development tools
Use AI-powered prompt writing tools
Overview
Optimize prompts
Overview
Zero-shot optimizer
Few-shot optimizer
Data-driven optimizer
Use prompt templates
Model tuning
Introduction to tuning
Tuning Gemini models
Supervised fine-tuning
About supervised fine-tuning
Prepare your data
Use supervised fine-tuning
Supported modalities
Text tuning
Document tuning
Image tuning
Audio tuning
Video tuning
Tune function calling
Preference tuning
About preference tuning
Prepare your data
Use preference tuning
Use tuning checkpoints
Use continuous tuning
Tuning recommendations with LoRA and QLoRA
Distillation
Open models
Supervised and distillation fine-tuning
Embeddings models
Tune text embeddings models
Translation models
About supervised fine-tuning
Prepare your data
Use supervised fine-tuning
Migrate
Call Agent Platform models using OpenAI libraries
Overview
Authenticate
Examples
Migrate from OpenAI SDK
Evaluate
Overview
Tutorial: Perform evaluation using the console
Perform evaluation using the GenAI Client in Agent Platform SDK
Tutorial: Evaluate models using the GenAI Client in Agent Platform SDK
Define your evaluation metrics
Define your evaluation metrics
Details for managed rubric-based metrics
Prepare your evaluation dataset
Run an evaluation
View and interpret evaluation results
Evaluate agents
Alternative evaluation methods
Evaluate using the evaluation module in Agent Platform SDK
Tutorial: Perform evaluation using the evaluation module in Agent Platform SDK
Define your evaluation metrics
Prepare your evaluation dataset
Run an evaluation
Interpret evaluation results
Templates for model-based metrics
Evaluate agents
Evaluate a judge model
Configure a judge model
Run AutoSxS pipeline
Run a computation-based evaluation pipeline
Deploy
Consumption options overview
Provisioned Throughput
Provisioned Throughput overview
Supported models
Calculate Provisioned Throughput requirements
Provisioned Throughput for Live API
Provisioned Throughput for Gemini 3 (Nano Banana) models
Provisioned Throughput for Veo 3 models
Single Zone Provisioned Throughput
Purchase Provisioned Throughput
Use Provisioned Throughput
PayGo
Standard PayGo
Priority PayGo
Flex PayGo
Batch inference
Overview
Create batch job from Cloud Storage
Create batch job from BigQuery
Resume an incomplete batch job
Quotas and system limits
Cache reused prompt context
Overview
Create a context cache
Use a context cache
Get context cache information
Update a context cache
Delete a context cache
Context cache for fine-tuned Gemini models
Deploy generative AI models
Troubleshooting error code 429
Retry strategy
Administer
Access control
Networking
Security controls
Control access to Model Garden models
Enable Data Access audit logs
Save and share prompts
Monitor models
Monitor cost using custom metadata labels
Request-response logging
Secure a gen AI app by using IAP
Overview
Set up your project and source repository
Create a Cloud Run service
Create a load balancer
Configure IAP
Test your IAP-secured app
Clean up your project
Build your own model
Overview
MLOps on Agent Platform
Interfaces for Agent Platform
Agent Platform beginner's guides
Train an AutoML model
Train a custom model
Get inferences from a custom model
Train a model using Agent Platform and the Python SDK
Introduction
Prerequisites
Create a notebook
Create a dataset
Create a training script
Train a model
Make an inference
Integrated ML frameworks
PyTorch
TensorFlow
Agent Platform for BigQuery users
Glossary
Get started
Set up a project and a development environment
Install the Agent Platform SDK for Python
Authenticate to Agent Platform
Choose a training method
Try a tutorial
Tutorials overview
AutoML tutorials
Hello image data
Overview
Set up your project and environment
Create a dataset and import images
Train an AutoML image classification model
Evaluate and analyze model performance
Deploy a model to an endpoint and make an inference
Clean up your project
Hello tabular data
Overview
Set up your project and environment
Create a dataset and train an AutoML classification model
Deploy a model and request an inference
Clean up your project
Custom training tutorials
Train a custom tabular model
Train a TensorFlow Keras image classification model
Overview
Set up your project and environment
Train a custom image classification model
Serve predictions from a custom image classification model
Clean up your project
Fine-tune an image classification model with custom data
Use Agent Platform development tools
Development tools overview
Use the Agent Platform SDK
Overview
Introduction to the Agent Platform SDK for Python
Agent Platform SDK for Python classes
Agent Platform SDK classes overview
Data classes
Training classes
Model classes
Prediction classes
Tracking classes
Terraform support for Agent Platform
Agent Platform Training
Overview
Agent Platform serverless training
Overview of serverless training in Agent Platform
Load and prepare data
Data preparation overview
Use Cloud Storage as a mounted file system
Mount an NFS share for serverless training
Use managed datasets
Prepare training application
Understand the serverless training service
Prepare training code
Use prebuilt containers
Create a Python training application for a prebuilt container
Prebuilt containers for serverless training
Use custom containers
Custom containers for serverless training
Create a custom container
Containerize and run training code locally
Train on a persistent resource
Overview
Create persistent resource
Run training jobs on a persistent resource
Get persistent resource information
Reboot a persistent resource
Delete a persistent resource
Configure training job
Choose a custom training method
Configure container settings for training
Configure compute resources for training
Use reservations with training
Use Spot VMs with training
Submit training job
Create custom jobs
Hyperparameter tuning
Hyperparameter tuning overview
Use hyperparameter tuning
Create training pipelines
Schedule jobs based on resource availability
Use distributed training
Training with Cloud TPU VMs
Use private IP for custom training
Use Private Service Connect interface for training (recommended)
Monitor and debug
Monitor and debug training using an interactive shell
Profile model training performance
Tutorial: Build a pipeline for continuous training
Create custom organization policy constraints
Vertex AI training clusters
Overview
Get started with training clusters
Deployment considerations
Compute resources
Networking
Storage
Orchestration
Create and manage clusters
Create cluster
Manage cluster
Manage accounts and job scheduling on a cluster
Cluster resiliency
Feature guides
Using