Data Analytics

Introducing the Open Knowledge Format

Fri, 12 Jun 2026 13:00:00 +0000

As foundation models continue to improve, the lack of relevant context often limits what they can do, especially as they are used to build agentic systems. While these models can help you write code, summarize documents, or analyze a dataset, they still need the right information to produce accurate and actionable results.

That’s why today, we’re introducing the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki pattern into a portable, interoperable format. This is a vendor-neutral, agent- and human-friendly standard for representing the metadata, context, and curated knowledge that modern AI systems need.

As published, OKF v0.1 represents knowledge as a directory of markdown files with YAML frontmatter, with a small set of agreed-upon conventions that let wikis written by different producers be consumed by different agents without translation.

That's it. No complex compression scheme, no new runtime, no required SDK. A bundle of OKF documents is:

Just markdown — readable in any editor, renderable on GitHub, indexable by any search tool
Just files — shippable as a tarball, hostable in any git repo, mountable on any filesystem
Just YAML frontmatter — for the small set of structured fields that need to be queryable: type, title, description, resource, tags, and timestamp

If you've used Obsidian, Notion, Hugo, or any of the LLM wiki patterns that have emerged over the past year, the shape will feel familiar. OKF formalizes the small set of conventions needed to make these patterns interoperable.

Let’s take a look at the problem that OKF can solve for your organization, how it works, how to get started with it, and what’s next.

A fragmented context landscape

In most organizations, the information that foundation models use is overwhelmingly internal knowledge: the schema of a table, your business’ meaning of a metric, the runbook for an incident, the join paths between two systems, the deprecation notice for an old API, etc.

Today, these atoms of knowledge live in a variety of highly fragmented systems:

Metadata catalogs with their own APIs
Wikis, third-party systems, or in shared drives
Code comments, docstrings, or notebook cells
The heads of a few senior engineers

When an AI agent needs to answer "How do I compute weekly active users from our event stream?" it has to assemble the answer from these scattered, mutually incompatible surfaces. Every vendor offers its own catalog, its own SDK, its own knowledge-graph schema, and none of the knowledge is easily portable across products or organizations.

The result: Every agent builder is solving the same context-assembly problem from scratch, every catalog vendor is reinventing the same data models, and the knowledge itself is locked behind whichever surface created it.

Knowledge as a living wiki

Developer teams are changing how they build AI agents. Instead of using models to search the same documents for the same facts over and over, you can give your agents a shared markdown library that grows more useful over time. This lets your agents take on the drudgery of reading and updating their own files, while your team curates the content and manages it like code.

Andrej Karpathy, the prominent AI researcher and educator, articulates this idea most crisply in his LLM Wiki gist. "LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass," he writes. The bookkeeping that causes humans to abandon personal wikis is exactly what LLMs are good at.

Similar knowledge-as-Wiki pattern keeps reappearing under different names: Obsidian vaults wired to coding agents, the AGENTS.md / CLAUDE.md family of convention files, repos full of index.md and log.md artifacts that agents consult before doing real work, and "metadata as code" repositories inside data teams.

The pattern is compelling and powerful, but each instance is bespoke. Karpathy's wiki and your team's wiki and a vendor's catalog export may all look alike (markdown, frontmatter, cross-links), but none of them are intentionally designed to cooperate. There is no agreed-upon answer to what fields every document should carry, or what filenames mean what. As a result, the knowledge encoded in wikis remains siloed within the original teams, leading to redundant effort whenever a new agent is built.

What's missing is a format, not another service

The answer to this problem isn’t another knowledge service. You need a format, a way to represent knowledge that:

Anyone can produce, without an SDK
Anyone can consume, without an integration
Survives moving between systems, organizations, and tools
Lives in version control alongside the code it describes
Is readable by humans and parseable by agents: the same file, no translation layer

By design, OKF is that format.

How OKF works: The design in one screen

An OKF bundle is a directory of markdown files representing concepts: anything you want to capture, including tables, datasets, metrics, playbooks, runbooks, and APIs. Each concept is one file. The file path is the concept's identity:

code_block: <ListValue: [StructValue([('code', 'sales/\r\n├── index.md\r\n├── datasets/\r\n│ ├── index.md\r\n│ └── orders_db.md\r\n├── tables/\r\n│ ├── index.md\r\n│ ├── orders.md\r\n│ └── customers.md\r\n└── metrics/\r\n│ ├── index.md\r\n └── weekly_active_users.md'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eab6421f0>)])]>

Each concept document has a small block of YAML front matter for structured fields and a markdown body for everything else:

code_block: <ListValue: [StructValue([('code', '---\r\ntype: BigQuery Table\r\ntitle: Orders\r\ndescription: One row per completed customer order.\r\nresource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders\r\ntags: [sales, revenue]\r\ntimestamp: 2026-05-28T14:30:00Z\r\n---\r\n\r\n# Schema\r\n\r\n| Column | Type | Description |\r\n|---------------|-----------|------------------------------------------|\r\n| `order_id` | STRING | Globally unique order identifier. |\r\n| `customer_id` | STRING | FK to [customers](/tables/customers.md). |\r\n\r\n# Joins\r\n\r\nJoined with [customers](/tables/customers.md) on `customer_id`.'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eab642850>)])]>

Concepts link to each other with normal markdown links, turning the directory into a graph of relationships that is richer than the parent/child links implied by the file system. Bundles can optionally include index.md files (for progressive disclosure as agents navigate the hierarchy) and log.md files (for chronological history of changes).

The full v0.1 specification (including conformance criteria, cross-linking rules, and the small number of reserved filenames) fits on a single page.

Three principles behind the design

1. Minimally opinionated. OKF requires exactly one thing of every concept: a type field. Everything else (e.g., what types exist, what other fields to include, what sections the body has) is left to the producer. The spec defines the interoperability surface, not the content model.

2. Producer/consumer independence. OKF cleanly separates who writes the knowledge from who consumes it. A bundle hand-authored by a human can be consumed by an AI agent. A bundle generated by a metadata export pipeline can be browsed in a visualizer. A bundle synthesized by one LLM can be queried by another. The format is the contract; the tooling at each end is independently swappable.

3. Format, not platform. OKF is not tied to any specific cloud, database, model provider, or agent framework. It will never require a proprietary account or SDK to read, write, or serve. We're publishing it as an open standard because the value of a knowledge format comes from how many parties speak it, not from who owns it.

What we're shipping with the spec

To make the format concrete, we're publishing reference implementations at both the producer and consumer ends:

An enrichment agent that walks a BigQuery dataset, drafts an OKF concept document for every table and view, then runs a second LLM pass that crawls authoritative documentation and enriches each concept with citations, schemas, and join paths.
A static HTML visualizer that turns any OKF bundle into an interactive graph view in a single self-contained file; no backend, no install on the viewing side, no data leaves the page.
Three ready-to-browse sample bundles: GA4 e-commerce, Stack Overflow, and Bitcoin public datasets, produced by the reference agent and committed to the repo as living examples of conformant OKF.

These are proofs of concept, deliberately. The agent demonstrates one way to produce OKF; nothing about the format requires a specific agent framework or LLM. The visualizer demonstrates one way to consume it; nothing about the format requires HTML or a graph view. We expect (and want!) the ecosystem of producers and consumers to grow far beyond what we've shipped.

Where we go from here

OKF v0.1 is a starting point, not a finished standard. The format will evolve as more producers and consumers emerge and as we collectively learn what knowledge representations agents actually need in practice.

We're publishing in the open from day one because that's the only way a knowledge format earns its name, whether you're building a knowledge catalog, an enrichment pipeline, a wiki tailored to AI agents, or anything in the AI knowledge domain.

From here, we encourage you to:

Read the spec (it's short!)
Write a producer for your source system, your database, your documentation site
Write a consumer: a viewer, a search index, an agent that reasons over bundles
Try the reference implementation against your own data
File issues, send PRs, or propose extensions: The spec is versioned and explicitly designed for backward-compatible growth

The repo, the spec, and the sample bundles are available in GitHub. We have also updated Google Cloud’s Knowledge Catalog to be able to ingest Open Knowledge Format and serve it to our agents. You can find the relevant code and examples here.

The format itself is the contribution. The tools we've shipped exist to make it real, and to lower the cost of trying it out. Whatever shape your knowledge takes today, OKF is designed to be the lingua franca it can be exchanged for tomorrow.

^{Published by the Google Cloud Data Cloud team. Open Knowledge Format is an open specification; contributions, alternative implementations, and adoption beyond Google products are all explicitly welcomed.}

^{In addition to the authors, this work came together thanks to key ideas from many others at Google, and we thank them for their contributions.}

Transform dashboards into interactive data experiences with Looker agents

Thu, 11 Jun 2026 16:00:00 +0000

Dashboards have long served as a primary way for organizations to extract insights from data, but they can fall short in agile environments: Dashboards aren’t interactive and don’t allow you to ask follow-up questions. This forces users to step outside their workflows or turn to data analysts to get the answers they need. Today, we are introducing Looker dashboard agents in preview, embedding intelligent, conversational data agents directly within dashboards and empowering users to explore their business intelligence (BI) data using natural language.

Start a conversation with a Looker dashboard agent

Interactive agent-led investigations

Traditionally, dashboards have presented a static view of data. With dashboard agents in Looker, users can explore their data directly within the dashboard interface. Users can start a conversation by clicking the Gemini icon and asking natural-language questions to receive contextual insights.

The accuracy of a data agent depends on the business context it is provided, and its ability to map appropriate metrics and dimensions to users’ inquiries. The Looker dashboard agent has direct context about the user’s applied filters, cross-filters, and pre-curated tiles, helping it to generate highly relevant and accurate answers to complex business questions.

Should a query require more data, the agent can access underlying Explores to uncover additional information. These insights are paired with relevant charts and natural language explanations to simplify data exploration.

Explore data beyond dashboard to uncover deeper insights

Tailor the agent to your business

Data analysts curate dashboards to provide business users with precise perspectives on organizational data. To maintain this kind of consistent and reliable analytical environment, the Looker dashboard agent is highly configurable. Analysts can add context on top of the Looker semantic layer by providing natural-language instructions directly to the agent. This way, they can define exactly how the agent interprets unique business logic and tailors responses for the target audience. By enabling self-serve data analysis, dashboard agents help analyst teams scale to meet the increasing data demands of the business.

Configure Looker dashboard agents

Inherited trust and transparency

For users to adopt an AI-based system, they must trust the information it provides them. When generating an insight, the Looker dashboard agent explicitly shows its work by displaying intermediate reasoning, referenced dashboard tiles, and applied filters. Additionally, the administrator needs to trust users only have access to data and insights to which they are authorized. The dashboard agent is backed by Looker’s governance model, managed through standard permissions.

We are actively working on additional capabilities for the Looker dashboard agent, including support for iframe embedding, allowing organizations to bring dashboard agents alongside Looker dashboards into any essential portal or application.

Enable dashboard agents today

With Looker version 26.08.11 and later, administrators can activate the dashboard agent capability by toggling "Enable Chat with Dashboard" within the Gemini in Looker settings. Once enabled, authorized users will see the Gemini icon and can begin chatting with their dashboard data immediately. Please explore our support documentation for more detailed information.

Deep dive: How Lightning Engine delivers 4.9x faster Apache Spark performance

Wed, 10 Jun 2026 17:00:00 +0000

From foundational ETL and analytics to the frontier of generative AI, Apache Spark serves as the architectural backbone for global data processing. However, as data volumes scale, the trade-off between performance and infrastructure costs can be a limiting factor for growth. In the agentic era, where autonomous agents can trigger thousands of concurrent, multi-hop queries, this performance bottleneck directly dictates your unit economics.

We are excited to announce the general availability of Lightning Engine for Managed Service for Apache Spark, available across both our serverless and managed clusters deployment modes. Designed to address these scaling challenges directly, it is fully compatible with modern Spark workloads and requires zero changes to your existing data pipelines.

Whether you choose the zero-ops simplicity of our serverless deployment mode or the fine-grained infrastructure control of our managed clusters deployment mode, Lightning Engine serves as the unified performance engine to supercharge your job execution. By validating Lightning Engine across more than one million real-world workloads, we have fine-tuned it for industrial-grade stability as well as reliable performance gains.

With this general availability release, Lightning Engine delivers:

Up to 4.9x faster performance than standard open-source Spark
2x the price-performance over the leading high-speed Spark alternative

Let’s take a closer look at how Manager Service for Apache Spark achieves these great results.

Under the hood: Vectorized native execution

Traditional Spark execution is often bottlenecked by JVM execution overhead and garbage collection pauses. Lightning Engine bypasses these limitations by compiling Spark physical query plans into native C++ instructions optimized for Single Instruction, Multiple Data (SIMD) vectorization.

Built on the open-source Gluten and Velox runtimes with specialized Google-engineered enhancements, this native execution layer accelerates your most demanding data processing tasks with:

Vectorized sort: Accelerates sorting operations by processing data columnarly in native memory, significantly reducing CPU cycle overhead.
Accelerated window functions: Speeds up calculations performed across sets of rows (such as moving averages, aggregations, and deduplication) by executing them directly within the native C++ layer.
Smart fallback: If a query contains an operator or custom Java UDF that is not natively supported, the engine's intelligent push-down layer automatically and gracefully transitions that specific sub-tree back to the JVM, avoiding unnecessary data format conversions and preserving overall execution stability.

Optimized Cloud Storage and BigQuery connectors

High-performance compute is useless if the engine is starved for data. With Lightning Engine, we’ve optimized our storage connectors to ensure that reading data from Cloud Storage and BigQuery isn’t the bottleneck. Optimizations include:

Direct path connection: Bypasses multiple node hops and uses bi-directional streaming with Cloud Storage. This allows seek operations and vectorized readV APIs to run without reopening streams, accelerating scan times for complex, deeply nested Parquet or ORC files.
Metadata call reduction: Managing large-scale partitioned tables often comes with a hidden performance tax: the time spent simply listing files. Lightning Engine utilizes lexicographic listing in the driver to collect metadata and transmit it directly to executors, eliminating redundant Cloud Storage API calls and dramatically reducing Cloud Storage metadata costs.
Native BigQuery connector: Directly consumes BigQuery data in Arrow format. By avoiding the expensive conversion from Arrow to JVM UnsafeRow, the engine eliminates serialization overhead to accelerate scan times.

Broadcast joins and advanced query optimization

Lightning Engine incorporates an advanced, cost-based query optimizer inspired by Google's F1 and Spanner query engines, and introduces several custom optimization rules. Examples include:

Single HashTable caching: In standard broadcast joins, Spark builds join hash tables repeatedly across tasks. Lightning Engine builds the hash table once per executor and caches it, eliminating redundant CPU cycles and reducing the executor's memory footprint.
Aggregation pushdown: Automatically pushes partial aggregations below join shuffles. This minimizes the volume of data that must be transferred across the network, drastically reducing expensive shuffle stages.
Auto shuffle partitioning: Dynamically and adaptively determines the optimal number of shuffle partitions for each individual query stage based on runtime statistics, preventing out-of-memory (OOM) spills without over-partitioning.

Learn more technical details and hear Lowe’s experience with Lightning Engine from Google Cloud Next ‘26

Getting started

These updates are live and ready to use today! You can enable Lightning Engine directly through the Google Cloud console or via the gcloud CLI.

To submit a serverless batch job with Lightning Engine enabled, specify the premium tier in your Spark properties:

code_block: <ListValue: [StructValue([('code', 'gcloud dataproc batches submit pyspark my_script.py \\\r\n --region=us-central1 \\\r\n --properties=dataproc:dataproc.tier=premium \\\r\n --properties=spark:spark.dataproc.lightningEngine.runtime=native'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eb100fe80>)])]>

To spin up a new managed cluster with Lightning Engine and Native Query Execution (NQE) enabled, run the following command in your terminal:

code_block: <ListValue: [StructValue([('code', 'gcloud dataproc clusters create my-optimized-cluster \\\r\n --region=us-central1 \\\r\n --image-version=2.3 \\\r\n --engine=lightning \\\r\n --enable-component-gateway \\\r\n--properties=spark:spark.dataproc.lightningEngine.runtime=native'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eb100fbe0>)])]>

Alternatively, navigate to the Managed Service for Apache Spark page in the Google Cloud console, click Create Cluster, select Cluster on Compute Engine, and choose Lightning Engine under the cluster configuration settings to automatically activate query acceleration for your workloads.

Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB

Mon, 08 Jun 2026 16:00:00 +0000

In clinical informatics, every second counts. For Alcidion, a global leader in smart health solutions, the mission is simple but critical: use technology to reduce cognitive load for clinicians and present the right information at the right time to save lives.

Whether it’s managing patient flow in an emergency department or ensuring a patient is in the correct ward to avoid adverse outcomes, Alcidion’s flagship platform, Miya Precision, serves as a dynamic intelligent care platform for modern hospitals. To power this mission, the platform recently underwent a major architectural transformation, migrating from a legacy Microsoft SQL Server environment to Google Cloud’s AlloyDB for PostgreSQL.

The challenge: overcoming performance bottlenecks

Operating in an industry where data integrity and uptime are non-negotiable, Alcidion faced several technical and operational hurdles with its previous setup:

Operational overhead: Managing persistent backends for SQL Server required significant manual effort. The team had to manually balance database loads between elastic pools to maintain performance while trying to optimize costs. They also had to constantly manage the gap between allocated and used space to prevent shared pools from being consumed by excessive slack space.
Performance latency: Complex JSON data processing, critical for modern health informatics, was taking up to 30 minutes for certain jobs.
Stability concerns: The team sought a more stable Kubernetes environment and a persistent backend that could scale without constant administrative intervention.

The solution: a smooth migration to AlloyDB

Alcidion used the Database Migration Service (DMS) to move from SQL Server to AlloyDB, achieving a remarkably efficient cutover. The total learning and migration process took under one month, with the core database move completed in only one and a half weeks.

By creating custom synchronization tools and using Google Cloud’s managed services, the team reduced the final transition window to just 15 minutes. Alcidion achieved this by spinning up a new Google Cloud instance synchronized to the active one, with both accessible via unique fully qualified domain names. The new environment remained in read-only mode for customer validation.

During the final cutover, the old instance was set to read-only, synchronization was halted, and external integration links were toggled to the new environment. This streamlined process allowed users to log into the new instance and resume work within minutes, with the primary delay being DNS record updates.

Alcidion chose a fully managed AlloyDB service to eliminate control plane tasks and administrative overhead. This shift allows their engineering team to focus on clinical innovation and product development rather than "managing the container" or the underlying database infrastructure.

Being able to cut over to AlloyDB in about 15 minutes had our users back to work almost immediately. For a system clinicians rely on around the clock, that kind of smooth transition gave Alcidion real confidence.

The results: impact by the numbers

The shift to AlloyDB and Google’s Agentic Data Cloud has delivered immediate, quantifiable improvements for Alcidion and its healthcare customers:

Faster data processing: Data processing that previously relied on SQL Server stored procedures — a process that became increasingly time-consuming as data volumes grew — has been transformed. By migrating to AlloyDB and using BigQuery and Dataflow for processing, Alcidion has seen jobs that once took 30 minutes now complete in just 5 to 60 seconds.
Enhanced stability: The migration has delivered a step-change in reliability. In the previous environment, the team faced monthly disruptions, ranging from failed scheduled maintenance to connectivity issues that required manual intervention. In contrast, AlloyDB and Google Cloud’s compute services have proven exceptionally stable, allowing the team to move away from the "firefighting" mode associated with frequent infrastructure crashes.
Reduced cognitive load: By simplifying their backend and clinical dashboards, Alcidion’s SREs have significantly reduced their administrative burden. This shift has freed the team to focus on high-value innovation, such as refining predictive analytics and generative AI that empower clinicians to make informed clinical decisions faster.

Future vision: AI and beyond

Alcidion isn't stopping at database modernization. The move to AlloyDB is a foundational step for their next phase of growth:

AlloyDB columnar engine: The team is exploring the columnar engine for a second round of query optimization and real-time analytics.
Generative AI apps: Alcidion is actively working with Google to use AlloyDB’s Gemini Enterprise Agent Platform integration to perform concept analysis and pick out critical clinical insights from vast datasets.

By moving to AlloyDB, Alcidion has improved its stability and performance and built a strong foundation to keep delivering smarter, safer care to hospitals worldwide.

Ready to modernize your database? Learn more about how AlloyDB can transform your operational workloads.

What's new for Managed Service for Apache Spark clusters

Thu, 04 Jun 2026 16:00:00 +0000

At Google Cloud, our goal is to let you run large-scale analytical and data science workloads with maximum efficiency so you can process big data pipelines, machine learning, and ETL tasks.

We recently announced that the Dataproc service is now Managed Service for Apache Spark, reflecting our deep integration with the Agentic Data Cloud.

To support the diverse architectural needs of today’s modern data teams, we offer the service in two distinct deployment modes: serverless and managed clusters. The serverless deployment mode completely abstracts infrastructure management for ephemeral or ad-hoc jobs, while the managed clusters deployment mode is designed for teams that require fine-grained infrastructure customization, persistent environments, long-running stateful processing, or native integration with custom Compute Engine hardware configurations.

When it comes to managed cluster deployments, we’ve re-imagined the experience from the ground up, focusing on three core pillars: making Spark faster by supercharging execution speeds, easier to run by maximizing resource obtainability and reducing operational overhead, and smarter by embedding AI directly into the development and operational lifecycle.

This blog post focuses specifically on what we announced at Google Cloud Next ‘26 for the Managed Spark clusters deployment mode: providing enhanced flexibility to fine-tune performance and cost through native execution engine, smarter scaling policies, and Gemini-powered extensions. For the latest of the serverless deployment mode, check out this blog.

Faster, with the Lightning Engine native execution engine

Arguably the biggest update for Managed Spark clusters is Lightning Engine, which introduces massive performance gains for Spark DataFrame/Dataset APIs and heavy Spark SQL queries. Powered by a native, C++ vectorized execution engine built on Velox and Gluten, with specialized internal enhancements, Lightning Engine bypasses JVM execution bottlenecks by compiling query plans into native instructions optimized for SIMD (Single Instruction, Multiple Data) vectorization.

This native execution engine delivers:

Up to 4.9x faster performance than standard open-source Spark
up to 2x the price-performance over the leading high-speed Spark alternative

Crucially, taking advantage of these performance gains doesn’t require any code changes to your existing Spark applications. Because your jobs complete faster, you directly reduce your aggregate Compute Engine runtime hours and overall spend.

To enable Lightning Engine on your managed clusters, simply specify the Lightning Engine option when you’re creating a cluster.

Learn technical details and hear Lowe’s experience with Lightning Engine

Easier: Maximize resource obtainability via Flexible VMs

Temporary localized shortages of a specific machine type can stall cluster creation or interrupt autoscaling. To dramatically improve cluster resilience against capacity constraints, Flexible VMs for Managed Spark clusters are now generally available.

Flexible VMs allow you to define up to ten ranked machine types for your master, primary, and secondary worker nodes. Managed Service for Apache Spark pairs this preference with automated regional zone placement, dynamically scanning the entire region to fulfill your capacity requests using the best available hardware layout. This helps ensure your pipelines spin up predictably, drastically reducing resource availability errors, and maximizing your ability to capture cost-effective Spot VM capacity during periods of peak demand.

Easier: Zero-scale clusters and scheduled stops

To give you better fiscal control over persistent and developmental environments, we recently announced the general availability of two highly requested FinOps features: zero-scale clusters and cluster scheduled stops.

Zero-scale clusters: You can now provision environments that use exclusively secondary workers (Spot VMs), enabling the cluster to automatically scale down to absolutely zero worker nodes when no processing is active, leaving only the master node online to preserve metadata.
Cluster scheduled stops: This feature lets you configure automated cluster shutdown policies based on specific idle-time limits or a precise future timestamp.

Because these features are natively integrated, they reduce the operational friction of having to delete and reconstruct your environment, while you can stop paying for idle compute overhead during nights and weekends.

Smarter: Managed Service for Apache Spark MCP Server

To bridge the gap between generative AI and data engineering, we launched the Model Context Protocol (MCP) server for Managed Service for Apache Spark. This open-standard integration allows LLMs and AI assistants to securely and dynamically interact with your Managed Spark clusters using natural language.

By utilizing the MCP server, your AI agents can securely connect to your data platform under existing IAM permissions. This allows agents to perform cluster-based operations, such as creating a cluster, submitting a job, or adjusting an autoscaling policy, directly from your AI application.

Smarter: Accelerating AI with the Data Agent Kit

The Google Cloud Data Agent Kit extension allows data scientists, engineers, and developers to manage their entire data workload lifecycle directly within their preferred development environment. We rolled out native support for this extension on Managed Spark clusters, enabling teams to seamlessly build and deploy specialized Data Agents for code generation and data wrangling.

Developers can choose to use Antigravity 2.0, Google's standalone, agentic development platform or bring these agentic capabilities into their preferred IDE including VS Code, Claude Code, or Codex via the Data Agent Kit extensions and plugins. By pairing this streamlined workflow with the raw processing power of managed clusters, these intelligent agents can securely execute complex workflows directly over petabyte-scale data lakes. Specifically, the Data Agent Kit enables developers to:

Build and orchestrate pipelines: Author multi-node data pipelines and generate comprehensive code documentation using natural language.
Perform real-time debugging: Leverage Gemini Cloud Assist to sift through executor logs, pinpoint root causes of job failures, and recommend actionable fixes.
Easily connect to Spark resources: Instantly attach to serverless Spark runtimes or managed clusters without manual network configuration or local Spark installations.
Streamline Git and CI/CD management: Commit, merge, and deploy code directly from your IDE of choice, triggering automated testing and deployment pipelines without friction.

Smarter: Next-generation Lakehouse

We recently launched Lakehouse, which delivers read/write interoperability between engines like Managed Service for Apache Spark and BigQuery. By leveraging the Lakehouse runtime catalog as a unified, serverless metadata layer, it removes data silos and the need for complex translation layers. This agentic-first approach allows organizations to process open formats directly from Google Cloud Storage, or even query remote AWS datasets using the newly introduced cross-cloud Lakehouse, all while maintaining a single source of truth for security and governance.

For customers utilizing Managed Spark clusters, this integration unlocks several powerful new capabilities. Data teams can now accelerate their most demanding ETL and data science workloads by up to 4.9x using the optimized Lightning Engine.

Next-gen runtimes: Cluster Image 3.0 with Spark 4.1

Keeping pace with the open-source ecosystem, we rolled out Cluster Image 3.0 in preview, built with Apache Spark 4.1 and that features an upgraded default Java runtime, Java 21. Spark 4.1 introduces a set of core open-source capabilities, including real-time mode for structured streaming. This enables your Spark environment to support real-time streaming with continuous, sub-second latency processing.

Get started today

These updates are live and ready to use today in Managed Spark clusters! You can enable these new features directly through the Google Cloud console or via the gcloud CLI.

To spin up a new Managed Cluster and natively unlocking the performance of Lightning Engine, run the following command in your terminal:

code_block: <ListValue: [StructValue([('code', 'gcloud dataproc clusters create my-optimized-cluster \\\r\n --region=us-central1 \\\r\n --image-version=2.3 \\\r\n --engine=lightning \\'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eb108f970>)])]>

Alternatively, navigate to the Managed Service for Apache Spark page in the console, click Create cluster, and select ‘Enable Lightning Engine’ under the cluster configuration settings to automatically activate Lightning Engine for your Spark jobs.

We look forward to hearing about the environments you build and run as Managed Service for Apache Spark clusters!

What’s new with Google Data Cloud

Thu, 04 Jun 2026 16:00:00 +0000

June 1 - June 5

Beyond the Query: Powering AI Agents with Bigtable, Firestore & Memorystore
Discover the latest advancements in Google Cloud's NoSQL Database portfolio, including Bigtable, Firestore, and Memorystore. This series is designed for a broad audience: whether you are exploring these databases for the first time or are an existing user looking to leverage the new capabilities announced at Next '26.

Register here to secure your spot!

Cloud Engineer's AI Toolkit Workshops: Solve data-driven challenges with BigQuery, AlloyDB, Gemini and more. Hosted by Google Cloud Labs, this highly technical event is built specifically for Platform Engineers, SREs, and cloud infrastructure teams ready to bridge the gap between AI prototypes and production-grade deployments. Look out for more locations coming soon

Toronto - June 25 (Data Cloud) | RSVP Here
Chicago - June 30 (Data Cloud) | RSVP Here
Start a 10-day Bigtable free trial with a 1 node SSD cluster and up to 500GB of storage capacity. With no credit card required to start, you can easily ingest workloads and manage workloads that require low-latency, high-throughput, and predictable access. Plus, new Google Cloud customers get $300 in free credits on signup.

May 11 - May 15

Managed Service for Apache Airflow has launched a wave of new features, including the general availability of Airflow 3.1, AI-powered agentic troubleshooting, a new managed Airflow MCP Server for custom agent integration, and declarative YAML-based orchestration pipelines—discover all the details in the full blog post.

April 20 - April 24

Google-built ODBC Driver for BigQuery is now available in Preview
We are excited to announce the launch of the new, Google-built ODBC driver for BigQuery. This new open-source driver provides a direct, high-performance connection for applications to BigQuery and is developed entirely in-house by Google. Download a new driver and connect your application to BigQuery.

April 13 - April 17

We announced we are reintroducing Data Studio to play a significant role in the AI era, expanding from data visualizations and reports to host BigQuery conversational agents and data apps built in Colab notebooks.
We announced BigQuery Graph is now available in preview, offering an easy-to-use, highly scalable graph analytics solution, empowering data professionals to model, analyze and visualize massive-scale relationships in an entirely new way.

April 6 - April 10

We introduced Conversational Analytics for Looker Embedded environments, enabling users to add natural language experiences to their own custom data-driven applications, powered by Gemini.
We expanded Looker’s capabilities for faster ad-hoc analysis, with the introduction of self-service Explores, enabling you to bring your own data to Looker’s semantic layer and gain instant access to insights in a governed data environment.

March 23 - March 27

We showed you how you can scale your reads with Cloud SQL autoscaling read pools. This feature allows you to provision multiple read replicas that are accessible via a single read endpoint and to dynamically adjust your read capability based on real-time application needs.
Our customers are leveraging the full power of Conversational Analytics and Looker to drive major business and technical breakthroughs in the AI era. Companies like Telenor, Pet Circle, Fluent Commerce, Lighthouse Intelligence, Wego, and ROLLER are turning data into insights and actions, grounded by Looker’s semantic layer.

March 16 - March 20

We introduced an enhanced Gemini assistant in BigQuery Studio, transforming the agent from a code assistant into a fully context-aware analytics partner.

February 23 - February 27

We introduced managed and remote MCP support for Google Cloud databases, including AlloyDB, Spanner, Cloud SQL, Bigtable and Firestore, to power the next generation of agents. This announcement extends the ability for AI models to plan, build, and solve complex problems, connecting to the database tools our customers leverage daily as the backbone of their work environment.
We outlined how you can build a conversational agent in BigQuery using the Conversational Analytics API to help you build context-aware agents that can understand natural language, query your BigQuery data, and deliver answers in text, tables, and visual charts.

February 16 - February 20

Our customers are leveraging the full power of Looker to drive major business and technical breakthroughs. Companies like Arrive, Audika, Carousell, Framebridge, GumGum, Intel, Overdose Digital, Ocean Network Express, Subskribe and Promevo are leveraging Looker’s newest AI-driven capabilities, including Conversational Analytics, to transform data to insights and actions, and empower their entire organization with a single source of truth, powered by Looker’s semantic layer.

February 2 - February 6

Join us on March 4 for our webinar, Win Your AI Strategy with Cloud SQL Enterprise Plus, to learn how to power your generative AI workloads with 3x higher performance and 99.99% availability. Register today to discover how to build a scalable, enterprise-grade foundation for your most demanding AI applications.

January 26 - January 30

We introduced Conversational Analytics in BigQuery, which allows users to analyze data using natural language. Conversational Analytics in BigQuery is an intelligent agent that generates, executes and visualizes answers grounded in your business context directly in BigQuery Studio, making data insights for data professionals more conversational.
We outlined how data products have become the foundation for AI agents, providing the context needed to make autonomous agents reliable and trusted for real business use, backed by organized business logic and semantic understanding.
We highlighted how you can supercharge data analytics workflows, and outlined Google Cloud’s AI agent offerings for data engineering, data science, and development tools, so you can integrate agentic workflows in your applications, empower your teams and speed discovery.

January 19 - January 23

We have fundamentally reimagined Firestore with pipeline operations for Enterprise edition. Experience a powerful new engine featuring over a hundred new query features, index-less queries, new index types, and observability tooling to improve query performance. Seamlessly migrate using built-in tools and leverage Firestore’s existing differentiated serverless foundation, virtually unlimited scale, and industry-leading SLA. Join a community of 600K developers to craft expressive applications that maximize the benefits of rich queryability, real-time listen queries, robust offline caching, and cutting-edge AI-assistive coding integrations.
Introducing Google Cloud SQL on MSSQLTips: We are highlighting a new technical guide published on MSSQLTips titled "Introducing Google Cloud SQL." This article serves as an essential resource for SQL Server administrators and developers exploring Google Cloud's fully managed database service. It provides a detailed overview of Cloud SQL capabilities, including high availability, security integration, and the seamless transition of on-premises SQL Server workloads to the cloud, making it an ideal resource for those planning their migration strategy.
We are excited to announce the Public Preview of Microsoft Entra ID (formerly Azure Active Directory) integration with Cloud SQL for SQL Server. Designed to tackle the challenge of identity sprawl in multi-cloud environments, this integration allows organizations to govern database access using their existing Microsoft identity infrastructure. Key benefits include centralized identity management, enhanced security features like Multi-Factor Authentication (MFA), and simplified user administration through direct group mapping. This feature is available for SQL Server 2022 and supports both public and private IP configurations.

January 12 - January 16

Google-built JDBC Driver for BigQuery is now available in Preview
We are excited to announce the launch of the new, Google-built JDBC driver for BigQuery. This new open-source driver provides a direct, high-performance connection for Java applications to BigQuery and is developed entirely in-house by Google. Download a new driver and connect your Java application to BigQuery.
Troubleshoot Airflow tasks instantly with Gemini Cloud Assist investigations: Cloud Composer just got smarter. We are excited to announce that Gemini Cloud Assist investigations are now available directly within Cloud Composer 3. Instead of manually sifting through raw logs, you can now simply click "Investigate" on a failed Airflow task. Gemini analyzes logs and task metadata to identify failure patterns—such as resource exhaustion or timeouts—and provides actionable recommendations driven by Gemini Cloud Assist to resolve the issue. This integration shifts the debugging experience from manual toil to automated root cause analysis, significantly reducing the time required to restore your pipelines. Learn more about AI-assisted troubleshooting.

What’s new in serverless Managed Service for Apache Spark

Wed, 03 Jun 2026 16:00:00 +0000

Whether you use it for data preparation, real-time interactive queries, AI model training, or something entirely different, running Apache Spark at scale is demanding — you shouldn’t have to manage the underlying infrastructure too.

Late last year, we announced the general availability (GA) of our serverless Managed Service for Apache Spark runtime version 3.0, prioritizing speed, simplicity, and reliability. Since then, customer use of Managed Service for Apache Spark for data science has nearly doubled year over year. This is a testament to our belief that using Google Cloud is the easier, smarter, and faster place to run your Apache Spark workloads.

In this blog, let’s dive into a few key features that make our serverless Apache Spark offering a great fit for a wide range of workflows, including feature engineering, GPU-accelerated model training and tuning, semantic search, RAG, building AI agents and applications, and more.

Zero-setup onboarding

The most significant barrier to entry for a cloud service is often the "time to magic moment" — the interval between creating a project and running your first workload. Previously, with serverless Spark, you still needed to manually configure IAM roles, VPC networking, and firewall rules before submitting a single job.

In the serverless Spark 3.0 runtime version, zero-setup onboarding significantly reduces the time to launch your first workload on serverless Spark. It does so by automating the following steps:

Permissions: Necessary IAM roles and permissions are automatically provisioned to the appropriate service accounts.
Networking: Private Google Access is auto-enabled on subnets, and system firewall policies are configured automatically.
API management: Enabling APIs is now more efficient; you can just enable the Managed Service for Apache Spark API instead of manually having to enable several different APIs, as you did previously.

Fast startup for SLA-sensitive workloads

Latency matters, especially for interactive data science and SLA-sensitive batch pipelines. Historically, serverless Spark startup times could take several minutes. With the 3.0 runtime, we’ve dropped startup times by 75% across both standard and premium tiers, delivered automatically without any code or configuration changes and at no additional cost.

This massive improvement qualifies serverless Spark for a much broader range of SLA-sensitive workloads, and we’re always looking to optimize startup times even further.

"Serverless Spark allowed us to quickly reap benefits by removing the need for fine-grain machine management. This drove faster model development and significantly reduced our data processing costs." - César Narnajo, Principal Engineer, Moloco

Better GPU obtainability

Support for Dynamic Workload Scheduler (DWS) Flex Start Mode in the serverless 3.0 runtime version allows serverless Spark to queue customer requests for a configurable duration when GPUs are unavailable. This feature addresses the obtainability challenges for high-demand accelerators like NVIDIA A100 and L4 that are the subject of frequent regional shortages. By pausing workloads until the necessary GPU capacity becomes accessible with DWS, you can dramatically increase obtainability and reliability for your latency-sensitive AI/ML workloads.

First-class support for Apache Spark 4.x

The serverless Spark 3.0 runtime version supports current and upcoming Apache Spark 4.x innovations, including Spark Connect, which supports a decoupled client-server architecture that enables remote connectivity from any client.

Enhanced multi-zonal support

To protect global enterprise workloads from zonal outages or hardware stockouts, the serverless Spark 3.0 runtime introduces enhanced multi-zonal support by default. The service can now automatically allocate execution nodes across multiple zones within a single region to help ensure obtainability.

Crucially, we do not charge for cross-zonal network traffic between nodes in a region, providing high availability without the traditional multi-zone tax. This is another benefit that you can realize by bringing your global Apache Spark workloads to Google Cloud.

Looking ahead

In addition to the above, we’re also continuing to innovate and push the boundaries of ease of use in areas such as history-based autotuning and goal based autoscaling.

Get started today

You can take advantage of these features today by specifying runtime_version: 3.0 in your batch workloads or interactive sessions. To run your first workload on serverless Spark, perform the following simple steps:

Enable the Managed Service for Apache Spark API.
If you aren’t the project owner, ask your project admin for the serverless Managed Service for Apache Spark Editor (roles/dataproc.serverlessEditor) role on the project.

Now you’re ready to start running your workloads on the Serverless 3.0 runtime version. For more details, visit our updated documentation and access serverless Managed Service for Apache Spark in the Google Cloud console.

Accelerating data lakes: Optimizing Apache Iceberg and Spark with gcs-analytics-core

Tue, 02 Jun 2026 16:00:00 +0000

Many data engineers spend significant time managing compatibility and getting best performance across multiple analytics engines. To help solve this pain point, we are excited to announce gcs-analytics-core, a new open-source Java library designed to centralize and accelerate analytics optimizations for Google Cloud Storage (GCS).

With this, you get the flexibility to select your preferred analytics engine while achieving high performance on GCS. The gcs-analytics-core library provides optimizations across various analytics engines that you use today on GCS, like the Iceberg Spark engine and plan to expand to other analytics engines by the end of this year.

Built to be shared across major data processing frameworks like Apache Spark, this library consolidates and improves performance for analytics workloads on GCS. Available natively in the Apache Iceberg Java runtime starting from version 1.11.0, this library improves read operations for columnar formats like Parquet.

What is the gcs-analytics-core library?

The gcs-analytics-core library is a centralized optimization layer that sits between your analytics engines — such as Apache Spark, Trino, and Apache Hive — and the underlying GCS Java SDK. It intercepts read calls and injects performance enhancements, providing a consistent experience without requiring framework-specific tuning.

For Apache Iceberg users, it integrates into the GCSFileIO implementation, replacing traditional sequential reads with parallelized strategies to minimize latency and maximize throughput.

Key technical optimizations

The library introduces specific optimizations designed to reduce time spent on I/O and end-to-end execution time:

Vectored I/O (threaded): This feature improves read performance by fetching multiple data ranges in parallel within a single operation, reducing the overhead of GCS calls. Without this feature, the system needs to issue a separate call for each data range, increasing both the number of operations and open file latency for each request.
Smart Parquet prefetching: When reading Parquet data, analytics engines typically perform an initial read of the file’s footer, which contains the data structure and information about where specific data ranges are located. The library automatically prefetches this footer data in a single chunk (typically 50KB–100KB), avoiding the multiple network calls that often occur when engines repeatedly seek backward to fetch metadata..

Spotlight: Apache Iceberg integration

We delivered the first major integration of this library into Apache Iceberg. With Iceberg 1.11.0 or later, analytics engines utilizing Iceberg’s GCSFileIO can leverage these performance enhancements. To adopt the library in your environment, verify your Iceberg catalog is configured to use the native GCS FileIO:

code_block: <ListValue: [StructValue([('code', '# Spark configuration example\r\nspark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.gcp.gcs.GCSFileIO'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eb0d78a00>)])]>

Because the core optimizations are embedded within the updated Iceberg runtime and the GCS connector architecture, you automatically benefit from Parquet footer prefetching and multi-threaded vectored reads — with no complex custom tuning required.

You can follow the specific integration details in Apache Iceberg Issue #14326.

Catalog compatibility

The gcs-analytics-core library is compatible with all Iceberg catalogs including the REST catalog, Hive, and other metadata management systems. By decoupling the performance optimizations from the catalog management layer, the library provides consistent read improvements without requiring adjustments to your existing infrastructure setup so you can scale across diverse data lake architectures.

TPC-DS Performance Benchmarks using Spark

To validate these improvements, end-to-end benchmarking was performed using an open source Apache Spark cluster with an Iceberg catalog configured to use GCSFileIO along with the gcs-analytics-core library.

The benchmark leveraged the industry-standard TPC-DS schema across varying dataset sizes (from 1GB up to 10TB), specifically comparing the new library's optimizations against the default GCSFileIO implementation, which uses sequential vectored reads.

By alleviating the I/O bottleneck at the storage layer, compute engines spend less time waiting for network responses (scan time) and more time processing data (execution time).

Here are the end-to-end TPC-DS benchmark results showcasing the percentage improvement when enabling gcs-analytics-core:

TPC-DS schema size	Scan time improvement	Execution time improvement
1 GB	71.51%	32.61%
10 GB	48.48%	18.94%
100 GB	40.98%	10.95%
1 TB	35.86%	3.38%
10 TB	18.40%	1.58%

As the data shows, there is a consistent improvement across all dataset sizes. The library is effective for the complex query patterns in TPC-DS, delivering scan time reductions that directly lower overall query execution time.

Get started

Before running your Spark workloads, confirm that the following requirements and configurations are met:

Use Apache Iceberg Spark runtime 1.11.0+ and the iceberg-gcp-bundle 1.11.0+.
Configure your catalog to use GCSFileIO.
Enable the gcs-analytics-core optimization flag (spark.sql.catalog.$CATALOG_NAME.gcs.analytics-core.enabled=true).
Enable vectorized I/O (spark.sql.iceberg.vectorization.enabled=true) to achieve read performance.

code_block: <ListValue: [StructValue([('code', 'spark-submit \\\r\n --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.11.0,org.apache.iceberg:iceberg-gcp-bundle:1.11.0 \\\r\n --conf spark.sql.catalog.$CATALOG_NAME=org.apache.iceberg.spark.SparkCatalog \\\r\n --conf spark.sql.catalog.$CATALOG_NAME.io-impl=org.apache.iceberg.gcp.gcs.GCSFileIO \\\r\n --conf spark.sql.catalog.$CATALOG_NAME.gcs.analytics-core.enabled=true \\\r\n --conf spark.sql.iceberg.vectorization.enabled=true \\\r\n <your-application-jar-or-script>'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eb0d78d60>)])]>

The gcs-analytics-core library is open source and available for developers to contribute to the project and explore the source code. Our implementation and micro-benchmark configurations are part of the repository and can be referenced for your contributions or validations.

GitHub repository: GoogleCloudPlatform/gcs-analytics-core
Documentation: Review the design document for deep architectural details.

We want to hear about your experience. If you test this on your own datasets, please feel free to open an issue on GitHub or share your results with the community. We look forward to seeing how you utilize these optimizations in your data lakes.

The fully-managed Remote MCP Server for AlloyDB is now Generally Available

Mon, 01 Jun 2026 16:00:00 +0000

AI agents possess incredible reasoning capabilities and can perform increasingly complex actions. But the reliability of agentic outcomes depends entirely on the quality of the context they can access — context that is frequently locked away in operational databases.

To bridge this gap, we are excited to announce the Remote Model Context Protocol (MCP) Server for AlloyDB is now generally available.

The Model Context Protocol (MCP) is an open-source standard that gives LLMs a secure, consistent way to connect to external data sources. As part of Google Cloud’s recent rollout of 50+ Google-managed MCP servers, this new integration makes it easier than ever for both interactive and autonomous agents to securely harness the full power of your enterprise data. For example, you can now ask an AI agent for an up-to-the-millisecond view of your delivery fleet by connecting it to your real-time logistics data in AlloyDB, avoiding inaccuracies due to stale data and reducing the need for manual reporting.

Why AlloyDB is the strong foundation for agentic apps

By connecting MCP to AlloyDB, your agents get access to the premier database built for enterprise-grade AI. AlloyDB delivers the scale, speed, and intelligence required for the most demanding agentic workloads:

Supercharged vector performance: Scale to over 10 billion vectors at up to 6x the speed of standard PostgreSQL for vector queries (and up to 10x faster for filtered queries) with the ScaNN index.
Advanced search and reranking: Power multimodal applications with hybrid search via RUM (in Preview) and intelligent reranking through Reciprocal Rank Fusion (RRF) or Gemini Enterprise Platform models.
Real-time intelligence: Efficiently generate millions of embeddings using built-in AI Functions to facilitate low-latency, real-time agentic experiences.
Unified data access: Give agents a single PostgreSQL interface to seamlessly join operational data in AlloyDB with analytical data in BigQuery or archived data in Iceberg tables via Lakehouse Federation.
Enterprise-grade scale: Rest easy with a 99.99% SLA, autopilot database optimizations, and auto-scaling read pools with up to 20 nodes.

Why Remote MCP matters for AlloyDB

Local MCP servers are great for local development, but communicating over standard input/output (stdio) streams becomes difficult when you scale to production workloads. It is both architecturally complex and administratively burdensome to provision and manage all of the infrastructure and security guardrails you need to run agents for high-value use cases that interact with sensitive operational data.

The Remote MCP Server for AlloyDB runs on fully-managed Google Cloud infrastructure and exposes an HTTP endpoint that connects your AI applications to your data. This solves key challenges for teams building agents on PostgreSQL:

Centralized discovery: Find, secure, and manage your database's MCP server using Agent Registry.
Fully-managed HTTP endpoints: No need to deploy or maintain the infrastructure required for connectivity. Configure your agent to use the endpoint to get started.
Fine-grained authorization: Instead of using shared database passwords or API keys, you use Identity and Access Management (IAM) to restrict agents to specific tables, schemas, or views. With the read-only execute SQL tool, you can prevent your agent from making accidental changes and deletions from your database.
Operational instance management: The AlloyDB toolset gives agents the ability to do more than run queries. Agents can update instances, export and import data, create backups, and restore clusters.
Model Armor protection: Model Armor provides optional prompt and response security to screen and filter data, defending against prompt injections or accidental data exfiltration.
Audit logging: Every query, action, and tool call goes to Cloud Audit Logs, giving security teams a full audit trail.

Let's see it in action: A quick demo

Getting started with the AlloyDB Remote MCP server is a straightforward process. To see it in action in your own environment, you can follow our new Codelab, which guides you through these essential steps:

API & environment prep: Enable the AlloyDB, Compute Engine, and Gemini Enterprise APIs in your Google Cloud project.
Provision your database: Deploy your AlloyDB cluster, create your database, and import your sample data.
Enable data access API: Permit the Data Access API on your AlloyDB instance.
Connect the agent: Configure your MCP client by providing the remote endpoint (https://alloydb.googleapis.com/mcp). Pass your Google Cloud IAM credentials using an OAuth 2.0 bearer token in the HTTP Authorization header.

Once the connection is established, your agent can provide reliable, grounded answers to complex business questions using your real-time operational data. By performing introspection queries, the agent automatically understands your database schema – including tables and columns – enabling it to construct sophisticated joins and queries to fulfill user requests accurately.

Once your agent has access to the AlloyDB toolset, it can execute queries, analyze operational trends, and dynamically rank text data using AlloyDB AI functions like AI.RANK().

Security remains paramount: the Remote MCP Server for AlloyDB integrates seamlessly with Model Armor. This provides protection against sensitive data leaks, even if the agent’s service account possesses broad access permissions within the database.

Watch the full demo below!

What's next

By enabling agents to interact securely with transactional data, we are embracing an architecture where AI agents can reliably access and act upon your enterprise’s single source of truth.

Ready to build? Discover AlloyDB with a 30-day free trial, and dive into the Remote MCP for AlloyDB Codelab to start powering your enterprise agentic applications today.

Modeling a digital twin of a food supply chain using BigQuery Graph

Mon, 01 Jun 2026 16:00:00 +0000

The example of a growing restaurant

Imagine you are running a restaurant chain. You just can't physically feel and touch things to know how your business operates. You need tools and a digital replica of your business to sense the health of the business for you.

The friction of growth

Growth creates a unique kind of friction that spreadsheets simply weren't built to solve:

The bullwhip effect: Small downstream demand shifts swell into upstream inventory tidal waves.
SOP drift: Tiny departures from standard prep work eventually erode the entire brand vibe.
The food safety blast radius: One contaminated ingredient creates a messy, complex map of risk across the network.
Maverick spend: The "million-dollar leak" caused by local managers purchasing ingredients off-contract.

The digital twin

Digital models empower us to ask more insightful questions about the world, but they also force a critical choice in how we structure data. While traditional relational tables have been the standard, we must ask: are they still the right tool for everything? Given that our world is inherently interconnected, perhaps shifting to graph-based models is the natural evolution for capturing reality.

When managing thousands of assets, complex supply chains, or global logistics networks, traditional relational databases require massive, resource-intensive SQL joins to trace dependencies. This architecture creates a latency gap between physical events and operational awareness.

Modeling with BigQuery Graph

BigQuery Graph allows you to build a digital twin of your entire supply chain within your existing data platform. By turning your physical world—items, recipes, and locations—into a searchable map of nodes and edges, you gain a new level of clarity.

1. Defining the Semantic Layer

Instead of moving data to a new database, you create a Graph View over your existing tables. This tells BigQuery exactly how your tables relate to one another.

Query Language:

code_block: <ListValue: [StructValue([('code', '# Build the Graph Nodes & Edges\r\nCREATE or REPLACE PROPERTY GRAPH `restaurant.bombod`\r\nNODE TABLES (\r\n `restaurant.item` label item properties all columns,\r\n `restaurant.location` label location properties all columns,\r\n `restaurant.itemlocation` label itemlocation properties all columns\r\n)\r\nEDGE TABLES (\r\n `restaurant.bom`\r\n KEY(bomKey)\r\n SOURCE KEY (childItemLocation) REFERENCES `restaurant.itemlocation`(itemLocationKey)\r\n DESTINATION KEY (parentItemLocation) REFERENCES `restaurant.itemlocation`(itemLocationKey)\r\n LABEL consists_of properties all columns\r\n);'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eabbf9550>)])]>

Image of a fictitious restaurant supply chain modeled using BigQuery Graph

Precision in practice

How does this change daily operations? It moves the business from panic to precision.

Surgical recalls: If a supplier reports a Listeria breakout, you walk the graph forward to find exactly which menu items in which specific restaurants are affected.
Weather risk analysis: When a hurricane threatens a distribution center, you don't see a list of stores; you see the blast radius. You identify the locations critically dependent on that hub and reroute supplies.

2. Executing the search

Graph Queries are a new tool for modelers and data scientists to query their data - it simplifies complex multi-domain data concepts and simplifies querying and makes data analysis a simpler more natural representation of problem articulation. For example: If I want to know which all locations handle chicken I could run a graph query as shown below:

To investigate a specific complaint or risk, you run a search on the model using graph query language.

Graph Query Language

code_block: <ListValue: [StructValue([('code', "# Navigate to the source of a specific ingredient issue\r\nGraph restaurant.bombod\r\nMATCH (a:itemlocation)-[c:consists_of]->(b:itemlocation) \r\nWHERE b.itemKey LIKE '%Chicken%'\r\nRETURN to_json([to_json(a),to_json(c),to_json(b)]) as result"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eabbf9160>)])]>

Source of a foul odor - modeled as a graph

Building for the future

To get the most out of your digital twin, follow these guiding principles:

Focus on structure: Use graphs for relationships and dependencies; keep daily sales totals in relational tables.
Clean your keys: Spend time on data engineering; a graph is only as strong as its connections.
Capture edge properties: Store metadata like lead times or shipping costs directly on the edges to increase the model's utility.

Conclusion

The restaurant industry has outgrown the relational way of treating business data only as a list. By building inter-domain relationships as a digital twin with BigQuery Graph, you move from reactive problem solving to proactive modeling. It’s time to stop managing your network with a list and start seeing the connections in seconds.

Get started today

Check out the tutorial here
Visit the BigQuery documentation: find overview and quickstart guide.
Share your feedback: join our community, and get your questions answered via bq-graph-preview-support@google.com.
Related blog: Introducing BigQuery Graph

Cool stuff Google Cloud customers built, May edition: Agentic algorithms for supply chains; virtual try-on APIs; robotic camera operators & more

Fri, 29 May 2026 16:00:00 +0000

AI and cloud technology are reshaping every corner of every industry around the world. Without our customers, who are building the future on our platform, there would be no Google

Cloud. In this regular round-up, we dive into some of the exciting projects redefining businesses, shaping industries, and creating new categories.

For our latest edition, we learn how Urban Outfitters sped up its order management; BASF uses AlphaEvolve algorithms to map global supply chains; the unification strategy for UKG’s workforce intelligence; WPP’s secrets to training humanoid robot camera operators; how Breuninger piloted Virtual Try-On APIs; creating automated video clips with Glance; and Movix improves the production of dental aligners.

Be sure to check back next month to see how more industry leaders and exciting startups are putting Google Cloud technologies to use. And if you haven’t already, please peruse our list of 1,302 real-world gen AI use cases from our customers.

Urban Outfitters saves big by migrating order management

Who: Urban Outfitters, Inc. (URBN), the popular clothing and home goods retailer, relies on IBM Sterling OMS as the nerve center of its global ecommerce operations. However, the foundation of this critical system — a massive 11TB Oracle database — was increasingly becoming a bottleneck.

What they did: URBN completed a major infrastructure upgrade, migrating its IBM Sterling OMS from an Oracle database to Google Cloud's AlloyDB for PostgreSQL. To enhance performance and provide high availability and scalability, the AlloyDB deployment architecture includes two read replicas, providing low-latency access to data for reporting and analytics. Google Cloud and IBM teams also assisted URBN in a rigorous, iterative switchover testing strategy.

Why it matters: The migration to AlloyDB has fundamentally reshaped URBN’s data strategy, delivering a more favorable total cost of ownership through an optimized storage and compute architecture, without sacrificing performance or reliability. Furthermore, the shift to a PostgreSQL-compatible database gave URBN the flexibility of an open-source ecosystem, providing freedom from vendor lock-in, as well as significant speed improvements that enhanced responsiveness.

Learn from us: "URBN’s successful migration serves as a blueprint for organizations looking to modernize their mission-critical infrastructure and future-proof their environment for AI expansion. This journey proves that even the most complex, mission-critical migrations can be achieved through deep cross-organizational partnership and a phased, risk-mitigated approach." – Rob Frieman, CIO, Urban Outfitters & Raj Pai, VP, Product Management, Databases, Google Cloud

BASF manages supply chain decisions with AlphaEvolve

Who: BASF Agricultural Solutions manages a complex network of 180 production sites with more than 5,000 distinct value chains. Currently, human planners make thousands of local decisions every day on what to produce, when to produce it, and how much safety stock to hold.

What they did: To understand how local decisions ripple across their entire global network, BASF turned to AlphaEvolve on Google Cloud to build a digital twin of their supply chain. In collaboration with Google Cloud and prognostica GmbH, BASF fed the model three years of historical data and then generated variations of the code, mutating the logic to see if it could simulate a supply chain that matched the real-world historical data.

Why it matters: By running thousands of experiments, AlphaEvolve developed a clear, human-readable algorithm that explains how the BASF network truly operates. The final algorithm successfully mirrored the actual historical performance of the supply chain, significantly reducing the error rates compared to the initial seed model. It automatically discovered factually correct, domain-specific supply chain rules, providing a clear foundation for optimizing asset utilization globally.

Learn from us: “We had several attempts to build a digital twin. … By using AlphaEvolve, we cannot only map the complex network based on system data, but at the same time understand and copy the human decisions that drive our daily operations.” – Dr. Goetz Krabbe, vice president for global supply chain at BASF

UKG unlocks real-time workforce intelligence at scale

Who: UKG is one of the leading providers of human capital management (HCM) and workforce management (WFM) solutions, but years of growth led to backend sprawl. They have 126 application teams, dozens of tech stacks, and more than 12,000 database instances.

What they did: To bring the full UKG suite onto one real-time foundation, the company built People Fabric, a new data and intelligence platform powered by AlloyDB for PostgreSQL and the just-announced Agentic Data Cloud. They created a custom change data capture (CDC) framework to extract changes from existing operational databases, and for larger analytical workloads, the same data flows into BigQuery, while Cloud SQL holds the metadata and tenancy context.

Why it matters: People Fabric gives UKG a complete and consistent view of people, work, pay, and culture data that’s updated continuously and ready for AI to use in real time. For engineering teams, People Fabric acts as a database-as-a-service that accelerates development and supports modernization without customer disruption. Additionally, migrating core person and employment data off their on-prem monolith has generated cost savings significant enough to fund half of People Fabric.

Learn from us: “As we continue expanding People Fabric, we’re laying the groundwork for deeper agentic automation, more responsive analytics, and a growing set of AI-driven capabilities — all on a trusted, scalable foundation built for what’s next.” – Radhi Chagarlamudi, Group Vice President, Product Engineering, UKG & Heather White, Cloud Data Architect, Google Cloud

WPP accelerates humanoid robot training 10x with G4 VMs

Who: WPP is one of the world’s largest marketing organizations, handling $70 billion of media for enterprise clients. They work on some of the most complex commercial film shoots and were eager to test the viability of robotic cameras to capture more footage, but this required complex training of physical models AI.

What they did: WPP used the new G4 VM instance powered by NVIDIA RTX PRO 6000 Blackwell on Google Cloud to tackle the unique challenges of training physical AI for robotics in videography settings. After capturing human motion with the OptiTrack mocap system, they undertook reinforcement learning using the AI Hypercomputer together with the NVIDIA Isaac Sim image. MuJoCo, an open source physics engine by Google DeepMind, was a critical piece of simulation software that validated accuracy continuously, in real-time.

Why it matters: WPP was able to utilize a P2P topology that moves data directly between GPUs without the bottleneck of central processing. They saw speed increases in excess of 10x, taking training times down to less than one hour. Through high-volume simulation, the humanoid robots learned how to respond to small changes and bridge the tough "sim-to-real" gap, helping ensure the robot's simulated adaptability translated to safety and stability in the real world.

Learn from us: "Our process for mastering complex, natural movement on a film set can be replicated across industries to overcome the massive computational complexity of training robots." – Perry Nightingale, SVP of Creative AI, WPP

Breuninger boosted sales with its "be your own model" AI

Who: Breuninger, a fashion and lifestyle company based in Germany, thought emerging generative media models could be a good fit to answer the question every online fashion shopper asks: "How will this look on me?"

What they did: Working with Google Cloud, they built a virtual try-on experience that lets shoppers see high-end fashion on their own bodies using a simple selfie. Using the Virtual Try-On (VTO) API, Breuninger’s data team worked directly with Google’s engineers to test and refine the technology in three stages, ultimately moving from pre-selected models to a user-first, selfie-based approach. The project was also part of Breuninger’s move to a Flutter-based platform, which helped the team move from its vision to a live launch in only three months.

Why it matters: During a six-week A/B test over Black Week and the holiday season, the team found that shoppers who used the virtual try-on converted purchases at a higher rate than those who didn't. Customer surveys reinforced the numbers: shoppers responded well to the high image quality and the personalized experience.

Learn from us: “Breuninger continues to refine the experience based on how customers actually use virtual try-on in everyday shopping — the same user-first approach that shaped the project from the start.” – Daniel Rascher, Senior Product Owner, Breuninger & Dr. Michael Menzel, Customer AI Specialist, Google Cloud

Glance turns hours of video into mobile-ready clips

Who: Glance, a mobile-first content platform, processes 1-2 hour videos from sources like podcasts, news reports, movies, and web series, and transforms them into 30 to 180-second vertical clips optimized for mobile lock screens.

What they did: The goal was to create a complete pipeline that takes a long-form landscape video (16:9) and outputs multiple ready-to-publish short-form portrait videos (9:16). The final technical solution uses Google Cloud Speech-to-Text v2, Gemini, and the Google Vision API, combined with custom video manipulation using Samurai (an open-source object tracking tool), OpenCV and MoviePy. The process involves audio extraction, speech-to-text transcription, and using Gemini 2.5 Flash to analyze transcript text and identify optimal start and end timestamps for short video clips.

Why it matters: With daily volume projected to grow from 3,500 to over 10,000 videos per day, manual editing wasn’t a realistic path forward. Glance’s video pipeline demonstrates what becomes possible when AI handles the repetitive, judgement-intensive work of video editing. The system transforms thousands of long-form videos into mobile-ready clips each day, preserving narrative context while optimizing for vertical viewing. Rather than choosing between scale and quality, automated pipelines can deliver both.Learn from us: “Glance’s video pipeline demonstrates what becomes possible when AI handles the repetitive, judgement-intensive work of video editing. … The approach offers a template for any organization sitting on long-form video archives. Rather than choosing between scale and quality, automated pipelines can deliver both.” – Himanshu Aggarwal,

Machine Learning Engineer, Glance & Sharmila Devi, AI Consulting Lead, Google Cloud

Movix fills a gap in dental skills with specialized agentic AI

Who: Movix is building one of the first agentic AI solutions for dental appliance manufacturers and dental labs, to help solve a serious shortage of skilled dental technicians in aligner manufacturing.

What they did: Movix developed custom models for deep learning, computer vision, and 3D mesh analysis over a five-month period, using Google Cloud infrastructure. Once defects are detected, they use the Gemini Enterprise Agent Platform to generate client-facing feedback that reads as if it came directly from a human technician. Their 3D models use Cloud Run with L4 GPUs for the massive compute power required, and they use Compute Engine VMs to run experiments and train models.

Why it matters: Movix’s agentic solutions automate data entry and quality control, which are traditionally manual, time-consuming, and error-prone tasks. The automation and higher level of accuracy the QC agent delivers can save $300 per remake for an aligner manufacturer, and speed up the appliance manufacturing process with quicker turnaround times.

Learn from us: “We plan to build hybrid solutions … designing an architecture that connects our cloud-based AI agents with older, on-premises software that many conservative labs still use — through lightweight local connectors and standardized APIs. This will allow us to access a large market segment that has not yet migrated to the cloud.” – Marina Domracheva, CEO, Movix & Bakit Dzhumagulov, CTO, Movix

From petabytes to predictions: Easy BigQuery insights in Google Sheets

Fri, 29 May 2026 16:00:00 +0000

Many organizations’ single source of truth is data that resides in BigQuery, Google’s governed, secure and petabyte-scale data platform. However, the "last mile" of ad-hoc analysis, modeling, and reporting often happens where business users are most comfortable: Google Sheets.

Bridging this gap usually involves exporting data as CSVs. But this is inefficient, creating data silos, version control problems, and security and governance risks. Connected Sheets helps to eliminate this trade-off, turning the familiar Google Sheets interface into a direct, live window into your BigQuery data platform, letting you analyze petabytes of data quickly, securely, and easily.

In this post, we’ll do a quick overview of Connected Sheets, walk through real-world use cases, and show you how to perform enterprise-grade data analysis using BigQuery directly in Google Sheets.

A live window into the single source of truth

Business users often wait days or weeks for simple reports. Connected Sheets solves this by letting you analyze your critical data via a secure, direct connection to billions of rows of live data, with no SQL required.

For data admins, this architecture is appealing because it maintains a strong security and governance posture. They can provision access to specific tables or views, confident that the underlying data cannot be altered from a Connected Sheet. Admins can also take advantage of Google Workspace’s enterprise data protections to control reading, sharing, and copying data throughout its lifecycle.

For end users, the benefit is immediate agility and ease of use. They can use familiar tools like pivot tables, charts, calculated columns, and formulas to analyze billions of rows of live data as if it were a local file, balancing centralized control with the business's demand for speed. End users don’t have to learn technical concepts like databases, schemas, tables, and query languages like SQL to access, analyze, and visualize the data.

Key use cases and core journeys

We consistently hear about three primary use cases for Connected Sheets from customers across industries.

1. Self-service exploratory analysis: Data teams provide access to curated tables and datasets in BigQuery. Business Analysts in sales, operations, finance, or marketing can then build their own pivot tables or charts that run over the entire live data source directly from Sheets, then filter data to answer day-to-day questions, freeing the data team from a constant backlog of ad-hoc requests.

Example: Deep-dive investigation

Scenario: A sales manager analyzes millions of global transactions to review quarterly performance.
Action: Using a Connected Sheets pivot table, they quickly create a pivot table to summarize revenue by region and product line. When they spot an anomaly — an unexpected revenue spike in EMEA, for example — they simply double-click the summarized value to drill down and learn more about exactly what led to that value.
Outcome: Connected Sheets instantly queries and retrieves the precise, granular transaction rows behind that summary value, making it easy and fast to find the root cause.

2. Operational reporting: Business users can create live, refreshable, and easy-to-understand dashboard-like views of their data that their partner teams can rely on and share with executives and leads.

Example: Automated executive summary

Scenario: An operations lead provides weekly updates on sales invoices to their leadership, based on a BigQuery dataset with millions of rows.
Action: The operations lead creates their Connected Sheet and builds a series of charts to visualize invoice trends over time. They then configure the sheet to automatically refresh on a schedule every Monday morning, so it’s always ready ahead of their executive review.
Outcome: The manual routine of exporting data and pasting it into workbooks is completely eliminated. Leadership gets a reliable report and analysis powered by the latest warehouse data.

3. Hybrid data modeling: Data practitioners often need to blend governed warehouse data with real-time manual inputs and annotations. For example, a finance team might pull revenue data from BigQuery and combine it with manual procurement entries from your ERP system in a separate tab, using VLOOKUP to create a consolidated view for month-end reporting.

Example: Custom business metrics

Scenario: A financial analyst calculates custom commission payouts based on live sales data from your CRM system. The commission tier logic changes frequently and isn't modeled in the central data warehouse.
Action: Instead of requesting a new data pipeline from their data team, the analyst can add a calculated column directly within the Connected Sheet. They use standard spreadsheet formulas (like IF or IFS) to apply custom business logic directly against the BigQuery data.
Outcome: The analyst retains the flexibility to model scenarios and calculate metrics quickly, while maintaining governed BigQuery data as their single source of truth.

Getting started

Connecting Google Sheets to BigQuery is straightforward and requires only a Google Workspace account and a billing-enabled Google Cloud project. There are two primary ways to establish a connection and create a Connected Sheet.

Path 1: Starting from Sheets
This is the typical workflow for users who work primarily within spreadsheets.

Open a new Google Sheet.
Navigate to Data > Data Connectors > Connect to BigQuery.
Select your billing-enabled Google Cloud project.
Browse available datasets, select a Saved Query to connect right away, or input a custom SQL query.
Click Connect.

Path 2: Starting from BigQuery
This workflow is common for data analysts starting from the Google Cloud console.

Navigate to the BigQuery UI in the console.
In the Explorer pane, locate the table or query result you wish to analyze.
Click the Export menu (or the three-dot action menu) next to the asset.
Select Open in > Connected Sheets.

From petabytes to predictions with Connected Sheets

We designed Connected Sheets to help you bridge the gap between the scalability of the cloud and the flexibility of the spreadsheet. With Connected Sheets, we’re making it easier than ever for organizations to put data into the hands of the people who need it.

To explore these features, connect your BigQuery data to Google Sheets today. For more technical details, visit the Connected Sheets documentation.

Evolving Dataflow to process massive datasets for machine learning

Thu, 28 May 2026 16:00:00 +0000

Google created MapReduce more than 20 years ago to solve the scaling problems in data processing that the then young company was running into. The AI era that we are in now demands efficient, large-scale data processing for everything from training frontier models like Gemini by Google DeepMind to powering fully autonomous vehicles like Waymo.

Many aspects of machine learning, including data ingestion, transformation, and feature extraction, rely heavily on processing massive datasets. To meet this astronomical scale required by efforts across Google, we evolved our data platform, Flume, the successor to the original MapReduce, with innovations focused on scalability, efficiency, and a better developer experience. And many of those innovations are available as part of Dataflow, our fully managed batch and streaming platform built on the same core technology Google uses to power its most demanding internal workloads.. In this blog, we provide an overview of the many innovations in the Flume platform, and a glimpse into how Google Cloud customers are putting those features into action with Dataflow.

Addressing massive scalability

The scale of data processing at Google has exploded over the last 20 years and continues to drive innovation. To tackle the challenges of immense scale, we introduced several features within Google's data processing platform, which are also available in Dataflow::

Liquid sharding dynamically splits work units (shards) during execution for on-the-fly rebalancing. This helps pipelines with uneven data distribution and stragglers to maximize worker efficiency as data grows.
Global compute enables enormous scaling by dynamically scheduling workloads across Google's global infrastructure. The system automatically determines the optimal location based on factors like data locality and resource availability.
Automatic pipeline optimization fuses consecutive operations into a single stage. This reduces I/O and stage-transition overhead, allowing large-scale execution to scale more gracefully.
Rate-limiting external API calls manages load on external services. This is essential for modern ML pipelines that frequently call external APIs for tasks like model evaluation, preventing high data volumes from overloading systems.
Tandem pools facilitate serverless remote inference. This feature helps overcome scalability limitations often found in remote inference systems by efficiently hosting, sharing, managing, and autoscaling external model servers.

Boosting efficiency with accelerators

Doing more with less isn't just a constraint; it fuels our progress. By finding ways to run more efficiently, we create the space and capacity needed for rapid innovation. This is particularly evident for teams that use accelerators like TPUs for their workloads. To improve utilization and cost efficiency, our engineers devised several novel features for our platform, now part of Dataflow:

Heterogeneous worker pools allow developers to specify custom resource requirements for different pipeline stages. For example, TPU-intensive work runs on TPU-equipped workers, while other stages use standard CPU workers. This ensures optimal resource allocation.
TPU-aware autoscaling prevents excessive initial assignment of TPU workers and improves efficiency during subsequent autoscaling events.
Duty-cycle policy enforcement automatically scales down TPU workloads when the accelerator's duty cycle (the fraction of time it is active) is low, scaling back up only when utilization improves.
TPU fungibility: By working with other infrastructure teams, we developed optimizations to encourage scheduling jobs to the most suitable TPU version and cell location based on quota and resource availability.

Enhancing the developer experience

Considering the wide mix of backgrounds and tools across Google, rapid prototyping, iteration, and reliable production operations are extremely important. Google has invested in significant capabilities to enhance the overall user experience:

Language flexibility is provided through a versatile SDK with a simple API in C++ (internal to Google), Java, Python, and Go (with SQL support). This allows users to build batch, ML, and streaming pipelines.
Integration with ML frameworks like JAX is available, along with native support for LLM-specific optimizations. The underlying platform also provides building blocks for robust agentic inference pipelines and supports simple transitions between bulk and streaming paradigms.
Unified batch and streaming enables users to use the same code for both historical batch and live streaming data. This simplifies the architecture, which traditionally would have required separate pipelines for batch and streaming data processing.
Observability for production pipelines is available through the monitoring UI, which offers comprehensive control and essential diagnostic data. Detailed performance metrics, such as stage-level TPU utilization graphs, provide transparency for troubleshooting and optimization tasks.
Advanced developer workflows for quicker day 0 and day 2 operations include features like sampling and dry-run to help ensure code accuracy. Users can also test pipelines on small in-memory collections, and even pause and resume production pipelines.

Dataflow brings innovation from Google's internal platform to Google Cloud

Dataflow is built upon Google's internal platform, sharing many core components, including the execution engine and the Apache Beam SDK (which originated from Flume’s APIs). This close relationship means that the cutting-edge solutions we build to handle Google’s internal data processing challenges, like pipelines that process hundreds of billions of documents, directly benefit Dataflow users. In fact, unique Dataflow features like vertical scaling, right fitting, dynamic sharding, and straggler detection all resulted from solutions developed for Google’s internal workloads.

This is one of the reasons many Google Cloud customers rely on Dataflow for critical ML applications: Spotify uses Dataflow for large-scale generation of ML podcast previews. Etsy leverages Dataflow for data preparation and ETL for its ML workloads. And Moloco uses Dataflow to process terabytes of data a day to update its prediction model for real-time ad bidding.

The momentum continues: Last quarter we launched support for TPU in Dataflow in addition to supporting GPUs. Looking ahead, we are working on an advanced reliability feature called speculative execution and are enhancing the developer experience with features like failure isolation and replay and pause/resume, which are coming soon. To learn more or get started with Dataflow visit https://docs.cloud.google.com/dataflow/docs/get-started.

The future of agentic development: Redefining the data practitioner lifecycle with Data Agent Kit

Tue, 19 May 2026 17:45:00 +0000

The modern software development landscape isn’t happening just on one surface — it’s happening across an entire ecosystem of agentic tools. Agents are being developed at an unprecedented scale, and these agents require direct access to enterprise data for context and grounding.

However, the current tooling for building agents and managing data is heavily fragmented. This can make it difficult to access data, increasing security risks, and causing broken developer experiences that hinder innovation.

To address this challenge, we recently launched Data Agent Kit, a unified, open-source collection of data engineering and data science skills, tools and plugins that integrate directly into the environments practitioners already use, such as VS Code, Claude Code, Codex, Gemini CLI and the Antigravity CLI. By seamlessly bringing together these core tools and skills with your enterprise data, the Data Agent Kit effectively serves as a comprehensive harness for agentic context, memory, and personalization. It provides:

Agentic skills: Pre-codified pathways for interacting with your data estate, covering query optimization, ML best practices, data validation, data drift checks, governance, and troubleshooting.
Model Context Protocol (MCP) tools: Secure connections between agentic workflows and cloud data platforms like BigQuery, AlloyDB, and Google Cloud Storage. Developers can now configure connection parameters for their cloud datasets and data processing engines without having to manage complex, manual pipeline code.
Plugins and extensions: Native IDE integrations that enable rich, context-aware developer interactions.

Together, these Data Agent Kit capabilities help data practitioners go from manually writing code to intent-driven data science and engineering: defining the desired business outcomes, constraints, and success criteria, and allowing the AI-augmented system to figure out how to execute it. This shift is critical because today, when building agentic applications that navigate complex data architectures, there’s often a 'context window tax' i.e., developers have to manually paste vast amounts of schema metadata into prompts, eating up token limits and increasing latency. Meanwhile, data practitioners often lack guidance about how to efficiently query, optimize, and troubleshoot cloud data, while specialized, fragmented development environments cannot see across your entire data estate. Data Agent Kit helps with these challenges and others, providing the foundational capabilities data practitioners need for a new agentic way of working.

Read on for an overview of Data Agent Kit’s features and benefits, how to install it and connect your local environment to your data estate, and an intent-driven engineering example.

A unified hub for your data estate and lifecycle

Data Agent Kit makes your entire data estate available in a single view. This goes beyond providing a simple catalog for databases such as BigQuery, AlloyDB and Spanner; rather, it integrates data engineering and science tasks, orchestration pipelines, and jobs into a single interface. This allows practitioners to manage their entire data workflow — from discovery to production — without context switching. Data Agent Kit’s intelligent routing automatically chooses the optimal compute engine for your task — whether that’s BigQuery for SQL-native analytics and ELT, or Spark for custom Python transformations and distributed ML training.

Unified Hub of your entire data estate and lifecycle

Ecosystem-led intelligence: Codified agentic skills

Data Agent Kit offers a library of predefined agentic skills (e.g., ML best practices, ELT, building data apps) based on Google Cloud’s data engineering and science expertise. Rather than relying on generic LLM prompts, it codifies prescriptive guidelines into your workflow. This allows you to inject enterprise-grade data intelligence directly into your IDE or CLI.

Browsing a predefined list of agentic data engineering and science skills

Transforming data exploration through natural language

Grounded in this unified data, Data Agent Kit delivers native conversational analytics directly within your workspace, making it easy to explore your data. Powered by the same Gemini natural language to SQL technology found in our first-party agents (e.g., Conversational BigQuery and Looker), Data Agent Kit lets you run natural language queries to profile, search, and visualize your datasets.

Within Data Agent Kit, you can use Conversational Analytics to explore your data

A practical walkthrough: Unifying data and building models

To see how Data Agent Kit’s skills and MCP tools work together, consider a financial services scenario: Your company is facing rising fraud claims. With your transaction data stored in Cloud Storage, you need to build a high-confidence fraud detection model and schedule orchestration pipelines. Traditionally, this involves hours of data wrangling across multiple consoles. With the Data Agent Kit, you can complete this in minutes, directly within your IDE or CLI. Let’s see how.

Onboarding: The one-minute setup

You can get started with the Data Agent Kit in under a minute through an integrated setup process.

To do so, search for "Google Cloud Data Agent Kit" in your IDE’s marketplace (VS Code) or via the GitHub repo in your CLI (Gemini, Antigravity, Claude, Codex) from the links in the “Get started today” section below. Data Agent Kit automatically configures dependencies and checks your Google Cloud login status.

Click the Google Cloud icon in your activity bar to authenticate via IAM. Once logged in, your Cloud Storage, databases, and catalog assets appear instantly in your workspace.

Use the settings menu to set project IDs, regions, and verify MCP status to ensure all backend services are authorized. Data Agent Kit also includes a quick-start guide on using the tools and skills.

An intent-driven data engineering example

With Data Agent Kit installed, you can skip the manual ETL boilerplate, and directly describe your high-level goal to your coding assistant (e.g., Claude Code, GitHub Copilot) in natural language. The assistant leverages Data Agent Kit’s skills to plan and execute the workflow.

Prompt:

I have the raw transaction logs landing in the GCS bucket gs://fin-clearing-raw/.

First, create a Spark notebook and (1) ingest these logs into an Iceberg table in BigQuery.

Second, create a dbt project to (2) deduplicate them, (3) remove the transactions with invalid transaction id and store them in a separate Iceberg table, (4) standardize the timestamps and perform any other necessary cleanup tasks (5) sync the output to another Iceberg table (6) join this output table with tables that have payer and payees identities and write the output to a final Iceberg table.

Third, I would like you to train an ML model on Spark using a notebook to detect fraudulent transactions in the output table. I am thinking about a LightGBM model but I am open to any suggestions you might have. Use the relevant datasets in the project.

Finally, create an inferencing step using Spark notebook to the above pipeline to perform batch inferencing and write flagged transactions to a Spanner table.

Create an orchestration pipeline that first runs the ingestion then the dbt and next the inference notebook.

Under the hood: Data pipeline steps

Behind the scenes, Data Agent Kit plans a robust multi-step orchestration of the entire data lifecycle, from exploration to inference.

Step 1: Notebook creation, ingestion and initial storage

Find your bronze data — raw, unfiltered data on financial transactions — and bring it into an Iceberg table before doing the transformations.

Automatically create a Notebook to ingest the raw logs from Cloud Storage.
Write the necessary SQL, and store the ingested data into an Iceberg table in BigQuery.

Ingestion into a bronze table

Step 2: Transformation (dbt Project)

Now, clean the bronze data into silver and gold tables:

Data preparation: Deduplicate the transaction logs.
Filter invalid IDs: Identify transactions with invalid IDs and store them in a separate Iceberg table.
Clean and standardize: Standardize timestamps and perform other necessary cleanup tasks.
Sync: Output the cleaned data to another Iceberg table, leveraging the BigQuery MCP server.
Enrichment: Join the cleaned table with payer and payee identity tables.
Final output: Write the joined dataset to a final Iceberg table.

Data transformation to create silver and gold tables

Step 3: Machine learning and inferencing

With your gold table minted, it’s time for some data science: model training and inferencing. Here, the agent hands the clean data from the previous step to the model to spot fraudulent patterns.

Training: Use a Spark notebook to train an ML model.
Inference: Create a Spark notebook inferencing step for batch processing.
Storage: Write all flagged fraudulent transactions to a Spanner table by leveraging the Spanner MCP.

Machine learning and inference

Step 4: Orchestration and execution

Finally, you’re ready to move to production and schedule the whole orchestration pipeline: Ingestion -> Transformation -> Inference.

Orchestration pipelines and scheduling runs

When things go sideways: Agentic incident management and intelligent recovery

If an orchestration pipeline fails, not to worry, Data Agent Kit streamlines resolution using its intelligent incident management capabilities:

Intelligent diagnosis: Automatically conducts root cause analysis to pinpoint failure sources
Autonomous remediation: Drafts and tests fixes, bypassing manual debugging
Automated recovery: Validates and deploys fixes via automated Git workflows

Issue diagnosis and remediation

And there you have it: You’ve gone from raw discovery to a fully automated, fraud-catching machine in a matter of minutes, all from within the same UX. No need to hop between multiple browser tabs, IDE interfaces, or learn data engineering and science best practices — Data Agent Kit orchestrates a clean end-to-end flow leveraging various MCP tools and codified skills. Ultimately, this approach helps you achieve what matters most: shipping innovative, high-performance data applications at scale.

Get started today

Data Agent Kit is available today in preview. Start by installing it in your favorite IDE or CLI:

Then visit the documentation to learn more and get started.

Beyond the Query: 5 Scenarios Laying the Foundation for the Agentic Era

Mon, 18 May 2026 16:00:00 +0000

Accessing enterprise data is shifting from static reports to dynamic use by autonomous systems. To keep up, organizations must route fragmented data from SaaS, IoT, and legacy sources into secure, scalable endpoints.

However, moving to AI-driven exposure requires more than just connecting an LLM to a database, it requires a fundamental architectural shift to manage security, costs, and semantic accuracy.

What we’ll cover

This article explores the technical evolution of data exposure through five architectural patterns: from manual SQL development to autonomous workflows standardized by the Model Context Protocol (MCP).

While the examples use BigQuery and mocked CRM data, the patterns apply to most enterprise data assets transitioning into an agentic workflow.

The 5 Scenarios of Data Evolution

The transition from static reports to agentic insights is defined by two factors: Trust and complexity.

Trust dictates autonomy: Low-trust environments (like external client-facing apps) require deterministic, hard-coded logic to prevent errors. High-trust environments (like internal tools for power users) allow for probabilistic LLM reasoning, where there is more tolerance for non-deterministic outputs.

Complexity defines utility: Simple lookups need fast, cached responses. In contrast, complex, cross-functional problems require an agent to orchestrate multiple tools and data sources.

To navigate this shift, we will examine five technical scenarios, starting with the baseline of the static API.

Scenario 1: The Static API Contract

Focus: Maximum stability and deterministic execution

Scenario 1 represents the traditional model of data exposure. A developer acts as the intermediary, translating specific business requirements—such as "Show me top-selling products by category"—into optimized, hard-coded SQL queries.

Isolation and Predictability

This approach provides the highest level of security and performance:

Low logic risk: Because the SQL is pre-written and vetted, there is no risk of a user (or an agent) crafting a query that accesses unauthorized data.
Secure by design: Using parameterized queries instead of string concatenation provides a hard barrier against SQL injection.
Reliability: The output is deterministic. If the development lifecycle is robust, the user is guaranteed to receive exactly what they requested, with predictable execution costs and performance.

Implementation example

This snippet demonstrates the baseline for data exposure: a direct, static API contract. It offers maximum predictability by using parameterized queries to prevent SQL injection and ensure consistent performance.

A note on the code examples: To prioritize architectural clarity, these examples are provided as conceptual blueprints rather than production-ready code. They are designed for pedagogical purposes and intentionally omit "industrial" requirements such as persistent session state, IAM/Auth protocols, and comprehensive exception handling. Use these only as a logic guide before implementing your own hardened and production-ready solution.

code_block: <ListValue: [StructValue([('code', 'from google.cloud import bigquery\r\ndef fetch_products(limit=10):\r\n client = bigquery.Client()\r\n # Use named parameters to ensure security and prevent SQL injection\r\n sql = """\r\n SELECT id, name \r\n FROM `bigquery-public-data.thelook_ecommerce.products` \r\n LIMIT @limit\r\n """\r\n job_config = bigquery.QueryJobConfig(\r\n query_parameters=[\r\n bigquery.ScalarQueryParameter("limit", "INT64", limit)\r\n ]\r\n )\r\n return client.query(sql, job_config=job_config).to_dataframe()'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eabf81a60>)])]>

Analysis

Parameter	Rating	Impact
Flexibility	Low	Users cannot change the query logic or filters without code changes.
Cost Control	High	Query plans are static; costs are predictable and easy to budget.
Latency	Low	Low response times leveraging for example BigQuery's query cache.
Maintenance	High	Every new business question requires a developer and a deployment.

When to use Scenario 1?

This approach is the benchmark for external-facing applications, customer portals, and high-traffic production dashboards. It is the best choice when your requirements include:

Strict auditability: You need a version-controlled (Git-based) history of every query executed against your data warehouse.
Performance at scale: You require sub-second latency, leveraging BigQuery’s result caching for high-concurrency workloads.
Deterministic logic: You must guarantee that specific inputs always produce the exact same output, with no room for AI-driven variability.
External multi-tenancy: You are exposing data to third parties and need absolute assurance against data cross-contamination.

Scenario 2: Custom Agent with SQL Generation

Focus: User flexibility and managed autonomy.

To resolve the development bottleneck of manual SQL authoring, Scenario 2 introduces an LLM agent (via the Agent Platform SDK) to act as a dynamic translator. In this model, the developer stops writing individual queries and starts focusing on metadata documentation.

From Query Writing to Metadata Curation

Using the Agent Platform SDK (for Python, for example), developers implement a reasoning engine that maps natural language to schema metadata. Rather than "guessing" the SQL, the agent follows a structured reasoning loop:

Analyze: It parses the natural language intent (e.g., "Which region had the highest growth?").
Retrieve: It looks up the relevant schema metadata provided in the system context.
Construct: It generates a syntactically correct, BigQuery-compatible statement.

For the LLM to generate accurate queries, it must "see" the data structures. You provide this through system instructions that include table names, column types, and—crucially—semantic descriptions (e.g., "created_at: The timestamp when the user first registered"). By curating this metadata space, you define the boundaries of what the agent can explore and execute.

Access control relies entirely on underlying database permissions (like RLS). Because the agent passes generated SQL dynamically, data boundaries must be enforced at the database level.

Implementation example

This marks the first step into agentic workflows, where an LLM acts as a translator between natural language and structured schema.

code_block: <ListValue: [StructValue([('code', 'from google.cloud import bigquery\r\nfrom vertexai.generative_models import GenerativeModel\r\n\r\ndef ai_query(user_prompt):\r\n # Initialize the model\r\n model = GenerativeModel("YOUR_LLM_MODEL")\r\n \r\n # SYSTEM CONTEXT: Grounding the model with schema metadata\r\n # This prevents the AI from guessing table names or column types.\r\n system_instruction = (\r\n "You are a BigQuery SQL expert. Output ONLY raw SQL code without markdown backticks. "\r\n "Context: The \'products\' table in \'bigquery-public-data.thelook_ecommerce\' "\r\n "contains: id (INT), name (STRING), and category (STRING)."\r\n )\r\n \r\n full_prompt = f"{system_instruction}\\n\\nUser request: {user_prompt}"\r\n \r\n # Generate the SQL string\r\n response = model.generate_content(full_prompt)\r\n sql_code = response.text.strip().replace("```sql", "").replace("```", "")\r\n \r\n # Execute the AI-generated query\r\n client = bigquery.Client()\r\n return client.query(sql_code).to_dataframe()'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eabf810a0>)])]>

Analysis

Parameter	Rating	Impact
Flexibility	High	Users can ask virtually any question in plain English.
Cost Control	Low	LLMs may generate unoptimized queries (e.g., missing partitions).
Latency	Medium	Includes LLM "thinking" time.
Maintenance	Medium	Developers manage "prompt schemas" rather than SQL code.

When to use Scenario 2?

Scenario 2 is best suited for internal data discovery and analyst-led exploration. It bridges the gap between raw data and business users when you require:

High-variability querying: When the range of potential business questions is too broad (the "infinite question space") to be efficiently covered by pre-built, static APIs.
Rapid prototyping: When analysts need to quickly explore datasets and validate hypotheses before committing to the development of formal, production-grade dashboards.
Semantic interpretation: When you need an agent to resolve natural language ambiguities—such as mapping "last quarter" or "active users"—into specific, technical filter criteria automatically.

Scenario 3: Conversational Analytics

Focus: Managed reasoning and verified logic.

Scenario 3 shifts the responsibility from a self-managed custom agent to a specialized, platform-native reasoning engine. By leveraging the Conversational Analytics API (currently in Pre-GA), you can deploy Data Agents - intelligent, governed layers that use enterprise-specific metadata and verified SQL to keep the LLM within strictly defined guardrails. This API translates natural language into precise queries across BigQuery, Looker, and Data Studio, while extending support to Google Cloud’s primary database solutions. We’ll consider BigQuery as our primary example for exploring these conversational insights.

The Power of Verified Queries

Unlike generic LLM prompts that guess the SQL structure, these agents are grounded in your organization’s source of truth:

Verified queries: You provide a library of verified queries (vetted, high-quality SQL examples) that the agent uses as a reference for complex joins and business logic. This ensures the agent follows your established coding patterns.
Managed context: The platform handles the retrieval of schema information and documentation, reducing the prompt bloat that often leads to hallucinations in custom-built agents.
Aligned outputs: By grounding the model in existing production SQL, the system ensures that AI-generated insights remain consistent with your official reporting metrics.

This solution inherits existing BigQuery IAM permissions and provides a view of the reasoning and SQL behind every answer.

Can all of this be done with enough work on a fully customized agent? Yes. Is the custom approach practical, and time/cost efficient? Maybe not.

Implementation example

This approach leverages a specialized reasoning engine to handle intent discovery and data grounding. The developer no longer manages the translation logic: they simply call the managed agent.

code_block: <ListValue: [StructValue([('code', 'from google.cloud import geminidataanalytics_v1beta as gda\r\ndef chat_data(user_query):\r\n # Initialize the client for the Data Agent service\r\n client = gda.DataAgentServiceClient() \r\n # Path to your pre-configured Data Agent resource\r\n agent_path = "projects/YOUR_PROJECT_ID/locations/us/dataAgents/YOUR_AGENT_ID"\r\n # Execute: The agent uses its "Verified Queries" and metadata to find the answer\r\n request = gda.ExecuteDataAgentRequest(name=agent_path, query=user_query)\r\n response = client.execute_data_agent(request=request)\r\n \r\n # The agent returns both the natural language answer and the supporting data\r\n return response.answer'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eabf81af0>)])]>

Analysis

Parameter	Rating	Impact
Flexibility	Medium	High for the data sources it knows, but restricted by its Verified instructions and metadata scope.
Cost Control	Medium	Grounded queries are typically more efficient than raw LLM generation.
Latency	Medium	Higher than static queries, due to the multi-stage reasoning and summarization process.
Maintenance	Low	Managed by Google; analysts focus on coaching the agent through metadata and verified SQL.

When to use Scenario 3?

Scenario 3 is the ideal path for BigQuery-centric analysis where accuracy is non-negotiable. Choose this when you require:

Governed trust: Business logic (e.g., "Revenue") must follow pre-vetted verified queries every time.
Native intelligence: Users need to perform complex tasks like forecasting or anomaly detection via BigQuery AI using natural language.
Auditability: Stakeholders require a transparent reasoning path to see exactly how the AI arrived at its numbers.

While Scenario 2 requires building a custom reasoning engine from scratch, Scenario 3 provides a platform-native experience that prioritizes verified logic over raw LLM generation.

The limitation: This data companion is ultimately confined to the BigQuery or Google Cloud ecosystem. To scale an agentic workforce across heterogeneous platforms and tools, we must look toward vendor-agnostic standards like the Model Context Protocol (MCP).

Scenario 4: Managed MCP Tools

Focus: Standardized connectivity and decoupled architecture.

Scenario 4 introduces the Model Context Protocol (MCP)—an open-source standard designed to normalize how AI applications interact with data and tools. While previous scenarios rely on custom SDKs or platform-specific APIs, MCP provides a universal interface that separates the reasoning layer from the tool execution layer.

Standardized Abstraction

MCP enables tool discovery by exposing a manifest of capabilities that any compliant agent can ingest. This allows for a modular system where the data logic is "externalized" from the agent itself.

The MCP client: The reasoning engine (the LLM) that identifies the user's intent. Because it uses a standardized protocol, the client can connect to any MCP server and instantly discover what it can do without needing new integration code.
The MCP server: The domain-specific service that exposes data and logic. The managed BigQuery MCP server doesn't just pass queries: it encapsulates the logic required to navigate Google Cloud’s infrastructure safely. It exposes tools such as:

list_dataset_ids: Context-aware discovery of the data environment.
get_dataset_info: Metadata retrieval for semantic grounding.
execute_sql: Controlled execution of data retrieval.

(see https://docs.cloud.google.com/bigquery/docs/reference/mcp for the updated toolset).

Access control is managed via standard IAM service accounts and lacks programmatic logic-checks.

This decoupling future-proofs your AI stack. You can swap your LLM provider or upgrade your agent's reasoning model without rewriting the data access logic, because the interface between them remains consistent and governed.

Implementation example

In an MCP-based architecture, connecting an AI agent to a data source is reduced to a simple configuration handshake. Instead of writing custom integration logic, you provide an MCP-compliant client (such as the Gemini CLI or a modern IDE) with a manifest defining the server’s location and security requirements.

The following manifest allows the client to connect to Google’s managed BigQuery MCP server, enabling it to dynamically discover and execute data tools without a single line of custom code:

code_block: <ListValue: [StructValue([('code', '{\r\n "mcpServers": {\r\n "bigquery": {\r\n "httpUrl": "https://bigquery.googleapis.com/mcp",\r\n "authProviderType": "google_credentials",\r\n "oauth": {\r\n "scopes": [\r\n "https://www.googleapis.com/auth/bigquery"\r\n ]\r\n }\r\n }\r\n }\r\n}'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eabf814f0>)])]>

Analysis

Parameter	Rating	Impact
Flexibility	High	Agents can contextually explore any table the MCP server exposes.
Cost Control	Medium	Tools are standardized, but a curious agent can still trigger large scans.
Latency	Medium	Includes standard overhead for the protocol handshake and tool-calling.
Maintenance	Low	Uses a managed MCP Server which requires no maintenance. The work is only on the MCP client.

When to use Scenario 4?

Scenario 4 is the architectural choice for multi-agent environments that require standardized data connectivity with minimal maintenance overhead. It is the ideal path when you require:

Managed infrastructure: You want to offload the security, execution, and maintenance of your toolset by consuming a managed BigQuery MCP server rather than building and patching custom data-retrieval code.
LLM portability: You need an open-standard interface, allowing you to use the same tools across different LLMs or agent frameworks without rewriting proprietary function calls.
Autonomous discovery: Your agents must navigate and inspect complex datasets dynamically. MCP’s standardized endpoints allow agents to crawl metadata and schema information autonomously to determine the best path for a query.

Scenario 5: Custom Hosted MCP Servers

Focus: Architectural extensibility and custom tool definition.

Scenario 5 takes the standardized connectivity of Scenario 4 and adds complete control by replacing the managed service with a custom-built MCP server. Typically hosted on scalable infrastructure like Cloud Run, you can rely on open source solutions such as MCP toolbox. This approach removes the guardrails of managed offerings, granting engineering teams full freedom to define specialized tools, integrate disparate third-party APIs, and implement proprietary execution logic within the protocol.

Architectural Advantages of Custom MCP

Shifting to a custom-hosted MCP server moves operational complexity from the LLM prompt to the server-side logic, unlocking three critical capabilities:

Deterministic tool tailoring: Instead of forcing an agent to navigate raw, sprawling schemas, developers define high-level functions with specific data shapes. This replaces probabilistic SQL generation with deterministic execution, virtually eliminating schema-based hallucinations.
Unified source orchestration: A custom MCP server acts as a consolidated gateway. Within a single tool execution, the server can orchestrate calls across BigQuery, external SaaS APIs, and legacy on-premises systems. The agent receives a pre-processed, unified response, abstracting away the multi-source complexity.
Programmable governance: This scenario enables code-level security difficult to implement in managed environments. You can implement granular controls directly within the protocol layer, such as:

Dynamic PII masking: Automatically redacting sensitive data before it reaches the LLM.
Custom authentication: Injecting enterprise-specific middleware.
Contextual rate limiting: Throttling tool usage based on the end-user’s identity or cost center.

Implementation example

In this scenario, when using MCP toolbox, you use a declarative tools.yaml file to define the interface of your custom MCP server. This file acts as the absolute boundary for your agent—it defines the BigQuery connection, enables safe discovery for schema inspection, and wraps complex, multi-table joins into a single, parameterized tool.

code_block: <ListValue: [StructValue([('code', '# ----------------------------------------------------------------------\r\n# Minimal Configuration\r\n# Dataset: bigquery-public-data.thelook_ecommerce\r\n# ----------------------------------------------------------------------\r\n\r\nsources:\r\n bq-thelook-ecommerce:\r\n kind: "bigquery"\r\n project: "${PROJECT_ID}"\r\n location: "${BQ_LOCATION}"\r\n\r\ntools:\r\n # 1. Discovery Tool: Helps the agent understand the database schema\r\n bigquery_get_table_info:\r\n kind: bigquery-get-table-info\r\n source: bq-thelook-ecommerce\r\n description: Retrieves table metadata and schema details. Run this before executing custom queries.\r\n\r\n # 2. Execution Tool: Parameterized SQL for safe, repeatable data fetches\r\n thelook_get_user_orders_summary:\r\n kind: bigquery-sql\r\n source: bq-thelook-ecommerce\r\n statement: |\r\n SELECT\r\n orders.user_id,\r\n COUNT(DISTINCT orders.order_id) AS count_of_orders,\r\n COUNT(order_items.id) AS count_of_items,\r\n SAFE_DIVIDE(COUNT(order_items.id), COUNT(DISTINCT orders.order_id)) AS avg_items_per_order\r\n FROM `bigquery-public-data.thelook_ecommerce.orders` AS orders\r\n INNER JOIN `bigquery-public-data.thelook_ecommerce.order_items` AS order_items\r\n ON orders.order_id = order_items.order_id \r\n AND orders.user_id = order_items.user_id\r\n WHERE orders.status = "Complete" \r\n AND orders.user_id = @user_id\r\n GROUP BY orders.user_id;\r\n description: Retrieves an order summary for a specific user ID, including total completed orders, items purchased, and average items per order.\r\n parameters:\r\n - name: user_id\r\n type: integer\r\n description: The unique identifier of the user.\r\n\r\ntoolsets:\r\n # Binds the tools together for agent use\r\n thelook_core_insights_toolset:\r\n - bigquery_get_table_info\r\n - thelook_get_user_orders_summary'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f3eb0cfaca0>)])]>

Analysis

Parameter	Rating	Impact
Flexibility	High	Supports cross-domain orchestration (e.g., BigQuery + legacy APIs) and unlimited custom tool definitions.
Cost Control	High	Allows developers to inject programmatic query cost estimation and budget thresholds prior to execution.
Latency	High	Custom multi-hop orchestration, network transit, and container cold-starts introduce latency.
Maintenance	High	Requires full ownership of the application lifecycle, including CI/CD, dependency patching, and container scaling.

When to use Scenario 5?

This architecture is the power user choice, essential for highly regulated environments and hybrid infrastructures where managed services fall short. Implement this approach when your design requires:

Secure hybrid orchestration: You must bridge BigQuery with private on-premises systems or restricted APIs, returning a pre-processed, consolidated payload that the agent can use immediately without navigating the raw network gap.
Hardened business logic: You need to move complex, non-negotiable calculations off the LLM and into a controlled code environment, exposing only high-level "expert" tools to guarantee absolute accuracy.
Centralized enterprise tooling: You want to maintain a single, governed source of truth for your proprietary tools that can be served uniformly across different LLM providers or internal frameworks without vendor lock-in.

Conclusion: The Foundation of the Agentic Era

The journey from Scenario 1 to Scenario 5 traces a clear technical evolution: we are moving away from rigid, hard-coded data silos and toward a world of autonomous discovery and standardized connectivity. By adopting frameworks like the Model Context Protocol (MCP), organizations can decouple their data logic from their AI models, ensuring that as LLMs evolve, their access to the enterprise "brain" remains seamless, scalable, and vendor-agnostic.

However, increased autonomy does not mean decreased oversight. While we haven’t touched on these points in depth in this article, we must adhere to a fundamental truth: data access must be governed and controlled using governance and security tools. Regardless of the access scenario—more or less agentic depending on the use case—security, credentials, quality management, and standardized governance are absolutely essential.

On a more lighthearted note, it’s worth remembering that the golden rule of computing still applies: "Garbage In, Garbage Out". You can build the most sophisticated, autonomous agentic layer in the world, but if you feed it messy, uncurated data, you’ll simply get "garbage" answers at a much faster and more confident pace. Sophisticated AI doesn't fix bad data: it just makes it more visible. Maintaining high data quality is not just a legacy requirement—it is the fuel that makes the agentic engine actually work.

What we announced in streaming AI at Next ‘26

Mon, 18 May 2026 16:00:00 +0000

Every device, user, and microservice generates data. Ingesting this data, extracting meaning and insights, and driving business decisions in real time has the potential to deliver transformational business value.The rise of agentic AI represents an opportunity for users to overcome the challenges inherent in real-time analytics. But while agentic AI has the potential to accelerate adoption, users face a new set of challenges with effectively leveraging real-time data:

Real-time context is hard to implement. Teams will choose to incorporate data from batch-oriented approaches, like periodic database syncs and scheduled refreshes. Agents have to either rely on stale data or require memory-intensive context windows. This “context lag” makes them ineffective for real-time agentic tasks like fraud detection, dynamic e-commerce recommendations, or autonomous supply chain adjustments.
Real-time systems are inflexible. Agentic tools lack the modularity to adapt to customer-specific requirements, forcing organizations to make difficult architectural choices. Data practitioners need a platform to meet them where they are, where they are free to make the tradeoff between latency, accuracy, and cost.

Google Cloud provides a tightly integrated, unified streaming data platform that delivers both fully managed, Google Cloud-native services, as well as open-source-compatible services, and that come together to power large-scale AI training and inference. The platform is comprised of five key services:

Pub/Sub: Highly reliable, serverless, and fully managed service for messaging and event streaming that’s integrated with BigQuery, Dataflow, and Cloud Storage. Pub/Sub is utilized by organizations like Anthropic.
Dataflow: A serverless engine for batch, streaming, and now agentic AI. Leading enterprise organizations like Palo Alto Networks use Dataflow, as do Google services like Waymo and Google Maps. For instance, Waymo cars use Dataflow to help it “see” the world, plan their routes, and predict obstacles. Before a car hits the actual pavement, it “drives” millions of miles in a simulator, with Dataflow generating training datasets and validating the models that are used for autonomous driving.
Managed Service for Apache Kafka: The fully managed way to run the popular open source streaming storage and data integration system on Google Cloud that’s highly reliable, secure, and cost efficient. Across the largest enterprises and startups, Apache Kafka serves as a staging location for critical training data and real time updates to AI agent context.
BigQuery: A unified platform for real-time ingestion and analysis. The Storage Write API provides high-throughput streaming into BigQuery and Lakehouse for Apache Iceberg tables with exactly-once delivery semantics and stream-level transactions. Additionally, BigQuery continuous queries enable real-time AI inference directly within the data pipeline by calling generative functions like AI.GENERATE_TEXT, allowing for immediate insights as data is ingested.
Bigtable: Google’s NoSQL real-time database for processing streaming data from Pub/Sub and Dataflow automatically using continuous materialized views, delivering results in seconds that are ready for low-latency serving using Bigtable’s in-memory tier.

Moving from insight to autonomous action

At Google Cloud Next, we announced a set of streaming AI capabilities to the Agentic Data Cloud, providing autonomous agents with instant context and enabling real-time actions, helping organizations feed real-time context to their AI agents.

For instance, imagine a supply chain agent that doesn't just monitor IoT data, but autonomously reroutes a shipment around bad weather, confirms new delivery windows with the receiving warehouse, and updates the customer's portal — all before a human supervisor is even aware of the problem. Consider a financial services agent that identifies a fraudulent transaction pattern, instantly freezes the account, communicates with the customer via their preferred channel, and initiates a new card shipment — all within seconds of the suspicious activity. Whether you’re creating embeddings on streaming data to power search, or building a sophisticated multi-agent fraud detection system, these new capabilities add powerful tools to your toolbox.

Let’s take a closer look at these new capabilities.

New streaming AI capabilities

At Next ‘26, we launched tightly integrated capabilities to our platform across three key areas:

1. Providing real-time, enriched context for agents

1.1. Pub/Sub AI Inference SMT (GA): You can now run inference on messages streamed through Pub/Sub. Data practitioners can choose any models available on Gemini Enterprise Agent Platform. Pub/Sub makes the inference call and appends the result to each message before sending it downstream, bringing Pub/Sub’s simplicity together with the Gemini Enterprise’s fully managed tools.

1.2. Pub/Sub Bigtable subscriptions (Preview): Stream Pub/Sub data directly to Bigtable. Pub/Sub Bigtable subscriptions directly materialize event data from a Pub/Sub topic into a Bigtable table, eliminating the need for custom pipelines and dramatically simplifying your streaming architecture. For instance, you can easily ingest vector embeddings into Bigtable to power semantic search workloads.

1.3. BigQuery continuous queries stateful data processing (Preview): BigQuery continuous queries can now perform complex correlations between multiple data streams using JOINs and calculate metrics over consistent time intervals with tumbling window aggregations. This enables sophisticated analysis, such as calculating 30-minute averages or correlating events across different streams, directly as data is ingested into BigQuery. Furthermore, you can integrate AI directly into your data pipelines by calling generative functions like AI.GENERATE_TEXT, as well as materialize continuous query SQL results into BigQuery tables or export them to operational sinks like Bigtable, Spanner, and Pub/Sub for real-time reverse ETL.

2. Direct agents to manage your resources

2.1. Model Context Protocol (MCP) support for Pub/Sub, Managed service for Apache Kafka, Bigtable and BigQuery (GA): Your agents can manage Pub/Sub,Managed service for Apache Kafka services, and BigQuery using fully managed MCP endpoints. Agents can also publish messages to Pub/Sub.

2.2. ADK integration (GA): Your agents can interact with your real-time data stored in Pub/Sub, Bigtable, BigQuery, or other Google Cloud services using pre-built ADK integrations. Developers can build agents acting on real-time context without having to implement complex configurations or plumbing.

3. Combine multi-agent systems with your data processing

3.1. Event-driven autonomous agents: As agents become core to our workflows, real-time data pipelines must evolve to incorporate them directly into the stream. We have enabled this capability by treating agentic logic as a first-class citizen within the Dataflow pipeline. You can now incorporate your agent code using the Agent Development Kit (ADK) and deploy it as a specialized node using the RunInference transform and the new ADKAgentModelHandler. Key advantages of this approach include:

- Massive scalability: Leverage Dataflow’s architecture to process high velocity events upstream and keep hundreds of agents sessions active simultaneously, each driven by specific incoming events.
- Pre-processing power: Dataflow handles the heavy lifting of complex data enrichment, delivering a “ready-to-act” context directly to the agent so it can focus on reasoning.

3.2. Dataflow Unified embeddings Sinks: We are introducing unified embedding generation directly within the data stream to eliminate “context lag”. You can now transform incoming data into high-dimensional vectors at low latency using Dataflow. These real-time embeddings are then seamlessly materialized into our expanded suite of high-throughput vector sinks, which now includes Cloud Spanner (featuring its new built-in vector search) and AlloyDB, providing you with an up to date vector database for semantic search needs as well as for your autonomous agents making RAG calls with an instantly searchable and perfectly synchronized long-term memory. This feature works with both remote and local models, for example Gemma.

As we continue to build out the platform, customers can expect to see even tighter integrations and more powerful capabilities. We look forward to seeing what you build with these new capabilities.

The power of LLMs on your data, more than two orders of magnitude faster and cheaper

Wed, 13 May 2026 16:00:00 +0000

Databases have introduced new AI-powered SQL functions which take natural language instructions as input and are evaluated using LLMs. They leverage the power of LLMs to answer new kinds of queries: Which product reviews are negative about durability? Which customer support tickets have been resolved by providing a workaround?

These new AI functions push the boundaries of what is possible in a SQL query engine by bringing the semantic understanding of LLMs to your data, thus enabling previously impossible analyses and applications. But, their cost and performance limited their applicability. LLM invocations add 10-100x to the overall query latency and ~1000x on cost. This is much too slow for operational databases. In analytics, a medium-sized query on 10-100 millions of rows would consume an amount of tokens that is prohibitively expensive for some applications.

Google Cloud has published a new paper at SIGMOD where we show how to accelerate and reduce the cost of LLM-powered AI functions by using proxy models. Proxy models are cost-optimized ultra-lightweight models tailored to a specific query (aka prompt) and tuned for your data. They replace the majority of LLM calls during query execution (thus the name proxy model) and can be trained on-the-fly or ahead of time. The fundamental ideas behind proxy models were proposed in Universal Query Engine (UQE) at NeurIPS 2024 by Google DeepMind.

Our paper shows that proxy models are automatically applicable in many (but not all) cases, sometimes with no loss of quality, sometimes with minor quality loss and a few times with a gain of quality. BigQuery and AlloyDB already implement this optimization under the optimized mode feature for AI.IF (BigQuery docs, AlloyDB docs) and AI.CLASSIFY (BigQuery docs). This article is a tl;dr of the SIGMOD paper and provides the key intuitions on three questions:

Why do proxy models work so accurately for so many cases, even though they are so much more performant than LLMs?
How do they work?
In which use cases do they deliver accurate answers? In which cases they fail and accuracy needs LLMs.

Why Proxy Models Work Accurately at Ultra Low Latency and Cost?

How can an ultra-lightweight proxy model, such as the logistic regression currently in use at BigQuery and AlloyDB, have the semantic understanding power of LLMs, which is required for accurate question answering? The key intuition is that these proxy models input rich embeddings of the data that they query. By default, we are using the Gemini embedding generators, which do the heavy lifting of bringing semantics to your data when the embeddings are generated.

Then the ultra low latency and cost are easy to see: Since embeddings are generated once and used many times, the cost of bringing semantics to your data is amortized; it now happens once as opposed to happening for each query. Furthermore, the proxy models run fast in the CPU — no need for dedicated hardware.

We hope that we gave you good intuitions for why proxy models work. But a word of caution is also needed: Proxy models are fundamentally an approximation technique more limited than LLMs. Proxy models perform well on some prompts but may be deficient to LLMs in others. Case in point, the SIGMOD26 paper shows that the proxy/LLM predictive performance (as measured by F1) ratio ranged from 90% to 116% in 10 benchmarks. For example, they might break down on problems that require reasoning to connect multiple semantic concepts. Rather, think of them as specializing the model to your query and your data.

The good news is that the query processors automatically check the effectiveness and feasibility of implementing AI Functions by proxies. Let’s see how they do it.

How Proxy Models Work?

Let’s go through a simple example of a semantic filter (AI.IF). Our taste in movies is very particular: We like movies with an interesting plot and great cinematography. The query below processes IMDB reviews to find such movies.

code_block: <ListValue: [StructValue([('code', 'SELECT\r\n DISTINCT t.primary_title\r\n FROM \r\n bigquery-public-data.imdb.reviews r, \r\n bigquery-public-data.imdb.title_basics t\r\n WHERE TRUE\r\n AND r.movie_id = t.tconst\r\n AND AI.IF("Is the plot interesting? Review: " || r.review, \r\n embeddings => r.review_embedded)\r\n AND AI.IF("Does the review praise the cinematography? Review: " || r.review, \r\n embeddings => r.review_embedded)'), ('language', 'lang-sql'), ('caption', <wagtail.rich_text.RichText object at 0x7f3eb0d70df0>)])]>

The column review contains the free-form text of the review. The column review_embedded contains Gemini embeddings of the review text. When you run this query in BigQuery, the query engine will

For the first AI.IF, create a training samples’ set consisting of about one thousand rows of the input relation, the imdb.reviews table.
Use an LLM to label the first sample set, marking each review as either TRUE (yes, the plot is interesting) or FALSE (no, the plot is not interesting).
Train a proxy model for the first AI.IF using the labels computed at the previous step.
Create a test sample set of rows for the first AI.IF and evaluate the quality of the proxy model on this test set.
Based on the eval results, the optimizer adaptively decides to either perform inference using the proxy model or fall back to LLM inference for the first AI.IF
Repeat the above steps for the second AI.IF

In BigQuery, all steps happen on-the-fly during query execution. AlloyDB, being an operational database that targets sub-second latencies, avoids the online proxy model training and the online evaluation. Rather, the query’s proxy models are computed ahead of time in a PREPARE statement, thus moving the cost of sampling, labelling and training out of the critical query path. This enables the offline creation of a big pool of PREPARE statements, while the application chooses the proper PREPARE statement and executes it in the online path.

Let’s take a step back and look at what is really happening at step #3. The proxy model uses each dimension of the review embeddings (from review_embedded) as its features. Modern dense embedding models like Gecko or Gemini capture myriads of semantic notions. In our example with movie reviews, at a high level of abstraction, relevant notions would include: “aesthetic”, “thought-provoking plot”, “underwhelming plot”, or perhaps “boring movie”. We stress the “high level of abstraction” because, in the binary “language” of foundation models, all these notions (and many more) are spread in the numbers of the dense embedding. Do not expect to spot a dimension that corresponds directly to cinematography. Importantly, the embedding space contains many more notions that are irrelevant to our task. The training of the proxy model essentially weighs heavily relevant notions and discards irrelevant ones.

A proxy model (green plane) isolating relevant semantic notions by cutting the embedding space (blue sphere)

Now, let’s enter the details of the particular proxy model, which is used by our current version: logistic regression. To visualize what is happening, think of embeddings as unit vectors forming a (hyper)sphere. For a binary classification task, the proxy model essentially cuts the sphere in two halves. In our example “aesthetic” and “thought-provoking plot” would fall on one side of the plane, whereas “underwhelming plot” and “boring movie” would be on the other side. Conceptually, the orientation of the plane determines which semantic notions are more relevant.

Importantly, the proxy model is tuned for your data and your question: The training of the proxy used a high quality LLM to label a sample from your data for the particular question.

Revisiting when Proxy Models Work

We can now see more clearly what distinguishes cases that proxy models work from cases they don’t: proxy models work well for prompts that can be decided by detecting semantic notions in the embedding space. They will fail for complex prompts that require forms of reasoning that go beyond detecting patterns in the embedding model.

The good news is that, in practice, we have observed that proxy models work for a large class of AI+SQL queries. The SIGMOD26 paper provides a comprehensive evaluation, showing that proxies worked in 11 benchmarks. Specifically, in 10 benchmarks the ratio of proxy F1 to LLM F1 ranged from 90% to 102% and in the 11th benchmark (Amazon Reviews) it was 116%. Notice that the proxy may even deliver better accuracy because it got the benefit of being trained by multiple samples as opposed to the LLM that addressed each row as a new problem.

There is a second limitation currently: extreme selectivities. Notice that Step 1 collects samples. It needs to collect many examples for TRUE and many examples for FALSE. Multiple sophisticated techniques are employed to achieve this, even when the TRUEs are many more than the FALSEs or vice versa. However, no purely sampling technique can confront cases of extreme selectivity, i.e., cases of very few TRUEs or very few FALSEs. This is the reason that the proxies will not be employed in such extreme selectivity cases. However, notice that this problem is fundamentally addressable by various techniques.

Why isn’t Vector Search Enough?

Proxy models appear … suspiciously close to vector search. After all, they also input vector embeddings. Why not just vector search? There are two reasons why vector search is not enough: The obvious one is that proxies are not rankers; they are classifiers: multiclass classifiers (AI.CLASSIFY) or binary classifiers (AI.IF). But, even if you narrow down to just AI.IF, an attempt to simulate AI.IF with vector search will be both hard-to-setup and will give suboptimal results. While proxy models are tailored to your data and your prompts, vector search is based on generic distance functions (such as cosine).

Experimental Results

We present here a subset of characteristic benchmarks from the SIGMOD26 paper. We compare the accuracy of proxy models with using LLM inference on all rows. In terms of quality, the relative accuracy varies from 0.92 (lowest) to 1.16 (highest), which means that for some tasks, proxy models perform slightly better than straight LLM inference.

Dataset	Prompt	F1 (Proxy)	F1 (LLM)	Relative (Proxy/LLM)
Amazon Reviews 10k	Review is {sentiment label}	0.860	0.739	1.163
Banking77	Is intent {intent label}? Think step-by-step: {CoT instructions}	0.700	0.707	0.990
California Housing	Location in Latitude & Longitude belongs to Southern California	0.953	0.953	1.0
FEVER	Is the claim supported by the text?	0.782	0.853	0.917

In terms of scalability and costs, the architectural differences between BigQuery and AlloyDB lead to slightly different results for each system. At a high-level, proxy models move parts of the computation from specialized hardware used by LLM inference services to ordinary database workers. This results in a large reduction in costs and in query latency. In the online training case, employed by BigQuery, for a typical one million row query, proxy models consume about 400x less tokens, and the latency goes down by 30x-100x. In AlloyDB’s case the LLM costs of PREPARE, which are similar to BigQuery’s, can be amortized over arbitrarily many runs of the prepared statements that invoke proxy models.

The cost reduction (token consumed) and latency improvement (query speed up) for various table sizes.

Conclusion

AI functions calling LLMs are becoming commonplace in databases. Choosing the proper model for each AI function is an active area of academic research (e.g. BARGAIN). The key intuition is right-sizing models: Performant cheap models for “easy” problems, powerful reasoning models for the hard problems. Our work builds on the same principles, but while academic research has only used LLMs to navigate the performance spectrum, non-LLM proxy models push performance much further using ultra-lightweight and highly specialized models that deliver surprisingly good quality for many problems. Yet, we should not be surprised: After all, the proxy models feed on the rich semantics that foundation models bring to embeddings and they also feed on being trained by LLMs. As embedding models improve and extract increasingly richer and finer semantics from text and multimodal data (image, video), we suspect that non-linear classifiers will be useful to identify even more complex semantic patterns, further extend the applicability of proxy models (e.g. to AI joins also) and explore additional points on the performance/quality Pareto.

If you would like to learn more, our full paper dives into the differences between online vs. offline training, and compares the performance of different embedding models as well as various proxy models (linear regression, SVM, XGB).

You can try proxy models today in BigQuery (docs) and AlloyDB (docs), dramatically speed up the AI Functions of your SQL queries and reduce their token consumption.

_{We would like to thank Bo Dai, Yuchen Zhuang, Xingchen Wan, and Dale Schuurmans from Google Deepmind for developing the fundamental principles on proxy models in UQE and for their continuous guidance & support along our journey to bring them to Cloud customers. We also thank Yeounoh Chung and Fatma Özcan, our partners in the System Research Group, as well as the AlloyDB and BigQuery engineering teams.}

Cloud Storage Rapid: Turbocharged object storage for AI and analytics

Mon, 11 May 2026 17:00:00 +0000

At Google Cloud Next ’26 we announced Cloud Storage Rapid, a family of object storage capabilities for data-intensive workloads like AI and analytics. Out of the gate, Cloud Storage Rapid consists of Rapid Bucket (formerly Rapid Storage), a high-performance zonal object storage offering, and Rapid Cache (formerly Anywhere Cache), which accelerates reads on-demand and colocates compute and data for workloads in existing buckets.

Cloud Storage Rapid is our response to the generational shift in how organizations build with AI. Teams are training trillion-parameter models, deploying inference at global scale, and building autonomous agents that reason over vast amounts of enterprise data. While accelerators like GPUs and TPUs often get the spotlight, they have a critical dependency: storage.

Storage is the engine that feeds accelerators during training, and the fast-access layer that makes real-time inference responsive. But as models scale, storage performance can be a bottleneck. Every time an AI/ML cluster waits on a data read or a checkpoint write stalls, you are paying for expensive compute cycles that aren't doing useful work.

Historically, AI/ML practitioners have had to choose between the specialized performance of a niche, zonal storage system, and the reliability and scale of a global object store like Google Cloud Storage. Many developers value Cloud Storage for its simplicity, scalability, reliability, and cost-effectiveness, but as the AI era has progressed, they’ve been throwing hotter and hotter workloads at it, running training and inference workloads with thousands of GPUs and TPUs. We’ve reached a performance tipping point that traditional object storage was never meant to handle. The Rapid family provides multiple options for co-locating compute workloads directly with high-performance zonal storage. It minimizes I/O bottlenecks that can block accelerators, so that your GPUs and TPUs stay fully saturated and productive. In this blog, let’s take a closer look at Cloud Storage Rapid’s capabilities.

Rapid Bucket

Rapid Bucket (GA), helps Cloud Storage meet the evolving demands of massive-scale generative AI, analytics, and other high-performance workloads. It does so by leveraging Colossus, the Google distributed storage system that powers Gemini and YouTube, to provide massive read/write performance and ultra-low latency in a dedicated object storage zonal bucket.

Lightning-fast performance
By combining the sub-millisecond latency of block-like storage, the throughput of a parallel filesystem, and the scalability and ease of use of object storage, Rapid Bucket provides high performance from the same Cloud Storage that you know and love.

Highlights include:

Ultra-low latency: Achieve up to 20 million queries per second and sub-millisecond latency.
Massive scalability: Rapid Bucket delivers 15+ TB/s of aggregate read throughput from a single Rapid zonal bucket.
New semantics: Enable higher performance with new capabilities such as native appends, unlimited readers (while writing!), and vectored reads.

Optimized for AI and analytics
You can use Rapid Bucket for a variety of demanding scenarios, including AI/ML data preparation, training, checkpointing, batch and streaming analytics processing, and optimizing distributed database architectures.

Key benefits include:

Optimized accelerator utilization: With Rapid Bucket, we observed 50% reduced blocked GPU time and up to 2.5x faster data loading for multi-modal training runs.
Faster checkpointing: Rapid Bucket makes checkpoint restores up to 5x faster and writes 3.2x faster compared to traditional object storage. This ensures faster recovery from workload interruptions, minimizes wasted accelerator time, and increases overall efficiency.

>5x faster checkpoint restores with Rapid Bucket

>3.2x faster checkpoint writes with Rapid Bucket

You can get started with Rapid Bucket here.

Rapid Cache

Originally announced at Cloud Next ‘25, Rapid Cache accelerates bandwidth for AI/ML workloads like data prep, training, and bursty model loading for inference, delivering an aggregate read throughput of 2.5 TB/s for your existing buckets — with no code changes. For inference workloads, we’ve observed that Rapid Cache provides up to 2.1x (114%) accelerated model load, resulting in 47% TCO savings.

When combined with multi-region buckets, customers can flexibly access GPUs and TPUs distributed across regions in a geo, while maintaining a single bucket namespace. This eliminates the need for manually orchestrated data movements between buckets, while benefitting from zonally co-located high performance.

New: Rapid Cache ingest on write
Customers at some of the world’s largest frontier AI/ML labs told us that they were looking for ways to accelerate reads immediately after a write, such as checkpoint restore workloads or a data prep pipeline that then feeds training. Before, caching the data required an initial read to trigger ingestion, which was served directly from the bucket at standard performance.

Rapid Cache’s new ingest on write feature solves this by simultaneously writing data to the Rapid Cache as it is being written to a Cloud Storage bucket. This proactive approach eliminates the initial cache-miss penalty, and helps workloads benefit from an immediate cache hit on the very first read. This provides up to 2.2x faster checkpoint restore times, allowing training clusters to recover faster from interruption.

To enable ingest on write, simply modify the ingestion criteria of your existing Rapid Cache.

Rapid Cache’s simplicity and performance has resulted in explosive adoption. In just one year since General Availability, customers have deployed thousands of Rapid Caches with a 20x growth in caches deployed, In fact,Rapid Cache serves up to 20% of Cloud Storage’s global egress. Cutting-edge AI/ML customers deploy their workloads on Rapid Cache, including Anthropic who uses Rapid Cache to improve the resilience of their cloud workload by co-locating data with TPUs in a single zone and providing dynamically scalable read throughput up to 2.5TB/s.

Case study: Thinking Machines Lab
Thinking Machines Lab is an artificial intelligence research and product company. Its mission is to make AI systems that are adaptable and customizable, building a future where everyone has access to the knowledge and tools to make AI work for their unique needs and goals.

At Next ‘26, James Sun, Member of Technical Staff at Thinking Machines Lab, spoke at our session, Cloud Storage Rapid: Turbocharged object storage for AI & Analytics, where he presented about the needs of the data-hungry AI/ML workloads that Thinking Machines Lab runs for high-performance storage at scale.

Thinking Machines runs diverse workflows: data processing in Dataflow, Kafka, and Spark, multi-model training, and serving Tinker — a flexible API for fine-tuning open source models. Thinking Machines' workloads run on Google Cloud Storage, Sun explained. Running these data-intensive AI/ML workloads at such a large scale introduces significant infrastructure challenges.

The first is managing a hub and spoke data architecture, where data processing hubs are located in one primary region while training GPUs are spread across multiple regions. Historically, this has made manual data movement and lifecycle management a major operational pain point. Furthermore, Thinking Machines Lab's workloads such as data prep and pretraining workflows, which rely on massive-scale Spark workloads to prepare their multi-modal datasets, often spike from cold to hot instantly. Previously, these surges led to disruptive 429 errors, which stalled data processing and loading, and interrupted critical training cycles.

To minimize these bottlenecks, Thinking Machines Lab integrated Rapid Cache across their AI/ML pipeline, to positive results.

“Rapid Cache has become a core foundation of our AI/ML data infrastructure, supporting our critical workflows, from data prep and pretraining to training and model loading. By acting as a crucial bandwidth shield and booster, it enables us to scale our data-intensive workloads across our entire fleet without compromise, providing us with the on-demand high bandwidth and consistent stability that we need to innovate at speed.” - James Sun, Member of Technical Staff, Thinking Machines Lab

In short, Cloud Storage and Rapid Cache provides Thinking Machines Lab with:

Easy, instant, scalable, on demand bandwidth: The team now achieves stable read throughput peaks of over 1.8TB/s.
Enhanced stability: Rapid Cache has greatly reduced tail-end latencies and 429 errors, providing the consistent performance needed for multi-modal training.
Fleet-wide scalability: Combined with multi-region buckets, they can now scale data-intensive workloads across their entire fleet, meeting the demands of a rapidly growing compute scale without the hassle of manual data movement while benefiting from zonally colocated storage for high performance.
Operational efficiency: The use of Hierarchical Namespace (HNS) has optimized their massive Spark workloads for data preparation, by supporting fast directory renames, along with providing the ability to ramp QPS more quickly as they scale out clusters. Rapid Cache’s "ingest on write" capability helps ensure immediate cache hits for checkpoint restores.

Choose your rocket ship

Whether you are running data preparation, massive-scale training, or low-latency inference, Cloud Storage Rapid delivers high performance together with the reliability and scalability that Cloud Storage is known for.

Rapid Bucket delivers the highest Cloud Storage throughput and queries per second as well as the lowest latency for read/write use cases, such as analytics, AI training, checkpointing, and model serving. This helps to reduce storage bottlenecks and increase compute utilization.
Rapid Cache provides higher read bandwidth and tail latency stabilization in existing buckets, without code changes. Key use cases include AI training, checkpoint restores, and serving, as well as accelerator optionality via multi-region buckets.

Get started with the Cloud Storage Rapid family today!

How BASF manages thousands of supply chain decisions with AlphaEvolve’s agentic algorithms

Thu, 07 May 2026 16:00:00 +0000

The agricultural and crop protection supply chain is one of the most intricate networks in the world. It takes up to two years to turn active ingredients into the final products farmers need, and a single change in weather or regulations can disrupt everything. Planners at BASF Agricultural Solutions navigate this reality daily across 180 production sites. To understand how local decisions ripple across their entire global network, BASF turned to AlphaEvolve on Google Cloud to build a digital twin of their supply chain.

Planning across a two-year lead time

BASF Agricultural Solutions manages a network with over 5,000 distinct value chains. Creating a single end product requires a bill of materials that can be over 30 levels deep, moving across different production sites and regions.

Currently, human planners make thousands of local decisions every day. They decide what to produce, when to produce it, and how much safety stock to hold. Because the network is so large, a planner can’t easily see how a localized decision affects the rest of the global supply chain.

This scale can lead to additional working capital and inventory and or cause production imbalances. Traditional mathematical models struggle to capture the dynamic reality of the network that planners navigate based on years of experience.

Building a foundation for decision support

AlphaEvolve is an evolutionary coding agent that generates and refines algorithms autonomously. In collaboration with Google Cloud and prognostica GmbH, BASF’s objective was not to replace human decision-making, but to establish a new model for decision support that helps planners handle the real-world complexity of the production network.

The team gave AlphaEvolve a foundational "seed" program. This initial code established a standard planning logic that translated demand forecasts into production schedules, serving as a functional baseline before introducing dynamic, network-wide coordination. From there, they fed the model three years of historical data, including inventory levels, market demand, and actual production outputs. AlphaEvolve then generated variations of the code, mutating the logic to see if it could simulate a supply chain that matched the real-world historical data.

Measuring what good looks like in initial tests

For AlphaEvolve to improve, it needed a specific goal. The evaluation function scored every new piece of generated code on one primary metric: how closely the simulated inventory levels and production decisions matched the actual historical reality recorded by BASF.

The latest AlphaEvolve runs delivered more than 80% relative improvement in accuracy compared to the initial seed model. With further adjustments, the team expects to push performance even higher — bringing the model to a level of accuracy not achieved with other approaches and making it actionable for operational use.

The results

The evolved planning logic delivered immediate, measurable improvements over the initial seed model. The final algorithm successfully mirrored the actual historical performance of the supply chain, significantly reducing the error rate compared to the initial seed.

“We had several attempts to build a digital twin for our complex supply network using deterministic models, and all of them failed,” said Dr. Goetz Krabbe, vice president for global supply chain at BASF. “By using AlphaEvolve, we cannot only map the complex network based on system data, but at the same time understand and copy the human decisions that drive our daily operations. This gives us a highly accurate and easy to maintain data driven digital twin of the entire network. Using it we can optimize our inventory levels and respond to market volatility with confidence while avoiding stockouts."

What the evolved algorithm actually does

By running thousands of experiments, AlphaEvolve developed a clear, human-readable algorithm that explains how the BASF network truly operates. It automatically discovered factually correct, domain-specific supply chain rules that explain the observed production outputs and inventory levels for the tested product value chain:

Production consolidation: The algorithm learned to group production amounts together, accurately mapping how planners optimize plant time.
Dynamic safety stocks: It introduced safety stock parameters to handle volatile and seasonal demand patterns, helping to strictly manage capital costs while preventing out-of-stock situations.
Network-wide coordination: The model successfully mapped the dependencies between different production tiers, providing a clear foundation for optimizing asset utilization globally.

What's next

The initial simulations showed that evolutionary AI can accurately model large-scale, dynamic supply chains. BASF’s objective is to create a digital twin of their entire global production network as a new foundation for simulation, decision support, scenario forecasting and optimization. This will allow the team to continuously simulate operations, identify hidden bottlenecks before they affect throughput, and optimize asset utilization across all global facilities.

_{This project was a collaboration between the BASF SE team including: Benjamin Priese, Michael Arlt, Debora Morgenstern and Tobias Hausen as well as Manuel Doerr and Thomas Christ from Prognostica GmbH Würzburg, and the AI for Science team at Google Cloud including (but not limited to): Kartik Sanu, Laurynas Tamulevičius, Nicolas Stroppa, Chris Page, Srikanth Soma, John Semerdjian, Skandar Hannachi, Vishal Agarwal and Anant Nawalgaria as well as Christoph Tittelbach from the Google account team and partners at Google DeepMind}

Scaling data and AI with Managed Service for Apache Airflow

Mon, 04 May 2026 18:00:00 +0000

Orchestration is no longer just about moving data; it is about governing enterprise intelligence. To reflect our deep commitment to and embrace of open-source software, we shared earlier that Cloud Composer is now officially Managed Service for Apache Airflow.

We announced a massive leap forward in our orchestration capabilities, fundamentally reimagining how data teams operate in the AI era. With four major launches, we are embedding AI directly into your workflows to democratize access, accelerate productivity, and power your most demanding MLOps.

1. Apache Airflow 3.1 is now Generally Available

We announced Apache Airflow 3.1 in General Availability to power your most demanding AI and MLOps workloads. This release combines the significant foundation of Airflow 3.0 with the recent community innovations of 3.1.

Key capabilities include:

Decoupled architecture: A robust separation between the entire Airflow system and the execution layer for better scalability and enhanced security.
DAG versioning: Native support for automated DAG versioning, retaining the historical structure and run history.
Powerful managed backfills: A redesigned backfill system that is now a first-class citizen, fully managed by the scheduler.
Event-driven scheduling and data assets: Enhanced capabilities for triggering workflows based on assets as well as external events, like messages arriving in a message queue.
Human-in-the-Loop (HITL) and deadline alerts: Pause execution for human decision-making via the UI, and set proactive time-based thresholds for critical pipelines.
And many more…

2. Agentic troubleshooting with Data Engineering Agents

Managing complex pipelines just got significantly easier. The Data Engineering Agent is now embedded directly in your Managed Airflow dashboard to quickly analyze logs, identify root causes, and suggest fixes.

Rapid resolution: By integrating Gemini Cloud Assist Investigations¹, you can leverage AI to troubleshoot DAG Run failures and receive personalized fix proposals directly in the console.
Reduced MTTR: This agentic approach helps minimize Mean Time to Repair (MTTR) by eliminating manual log parsing. Furthermore, troubleshooting is now elevated to the DAG execution level—rather than just the task level—providing a holistic view of pipeline health.

3. Orchestration pipelines and deployment automation framework

You no longer need to be an Apache Airflow expert to harness its power. Orchestration pipelines are a core component of our new cross-product Deployment Automation Framework, allowing you to create end-to-end data pipelines efficiently.

Declarative orchestration: Define your entire pipeline—including the orchestration logic, infrastructure configuration, and dependencies—in simple, human-readable YAML files.
Cross-product bundles: These YAML definitions are easily deployed as a complete bundle to the cloud. For example, without knowing Airflow syntax, a user can quickly create and deploy a comprehensive data integration pipeline across dbt, Spark, DTS, and more.
Unified IDE experience: Alongside automated validation and deployment via GitHub actions, the Google Data Cloud extension makes agentic authoring and troubleshooting the centerpiece of your workflow. You can now rely on powerful AI agents to build and debug pipelines directly in your IDE, with the ability to visually inspect the agent-generated DAGs for complete oversight.

Crucially, this declarative approach breaks down the traditional silos between advanced Python developers and data analysts. By shifting to human-readable YAML, we are fostering a more inclusive data culture where a wider range of practitioners can independently author, understand, and manage critical data workflows.

4. MCP Server for Managed Airflow (Public Preview)

To further bridge the gap between AI and orchestration, we are launching the Managed Airflow MCP Server in Public Preview.

Agentic tooling: This server provides tools like list_environments, get_dag_run, and get_task_instance to fetch critical information about your environments.
Seamless integration & reduced context-switching: Both humans and agents can use these tools to simplify task management. Most importantly, this drastically reduces the context-switching developers face when debugging complex DAGs. By bringing environment and task data directly into your preferred interfaces, you can troubleshoot faster without constantly pivoting between different consoles.

Embrace the future of data orchestration

With these launches, we are fundamentally lowering the barrier to entry for orchestration while simultaneously raising the ceiling for what power users can achieve. By taking away the infrastructure burden and providing native, agentic tooling, data teams can stop wrestling with boilerplate code and start focusing primarily on deriving insights and driving business value.

Whether you are a seasoned Data Engineer building dynamic Python DAGs or a Data Analyst defining straightforward YAML pipelines, Managed Service for Apache Airflow is built for you.

Get Started Today Ready to experience the next generation of data pipeline orchestration? Create a new environment in the Google Cloud Console, explore the Google Cloud Data Agent Kit extension, and start building your agentic future today.

_{1. Availability might be limited (details)}

Data Analytics

Introducing the Open Knowledge Format

A fragmented context landscape

Knowledge as a living wiki

What's missing is a format, not another service

How OKF works: The design in one screen

Three principles behind the design

What we're shipping with the spec

Where we go from here

Transform dashboards into interactive data experiences with Looker agents

Interactive agent-led investigations

Tailor the agent to your business

Inherited trust and transparency

Enable dashboard agents today

Deep dive: How Lightning Engine delivers 4.9x faster Apache Spark performance

Under the hood: Vectorized native execution

Optimized Cloud Storage and BigQuery connectors

Broadcast joins and advanced query optimization

Learn more technical details and hear Lowe’s experience with Lightning Engine from Google Cloud Next ‘26

Getting started

Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB

The challenge: overcoming performance bottlenecks

The solution: a smooth migration to AlloyDB

The results: impact by the numbers

Future vision: AI and beyond

What's new for Managed Service for Apache Spark clusters

Faster, with the Lightning Engine native execution engine

Learn technical details and hear Lowe’s experience with Lightning Engine

Easier: Maximize resource obtainability via Flexible VMs

Easier: Zero-scale clusters and scheduled stops

Smarter: Managed Service for Apache Spark MCP Server

Smarter: Accelerating AI with the Data Agent Kit

Smarter: Next-generation Lakehouse

Next-gen runtimes: Cluster Image 3.0 with Spark 4.1

Get started today

What’s new with Google Data Cloud

June 1 - June 5

May 11 - May 15

April 20 - April 24

April 13 - April 17

April 6 - April 10

March 23 - March 27

March 16 - March 20

February 23 - February 27

February 16 - February 20

February 2 - February 6

January 26 - January 30

January 19 - January 23

January 12 - January 16

What’s new with Google Data Cloud - 2025

What’s new in serverless Managed Service for Apache Spark

Zero-setup onboarding

Fast startup for SLA-sensitive workloads

Better GPU obtainability

First-class support for Apache Spark 4.x

Enhanced multi-zonal support

Looking ahead

Get started today

Accelerating data lakes: Optimizing Apache Iceberg and Spark with gcs-analytics-core

What is the gcs-analytics-core library?

Key technical optimizations

Spotlight: Apache Iceberg integration

TPC-DS Performance Benchmarks using Spark

Get started

The fully-managed Remote MCP Server for AlloyDB is now Generally Available

Why AlloyDB is the strong foundation for agentic apps

Why Remote MCP matters for AlloyDB

Let's see it in action: A quick demo

What's next

Modeling a digital twin of a food supply chain using BigQuery Graph

The example of a growing restaurant

The friction of growth

The digital twin

Modeling with BigQuery Graph

1. Defining the Semantic Layer

Precision in practice

2. Executing the search

Building for the future

Conclusion

Get started today