I'm an experienced Data Architect & AI Agent Developer with a passion for designing end-to-end data solutions that drive business value. I specialize in bridging traditional data engineering with cutting-edge AI agents, combining deep expertise in cloud architecture, data analytics, and machine learning engineering.
Currently diving deep into AI/Agentic systems, particularly data agents that leverage my extensive background in data engineering and cloud solutions.
I'm actively exploring the intersection of data engineering and AI agents, developing customized data science agents that combine traditional data processing with intelligent automation. My work focuses on:
- Data Agents Development: Creating intelligent agents for data processing and analysis
- Agentic Workflows: Designing autonomous systems for data pipeline management
- AI-Powered Data Solutions: Integrating LLMs with traditional data engineering patterns
π¬ Current Project: Developing custom data agents based on ADK samples, combining my data expertise with agentic AI capabilities.
As a thought leader in data analytics and AI, I actively contribute to the developer community through speaking engagements and knowledge sharing.
Google Cloud Next Extended Singapore 2025 - June 14, 2025
Talk: "Metadata: The Key to Unlocking Data Analytics in the Agentic Era"
Presented insights on Google Cloud's latest data analytics innovations from Next '25, focusing on AI integration with BigQuery and the crucial role of metadata in enabling AI agents. Covered specialized AI agents for various user roles, AI-assisted notebooks, and the BigQuery AI Query Engine's capabilities with both structured and unstructured data.
Key Topics: BigQuery metadata, AI agents, data governance, query optimization, autonomous data processing
GDG Monthly Meetup #10 - October 24, 2024
Talk: "Harnessing Real-Time Insights: LLM Inference for Streaming Data with SQL"
Explored practical techniques for performing real-time inference on streaming data using large language models (LLMs) and SQL. Demonstrated seamless integration of LLMs into existing application workflows, enabling real-time insights, predictions, and classifications directly within familiar SQL environments.
Key Topics: Real-time data processing, LLM integration, streaming analytics, SQL-based AI inference
mcp-cr - Model Context Protocol Server
A comprehensive tutorial for deploying MCP (Model Context Protocol) servers to Google Cloud Run, featuring a zoo animal database with interactive tools. Demonstrates modern AI integration patterns with cloud-native deployment.
Tech Stack: Python, FastMCP, Google Cloud Run, Docker Key Features: MCP server implementation, Cloud Run deployment, Interactive AI tools, RESTful API
gemini-cli-1c - One-Click Gemini CLI Setup
Automated installation script for complete development environment setup with NVM, Node.js, and Google's Gemini CLI. Streamlines developer onboarding for AI-powered workflows.
Tech Stack: Shell, Node.js, NVM, Gemini CLI Key Features: One-command installation, Environment configuration, Developer productivity tools
spark-hybrid-compute - Advanced Spark Integration
Comprehensive solution for Spark integration with BigLake Metastore and Apache Iceberg, supporting both Dataproc and Docker-based deployments. Demonstrates hybrid cloud computing patterns for modern data lakes.
Tech Stack: Apache Spark, BigLake, Apache Iceberg, Dataproc, Docker, Jupyter Key Features: Hybrid cloud architecture, Iceberg table management, BigQuery integration, Multiple deployment options
bigquery-antipattern-recognition - BigQuery SQL Optimization Tool
Enhanced fork of Google Cloud Platform's utility for identifying and rewriting common anti-patterns in BigQuery SQL syntax. Added advanced features including query grouping functionality and clustering optimization patterns to improve query performance analysis.
Tech Stack: Java, BigQuery, Maven, Docker, Cloud Run, Vertex AI Key Features: 15+ antipattern detections, AI-powered SQL rewriting, Query grouping analysis, Remote UDF deployment, CI/CD integration
cf-bq-rf-gemini - BigQuery AI Integration
BigQuery Remote Function implementation that integrates Gemini AI directly into SQL queries, enabling AI-powered data processing at scale within BigQuery workflows.
Tech Stack: Go, BigQuery Remote Functions, Gemini AI, Cloud Functions Key Features: SQL-native AI integration, Scalable processing, Custom model configurations and proper error handling
audio-transcribe-go - Speech-to-Text Service
Production-ready speech-to-text application designed for BigQuery Remote Functions, enabling audio transcription directly within SQL workflows. Supports multiple deployment options and comprehensive testing.
Tech Stack: Go, Google Speech-to-Text API, BigQuery Remote Functions, Cloud Functions Key Features: BigQuery integration, Multiple deployment options, Comprehensive testing, Production-ready
cf-pubsub-to-bq - Real-Time Data Ingestion
Complete real-time data pipeline solution from Pub/Sub to BigQuery using Cloud Run Functions. Includes data generation, streaming processing, and automated table management.
Tech Stack: Go, Pub/Sub, BigQuery, Cloud Run Functions, Dataflow Key Features: Real-time processing, Automated data generation, Partitioned tables, End-to-end pipeline
- Big Data Processing: Apache Spark, Dataproc, distributed computing
- Data Warehousing: BigQuery, data modeling, partitioning strategies
- Real-Time Streaming: Pub/Sub, Kafka, event-driven architectures
- Database Technologies: PostgreSQL, MySQL, Redis, Cassandra
- AI Agents: Data science agents, agentic workflows, autonomous systems
- LLM Integration: Gemini AI, prompt engineering, AI-powered data processing
- ML Engineering: Model deployment, MLOps, production ML systems
- Google Cloud Platform: Comprehensive expertise across data, AI, and compute services
- Microsoft Azure: Enterprise cloud solutions and data platforms
- Serverless Computing: Cloud Functions, Cloud Run, event-driven architectures
- Infrastructure as Code: Terraform, deployment automation
- Google Cloud Platform (Current): Specialized in data platforms, serverless architectures, and AI/ML services
- Microsoft Azure (Previous): Enterprise cloud solutions and data platform design
- Serverless-First Approach: Designing scalable, cost-effective cloud-native solutions
- X (Twitter): @johanesalxd
- LinkedIn: johanesalxd


