Variant ratified in Apache Parquet community for open data

View organization page for Databricks

1,051,295 followers

Exciting news for the open data community: Variant, the native data type for semi-structured data, is now ratified in the Apache Parquet™ community — with support across Delta Lake, Apache Iceberg™, and Apache Spark™. Variant brings a unified, open standard to how the lakehouse stores and queries flexible data—making it faster, simpler, and more consistent across formats and engines. https://lnkd.in/gESP97Nm

  • No alternative text description for this image
Awadelrahman Ahmed

Data & AI Architect – Strategy, Platforms & Solutions | Databricks MVP | Databricks Technical Council Member | MLflow Ambassador

1w

Thanks to Variant, they just made Delta, Iceberg, and Spark agree on something🙈

Gaurav Harpale

Senior Data Engineer @ NTT Data | Snowflake | Databricks | Gen-AI | Cloud Data Migration | Azure | ADF | DWH | PySpark | Teradata | Salesforce

1w

This is a huge step forward for the open data ecosystem! Standardizing Variant means true interoperability for semi-structured data. Faster queries, simpler pipelines, and consistent handling of flexible data types; this will unlock massive value for lakehouse architectures and AI-driven analytics!

Sagar Shinde

Director @ CloudEthiX

1w

Exciting progress toward true interoperability in the open data ecosystem!

We’ve been using this feature for some time at ziggiz, and it’s been a game changer. As a managed Cyber Lakehouse, we’re thrilled to see the open ecosystem moving so quickly — Variant gives us incredible flexibility when working with variable fields across diverse security and IT telemetry sources. Exciting times ahead for truly open, analytical data models.

Fantastic progress for open standards and the lakehouse community! Excited to see how Variant improves flexibility and performance. 🚀

Fantastic milestone for the open data ecosystem and for everyone building smarter data architectures! At Kobai we’re extending the power of the Lakehouse with a semantic layer that understands Variant natively — connecting structured and semi-structured data into meaningful knowledge graphs. The result? * Faster context-aware insights * Unified data semantics across Delta, Iceberg, and Spark * A foundation for truly intelligent, explainable AI Learn how Kobai makes your Lakehouse even smarter: kobai.io #Databricks #KnowledgeGraph #Lakehouse #AI #DataEngineering #OpenData

"Unified open standard for semi-structured data" — words that shouldn't be this exciting, but here we are 😄 Seriously though, this fixes a real problem. Nice work!

Like
Reply
Amit Gupta

Senior Data Architect | Cloud Data Migration Consultant | Databricks Engineer | 2 x Azure | 1 x GCP Data Practioner

1w

It's amazing 👏

Like
Reply
Gaive Gandhi

Sr. Director - Strategy, Consulting and Delivery | Data and AI | Building Delivery Teams | I Am Here To Learn

1w

This is awesome. I have a few questions though: 1. Is the Variant data type supported with custom Serializers like Kryo? 2. Since there is interoperability amongst Delta, Hudi and Iceberg, I am presuming Variant data written using one open table format can be read and processed by another.

Like
Reply
Esdras Rocha

Engenheiro de Dados Sênior | Databricks, ADF, PySpark | Multi-Cloud (AWS & Azure) | Arquitetura Lakehouse | ETL & Governança de Dados

1w

Huge step forward for the Lakehouse world! Variant is a game changer for handling semi-structured data seamlessly across Delta, Iceberg, and Spark

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories