Data as an essential element of the AI-driven digital economy

Dr. Peter Katko

Digital and AI Law Leader, EY Law

Published Sep 24, 2025

In today’s rapidly evolving digital economy, data is no longer just a byproduct of business operations, it is the lifeblood of innovation, strategy, and competitive advantage. AI’s growing impact amplifies data’s value, the more data it consumes, the smarter it becomes. Businesses that leverage and safeguard their data will dominate the AI era. As AI integrates deeper into operations, data quality, governance, and innovation (like synthetic data) grow more critical. As explored in the sections below, understanding the economic gravity of data requires examining how it is reshaping industries and redefining value creation across sectors.

Why data matters more than ever: the economic gravity of data

According to the International Data Center Authority´s 2025 Global Digital Economy Report, the digital economy now accounts for 15% of global GDP, or roughly $16 trillion, with data-intensive services driving most of that growth. This transformation is not confined to the tech giants. Today, every industry, whether consciously or not, is becoming a data industry.

In the retail sector, businesses are using data to better understand the commercial habits of their customers, informing decisions on everything from product placement to personalized offers and demand forecasting. In healthcare, data enables earlier diagnoses and more targeted treatments. In finance, it powers fraud detection, credit scoring and real-time risk analysis. These examples highlight a deeper truth: it’s not just the technology that’s driving progress; it’s the data itself. Across every sector, data is the connective tissue of innovation, the invisible infrastructure that links insight to action and turns potential into performance.

AI is only as good as the data behind it

Artificial intelligence is now deeply embedded in nearly every business function. But regardless of how advanced the model is, its effectiveness hinges on the integrity of the data it’s built on. When data is incomplete, inconsistent, or biased, the outcomes of AI can be equally flawed, leading to poor decisions, skewed insights, and reputational risk. These issues are not just technical, they’re strategic ones that affect trust, compliance and performance.

With 100 million weekly users processing 10 million daily queries, ChatGPT demonstrates an astonishing testament to its global relevance. But what truly powers this capability is data. The model was trained on a massive 570GB corpus of text, including web pages, books, and other sources – equivalent to roughly 300 billion tokens. This immense volume of high-quality, diverse data is what enables ChatGPT to understand context, generate nuanced responses, and solve complex problems across domains. The takeaway is clear: the smarter the AI, the deeper its data roots. Without vast and well-curated datasets, even the most sophisticated models would fall short of delivering meaningful impact. Yet even with this scale, ChatGPT is not the pinnacle of intelligence – it’s a reflection of its training data, reminding us that in AI, progress is proportional to the depth, diversity, and integrity of the data behind it.

EY strategic guidance highlights a fundamental principle: AI is only as good as the data it’s trained on. As outlined in the article 7 Steps to Leveraging Data Effectively in the AI Era, the firm recommends investing $20 in data for every $1 in AI, emphasizing that data quality, lineage, and bias assessment are essential to achieving meaningful and trustworthy outcomes. Without high-quality, well-governed data, even the most sophisticated AI models are unlikely to deliver reliable insights on a scale.

Data: the core currency of the AI economy

In the AI digital economy, data shapes how algorithms learn, how decisions are made, and how value is created. But the true power of data lies not in its volume, but in how thoughtfully it is sourced, structured, and scrutinized.

Organizations must move beyond merely collecting data and begin to critically examine its strategic value. To truly unlock its potential, they need to ask important questions: Where does our data come from, and how trustworthy is it? Is it representative, free from bias, and ethically sourced? And perhaps most importantly, are organizations using this data in ways that reflect their core values and support our long-term objectives?

As highlighted in the recent Spiceworks analysis, a critical challenge for AI development: we may exhaust high-quality training data for large language models (LLMs) as early as 2026. As mentioned earlier, current models like ChatGPT already consume trillions of tokens. The supply of publicly available, human-generated text data is finite, and the AI industry is rapidly consuming this limited resource. This scarcity is driving organizations to explore alternatives like synthetic data generation and proprietary datasets, though these approaches introduce new challenges around quality, bias, and intellectual property rights. This impending shortage fundamentally changes the economics of AI, making existing data assets increasingly valuable and necessitating more strategic approaches to data acquisition and management. As AI models become increasingly commoditized, proprietary, high-quality data becomes the true differentiator.

Data as a defensible competitive advantage

In the AI-driven digital economy, data is more than just valuable, it’s defensible. Proprietary data, unique to a company, is emerging as a new form of intellectual property and a powerful competitive advantage.

Furthermore, synthetic data is redefining what it means to own a competitive data advantage. Unlike scarce or privacy-restricted traditional data, it lets organizations create tailored datasets on demand, while retaining full ownership. When engineered properly, it can model rare scenarios, preserve anonymity, and even enhance AI training quality. But the real value lies in disciplined curation: top enterprises now blend synthetic and real data into hybrid assets, rigorously validating for bias and compliance to build scalable, defensible AI advantages. The key is quality at scale: synthetic data must replicate real-world complexity to be effective. When done right, it doesn’t just add quantity – it strategically expands the 'learning curriculum' for its AI systems.

In my experience, leading organizations understand this shift. They are investing in robust data governance, ethical AI practices, and strategic partnerships to build rich data ecosystems. By treating data as a core business asset, with board-level oversight and cross-functional collaboration, they are not just unlocking insights, but securing long-term, hard-to-replicate advantages in a rapidly evolving landscape.

The cost of poor data strategy

In the AI-powered digital economy, data is not just a support function, it’s the foundation of value creation. Without a clear and mature data strategy, organizations risk falling behind in a landscape where speed, intelligence, and adaptability are driven by data.

The EY AI Pulse Survey, which gathered insights from 500 US senior business leaders, reveals that an impressive 97% of those whose organizations are investing in AI are already realizing measurable value across business functions. Notably, organizations allocating 5% or more of their total budget to AI initiatives report significantly greater impact. Findings from EY report AI Investments Remain Strong in 2025 Amid Rising Risks show that companies are achieving higher performance across key areas: competitive advantage (80%) operational efficiency (84%), employee productivity (83%), and product innovation (75%). However, 83% of these leaders also acknowledge that stronger data infrastructure would accelerate AI adoption, and two-thirds admit their current limitations are actively holding them back.

In contrast, it seems likely that companies without a strong data foundation face mounting disadvantage. Siloed systems prevent insights from flowing across teams, limiting collaboration and innovation. Poor data quality leads to unreliable AI outputs, undermining performance and decision-making. And without proper governance, organizations expose themselves to compliance risks and a loss of stakeholder trust.

The future belongs to those who see data not just as a resource, but as a responsibility, one that demands critical thinking, collaboration, and a commitment to trust. As AI scales and the digital economy accelerate, data is becoming both the most valuable and the most scrutinized asset.

With a rising regulatory landscape, from the EU AI Act to global privacy laws, compliance is essential, but trust is the true differentiator – and that is precisely the purpose of these frameworks: to foster trust and protect individuals in an increasingly data-driven world, while simultaneously promoting innovation and enabling organizations to unlock the full value of their data.

In this landscape, the EU Data Act marks a pivotal step toward a more open, fair, and innovation-driven data economy. By establishing clear rules for data sharing between businesses, consumers, and public institutions, the regulation aims to unlock the value of data while safeguarding privacy, security, and competition. It’s not just a technical framework, it’s a strategic blueprint that challenges companies to rethink how they collect, manage, and leverage data across their operations. The Data Act encourages organizations to treat data not only as a commercial asset but as a shared resource that must be handled responsibly and transparently.

Meeting these ambitions requires a multidisciplinary approach: data governance, legal, cybersecurity, AI, and operational teams must work together to make sure that data is managed ethically, securely, and strategically across its entire lifecycle.

What sets leading organizations apart is their ability to bring these capabilities together and to do so swiftly and effectively, turning multidisciplinary collaboration into a true competitive advantage.

The views reflected in this article are the views of the authors and do not necessarily reflect the views of Ernst & Young LLP or other members of the global EY organization.

Authors:

David de Falguera Llobet

Gabrielle Ellis

EU AI Act in Practice

2,340 followers

+ Subscribe

Melissa A. Ellis

Senior Executive Assistant to the Chairman & CEO

Very timely and well written!

Karl Ricker

Engineering and Quality Assurance at Innovative Solutions, Inc.

1mo

I believe AI is coming and will be impactful to everyone, whether they are ready or not. Quibbling on legal matters of Data governance will not speed a country’s adoption and use of it. I would emphasize more the education and trading of citizenry.

Oliver M. Habel

Managing partner of tecLEGAL Habel RAe, specialised in corporate law and international commercial law

1mo

I found these lines interesting and agree with them completely. Great job! I learnt this week that the Data Act and the data economy do not seem to be important issues for German companies, at least in legal departments, where they do not appear to be significant problems. I get this impression from feedback from companies.

1 Reaction

Guido Reinke, Ph.D., LL.M., CIPP/E, CISA CRISC, CFE

1mo

I fully agree. As the latest guidance from the EDPB shows, AI also should adher to privacy by design and default. And it should comply with IP laws.

1 Reaction

Gayanthi Gunawardhana, LL.M.

Principal Consultant @ Libra Sentinel | Data Privacy & Consent Architecture | AI Governance & AI Literacy | FinTech Law

1mo

Excellent framing Dr. Peter Katko. The real leadership litmus test isn’t more models, it’s whether an organization can prove lineage and single-owner accountability for one critical dataset this quarter. If you can, you’ve built defence. If not, you’ve bought technology, not trust.

LinkedIn respects your privacy

Data as an essential element of the AI-driven digital economy

Dr. Peter Katko

Digital and AI Law Leader, EY Law

EU AI Act in Practice

2,340 followers

More articles by Dr. Peter Katko

Others also viewed

Beyond Visibility: VAST Announces the AI Agent Platform for Proactive Supply Chain Command

Driving Growth with Data: How Franklin Templeton Investments Utilized Machine Learning to Acquire $600M in New Assets

AI-Ready Data: Why Context Is the Missing Link

Data Quality & Observability: The Hidden Pillars of AI Success

The Next Big Data Gold Rush: Why Vertical AI Needs Specialized Data Providers Now

The Foundation of Industrial AI: Addressing the Data Quality Imperative

The First Principles of Value and Digital Data Ownership

AI in Private Markets Is Only as Smart as Your Data

Knowledge Graphs as Fancy Databases

Dr. Tom Redman on Why Bad Data Could Break AI And How to Fix It

Explore content categories

EU AI Act in Practice

2,340 followers

More articles by Dr. Peter Katko

The EU AI Act’s impact on HR functions

Can AI help online platforms to comply with the Digital Services Act (DSA)?

AI and copyright: the Text and Data Mining exception with respect to training AI models

Artificial Intelligence (AI) regulation in the automotive sector

AI in financial services: compliance with the EU AI Act

New Privacy Law in Australia - will also impact AI

Responsibilities in the AI value chain: part 2 — the Provider

Responsibilities in the AI value chain: Part 1 – the Deployer

AI literacy – knowledge is power

The EU AI Act is here: ensuring AI system compliance with conformity assessment procedures

Others also viewed

Beyond Visibility: VAST Announces the AI Agent Platform for Proactive Supply Chain Command

Driving Growth with Data: How Franklin Templeton Investments Utilized Machine Learning to Acquire $600M in New Assets

AI-Ready Data: Why Context Is the Missing Link

Data Quality & Observability: The Hidden Pillars of AI Success

The Next Big Data Gold Rush: Why Vertical AI Needs Specialized Data Providers Now

The Foundation of Industrial AI: Addressing the Data Quality Imperative

The First Principles of Value and Digital Data Ownership

AI in Private Markets Is Only as Smart as Your Data

Knowledge Graphs as Fancy Databases

Dr. Tom Redman on Why Bad Data Could Break AI And How to Fix It

Explore content categories