This Stanford study examined how six major AI companies (Anthropic, OpenAI, Google, Meta, Microsoft, and Amazon) handle user data from chatbot conversations. Here are the main privacy concerns. 👀 All six companies use chat data for training by default, though some allow opt-out 👀 Data retention is often indefinite, with personal information stored long-term 👀 Cross-platform data merging occurs at multi-product companies (Google, Meta, Microsoft, Amazon) 👀 Children's data is handled inconsistently, with most companies not adequately protecting minors 👀 Limited transparency in privacy policies, which are complex and hard to understand and often lack crucial details about actual practices Practical Takeaways for Acceptable Use Policy and Training for nonprofits in using generative AI: ✅ Assume anything you share will be used for training - sensitive information, uploaded files, health details, biometric data, etc. ✅ Opt out when possible - proactively disable data collection for training (Meta is the one where you cannot) ✅ Information cascades through ecosystems - your inputs can lead to inferences that affect ads, recommendations, and potentially insurance or other third parties ✅ Special concern for children's data - age verification and consent protections are inconsistent Some questions to consider in acceptable use policies and to incorporate in any training. ❓ What types of sensitive information might your nonprofit staff share with generative AI? ❓ Does your nonprofit currently specifically identify what is considered “sensitive information” (beyond PID) and should not be shared with GenerativeAI ? Is this incorporated into training? ❓ Are you working with children, people with health conditions, or others whose data could be particularly harmful if leaked or misused? ❓ What would be the consequences if sensitive information or strategic organizational data ended up being used to train AI models? How might this affect trust, compliance, or your mission? How is this communicated in training and policy? Across the board, the Stanford research points that developers’ privacy policies lack essential information about their practices. They recommend policymakers and developers address data privacy challenges posed by LLM-powered chatbots through comprehensive federal privacy regulation, affirmative opt-in for model training, and filtering personal information from chat inputs by default. “We need to promote innovation in privacy-preserving AI, so that user privacy isn’t an afterthought." How are you advocating for privacy-preserving AI? How are you educating your staff to navigate this challenge? https://lnkd.in/g3RmbEwD
Data Privacy Issues With AI
Explore top LinkedIn content from expert professionals.
-
-
NEWS 21/10/25: Department of Homeland Security obtains first-known warrant targeting OpenAI for user prompts in ChatGPT According to a recent article by Forbes, the U.S. Department of Homeland Security (DHS) has secured a federal search warrant ordering OpenAI to identify a user of ChatGPT and to produce the user’s prompts, as part of a child-exploitation investigation. https://lnkd.in/eatmK3zv? Key details: - The warrant was filed by child-exploitation investigators within DHS. - It specifically targets “two prompts” submitted to ChatGPT by an anonymous user. The warrant asks OpenAI for the user’s identifying information and associated prompt history. - This is described as the first known federal search warrant compelling ChatGPT prompt-level data from OpenAI. What this means for privacy: -Prompts are treated as evidence. What users have assumed to be ephemeral or private entries in a chat session with an AI service may now be subject to law-enforcement production. -Scope of data retention and access must be reconsidered. If prompt history can be identified and requested, both users and providers should evaluate how long prompts are stored, under what identifiers, and how anonymised they truly are. - Implications for user trust and provider responsibility. AI companies may face growing legal obligations to disclose user-generated content and metadata, which may affect how the services present themselves (privacy guarantees, terms of service) and how users engage with them. - International context and legal cross-overs. For users in jurisdictions with strong data-protection regimes (for example, the General Data Protection Regulation in the UK/EU), the fact that prompt-data can be subject to U.S. warrant may raise questions about extraterritorial access and data flow compliance. In short: this isn’t just another law-enforcement request. It marks the first time a generative-AI provider has been legally compelled to unmask a user and disclose their prompt history. ============ ↳I track how stories like this shape the ethics and governance of AI. You can find deeper analysis at discarded.ai. #AISafety #AIRegulation #Privacy #Governance #Ethics Image AI Generated
-
The next big data privacy scandal in 2026 is not surveillance. It is surveillance pricing. Two people can buy the same thing on the same day and pay different prices because their data told the system they would tolerate it. This is the part more people need to understand. The next privacy battle is not only about: “Who has my data?” It is also about: “What are they doing with it?” Because once companies know your location, device type, browsing behaviour, repeat visits, urgency signals, and purchase history, privacy becomes a pricing issue. We are already seeing signals of this. Uber openly calls it surge pricing. Airbnb has Smart Pricing. Amazon lets sellers automate price changes in real time. Hotels and airlines have used dynamic pricing for years. In 2025, India’s consumer affairs ministry sent notices to Ola and Uber after allegations that identical rides were being priced differently on Apple and Android phones. So, what changes the privacy conversation is when dynamic pricing stops reacting only to market demand and starts learning from the customer in front of it. This is why I think the most important privacy question in 2026 is no longer: “Was my data leaked?” It is: “Is my data being used to influence the price, urgency, ranking, or offer I see?” Think about everyday Indian internet behaviour: You check a flight 4 times from the same laptop. You open a hotel app from a premium phone. You try booking a cab during rain, from a high-income pin code, late at night. You revisit the same product after showing clear buying intent. You may still call it convenience. But increasingly, it can also become behavioural exploitation. Because the moment customers feel the system knows them well enough to charge them more, trust collapses. And once trust collapses, growth gets expensive. My view is simple: Data privacy in 2026 is not just about protecting people from theft. It is about protecting people from invisible disadvantage. That is the conversation more founders, platforms, and regulators need to have now. Whats your surveillance pricing case you faced? Seqrite #DataPrivacy #DynamicPricing #AI #ConsumerRights #DigitalEconomy #Privacy #TechPolicy #StartupIndia #CyberSecurity #TrustInTechnology
-
𝗧𝗵𝗲 𝗦𝘂𝗿𝘃𝗲𝗶𝗹𝗹𝗮𝗻𝗰𝗲 𝗧𝗿𝗮𝗽: 𝗠𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴 𝗕𝗼𝗼𝘀𝘁𝘀 𝗩𝗶𝘀𝗶𝗯𝗶𝗹𝗶𝘁𝘆, 𝗲𝗿𝗼𝗱𝗲𝘀 𝘁𝗿𝘂𝘀𝘁. Over the past few months, more companies have quietly rolled out new monitoring systems — tracking mouse movements, keystrokes, websites, “idle time,” and even screenshots. 𝗧𝗵𝗲 𝗶𝗻𝘁𝗲𝗻𝘁? Improve productivity, tighten accountability, optimise workflows. 𝗧𝗵𝗲 𝗼𝘂𝘁𝗰𝗼𝗺𝗲? A workplace culture that feels more watched than supported. Here’s the paradox leaders are missing: 𝙈𝙤𝙣𝙞𝙩𝙤𝙧𝙞𝙣𝙜 𝙗𝙤𝙤𝙨𝙩𝙨 𝙫𝙞𝙨𝙞𝙗𝙞𝙡𝙞𝙩𝙮 — 𝙣𝙤𝙩 𝙩𝙧𝙪𝙨𝙩. Employees may be online longer, but they’re not necessarily more engaged. Surveillance signals a lack of confidence, and people respond by doing only what gets measured. 𝙏𝙧𝙖𝙘𝙠𝙞𝙣𝙜 𝙖𝙘𝙩𝙞𝙫𝙞𝙩𝙮 𝙙𝙤𝙚𝙨 𝙣𝙤𝙩 𝙣𝙚𝙘𝙚𝙨𝙨𝙖𝙧𝙞𝙡𝙮 𝙢𝙚𝙖𝙣 𝙩𝙧𝙖𝙘𝙠𝙞𝙣𝙜 𝙞𝙢𝙥𝙖𝙘𝙩. A green dot on Teams does not equal performance. When companies measure time-at-keyboard more than outcomes, employees shift from value-creation to “visibility theatre.” 𝙏𝙝𝙚 𝙚𝙢𝙤𝙩𝙞𝙤𝙣𝙖𝙡 𝙘𝙤𝙨𝙩 𝙞𝙨 𝙧𝙚𝙖𝙡. Workers report: • feeling micromanaged • reduced autonomy • lower morale • rising anxiety and distrust Ironically, the very tools meant to improve productivity may be undermining it. Modern work isn’t defined by minutes of activity — it’s defined by: • problem-solving • creativity • judgment • ownership • outcomes These can’t be captured by keystroke logs. 𝗧𝗵𝗲 𝗰𝗼𝗺𝗽𝗮𝗻𝗶𝗲𝘀 𝘁𝗵𝗮𝘁 𝘄𝗶𝗹𝗹 𝘄𝗶𝗻 𝗮𝗿𝗲𝗻’𝘁 𝘁𝗵𝗲 𝗼𝗻𝗲𝘀 𝘁𝗿𝗮𝗰𝗸𝗶𝗻𝗴 𝗲𝗺𝗽𝗹𝗼𝘆𝗲𝗲𝘀… 𝗧𝗵𝗲𝘆’𝗿𝗲 𝘁𝗵𝗲 𝗼𝗻𝗲𝘀 𝗲𝗺𝗽𝗼𝘄𝗲𝗿𝗶𝗻𝗴 𝘁𝗵𝗲𝗺.
-
Can You Trust Your Data the Way You Trust Your Best Team Member? Do you know the feeling when you walk into a meeting and rely on that colleague who always has the correct information? You trust them to steer the conversation, to answer tough questions, and to keep everyone on track. What if data could be the same way—reliable, trustworthy, always there when you need it? In business, we often talk about data being "the new oil," but let’s be honest: without proper management, it’s more like a messy garage full of random bits and pieces. It’s easy to forget how essential data trust is until something goes wrong—decisions are based on faulty numbers, reports are incomplete, and suddenly, you’re stuck cleaning up a mess. So, how do we ensure data is as trustworthy as that colleague you rely on? It starts with building a solid foundation through these nine pillars: ➤ Master Data Management (MDM): Consider MDM the colleague who always keeps the big picture in check, ensuring everything aligns and everyone is on the same page. ➤ Reference Data Management (RDM): Have you ever been in a meeting where everyone uses a different term for the same thing? RDM removes the confusion by standardising key data categories across your business. ➤ Metadata Management: Metadata is like the notes and context we make on a project. It tracks how, when, and why decisions were made, so you can always refer to them later. ➤ Data Catalog: Imagine a digital filing cabinet that’s not only organised but searchable, easy to navigate, and quick to find exactly what you need. ➤ Data Lineage: This is your project’s timeline, tracking each step of the data’s journey so you always know where it has been and is going. ➤ Data Versioning: Data evolves as we update project plans. Versioning keeps track of every change so you can revisit previous versions or understand shifts when needed. ➤ Data Provenance: Provenance is the backstory—understanding where your data originated helps you assess its trustworthiness and quality. ➤ Data Lifecycle Management: Data doesn’t last forever, just like projects have deadlines. Lifecycle management ensures your data is used and protected appropriately throughout its life. ➤ Data Profiling: Consider profiling a health check for your data, spotting potential errors or inconsistencies before they affect business decisions. When we get these pillars right, data goes from being just a tool to being a trusted ally—one you can count on to help make decisions, drive strategies, and ultimately support growth. So, what pillar would you focus on to make your data more trustworthy? Cheers! Deepak Bhardwaj
-
📌 The Modern Data Quality Framework for BI Every company wants better dashboards, better insights, better AI. But very few stop to ask the one question that actually matters: Can we trust the data we’re using in the first place? Because the hard truth is this: Most data issues don’t come from tools. They come from unreliable foundations that nobody notices until something breaks in production. When I look at the teams that consistently ship trustworthy data, there’s always the same pattern behind the scenes. Let me walk you through my reasoning. 1️⃣ 𝐓𝐡𝐞 5 𝐏𝐢𝐥𝐥𝐚𝐫𝐬 𝐀𝐫𝐞 𝐒𝐭𝐢𝐥𝐥 𝐭𝐡𝐞 𝐒𝐭𝐚𝐫𝐭𝐢𝐧𝐠 𝐏𝐨𝐢𝐧𝐭 Accuracy, completeness, consistency, timeliness, and validity. We all know them. But most teams still treat these as “definitions.” On the other hand, the best teams treat them as operational targets. It’s a completely different mindset. Accuracy isn’t “nice to have.” It’s whether your revenue aligns with reality. Completeness isn’t a rule. It’s whether you trust the KPI enough to act on it. Everything changes once you start thinking this way. 2️⃣ 𝐓𝐞𝐜𝐡𝐧𝐢𝐜𝐚𝐥 𝐂𝐡𝐞𝐜𝐤𝐬 𝐌𝐚𝐤𝐞 𝐨𝐫 𝐁𝐫𝐞𝐚𝐤 𝐑𝐞𝐥𝐢𝐚𝐛𝐢𝐥𝐢𝐭𝐲 This is where issues hide. I can’t count the number of times I’ve seen dashboards fail not because the model was wrong but because nobody noticed: → A column changed type → A pipeline skipped 2% of rows → A source table silently dropped a field → A null explosion went undetected for weeks This layer is invisible to most of the business, yet it’s the one that protects trust. If you don’t have anomaly detection or CI/CD tests, you’re relying on luck. And luck is not a data strategy. 3️⃣ 𝐆𝐨𝐯𝐞𝐫𝐧𝐚𝐧𝐜𝐞 𝐌𝐚𝐤𝐞𝐬 𝐄𝐯𝐞𝐫𝐲𝐭𝐡𝐢𝐧𝐠 𝐖𝐨𝐫𝐤 Data catalogs, lineage, ownership, contracts. People talk about them like buzzwords, but the impact is very real. Lineage isn’t a diagram. It’s how you debug issues in minutes instead of days. Contracts aren’t bureaucracy. They’re how producers guarantee stability for downstream teams. Stewardship isn’t a title. It’s accountability. What I’ve learned from my experience is simple: When governance is strong, you don’t spend your life firefighting. 4️⃣ 𝐀𝐭 𝐭𝐡𝐞 𝐂𝐞𝐧𝐭𝐞𝐫 𝐨𝐟 𝐄𝐯𝐞𝐫𝐲𝐭𝐡𝐢𝐧𝐠: 𝐃𝐚𝐭𝐚 𝐓𝐫𝐮𝐬𝐭 This is the part people underestimate. Trust is not something you “announce” on a slide. It’s something you earn, build, and protect over time. It shows up in adoption. It shows up in business confidence. It shows up in how quickly you can respond when an anomaly hits. Trust is the real KPI. And when it’s strong, everything else becomes easier. Executives stop asking "where did this number come from." Why does this matter so much? Because a lot of companies are scaling GenAI without first fixing data quality. And when AI learns from unreliable data, it becomes unreliable itself. If you want to improve decision-making, data quality is not a side topic. Everything else is built on top of it.
-
Digital is the operating system of every charity. SCVO’s latest call to action is clear: AI, data, cyber and digital confidence are governance issues, not technical side projects. For boards and leaders, there are some uncomfortable truths in here: 1. If digital isn’t on the board agenda, you don’t have a digital strategy 🚩 Delegating this to a single staff member is like delegating finance to Excel. 2. Tools don’t transform organisations - leadership does ⚡ Buying software without investing in skills and culture just creates more expensive problems. 3. User-centred design is governance 🧑🦳 Boards should be asking: who gets excluded by our digital choices? Not just what platform should we buy? 4. Data is an ethical responsibility, not just a dashboard 📊 What you collect, how you protect it, and what you do with it is about trust — not KPIs. 5. AI needs curiosity with guardrails ✨ Small, values-led experiments beat hype-driven gambles every time. 6. Cyber risk sits with the board - whether the board feels ready or not 🔑 Digital confidence is about leaders asking better questions, creating permission to learn, and connecting technology to mission. If you’re a trustee or CEO and digital still feels like “someone else’s job” - that’s the risk right there.
-
AI reaches a milestone: privacy by design at scale Google AI and DeepMind have announced VaultGemma, a 1B parameter, open-weight model trained entirely with differential privacy (DP). Why does this matter? Most large LLMs carry inherent privacy risks: they can memorise and reproduce fragments of their training data. A serious issue if it’s a patient record, bank detail, or private correspondence. VaultGemma's training method - DP-SGD, which limits how much influence any datapoint has and adds noise to blur details - ensures no single personal data included in the training could later be exposed. The result: a mathematical guarantee of privacy, the strongest ever achieved at this scale. The opportunities In healthcare, finance, and government, the implications are immediate: 🔸 Hospitals can analyse patient data without risking disclosure. 🔸 Banks can detect fraud or assess credit risk within GDPR rules. 🔸 Governments can train models on citizen data while meeting privacy-by-design requirements. In each case, sensitive data shifts from a liability to an asset that can drive innovation. The challenges 1️⃣ Performance: VaultGemma is less accurate than the frontier LLMs, closer to the performance of GPT-3.5. This is the cost of stronger privacy: trading short-term capability for long-term protection. 2️⃣ Jurisdiction: The model guarantees privacy, but not sovereignty. Built by an American provider, it remains subject to U.S. law. Under the CLOUD Act, American authorities can compel access even to data hosted abroad. How this compares 💠 Gemini has strong capability and multimodality, but privacy protections rest on corporate policy. 💠 ChatGPT-5 leads in performance, but is closed & under U.S. jurisdiction. 💠 Claude is positioned as “safety-first,” yet its privacy controls are policy-based, not mathematical. By contrast, VaultGemma offers provable privacy. The trade-off is weaker performance and continued U.S. jurisdiction - but it moves the conversation from “trust us” to “prove it.” Leaders have now a wider choice for adopting AI: ✔️ Privacy-first model: trade accuracy for provable privacy. Suited for highly regulated sectors and SMEs needing compliance. Lower cost, limited customisation, under U.S. law. ✔️ Frontier LLMs: cutting-edge capability at scale. Privacy rests on policy, with jurisdiction split - U.S., Chinese, or EU law. Highest-priced via usage-based APIs, but with the broadest ecosystems and integrations. ✔️ Sovereign alternatives: slower today, but with greater control of data and law. Could adopt privacy-by-design methods like VaultGemma, though requiring heavy upfront investment. Higher initial cost, offset by customisation and long-term resilience. AI has reached a milestone: privacy by design is possible at scale. Leaders need to balance trust, compliance, performance, and control in their choices. #AI #ResponsibleAI #DataPrivacy #DigitalSovereignty #Boardroom
-
The Model That Knows Too Much: How AI Can Leak What It Learned What if your AI isn’t just generating answers… it’s recalling secrets? We tend to treat AI models like secure engines: data goes in, predictions come out, and whatever happens inside stays inside. But model inversion attacks challenge that assumption. With carefully crafted queries, an attacker doesn’t need to breach your database; they can interrogate your model until it begins to reveal fragments of the very data it was trained on. No firewall alerts. No ransomware screen. Just seemingly normal interactions… producing not-so-normal disclosures. As organizations rush to embed AI into healthcare, finance, legal systems, and enterprise platforms, the question is no longer just “Is the model accurate?” It’s “What does the model remember?” Because in the age of foundation models, sensitive data doesn’t have to be stolen from storage; it can be reconstructed from memory. And that turns your AI system into something far more than a tool. It becomes a potential leak. #Cybersecurity #ArtificialIntelligence #AIsecurity #ModelInversionAttacks #DataPrivacy #MachineLearning #AISafety #Infosec #DigitalRisk #FoundationModels #DataProtection #CyberRisk
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development