How to Remove Your Personal Information from AI Datasets

How to Remove Your Personal Information from AI Datasets

February 12, 20266 min read

Privacy & Technology

The Uncomfortable Truth Nobody Told You

Your name. Your face. Your old blog posts from 2015. Your forum rant about a pizza delivery.

It's all probably inside an AI model now.

Companies are scraping information from across the internet and using it to train AI systems without your consent. Meta has confirmed that all text and photos publicly posted by adult Facebook and Instagram users since 2007 have been scraped and fed into the company's intelligence models.

That's two decades of your digital life. Fed into a machine you never agreed to feed.

The question isn't whether your data is there. The question is: what can you actually do about it?

Ensuring Future Data Privacy Safeguarding Personal Details in the ...

This guide gives you the answer and a step-by-step plan to take back as much control as possible over your personal information.

Why Your Data Ended Up in AI Models

Many tech companies creating AI products today do not disclose their sources of training data. Through tools like "Have I Been Trained" we have learned that many AI products are trained on datasets containing personal information scraped from popular websites. These websites include media platforms, video and photography sites and online encyclopedias.

According to one study AI products often produce outputs that contain peoples names and some form of contact information. One artist even discovered that her private medical records were inside LAION, a training dataset used by many AI products.

Academics have identified at 12 privacy risks from AI. These risks include AI collecting amounts of data increasing surveillance risks revealing sensitive information and making sensitive information more accessible.

The Hard Truth: Full Removal Is Nearly Impossible

Navigating Data Privacy and Security Challenges in AI: A Legal Guide ...

It is currently technically impossible to remove data from training datasets without influencing deep learning models in unpredictable ways.

The landscape is shifting fast:

  • California's AB 1008 law, taking effect in 2025 requires AI developers to honor deletion requests for information embedded in models.

  • Researchers at UC Riverside have developed a method that compels AI models to forget selected information while maintaining functionality.

  • The European Data Protection Board has made the right to erasure its enforcement priority for 2025.

The tools are coming.. Right now your power lies in opt-outs deletion requests and controlling future exposure.

Your Action Plan: 6 Steps to Reclaim Your Data

7 GDPR Principles: Guide to Data Protection

Step 1: Audit Where Your Data Might Be

Before you can remove anything you need to know what's there.

  • Visit haveibeentrained.com and search for your images in AI training datasets.

  • Search your name + email in quotes on Google to see whats publicly indexed.

  • Review the privacy policies and terms of service of platforms where you've shared content.

Step 2: Opt Out of AI Training on Major Platforms

ChatGPT (OpenAI)

You can opt out of training through OpenAI's privacy portal by clicking "Do not train on my content." Once you opt out, new conversations will not be used to train their models. OpenAI Help Center You can also use Temporary Chat mode, these chats won't appear in history, use or create memories, or be used to train models. OpenAI Help Center

Google Gemini

Open Gemini in your browser, click on Activity, and select the Turn Off drop-down menu. You can turn off Gemini Apps Activity, or opt out and delete your conversation data. Build Fast with AI

LinkedIn

Log into your LinkedIn account → Click on "Me" in the upper toolbar → Select "Settings & Privacy" → Under Data Privacy → Select "Data for Generative AI Improvement" → Move the slider to Off. Transparencycoalition

Microsoft 365 (Word, Excel, Outlook, PowerPoint)

Everyone using Microsoft 365 tools is automatically opted in unless they take steps to opt out. You'll need to opt out of both the MS "connected experiences" portal and separately disable Copilot to keep your work truly private. Transparencycoalition

Facebook & Instagram (Meta)

For U.S. users, Meta does not provide a direct opt-out option. The most effective solution is to make your Facebook and Instagram accounts private, Meta has said it will not use data from private accounts going forward. PIRG EU users have additional rights under GDPR and can submit a formal objection through Meta's privacy form.

X (formerly Twitter)

X rolled out terms of service stating that by continuing to post, users automatically consent to X using their data to train its AI models, including Grok. PIRG Your options are limited to deleting past posts or deactivating your account.

Step 3: Submit Opt-Out Requests to Data Brokers

Data brokers compile profiles on individuals and sell this information. Removing your data from these brokers is a first move.

Priority brokers to contact:

  • Spokeo, WhitePages, BeenVerified, Intelius, Acxiom.

  • Search "data broker opt-out" on each company's site for their removal form

  • This process is ongoing, as new data brokers emerge and some may re-add your information over time, regular checks are recommended.

Step 4: Submit a Direct Privacy Request to AI Developers

OpenAI provides options through its Privacy Portal.

For AI companies visit their official privacy policy pages and look for "Data Subject Access Request" (DSAR) forms.

Step 5: Leverage Your Legal Rights

The right to be forgotten empowers individuals to request the deletion of their data.

When contacting data brokers or AI developers mention the specific privacy law that grants you the right to data deletion.

Know which laws protect you:

Protected by law

Step 6: Reduce Your Future Exposure

Audit your social media profiles and privatize or delete old posts. Add robots.txt blocks, on your website to prevent AI crawlers from scraping your content. Use privacy-focused tools that minimize the personal data footprint you leave behind.

The Honest Bottom Line

Privacy in the AI era is not something you can fix with one click. It is an effort that requires your attention.

The companies that collect your data have engineers working for them. You on the hand have to fill out opt-out forms and file a GDPR request. It is clear that the advantage is not on your side.

How to Opt Out of Data Brokers: Step-by-Step Guide to Remove Your ...

However small actions that you take consistently can add up over time. Every time you submit an opt-out form make your account private or request that a data broker remove your information. It limits what they can do with your data. As global regulations become stricter you gain control, bit by bit.

Researchers at the University of California Riverside have said that people should be able to know that their data can be completely erased from machine learning models. Not in theory but, in ways that can be proven to work. University of California

That future is being created now. In the meantime you know what actions to take to protect your privacy.

Your Quick-Action Checklist

  • Search your name on HaveIBeenTrained.com

  • Opt out of ChatGPT training (OpenAI Privacy Portal)

  • Opt out of Gemini training (Activity settings)

  • Turn off LinkedIn Generative AI data setting

  • Set Facebook & Instagram accounts to Private

  • Submit removal requests to top 5 data brokers

  • Add GPTBot to your website's robots.txt

  • File a GDPR/CCPA request with AI companies that hold your data

  • Set a quarterly calendar reminder to recheck


Share this post if someone you know deserves to understand their data rights. The more people who opt out, the stronger the signal to the industry.

At Engage AI our specialty is cutting through the noise. Helping businesses like yours put AI to work in ways that deliver real measurable results. Learn more about our services and book a consultation today.

Lance Blitzer

Lance Blitzer

At Engage AI, we are a team of dedicated professionals committed to revolutionizing the way businesses operate through advanced automation solutions. With years of experience in the industry, we specialize in helping companies streamline their workflows, integrate tools seamlessly, and achieve greater efficiency with our user-friendly automation software. Our mission is to empower businesses to focus on growth and innovation, while we handle the repetitive tasks that slow them down.

LinkedIn logo icon
Instagram logo icon
Youtube logo icon
Back to Blog