Meta purportedly trained its AI on more than 80TB of pirated content and then open-sourced Llama for the greater good

Meta is facing a class-action lawsuit alleging copyright infringement and unfair competition over the training of its AI model, Llama.

According to court documents released by vx-underground, Meta allegedly downloaded nearly 82TB of pirated books from shadow libraries such as Anna’s Archive, Z-Library, and LibGen to train its AI systems.

Internal discussions reveal that some employees raised ethical concerns as early as 2022, with one researcher explicitly stating, “I don’t think we should use pirated material” while another said, “Using pirated material should be beyond our ethical threshold.”

Meta made efforts to avoid detection

Despite these concerns, Meta appears to have not only ploughed on and taken steps to avoid detection. In April 2023, an employee warned against using corporate IP addresses to access pirated content, while another said that “torrenting from a corporate laptop doesn’t feel right,” adding a laughing emoji.

There are also reports that Meta employees allegedly discussed ways to prevent Meta’s infrastructure from being directly linked to the downloads, raising questions about whether the company knowingly bypassed copyright laws.

In January 2023, Meta CEO Mark Zuckerberg reportedly attended a meeting where he pushed for AI implementation at the company despite internal objections.

Meta isn't alone in facing legal challenges over AI training. OpenAI has been sued multiple times for allegedly using copyrighted books without permission, including a case filed by The New York Times in December 2023.

Nvidia is also under legal scrutiny for training its NeMo model on nearly 200,000 books, and a former employee had disclosed that the company scraped over 426,000 hours of video daily for AI development.

And in case you missed it, OpenAI recently claimed that DeepSeek unlawfully obtained data from its models, highlighting the ongoing ethical and legal dilemmas surrounding AI training practices.

Via Tom's Hardware

You may also like

How It works

Search Crack for

Latest IT News

Mar 24
Hume can understand words in context and predict emotions, making conversations sound very human-like.
Mar 24
Viggle uses AI to animate still images based on reference videos. It can also add characters into existing clips, giving it major meme potential. Here’s what you need to know.
Mar 24
How to generate realistic voiceovers using AI
Mar 24
Asana is a task management and productivity tool. Its automated AI features are designed to make your team workflows more efficient. Here’s how it works.
Mar 23
A new report gives us some more information about what to expect from the Samsung smart glasses.
Mar 22
Comparing ChatGPT, Gemini, Claude, and Perplexity AI search.
Mar 21
How to transform text prompts into realistic videos with AI

Latest cracks