Insidr.ai Logo - Find the Best AI Tools to Supercharge Your Business

Insidr.ai

AI NEWS

Raising Concerns: Tech Giants Utilize a Repository of 183,000 Books for AI Training

Tech Giants Utilize an Archive of 183,000 Books to Train AI - insidr.ai

In a recent report by The Atlantic, it has come to light that some of the leading technology companies have incorporated a vast library of nearly 200,000 books in their efforts to train generative AI models. This assortment of literary works encompasses a wide spectrum of genres, ranging from erotic fiction to prose poetry. Notably, within this extensive dataset, the works of celebrated authors like J.K. Rowling, Amitav Ghosh, Rupi Kaur, and Neil Gaiman have found their place. However, the authors themselves remain oblivious to this fact.

The books in question, collectively referred to as “Books3,” are not only a subject of fascination but also a matter of legal dispute. They serve as essential training data for generative AI systems, assisting these technologies in mastering the art of information communication. Nevertheless, their incorporation has raised numerous ethical and copyright concerns.

The utilization of Books3 in AI training

CNN’s investigative report underscores that some AI training text can be sourced from publicly available online articles. The utilization of Books3 in AI training has already triggered a flurry of lawsuits, with Meta and other corporations adopting this system for AI development finding themselves in the legal crosshairs.

What’s more, the authors who suddenly discovered their copyrighted novels forming a part of this dataset expressed their frustration and disbelief on social media, sharing screenshots to confirm the presence of their works. Mary H.K. Choi, the author of “Emergency Contact” and a New York Times bestseller, took to social media to voice her anguish, saying, “I’m completely gutted and whipsawed. I am outraged and at the same time feel utterly helpless.”

In an interview with CNN, Choi articulated her concern further:

"A book encapsulates infinite choices, boundless permutations, and even shortcomings of the author at the time. To think that all this life can be chucked into a vast churning pool to be extruded into a giant algorithmic, generative sausage machine reduces so much so swiftly. Not just financially for the authors but it beggars booksellers, librarians, and readers from so many intimacies."

Min Jin Lee, the acclaimed author of “Pachinko” and “Free Food for Millionaires,” shared her disappointment, referring to the use of her books as a “theft.” She said, “I spent three decades of my life to write my books. The AI large language models did not ‘ingest’ or ‘scrape’ ‘data.’ AI companies stole my work, time, and creativity. They stole my stories. They stole a part of me.”

While some authors are distressed by this unanticipated use of their work, author James Chappel holds a different perspective. He welcomes the opportunity for his book to be read and to educate, showing a contrasting viewpoint among authors affected by this AI training practice.

In response to the growing concerns, a spokesperson for Bloomberg stated that the company had employed various data sources, including Books3, to train its initial BloombergGPT model, designed for the financial industry. However, the spokesperson assured that Books3 would not be included among the data sources used for future commercial versions of BloombergGPT. This marks a step towards addressing the apprehensions surrounding the use of copyrighted material in AI training.

Source

Discover More AI Tools

Every week, we introduce new AI tools and discuss news about artificial intelligence.

To discover new AI tools and stay up to date with newest tools available, click the button.

To subscribe to the newsletter and receive updates on AI, as well as a full list of 200+ AI tools, click here.

Share:

Picture of Insidr.ai

Insidr.ai

Find The Best AI Tools To Supercharge Your Business

Leave a Reply

Your email address will not be published. Required fields are marked *

Table of Contents

List of 300+ Best AI Tools For Free

I’ll send you the full AI Tools List of all the Best AI tools (and not the rest) to supercharge your business & productivity. Updated weekly.

FREE AI TOOLS LIST

List of 300+ best AI tools available to supercharge your work & business

insidr-ai_Best AI Tools Directory

Find and compare the 300+ Best AI Tools in 21+ categories. We constantly search the market and update the list to keep a collection of the most valuable tools that will empower your business, and cut the rest that you don’t really need.