16.3 C
New York
Sunday, April 20, 2025

Advertisement

Advertisement

Google Unveils Google-Extended: A New Tool Empowering Website Publishers to Control Data Use for AI Training

Advertisement

Google introduces Google-Extended, allowing publishers to opt out of data utilization for AI model training while staying accessible on Google Search.

Table of Contents

In a move to give website publishers more control over their data, Google has announced the launch of โ€œGoogle-Extended,โ€ a novel tool designed to enable publishers to manage their dataโ€™s usage in training the companyโ€™s AI models. The new feature allows websites to continue being crawled and indexed by Googlebot while avoiding their data being incorporated into the development of AI models.

Enhanced Control Over AI Training Data

Google-Extended provides website publishers with the ability to decide whether their sites contribute to improving Bard and Vertex AI generative APIs, offering a unique level of control over their contentโ€™s accessibility on the web. This development comes in response to growing concerns regarding the use of publicly available data scraped from the web to train AI systems, particularly after Google confirmed its use for training its AI chatbot, Bard.

The toolโ€™s implementation is facilitated through the robots.txt file, a widely used text document that instructs web crawlers about which parts of a website can be accessed. By using Google-Extended, publishers can now manage their data preferences more efficiently.

Adapting to Evolving AI Landscape

Google acknowledges the expanding landscape of AI applications and pledges to explore additional machine-readable approaches to empower web publishers with even more choices and control. The company assures that it will provide further updates in the near future.

Navigating the Complex Web of Data Usage

The introduction of Google-Extended reflects a broader trend among website publishers who are increasingly concerned about the use of their data for AI training. Many prominent sites, including The New York Times, CNN, Reuters, and Medium, have already taken measures to block web crawlers used by organizations like OpenAI for data scraping and AI model training.

However, distinguishing Google from other web crawlers presents a unique challenge. Complete blocking of Googleโ€™s crawlers is not a viable option for many websites, as it would result in them being excluded from Google Search results. To address this issue, some sites, like The New York Times, have resorted to legal measures by updating their terms of service to prohibit companies from using their content for AI training.

However, distinguishing Google from other web crawlers presents a unique challenge. Complete blocking of Googleโ€™s crawlers is not a viable option for many websites, as it would result in them being excluded from Google Search results. To address this issue, some sites, like The New York Times, have resorted to legal measures by updating their terms of service to prohibit companies from using their content for AI training.

Advertisement

Anup
Anuphttps://techrefreshing.com/
Anup is a passionate tech enthusiast and the creator of TechRefreshing.com. With expertise in Crypto, Linux, AI, and emerging technologies, Anup shares insights, tutorials, and tips to keep readers informed and ahead in the ever-evolving tech world. When not writing, Anup explores the latest gadgets and innovations shaping the future.

Related Articles

- Advertisement -

Latest Articles

Advertisement