Deep search
Search
Copilot
Images
Videos
Maps
News
Shopping
More
Flights
Travel
Hotels
Real Estate
Notebook
Top stories
Sports
U.S.
2024 Election
Local
World
Science
Technology
Entertainment
Business
More
Politics
Any time
Past hour
Past 24 hours
Past 7 days
Past 30 days
Best match
Most recent
Bluesky, Open API and data scraping
Bluesky dataset for AI training removed from Hugging Face
On 26 November, Daniel van Strien, a machine learning librarian at Hugging Face, uploaded a dataset of 1m public posts and accompanying metadata taken from Bluesky’s firehose API. The dataset card explained it was “intended for machine learning research and experimentation with social media data”.
Bluesky’s Open API Makes User Data Vulnerable to Scraping
Bluesky is facing its first major controversy over data scraping after a dataset containing one million public posts appeared on the AI platform Hugging Face, 404Media reports. Compiled by Daniel van Strien,
Bluesky Open API: Data Scrapers May Access Firehouse for AI Training, as Demoed by Hugging Face
Bluesky's Firehouse is known for being an open API, but it is also its flaw as anyone can scrape its data for the likes of AI training.
One million public Bluesky posts scraped for AI training
Bluesky is already facing its first major AI scrape, despite the stance of its owners that it will never train generative AI on user data.
Hugging Face’s Dataset Release Exposes 1M Bluesky Posts for Research
A Hugging Face librarian released and later removed a 1 million Bluesky posts dataset, sparking concerns over data transparency and consent. Daniel van
Bluesky Faces Backlash Over User Data Scraping for AI Training
Bluesky, the social media platform often seen as a rival to Twitter, is at the center of a controversy after one million of its public posts..|News Track
Twitter rival Bluesky’s user posts scraped for training AI
Bluesky user posts and user information was scraped by an AI researcher and built into a dataset and published on open platform, Hugging Face.
Bluesky: No Generative AI Training from User Data, But External Use May Persist
Although Bluesky itself doesn’t train AI models on user data, it doesn’t prevent others from using its data for training purposes.
Bluesky’s open API means anyone can scrape your data for AI training
Bluesky might not be training AI systems on user content as other social networks are doing, but there’s little stopping third-parties from doing so. Per a report by 404 Media, a machine learning librarian at AI firm Hugging Face pulled 1 million public posts from Bluesky via its Firehose API for machine learning research,
Werd I/O on MSN
23h
Bluesky, AI, and the battle for consent on the open web
Daniel van Strien, a machine learning librarian at Hugging Face, took a million Bluesky posts and turned them into a dataset ...
Daily Tech News Show
18h
Bluesky Scraper – DTNS 4905
President of SCE Worldwide Studios, Shuhei Yoshida, steps down from his 38 years. Scott shares his thoughts on Yoshida’s ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results
Trending now
Concedes in CA House race
Wins reelection after recount
Meets Trump at Mar-a-Lago
Stowaway found on flight
NM man awarded $412M+
China probes top official
Settles harassment lawsuit
Retires after 14 seasons
Uber faces FTC probe
Residents sue Givaudan
Man freed, gets $13M award
'City under the ice'
Thanksgiving in space
Habitat protections plan
US sanctions more allies
Jan. 6 coverage suit tossed
Migration talk with MX pres
Capitol Hill visit on Dec 5
US economy expanded
X claims Infowars accounts
Costco recalls eggs
Bail denied by third judge
Abortions fell 2% in 2022
China releases 3 Americans
Picked as envoy for UKR, RU
Weekly jobless claims fall
Court upholds TX razor wire
Menendez seeks new trial
Cybertruck crash kills 3
Travel advisory downgraded
Feedback