SUBSCRIBE
  • Home
  • About Us
    • Contact
No Result
View All Result
Discern Report
Discern Report
  • Home
  • About Us
    • Contact
No Result
View All Result
Discern Report
No Result
View All Result
Home Type Aggregated

Researcher Releases AI Training Dataset Based on Posts to Leftist Echo Chamber Bluesky

by Breitbart
December 4, 2024
in Aggregated, News
Researcher Releases AI Training Dataset Based on Posts to Leftist Echo Chamber Bluesky


  • Hand-curated links from conservative and Christian sites — NO legacy media garbage links. Patriots get their news every day at JDRucker.com


A machine learning researcher has released a massive dataset containing one million public posts from leftist social media echo chamber Bluesky, raising questions about data privacy and consent. The data could potentially be used to train AI to be even more woke than notoriously left-leaning AI chatbots like ChatGPT.

404Media reports that in a move that has raised concerns about user privacy and consent, Daniel van Strien, a machine learning librarian at AI community platform Hugging Face, released a dataset composed of one million Bluesky posts. The dataset, intended for machine learning research, includes the text content of each post along with metadata such as the time of posting and the user’s decentralized identifier (DID).



Van Strien announced the dataset on Bluesky last week, stating, “This dataset contains 1 million public posts collected from Bluesky Social’s firehose API, intended for machine learning research and experimentation with social media data. Each post contains text content, metadata, and information about media attachments and reply relationships.”

While the data was collected from Bluesky’s public firehose API, which aggregates all public data updates on the platform in real-time, the inclusion of user DIDs has raised privacy concerns. The dataset is not anonymous, and van Strien also created a search tool for finding users based on their DID, which he published on Hugging Face.

A quick review of the dataset reveals that it contains a wide range of content, from political discussions and concert chatter to pornography. Notably, the dataset is a snapshot of Bluesky at a specific point in time, meaning it may include posts that have since been deleted by users. […]

— Read More: www.breitbart.com

Pro-MAGA. Pro-Trump. Pro-America. Pro-Family. Most importantly, Pro-Jesus. Here’s the news aggregator that delivers what America needs right now: jdrucker.com
Advisor Bullion Surge

Next Post
Why Trump Won the Influencer War

Why Trump Won the Influencer War

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • About Us
  • America First Newsletter
  • Contact
  • Home
  • Integrating With Augusta Precious Metals
  • Newsletter
  • Privacy Policy
Site Operated By JD Rucker.

© 2023 America First Report.

No Result
View All Result
  • Home
  • Original
  • Curated
  • Aggregated
  • News
  • Opinions
  • Videos
  • Podcasts
  • About Us
  • Contact
  • Privacy Policy

© 2023 America First Report.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?