Uncategorized

A new bill seeks to reveal what is hidden in the training data

A Democrat-Electoral Correlation Act to Protect Copyrighted Materials from Higgs Boson Production: The Impact of Schiff’s Law

Developers of AI models claim their models are trained on publicly available data, but the sheer amount of information means they don’t know specifically which data is copyrighted. copyrighted materials can be used in a fair manner, said companies. Legal cover has begun to be offered to customers by many of these companies if they’re sued for using copyrighted works.

Companies have 30 days to submit a report before the public can see the model. The bill will not be retroactive to existing AI platforms unless changes are made to their training datasets after it becomes law.

Schiff’s bill garnered support from industry groups like the Writers Guild of America (WGA), the Recording Industry Association of America (RIAA), the Directors Guild of America (DGA), the Screen Actors Guild – American Federation of Television and Radio Artists (SAG-AFTRA), and the Authors Guild. The Motion Picture Association does not usually back moves to protect copyrighted work from piracy. (Disclosure: The Verge’s editorial staff is unionized with the Writers Guild of America, East.)

What Do I Can and Can I Do to Unlock Your Data for Generative AI Training? The Case of Facebook, Google, X, and Google

There have been some small moves to give people more control over what happens to the data they post online as the lawsuits and investigations around generative AI and its opaque data practices pile up. Some companies now let customers not pay for their content to be used in training. Here are what you can and can’t do.

Mireshghallah explains that companies can make it complicated to opt out of having data used for AI training, and even where it is possible, many people don’t have a “clear idea” about the permissions they’ve agreed to or how data is being used. There are laws that are taken into consideration, such as EU’s strong privacy laws. Facebook, Google, X, and other companies have written into their privacy policies that they may use your data to train AI.

While there are various technical ways AI systems could have data removed from them or “unlearn,” Mireshghallah says, there’s very little that’s known about the processes that are in place. The options can be buried or labor-intensive. It is likely that getting posts out of the data will be difficult. Users are almost always made to opt-in by default if a company gives them opt outs for future data sharing.

The Electronic Frontier Foundation says that most companies add the friction because people aren’t going to look for it. “Opt-in would be a purposeful action, as opposed to opting out, where you have to know it’s there.”