Existing federal law, through copyright, provides authors of original works of authorship, as defined, with certain rights and protections. Existing federal law generally gives the owner of the copyright the right to reproduce the work in copies or phonorecords and the right to distribute copies or phonorecords of the work to the public. Existing federal law provides that sound recordings fixed before February 15, 1972, are not subject to copyright, but are subject to similar rights and protections under the Classics Protection and Access Act.
Existing law requires, on or before January 1, 2026, and before each time thereafter that a generative artificial intelligence system or service, as defined, or a substantial modification to a generative artificial intelligence system or service, released on or after January 1, 2022, is made available to Californians for use, regardless of whether the terms of that use include compensation, a developer of the system or service to post on the developer's internet website documentation, as specified, regarding the data used to train the generative artificial intelligence system or service.
This bill would require a developer of a generative artificial intelligence model to, among other things, document any covered materials that the developer knows were used to train the model. The bill would require the developer to make available a mechanism on the developer's internet website allowing a rights owner to submit a request for information about the developer's use of covered materials that would allow the rights owner to provide the developer with, among other things, registration, preregistration, or index numbers and fingerprints for one or more covered materials. The bill would require a developer to, within 7 days of receiving that request from the rights owner, assess whether the covered material represented by a fingerprint provided by the rights owner is likely to be present in the developer's dataset and provide the rights owner with a list of their covered materials that were used to train the model and are likely to be present in the developer's dataset, as specified. The bill would provide that each day following the 7-day period that a developer fails to provide a rights owner with that information constitutes a discrete violation. The bill would authorize a rights owner that is not provided with information according to these provisions to bring a civil action against the developer for specified relief. The bill would provide that the bill's requirements do not apply to a developer that makes all of the data used to train the model publicly available at no cost, as specified. The bill would define various terms for these purposes.