Privacy Regulations in California Now Extend to AI
On September 28, 2024, Governor Newsome signed a number of bills concerning artificial intelligence (AI) into law. Among those laws were two particularly affect the right to privacy as it applies to AI.
AB 1008[1] (Personal Information and AI Systems – effective January 1, 2025)
This bill amended the California Consumer Privacy Act (CCPA) to specify that personal information can exist in various formats, including artificial intelligence systems that can disclose such information. The new definition is as follows:
(4) “Personal information” can exist in various formats, including, but not limited to, all of the following:
- Physical formats, including paper documents, printed images, vinyl records, or video tapes.
- Digital formats, including text, image, audio, or video files.
- Abstract digital formats, including compressed or encrypted files, metadata, or artificial intelligence systems that are capable of outputting personal information. (emphasis added)
The change in the law extends the CCPA’s privacy protections to large language models that are capable of collecting, storing, and disclosing such information. This also means that consumers may have a right to request their data be deleted from the AI’s data pool.
Another significant AI bill signed by Government Newsom is AB-2013,[2] which requires AI models to disclose the data used to train the AI model, including, among other requirements:
- The sources or owners of the datasets.
- A description of how the datasets further the intended purpose of the artificial intelligence system or service.
- The number of data points included in the datasets, which may be in general ranges, and with estimated figures for dynamic datasets.
- A description of the types of data points within the datasets. For purposes of this paragraph, the following definitions apply:
- As applied to datasets that include labels, “types of data points” means the types of labels used.
- As applied to datasets without labeling, “types of data points” refers to the general characteristics.
- Whether the datasets include any data protected by copyright, trademark, or patent, or whether the datasets are entirely in the public domain.
- Whether the datasets were purchased or licensed by the developer.
- Whether the datasets include personal information, as defined in subdivision (v) of Section 1798.140.
- Whether the datasets include aggregate consumer information, as defined in subdivision (b) of Section 1798.140.
…
As the proliferation of AI continues, so will the states designed to provide transparency and protection to individual personal information.