Privacy Issues in the Creation of Facial Recognition Technology

Concerns regarding the privacy rights of individuals in connection with facial recognition technologies have percolated over the past year, and recently these concerns reached peak levels in connection with law enforcement’s use of facial recognition technology. While law enforcement’s use of facial recognition technology is one facet of privacy concern, which led in part to the City of San Francisco banning facial recognition technology by city law enforcement and local agencies in 2019, there is even a more preliminary concern arising from how these technologies are built. Facial recognition software is built in part by amassing a large collection of images that a source image is then compared against. As such, these image collections are a key factor in providing effective facial recognition. Considering how these facial recognition databases are constructed is the foundational privacy building block in these technologies.

In order for facial recognition technologies to effectively recognize faces, the recognition model must be trained with a robust and comprehensive data set, namely, photos of human faces. This begs the question of how a massive and comprehensive data set of photographs of faces is built. One answer is through a commonly deployed tool in the artificial intelligence space of web scraping. At a very simplistic level, web scraping utilizes software to extract data from a website, often from websites that have large amounts of publicly available data. Despite this data being publically available, individuals may not know of this extraction, nor consent to it, and a web platform’s terms of service may expressly prohibit it.

For example, IBM received criticism for deploying web scraping of images from Flickr (an image hosting website), in an effort to build a more racially diverse and comprehensive data set. Additionally, Clearview AI utilizes web scraping of platforms like FaceBook, YouTube, Google, and Twitter to build its data set for facial recognition. And in the latter case, Clearview AI even received cease and desist letters from several companies regarding these practices. In each of these cases, while the images scraped are often publicly available, it is done without the consent of individuals and often in violation of the platform’s terms of service.

So what recourse does a user of a platform have when a third party company utilizes web scraping to build a proprietary facial recognition platform? Unfortunately, there is often no recourse. However two states’ law provide some insight into how these facial recognition technologies may be regulated by states moving forward. First, the Illinois Biometric Information Privacy Act prohibits a private entity from collecting an individual’s biometric information, including a photo with face geometry, without first providing notice to and obtaining consent from the Illinois resident. This Act provides a private right of action for enforcement against entities that violate its provisions. Second, the California Consumer Privacy Act provides that a business that collects personal information from a California consumer must first provide the consumer with notice of its collection, including the uses of the information, as well as providing California consumers rights like deletion associated with their data. The California attorney general has enforcement rights against violations of the Act.

While privacy concerns associated with the building of data sets for use in facial recognition technologies is only one facet of the privacy prism created by these technologies, it is foundational for these companies to understand and adhere to in building their technologies.