Microsoft’s Azure Face AI Platform Phasing Out Emotion, Identity Measurements
June 23, 2022 by Dave Haynes
Microsoft is phasing out some key capabilities of its Azure Face audience detection and measurement system, such as identifying age range and gender and the emotional state of viewers.
These are commonly marketed capabilities of the computer vision companies active in the digital signage sector, touted as ways to understand the characteristics of people who are looking at visual messaging and advertising displays, and in some cases to serve messages in near real-time based on those attributes. For example, a beauty brand might serve messaging for different hair care products based on gender.
Candidly, I did not know Microsoft even had Azure Face and did this – but it may be something in use by some CMS software firms that focus on Windows and use Microsoft’s Azure cloud services. If so, API access to those tools is going away.
In an announcement via the tech giant’s corporate blog, Microsoft explained it was releasing a new framework for building AI systems responsibly, based on learnings to date and feedback from users and observers. The announcement/blog post from the company’s Chief Responsible AI Officer (a title I have never seen) Natasha Crampton, and gets into areas such as speech to text and neural voice, but also addresses the challenges with audience detection and measurement tools.
Finally, we recognize that for AI systems to be trustworthy, they need to be appropriate solutions to the problems they are designed to solve. As part of our work to align our Azure Face service to the requirements of the Responsible AI Standard, we are also retiring capabilities that infer emotional states and identity attributes such as gender, age, smile, facial hair, hair, and makeup.
Taking emotional states as an example, we have decided we will not provide open-ended API access to technology that can scan people’s faces and purport to infer their emotional states based on their facial expressions or movements. Experts inside and outside the company have highlighted the lack of scientific consensus on the definition of “emotions,” the challenges in how inferences generalize across use cases, regions, and demographics, and the heightened privacy concerns around this type of capability.
We also decided that we need to carefully analyze all AI systems that purport to infer people’s emotional states, whether the systems use facial analysis or any other AI technology. The Fit for Purpose Goal and Requirements in the Responsible AI Standard now help us to make system-specific validity assessments upfront, and our Sensitive Uses process helps us provide nuanced guidance for high-impact use cases, grounded in science.
These real-world challenges informed the development of Microsoft’s Responsible AI Standard and demonstrate its impact on the way we design, develop, and deploy AI systems.
In a separate post on the Azure site, the technology’s group product manager Sarah Bird explains:
Effective today, new customers need to apply for access to use facial recognition operations in Azure Face API, Computer Vision, and Video Indexer. Existing customers have one year to apply and receive approval for continued access to the facial recognition services based on their provided use cases. By introducing Limited Access, we add an additional layer of scrutiny to the use and deployment of facial recognition to ensure use of these services aligns with Microsoft’s Responsible AI Standard and contributes to high-value end-user and societal benefit. This includes introducing use case and customer eligibility requirements to gain access to these services. Read about example use cases, and use cases to avoid, here.
Starting June 30, 2023, existing customers will no longer be able to access facial recognition capabilities if their facial recognition application has not been approved. Submit an application form for facial and celebrity recognition operations in Face API, Computer Vision, and Azure Video Indexer here, and our team will be in touch via email.
Facial detection capabilities (including detecting blur, exposure, glasses, head pose, landmarks, noise, occlusion, and facial bounding box) will remain generally available and do not require an application.
In another change, we will retire facial analysis capabilities that purport to infer emotional states and identity attributes such as gender, age, smile, facial hair, hair, and makeup. We collaborated with internal and external researchers to understand the limitations and potential benefits of this technology and navigate the tradeoffs. In the case of emotion classification specifically, these efforts raised important questions about privacy, the lack of consensus on a definition of “emotions,” and the inability to generalize the linkage between facial expression and emotional state across use cases, regions, and demographics. API access to capabilities that predict sensitive attributes also opens up a wide range of ways they can be misused—including subjecting people to stereotyping, discrimination, or unfair denial of services.
To mitigate these risks, we have opted to not support a general-purpose system in the Face API that purports to infer emotional states, gender, age, smile, facial hair, hair, and makeup. Detection of these attributes will no longer be available to new customers beginning June 21, 2022, and existing customers have until June 30, 2023, to discontinue use of these attributes before they are retired.
While API access to these attributes will no longer be available to customers for general-purpose use, Microsoft recognizes these capabilities can be valuable when used for a set of controlled accessibility scenarios. Microsoft remains committed to supporting technology for people with disabilities and will continue to use these capabilities in support of this goal by integrating them into applications such as Seeing AI.
So Microsoft is not backing away from face pattern detection, but has re-thought what it does and also how it handles facial recognition. What always seems to get lost among people new to this tech, and weirdly some of the people marketing it, is that detection is anonymous (the AI/machine learning just looks at the understood geometry of faces) while recognition is matching a captured face image against a stored database of faces.
The general public is much more prone to being upset by recognition than by detection, particularly when the distinctions are explained well and readily apparent. There have been plenty of stories in mainstream media suggesting Digital OOH screens “are watching you!” when all they’re really doing is anonymously detecting faces and logging how many and when.
The tech that detects faces and suggests the people behind those faces are happy or sad has never struck as much more than a gimmick for trade show demos. Maybe it is widely used and I just unaware, but I don’t think so. I have stood in front of numerous trade show demos through the years that purported to detect emotional state, and the AI has not had much luck decoding my natural stone-faced state.
The broader issue is not with marketing activities, but the dystopian risks of things like individual workers on a job being measured and assessed based on things like their happiness levels.
There is plenty of value to computer vision for audience measurement, like how many look, how long they look, how that changes based on time and date, and by location, and a bunch of value in computer vision for fairly low-tech uses like parking garage occupancy rates that get relayed to screens.
Leave a comment