AI-generated empathy has its limits, new research suggests.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Conversational agents (CAs) such as Alexa and Siri are designed to answer questions, offer suggestions — and even show empathy. However, new research shows that they perform worse than humans when it comes to interpreting and exploring the user experience.

CAs are powered by large language models (LLMs) that digest large amounts of human-generated data, and thus may suffer from the same biases as the humans from which the information comes.

Researchers at Cornell University, Olin College, and Stanford University tested this theory by having CAs interact with approximately 65 different human identities to demonstrate empathy.

The team found that CAs make value judgments about certain identities — such as gay and Muslim — and can encourage identities related to harmful ideologies, including Nazism.

“I think automatic empathy can have a huge impact and have huge potential for positive things — for example, in the field of education or health care,” said lead author Andrea Cuadra, who now is a postdoctoral researcher at Stanford.

“It's highly unlikely that this (auto-compassion) won't happen,” he said, “so it's important that as it happens, we have a critical approach so that we can minimize the potential harm.” to be more deliberate about.”

Cuadra will present “The Illusion of Empathy? Notices the Display of Emotion in Human-Computer Interaction” at CHI '24, the Computing Machinery Conference on Human Factors in Computing Systems, May 11-18 in Honolulu. Research co-authors from Cornell University included Nicola Dale, associate professor, Deborah Estrin, professor of computer science and Malte Jung, associate professor of information science.

The researchers found that, in general, LLMs scored higher for emotional response, but lower for interpretations and research. In other words, LLMs are able to answer a question based on their training but are unable to dig deeper.

Dell, Astrin and Jung said the inspiration for this work came because Quadra was studying the use of earlier-generation CAs by older adults.

“He observed interesting uses of the technology for transactional purposes such as health vulnerability assessments as well as open recall experiments,” Estrin said. “Along the way, he observes vivid examples of the tension between coercion and disturbing 'sympathy'.”

Funding for this research came from the National Science Foundation; a Cornell Tech Digital Life Initiative Doctoral Fellowship; a Stanford PRISM Baker Postdoctoral Fellowship; and the Stanford Institute for Human-Centered Artificial Intelligence.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Leave a Comment