JMIR Preprints #54369: Can Large Language Models "Read Your Mind in Your Eyes"?

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Can Large Language Models "Read Your Mind in Your Eyes"?

Zohar Elyoseph;
Elad Refoua;
Kfir Asraf;
Maya Lvovsky;
Yoav Shimoni;
Dorit Hadar-Shoval

ABSTRACT

Background:

Mentalization, integral to human cognitive processes, pertains to the interpretation of one's own and others' mental states, including emotions, beliefs, and intentions. With the advent of artificial intelligence (AI) and the prominence of large language models (LLMs) in mental health applications, questions persist about their aptitude in emotional comprehension. The prior iteration, ChatGPT-3.5, demonstrated an advanced capacity to interpret emotions from textual data, surpassing human benchmarks. Given the introduction of ChatGPT-4, with its enhanced visual processing capabilities, and considering Bard's existing visual functionalities, a rigorous assessment of their proficiency in visual mentalizing is warranted.

Objective:

The aim of the research was to critically evaluate the capabilities of ChatGPT-4 and Google Bard with regard to their competence in discerning visual mentalizing indicators as contrasted with their textual-based mentalizing abilities.

Methods:

We employed the esteemed Reading the Mind in the Eyes Test (RMET) developed by Baron-Cohen to assess the models' proficiency in interpreting visual emotional indicators. Simultaneously, the Levels of Emotional Awareness Scale (LEAS) was utilized to evaluate the LLMs aptitude in textual mentalizing. Collating data from both tests provided a holistic view of the mentalizing capabilities of ChatGPT-4 and Bard.

Results:

• ChatGPT-4 RMET. ChatGPT-4, displaying a pronounced ability in emotion recognition, secured scores of 26 and 27 in two distinct evaluations, significantly deviating from a random response paradigm. These scores align with established benchmarks from the broader human demographic. Notably, ChatGPT-4 exhibited consistent responses, with no discernible biases pertaining to the gender of the model or nature of the emotion. • Google Bard RMET. By contrast, Bard's performance aligned with random response patterns, securing scores of 10 and 12, rendering further detailed analysis redundant. • LEAS: In the domain of textual analysis, both ChatGPT and Bard surpassed established benchmarks from the general population, with their performances being remarkably congruent.

Conclusions:

ChatGPT-4 proved its efficacy in the domain of visual mentalizing, aligning closely with human performance standards. Although both models displayed commendable acumen in textual emotion interpretation, Bard's capabilities in visual emotion interpretation necessitate further scrutiny and potential refinement.

Citation

Please cite as:

Elyoseph Z, Refoua E, Asraf K, Lvovsky M, Shimoni Y, Hadar-Shoval D

Capacity of Generative AI to Interpret Human Emotions From Visual and Textual Data: Pilot Evaluation Study

JMIR Ment Health 2024;11:e54369

DOI: 10.2196/54369

PMID: 38319707

PMCID: 10879976

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Mental Health

Date Submitted: Nov 7, 2023

Date Accepted: Dec 25, 2023

Can Large Language Models "Read Your Mind in Your Eyes"?

ABSTRACT

Citation

Copyright