Generative AI and the use of Large Language Models (LLMs) can cause unintentional changes in the feelings expressed in the original text, as noted by top academics from the Gillmore Centre for Financial Technology at Warwick Business School, part of the University of Warwick.
The paper titled ‘Who’s Talking, Machine or Human? How Generative AI Alters Human Sentiment’, is authored by academics from the Gillmore Centre and explores the impact of the growing use of LLMs on public sentiment. It suggests that changes made by LLMs to content can make existing results unreliable.
Conducting the study
Dr Yi Ding, assistant professor of information systems says, “Conducting this study looking at the use of Generative Al alongside human sentiment will play a critical role in LLM future developments, ultimately enhancing output, helping remove biases and improving efficiency for anyone who uses it.”
These findings were made by replicating and adopting established experiments to be a valuable addition to the field of Generative AI and user-generated content (UGC). They demonstrate that the widespread use of LLMs alters the language characteristics of any text.
This was noticed through an extensive analysis that involved studying 50,000 tweets. They used the advanced GPT-4 model to rephrase the text. By applying the ‘Valence Aware Dictionary for Sentiment Reasoning’ (VADER) method to compare the original tweets with the ones rephrased by GPT-4, the researchers found that LLMs tend to make the sentiment more neutral, moving the text away from both positive and negative tones.
“While LLMs do tend to move positive sentiments closer to neutrality, the shift in negative sentiments towards a neutral position is more pronounced. This overall shift towards positivity can significantly impact the application of LLMs in sentiment analysis,” said Ashkan Eshghi, Houlden Fellow.
According to director Ram Gopal, “This bias arises from the application of LLMS for tasks such as paraphrasing, rewriting, and even content creation, resulting in sentiments that may diverge from those the individual would have expressed without LLMs being used.
“In turn, our research proposes a mitigation method aimed at reducing bias and enhancing the reliability of UGC. This involves predicting or estimating the sentiment of original tweets by analysing the sentiments of their rephrased counterparts,” Gopal added.
Yet, more research is necessary to determine if other aspects of UGC, like emotions, sentence structure, or the use of specific words, would alter when AI is involved. The academics intend to employ other predictive models to understand genuine human sentiments and suggest more ways to address this in future studies.