Author's note: This document was created with the help of ChatGPT, which was employed to summarize and rephrase the highlights of the original report. The chatGPT output was then manually edited and reviewed by the author.

AI-Powered Influence Operations: Investigating the Risks and Benefits of Generative Language Models

As the capabilities of generative models have grown in recent years, there have been discussions about the potential benefits and risks that come with them. In a joint report of January 2023, OpenAI researchers collaborated with Georgetown University’s Center for Security and Emerging Technology and the Stanford Internet Observatory to investigate how large language models could be used to conduct disinformation campaigns or influence operations.

Influence operations can be defined as covert or deceptive attempts to sway the opinions of a target audience, regardless of the message's veracity or the identity of the person spreading it.

In order to conduct an influence operation using a language model, it is necessary to have access to a model, the means to distribute the produced content, and for it to have an impact on the target audience. The impact of an influence operation can be based on the content of the message if it is able to persuade or reinforce a particular viewpoint, distract from other ideas, or hinder the ability to think critically.

Often, the goal is to crowd out important information with irrelevant and attention-grabbing content, rather than trying to persuade the target with the information being spread. However, measuring the impact and effectiveness of these operations is difficult. Their success is dependent on the resources and the ability of the malicious actors, the quality and message of content, and how easily it can be detected.

Another way that influence operations can impact is by eroding trust in society. This can occur even with low-quality efforts and can lead to undermining the credibility of reliable news sources. Propagandists often exploit vulnerabilities in how people establish trust, particularly in the digital age, by manipulating perceptions of reputation, using fake credentials, or altering photographic and video evidence. This can undermine trust in information beyond the specific topic of the campaign.

Recent progress in generative models

Basically, generative models are composed of large artificial neural networks and are trained using a trial-and-error process on vast amounts of data.

Creating generative models involves two steps: training a neural network on a large amount of raw data and fine-tuning the model on small amounts of task-specific data (optional and cheaper). Fine-tuning can improve the model's capabilities or train domain-specific skills.

A number of organizations have developed advanced language models which can range from fully public — that anyone can download and use them to produce outputs that cannot be monitored by the model's designer (this is the case of most models created with Google Tensorflow), to fully private, or not accessible to the general public. A third category of models aims to balance public and private access, such as requiring users to sign a license that bans certain use cases or allowing access through an application programming interface (API): this is the case of OpenAI.

Source: Pixabay (CC0)

Generative models and influence operations

The report expands on the ABC (Actors, Behavior, and Content) model, a widely used model in the field of disinformation. When analyzing influence operations, it is important to consider all three aspects of actors, behaviors, and content, as even within a manipulative campaign, certain elements may be authentic. For example, real content may be spread using paid or automated engagement, or by actors who are not who they claim to be. Similarly, authentic actors (e.g. domestic political activists) may use inauthentic automation to spread content.

The authors believe the integration of AI in disinformation campaigns can (1) reduce the cost of running disinformation campaigns by automating content production, (2) reducing the need for creating fake personas, (3) expanding the set of actors with the capacity to run influence operations, and (4) producing culturally appropriate outputs that are less likely to be identified as inauthentic. It can also (5) change the way existing behaviors are enacted in practice and introduce new behaviors. For example, (5a) it can replace or augment human writers in the content generation process, (5b) increase the scalability of propaganda campaigns, (5c) improve existing tactics and techniques, and (5d) falsify checks in areas in which text commentary is solicited. Additionally, generative models can enable real-time dynamic content generation (6) that leverages demographic information to generate more persuasive articles and deploy personalized chatbots that interact with targets one-on-one.

Ultimately, AI can improve the quality and decrease the detectability of both short-form and long-form text content, by masking some of the telltale signs (identical, repeated messaging) that bot detection systems rely on (7).

Possible improvements and future impact of generative models

While these models can generate plausible content for a wide range of tasks, as task complexity increases they are not always reliable. This means that a propagandist must either trust that the model will complete the task without errors or continuously monitor the model's output. Furthermore, models often fail to produce high-quality text consistently because they lack awareness of current events and information. To address this, researchers are working on continually retraining models and developing new algorithms to update a model's understanding of the world in real time. Language models will also become more efficient in the future, which will lower the cost of utilizing them in influence operations. This will enable more efficient production of persuasive content, and the potential for language models to act as a cultural context checker, enabling operators to effectively target audiences they may not be familiar with and allowing for a greater output of authentic content in a shorter amount of time.

When thinking about the future impact of language models on influence operations, it is important to consider which actors will have access to these models and what factors may lead to their use in these operations. Three key unknowns in this area include willingness to invest in generative models, greater accessibility from unregulated tooling, and norms and intent-to-use. Actors such as governments, private firms, and wealthy individuals may develop state-of-the-art language models for other purposes, which could then be repurposed for influence operations. Additionally, the proliferation of easy-to-use tools for language models may make propaganda campaigns more prevalent. However, norms and intentions may play a role in limiting the use of language models for these operations. To create a norm that it is unacceptable to use language models for influence operations, a coalition of states or machine language researchers and ethicists may be necessary.

Mitigating the threat

Mitigating the threat of AI-powered influence operations through language models can be targeted at different stages of the operation pipeline. These stages include (1) construction of the models, (2) access to the models, (3) dissemination of the generated content, and (4) formation of beliefs in the target audience. To successfully use generative language models for shaping the information ecosystem, propagandists need AI models that can generate scalable and realistic-looking text, regular and reliable access to such models, infrastructure to disseminate the outputs, and a target audience that can be influenced by the content.

Researchers have identified possible mitigations to reduce the threat of AI-powered influence operations specifically related to language models. To limit the negative impacts of large language models, developers may build models that are more detectable or fact-sensitive, governments may impose restrictions on data collection or controls on AI hardware, and monitor computing power usage. Platforms may implement rules in terms of service for the appropriate use of language models, flag content they suspect may be inauthentic, and work with AI companies to determine if a language model generated it. Media literacy campaigns can help individuals identify and discern real from fake news online, but these campaigns may need updating as AI-generated content becomes more advanced. Developers can also create consumer-focused AI tools, such as browser extensions and mobile apps that warn users of potential fake content or accounts, or that selectively block ads on these sites. However, these tools may also present risks, such as susceptibility to bias and potential conflicts with the interests of social media platforms.

Essentially, each mitigation can be evaluated based on its technical feasibility (the ability to implement the mitigation on a technical level), social feasibility (the political, legal, and institutional feasibility of the mitigation), downside risk (the negative impacts that a mitigation may cause), and impact (the effectiveness of the mitigation in reducing the threat of AI-enabled influence operations). The responsibility for the implementation of these mitigations and the reasons why they would be implemented are, of course, open questions.


Author: Massimo Terenzi (University of Urbino). Contact: {massimo (dot) terenzi (at) uniurb (dot) it}.

Editor: Jochen Spangenberg (DW)

As stated above: this text is a summarization using ChatGPT, with some additional editing by the author named above, of a scientific report entitled Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations. Authors of the report: Josh A. Goldstein, Girish Sastry, Micah Musser, Renee DiResta, Matthew Gentzel and Katerina Sedova is co-funded by the European Commission under grant agreement ID 101070093, and the UK and Swiss authorities. This website reflects the views of the consortium and respective contributors. The EU cannot be held responsible for any use which may be made of the information contained herein.