Here's a summary of the contributions of project partners CERTH, USFD and Ontotext of what they presented at the Sofia Information Integrity Forum (SIIF) which was held from 7–9 November 2024 in Sofia, Bulgaria. Presenters showcased work, advances and outcomes in the domains synthetic content detection, textual analysis and narrative exploration.
Back in 2022, Ontotext, together with a number of other local Bulgarian stakeholders and others such as the GATE institute and AI analytics company Identrics, co-organised a half-day “Technologies against disinformation” event to present their efforts in developing relevant technological tools that support in tackling the societal harm of disinformation spreading.
The follow-up edition of the event a year later, in 2023, then spanned a whole day and attracted the attention of researchers from academia and NGOs, media literacy experts, and state security officers. It was evident that a vibrant counter-disinformation community was shaping up.
It soon became obvious that this flourishing ecosystem deserved an even more comprehensive gathering with a broader regional focus. That is why, in 2024, all this led to a three-day forum with six organising partners, five thematic tracks, and speakers from all over the region of Southeastern Europe and the Black Sea.
The organisation of the 2024 event was made possible thanks to the support of the regional EDMO hub BROD, as well as the Konrad Adenauer Foundation, the Open Society Institute and other EU/US-funded projects. One of the latter supporters: vera.ai.
The SIIF programme (clustered into five thematic tracks) was rather packed: around 70 speakers coming from research, policy, media literacy, technology, and information operations presented to 100+ people in person, while numerous others had joined online.
A key message conveyed by some of the most prominent speakers, including Mark Galeotti and Neville Bolt, was that all stakeholders engaged in the fight against disinformation should ideally unite as much as possible, and coordinate their efforts in order to effectively counteract or rather be ahead of the opponents’ game.
Below, we take a closer look at the unique perspective and tools that vera.ai partners presented in sessions from the 2024 SIIF technology track.
Olga Papadopoulou, Research Associate at CERTH and vera.ai Project Manager, presented the comprehensive work done by the MeVer team at CERTH-ITI (led by Symeon Papadopoulos) and the University of Naples' Multimedia Forensics Lab (led by Luisa Verdoliva) on synthetic content detection. Olga did so in a 15-minute talk that was part of the session “Navigating the Infodemic: AI Solutions for Media Accuracy” that took place on day 1 of the conference.
Olga began with a brief introduction of the vera.ai project, highlighting its goal of building trustworthy AI solutions to empower and support media professionals in their fight against disinformation. The ambitious vera.ai project is dedicated to creating a diverse array of solutions and conducting computational social science research on disinformation. However, in the context of SIIF, Olga focused on the challenge of Generative AI (GenAI) and the tools for identifying synthetic content.
She referenced several reports and studies, underlining the pivotal role of GenAI in the proliferation of disinformation. This includes its involvement in spreading political, wartime, health-related, and various other types of harmful content, underscoring the urgency of the issue.
Finally, she introduced the tools developed as part of vera.ai, which are designed to detect synthetic images, deepfake videos, and synthetic audio. These tools are grounded in cutting-edge AI research and are made accessible to end users through user-friendly applications, namely the Verification Plugin (available for free via the Chrome store) and Truly Media. (Indivual, stand-alone tools are also available via the MeVer team website).
While Olga focused on the visual and audio modalities, in the next session presentation Olesya Razuvayevskaya from the University of Sheffield (USFD) joined online to shed light on and provide insights into textual content and meme analysis methods.
Olesya explored various types of clues, known as credibility signals, that are instrumental in identifying misinformation. Specifically, she discussed the methods used to detect the genre of news articles (e.g., objective reporting, opinionated news, and satire), the different framings employed in articles, and the persuasive techniques often used to influence readers or manipulate their perceptions.
Additionally, Olesya addressed the multilingual and explainability aspects of the methods underlying each credibility signal tool. The explainability of the tools is operationalised through the highlights of the individual parts of the texts that trigger a certain class and the spectrum of colours, indicating lower or higher confidence in those text segments.
Finally, Olesya introduced a multilingual tool for optical character recognition (OCR). The tool is able to perform the recognition of blocks of text within an image. The use case of the tool was focused on the analysis of memes containing text in different languages, and the translation of individual text units into the target language.
Day two of the conference included another technology-focussed session on “AI and Understanding the Shifting (Dis)Information Landscape”. In it, Ontotext’s Research Lead in vera.ai, Andrey Tagarev, showcased a prototype of a chatbot that facilitates disinformation narratives' exploration.
As part of their work in vera.ai, Ontotext is enhancing one of the user-facing instruments in the project’s toolbox, the so-called Database of Known Fakes (DBKF). This is not only a database that gathers debunks from trustworthy fact-checking sources, but also a powerful search tool. Upon ingestion the content is processed by various AI-based algorithms that extract meaningful metadata and other useful information such as identified concepts (people, organisations, locations, general terms) or events. In addition, as a result of research conducted in vera.ai, Andrey developed an approach to cluster data by textual similarity and thus unravel content reuse across multiple languages, which is often the core of the spreading of disinformation narratives.
Geared up with all these diverse capabilities to search through the debunking articles, the user can sometimes wonder how to best combine them all for individual exploration tasks. Here the chatbot assistant comes into play. The human expert can ask a question in natural language and the ‘AI buddy’ then leverages the instruments behind the scenes to retrieve a relevant output, as Andrey explained during his SIIF talk. When the user is not satisfied with the answer, they can ask for correction or ask additional questions to narrow down the results further. Andrey’s presentation also included a couple of sample conversations with the chatbot in both English and Bulgarian.
In his concluding slides Andrey showed how the DBKF data is structured and interlinked in the knowledge graph (KG), highlighting the benefits of combining KG and AI agents. These include, among others, the efficient retrieval of trustworthy information in an explainable way as well as enabling not technically savvy users to ask questions over the data in natural language.
During the Q&A sessions the audience displayed a keen interest in the various tools discussed. Two very thoughtful issues in particular were raised:
As Olga rightfully pointed out: vera.ai strives to set a good example in both aspects through building on top of user-defined requirements, through a fact-checker-in-the-loop approach, and through iteratively updating models in response to new architectures and patterns.
It was furthermore stressed that vera.ai project partners will continue to do their part in countering disinformation by providing the community out there with as many useful tools and techniques as possible in their efforts to counter falsehoods, lies and manipulations. After all, no less than the future of our democracies is at stake.
Authors: Nedelina Mitankina (Ontotext) with input from Olga Papadopoulou (CERTH-ITI), Olesya Razuvayevskaya (USFD) and Andrey Tagarev (Ontotext).
Editor: Jochen Spangenberg (DW)