Project partner KInIT at the EMNLP conference in Singapore with 3 papers

The EMNLP (Empirical Methods in Natural Language Processing) conference is one of the top NLP conferences. KInIT researchers Róbert Móro and Ján Čegiň presented three full papers at this prestigious event. Here’s more about what they did.

Three KInIT Papers accepted for the main track

KInIT submitted three papers to this renowned NLP conference. All were accepted.

Paper 1: MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

This paper represents a collaborative effort between the Penn State University, MIT Lincoln Laboratory, University of Mississippi and KInIT, showcasing a solid example of international cooperation. The paper is the result of KInIT work on the VIGILANT and projects. The paper addresses the problem of detecting machine-generated text in 11 languages by building and publishing a benchmarking dataset for this task and comparing existing state-of-the-art methods.

The paper was presented by Róbert Móro. You can read it here.

Paper 2: Multilingual Previously Fact-Checked Claim Retrieval

In this paper, the authors focus on the problem of retrieving previously fact-checked claims for different languages. A unique dataset is presented for this task, covering over 30 languages. Furthermore, an extensive comparison of different models for representing claims is performed.

Róbert Móro presented this paper. Outcomes are the result of KInIT work in the DisAI and CEDMO projects. The paper is available here.

Paper 3: ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness

The third paper was presented by KInIT PhD student Ján Čegiň. One of the co-authors is Peter Brusilovsky from the University of Pittsburgh. The paper is the result of KInIT work in the CEDMO and projects. It explores the possibilities of using large language models (LLMs) such as ChatGPT to generate training data using paraphrasing for the so-called “intent classification” task.

The paper can be accessed here.

About the event

More than 2,000 participants attended the conference from 8 to 10 December 2023. The 2024 edition will take place in Miami, US. To stay updated, check out the respective website of the Association for Computational Linguistics


Note: a similar version of this article first appeared on the KInIT website.

Author: Marianna Palková (KInIT)

Editor: Jochen Spangenberg (DW) is co-funded by the European Commission under grant agreement ID 101070093, and the UK and Swiss authorities. This website reflects the views of the consortium and respective contributors. The EU cannot be held responsible for any use which may be made of the information contained herein.