The 2nd International Workshop on Natural Language Processing for Digital Humanities – NLP4DH 2022
The 2nd Workshop on Natural Language Processing for Digital Humanities (NLP4DH 2022) will be organized remotely with AACL 2022. The proceedings of the workshop will be published in the ACL anthology. The workshop will take place on the 20th of November 2022 online.
The focus of the workshop is on applying natural language processing techniques to digital humanities research. The topics can be anything of digital humanities interest with a natural language processing or generation aspect. A list of suitable topics includes but is not limited to:
- Text analysis and processing related to humanities using computational methods
- Thorough error analysis of an NLP system using (digital) humanities methods
- Dataset creation and curation for NLP (e.g. digitization, digitalization, datafication, and data preservation).
- Research on cultural heritage collections such as national archives and libraries using NLP
- NLP for error detection, correction, normalization and denoising data
- Generation and analysis of literary works such as poetry and novels
- Analysis and detection of text genres
Schedule
All times are local time in Taipei
The workshop takes place in GatherTown (follow the signs for workshops and NLP4DH).
17:00 | Workshop opening |
17:00-18:30 | Poster Session 1 |
A Stylometric Analysis of Amadís de Gaula and Sergas de Esplandián | Yoshifumi Kawasaki |
Computational Exploration of the Origin of Mood in Literary Texts | Emily Öhman and Riikka Rossi |
Sentiment is all you need to win US Presidential elections | Sovesh Mohapatra and Somesh Mohapatra |
Interactive Analysis and Visualisation of Annotated Collocations in Spanish (AVAnCES) | Simon Gonzalez |
Fractality of sentiment arcs for literary quality assessment: The case of Nobel laureates | Yuri Bizzoni, Kristoffer Nielbo, Mads Thomsen |
Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material | Solomon Tannor, Nachum Dershowitz, Moshe Lavee |
Use the Metadata, Luke! — An Experimental Joint Metadata Search and N-gram Trend Viewer for Personal Web Archives | Balázs Indig, Zsófia Sárközi-Lindner, Mihály Nagy |
18:30-19:30 | Lunch break |
19:30-21:00 | Poster session 2 |
MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation | Kshitij Gupta |
ParsSimpleQA: The Persian Simple Question Answering Dataset and System over Knowledge Graph | Hamed Babaei Giglou, Niloufar Beyranvand, Reza Moradi, Amir Mohammad Salehoof, Saeed Bibak |
Enhancing Digital History – Event discovery via Topic Modeling and Change Detection | King Ip Lin, Sabrina Peng |
A Parallel Corpus and Dictionary for Amis-Mandarin Translation | Francis Zheng, Edison Marrese-Taylor, Yutaka Matsuko |
Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers | Nilo Pedrazzini and Barbara McGillivray |
Optimizing the weighted sequence alignment algorithm for large-scale text similarity computation | Maciej Janicki |
Domain-specific Evaluation of Word Embeddings for Philosophical Text using Direct Intrinsic Evaluation | Goya van Boven, Jelke Bloem |
21:00-21:30 | Coffee break |
21:30-23:00 | Poster session 3 |
Towards Bootstrapping a Chatbot on Industrial Heritage through Term and Relation Extraction | Mihael Arcan, Rory O’Halloran, Cécile Robin, Paul Buitelaar |
Non-Parametric Word Sense Disambiguation for Historical Languages | Enrique Manjavacas Arevalo and Lauren Fonteyn |
Introducing a Large Corpus of Tokenized Classical Chinese Poems of Tang and Song Dynasties | Chao-Lin Liu, Ti-Yong Zheng, Kuan-Chun Chen, Meng-Han Chung |
Creative Text-to-Image Generation: Suggestions for a Benchmark | Irene Russo |
The predictability of literary translation | Andrew Piper and Matt Erlin |
Emotion Conditioned Creative Dialog Generation | Khalid Alnajjar and Mika Hämäläinen |
Integration of Named Entity Recognition and Sentence Segmentation on Ancient Chinese based on Siku-BERT | Sijia Ge |
(Re-)Digitizing 吳守禮 Ngôo Siú-lé’s Mandarin — Taiwanese Dictionary | Pierre Magistry and Afala Phaxay |
Paper submission
We solicit original and unpublished work related to digital humanities and natural language processing. Short papers can be up to 4 pages in length and long papers up to 8 pages. Both submission formats can have an unlimited number of pages for references. All submissions must follow the ACL stylesheet (Overleaf template). We don’t accept submissions that consist of an abstract only.
The submissions must be anonymous and they will be peer-reviewed by our program committee. The peer review is double blinded. Please see “Paper submission details” on the main conference website for more information.
Papers must be submitted using SoftConf by the workshop deadline. At least one of the authors of an accepted paper must register for the main conference and present the paper.
Accepted papers (short and long) will be published in the workshop proceedings that will appear in the ACL Anthology. Accepted papers will also be given an additional page to address the reviewers’ comments. The length of a camera ready submission can then be 5 pages for a short paper and 9 for a long paper with an unlimited number of pages for references.
The authors of the accepted papers will be invited to submit an extended version of their workshop paper to a special issue in the Journal of Data Mining & Digital Humanities.
Important dates
- Paper submission (full and short):
August 25, 2022August 28, 2022 - Notification of acceptance: September 25, 2022
- Camera ready deadline: October 10, 2022
- Workshop: November 20, 2022
All times are Anywhere on Earth (AoE).
Organizers
Mika Hämäläinen, Rootroo Ltd and University of Helsinki
Khalid Alnajjar, Rootroo Ltd and University of Helsinki
Thierry Poibeau, École normale supérieure and CNRS
Niko Partanen, University of Helsinki
Jack Rueter, University of Helsinki
You can contact us by email hello@rootroo.com
Program committee
To be updated:
Iana Atanassova, Université de Bourgogne Franche-Comté
Yuri Bizzoni, Aarhus University
Miriam Butt, University of Konstanz
Won Ik Cho, Seoul National University
Quan Duong, University of Helsinki
Hugo Gonçalo Oliveira, University of Coimbra
Kenichi Iwatsuki, ARIKTTA
Heiki-Jaan Kaalep, University of Tartu
Enrique Manjavacas, Leiden University
Matej Martinc, Jozef Stefan Institute
Flammie Pirinen, UiT The Arctic University of Norway
Tyler Shoemaker, University of California, Davis
Liisa Lotta Tarvainen-Li, Acolad
Jörg Tiedemann, University of Helsinki
Jouni Tuominen, Aalto University
Shuo Zhang, Bose Corporation
Emily Öhman, Waseda University
Frederik Arnold, Humboldt-Universität zu Berlin
Nicolas Gutehrlé, Université de Bourgogne Franche-Comté
Thibault Clérice, Université PSL
Aynat Rubinstein, The Hebrew University of Jerusalem
Lama Alqazlan, University of Warwick
Gechuan Zhang, University College Dublin
Moshe Stekel, Ariel University
Alejandro Sierra-Múnera, University of Potsdam
Avinash Tulasi, IIIT Delhi