diff --git a/data/xml/2022.acl.xml b/data/xml/2022.acl.xml
index b81c16095d..57f536bd80 100644
--- a/data/xml/2022.acl.xml
+++ b/data/xml/2022.acl.xml
@@ -34,6 +34,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/multirc">MultiRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.1</doi>
     </paper>
     <paper id="2">
       <title>Quantified Reproducibility Assessment of <fixed-case>NLP</fixed-case> Results</title>
@@ -44,6 +45,7 @@
       <abstract>This paper describes and tests a method for carrying out quantified reproducibility assessment (QRA) that is based on concepts and definitions from metrology. QRA produces a single score estimating the degree of reproducibility of a given system and evaluation measure, on the basis of the scores from, and differences between, different reproductions. We test QRA on 18 different system and evaluation measure combinations (involving diverse NLP tasks and types of evaluation), for each of which we have the original results and one to seven reproduction results. The proposed QRA method produces degree-of-reproducibility scores that are comparable across multiple reproductions not only of the same, but also of different, original studies. We find that the proposed method facilitates insights into causes of variation between reproductions, and as a result, allows conclusions to be drawn about what aspects of system and/or evaluation design need to be changed in order to improve reproducibility.</abstract>
       <url hash="bc10667e">2022.acl-long.2</url>
       <bibkey>belz-etal-2022-quantified</bibkey>
+      <doi>10.18653/v1/2022.acl-long.2</doi>
     </paper>
     <paper id="3">
       <title>Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings</title>
@@ -59,6 +61,7 @@
       <bibkey>yu-etal-2022-rare</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.3</doi>
     </paper>
     <paper id="4">
       <title><fixed-case>A</fixed-case>leph<fixed-case>BERT</fixed-case>: Language Model Pre-training and Evaluation from Sub-Word to Sentence Level</title>
@@ -73,6 +76,7 @@
       <url hash="e30da8cd">2022.acl-long.4</url>
       <bibkey>seker-etal-2022-alephbert</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/oscar">OSCAR</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.4</doi>
     </paper>
     <paper id="5">
       <title>Learning to Imagine: Integrating Counterfactual Thinking in Neural Discrete Reasoning</title>
@@ -88,6 +92,7 @@
       <bibkey>li-etal-2022-learning</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/drop">DROP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/hybridqa">HybridQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.5</doi>
     </paper>
     <paper id="6">
       <title>Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings for Complex Word Identification</title>
@@ -99,6 +104,7 @@
       <abstract>Complex word identification (CWI) is a cornerstone process towards proper text simplification. CWI is highly dependent on context, whereas its difficulty is augmented by the scarcity of available datasets which vary greatly in terms of domains and languages. As such, it becomes increasingly more difficult to develop a robust model that generalizes across a wide array of input examples. In this paper, we propose a novel training technique for the CWI task based on domain adaptation to improve the target character and context representations. This technique addresses the problem of working with multiple domains, inasmuch as it creates a way of smoothing the differences between the explored datasets. Moreover, we also propose a similar auxiliary task, namely text simplification, that can be used to complement lexical complexity prediction. Our model obtains a boost of up to 2.42% in terms of Pearson Correlation Coefficients in contrast to vanilla training techniques, when considering the CompLex from the Lexical Complexity Prediction 2021 dataset. At the same time, we obtain an increase of 3% in Pearson scores, while considering a cross-lingual setup relying on the Complex Word Identification 2018 dataset. In addition, our model yields state-of-the-art results in terms of Mean Absolute Error.</abstract>
       <url hash="23e260bb">2022.acl-long.6</url>
       <bibkey>zaharia-etal-2022-domain</bibkey>
+      <doi>10.18653/v1/2022.acl-long.6</doi>
     </paper>
     <paper id="7">
       <title><fixed-case>J</fixed-case>oint<fixed-case>CL</fixed-case>: A Joint Contrastive Learning Framework for Zero-Shot Stance Detection</title>
@@ -115,6 +121,7 @@
       <attachment type="software" hash="a6bd297a">2022.acl-long.7.software.zip</attachment>
       <bibkey>liang-etal-2022-jointcl</bibkey>
       <pwccode url="https://github.com/hitsz-hlt/jointcl" additional="false">hitsz-hlt/jointcl</pwccode>
+      <doi>10.18653/v1/2022.acl-long.7</doi>
     </paper>
     <paper id="8">
       <title>[<fixed-case>CASPI</fixed-case>] Causal-aware Safe Policy Improvement for Task-oriented Dialogue</title>
@@ -126,6 +133,7 @@
       <url hash="6ef88ff6">2022.acl-long.8</url>
       <bibkey>ramachandran-etal-2022-caspi</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>U</fixed-case>ni<fixed-case>T</fixed-case>ran<fixed-case>S</fixed-case>e<fixed-case>R</fixed-case>: A Unified Transformer Semantic Representation Framework for Multimodal Task-Oriented Dialog System</title>
@@ -139,6 +147,7 @@
       <attachment type="software" hash="8de0e473">2022.acl-long.9.software.zip</attachment>
       <bibkey>ma-etal-2022-unitranser</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mmd">MMD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.9</doi>
     </paper>
     <paper id="10">
       <title>Dynamic Schema Graph Fusion Network for Multi-Domain Dialogue State Tracking</title>
@@ -152,6 +161,7 @@
       <url hash="b0a15cf5">2022.acl-long.10</url>
       <bibkey>feng-etal-2022-dynamic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/sgd">SGD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.10</doi>
     </paper>
     <paper id="11">
       <title>Attention Temperature Matters in Abstractive Summarization Distillation</title>
@@ -165,6 +175,7 @@
       <attachment type="software" hash="a6dc302d">2022.acl-long.11.software.zip</attachment>
       <bibkey>zhang-etal-2022-attention</bibkey>
       <pwccode url="https://github.com/shengqiang-zhang/plate" additional="false">shengqiang-zhang/plate</pwccode>
+      <doi>10.18653/v1/2022.acl-long.11</doi>
     </paper>
     <paper id="12">
       <title>Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation</title>
@@ -182,6 +193,7 @@
       <pwccode url="https://github.com/ghchen18/acl22-sixtp" additional="false">ghchen18/acl22-sixtp</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/flores-101">FLORES-101</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/flores">FLoRes</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>T</fixed-case>op<fixed-case>WORDS</fixed-case>-Seg: Simultaneous Text Segmentation and Word Discovery for Open-Domain <fixed-case>C</fixed-case>hinese Texts via <fixed-case>B</fixed-case>ayesian Inference</title>
@@ -192,6 +204,7 @@
       <abstract>Processing open-domain Chinese texts has been a critical bottleneck in computational linguistics for decades, partially because text segmentation and word discovery often entangle with each other in this challenging scenario. No existing methods yet can achieve effective text segmentation and word discovery simultaneously in open domain. This study fills in this gap by proposing a novel method called TopWORDS-Seg based on Bayesian inference, which enjoys robust performance and transparent interpretation when no training corpus and domain vocabulary are available. Advantages of TopWORDS-Seg are demonstrated by a series of experimental studies.</abstract>
       <url hash="1ddb631d">2022.acl-long.13</url>
       <bibkey>pan-etal-2022-topwords</bibkey>
+      <doi>10.18653/v1/2022.acl-long.13</doi>
     </paper>
     <paper id="14">
       <title>An Unsupervised Multiple-Task and Multiple-Teacher Model for Cross-lingual Named Entity Recognition</title>
@@ -207,6 +220,7 @@
       <attachment type="software" hash="bc84f15b">2022.acl-long.14.software.zip</attachment>
       <bibkey>li-etal-2022-unsupervised-multiple</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.14</doi>
     </paper>
     <paper id="15">
       <title>Discriminative Marginalized Probabilistic Neural Method for Multi-Document Summarization of Medical Literature</title>
@@ -218,6 +232,7 @@
       <abstract>Although current state-of-the-art Transformer-based solutions succeeded in a wide range for single-document NLP tasks, they still struggle to address multi-input tasks such as multi-document summarization. Many solutions truncate the inputs, thus ignoring potential summary-relevant contents, which is unacceptable in the medical domain where each information can be vital. Others leverage linear model approximations to apply multi-input concatenation, worsening the results because all information is considered, even if it is conflicting or noisy with respect to a shared background. Despite the importance and social impact of medicine, there are no ad-hoc solutions for multi-document summarization. For this reason, we propose a novel discriminative marginalized probabilistic method (DAMEN) trained to discriminate critical information from a cluster of topic-related medical documents and generate a multi-document summary via token probability marginalization. Results prove we outperform the previous state-of-the-art on a biomedical dataset for multi-document summarization of systematic literature reviews. Moreover, we perform extensive ablation studies to motivate the design choices and prove the importance of each module of our method.</abstract>
       <url hash="0c1f23ed">2022.acl-long.15</url>
       <bibkey>moro-etal-2022-discriminative</bibkey>
+      <doi>10.18653/v1/2022.acl-long.15</doi>
     </paper>
     <paper id="16">
       <title>Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm</title>
@@ -239,6 +254,7 @@
       <bibkey>huang-etal-2022-sparse</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.16</doi>
     </paper>
     <paper id="17">
       <title><fixed-case>C</fixed-case>ipher<fixed-case>DA</fixed-case>ug: Ciphertext based Data Augmentation for Neural Machine Translation</title>
@@ -250,6 +266,7 @@
       <url hash="bd1f219a">2022.acl-long.17</url>
       <bibkey>kambhatla-etal-2022-cipherdaug</bibkey>
       <pwccode url="https://github.com/protonish/cipherdaug-nmt" additional="false">protonish/cipherdaug-nmt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.17</doi>
     </paper>
     <paper id="18">
       <title>Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages</title>
@@ -262,6 +279,7 @@
       <bibkey>patil-etal-2022-overlap</bibkey>
       <pwccode url="https://github.com/vaidehi99/obpe" additional="false">vaidehi99/obpe</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.18</doi>
     </paper>
     <paper id="19">
       <title>Long-range Sequence Modeling with Predictable Sparse Attention</title>
@@ -273,6 +291,7 @@
       <url hash="23d86d12">2022.acl-long.19</url>
       <bibkey>zhuang-etal-2022-long</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/lra">LRA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.19</doi>
     </paper>
     <paper id="20">
       <title>Improving Personalized Explanation Generation through Visualization</title>
@@ -286,6 +305,7 @@
       <abstract>In modern recommender systems, there are usually comments or reviews from users that justify their ratings for different items. Trained on such textual corpus, explainable recommendation models learn to discover user interests and generate personalized explanations. Though able to provide plausible explanations, existing models tend to generate repeated sentences for different items or empty sentences with insufficient details. This begs an interesting question: can we immerse the models in a multimodal environment to gain proper awareness of real-world concepts and alleviate above shortcomings? To this end, we propose a visually-enhanced approach named METER with the help of visualization generation and text–image matching discrimination: the explainable recommendation model is encouraged to visualize what it refers to while incurring a penalty if the visualization is incongruent with the textual explanation. Experimental results and a manual assessment demonstrate that our approach can improve not only the text quality but also the diversity and explainability of the generated explanations.</abstract>
       <url hash="1d8f2fca">2022.acl-long.20</url>
       <bibkey>geng-etal-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.acl-long.20</doi>
     </paper>
     <paper id="21">
       <title>New Intent Discovery with Pre-training and Contrastive Learning</title>
@@ -300,6 +320,7 @@
       <bibkey>zhang-etal-2022-new</bibkey>
       <pwccode url="https://github.com/zhang-yu-wei/mtp-clnn" additional="false">zhang-yu-wei/mtp-clnn</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/clinc150">CLINC150</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>M</fixed-case>odeling <fixed-case>U.S.</fixed-case> State-Level Policies by Extracting Winners and Losers from Legislative Texts</title>
@@ -310,6 +331,7 @@
       <abstract>Decisions on state-level policies have a deep effect on many aspects of our everyday life, such as health-care and education access. However, there is little understanding of how these policies and decisions are being formed in the legislative process. We take a data-driven approach by decoding the impact of legislation on relevant stakeholders (e.g., teachers in education bills) to understand legislators’ decision-making process and votes. We build a new dataset for multiple US states that interconnects multiple sources of data including bills, stakeholders, legislators, and money donors. Next, we develop a textual graph-based model to embed and analyze state bills. Our model predicts winners/losers of bills and then utilizes them to better determine the legislative body’s vote breakdown according to demographic/ideological criteria, e.g., gender.</abstract>
       <url hash="779956cb">2022.acl-long.22</url>
       <bibkey>davoodi-etal-2022-modeling</bibkey>
+      <doi>10.18653/v1/2022.acl-long.22</doi>
     </paper>
     <paper id="23">
       <title>Structural Characterization for Dialogue Disentanglement</title>
@@ -323,6 +345,7 @@
       <bibkey>ma-etal-2022-structural</bibkey>
       <pwccode url="https://github.com/xbmxb/structurecharacterization4dd" additional="false">xbmxb/structurecharacterization4dd</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/molweni">Molweni</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.23</doi>
     </paper>
     <paper id="24">
       <title>Multi-Party Empathetic Dialogue Generation: A New Task for Dialog Systems</title>
@@ -338,6 +361,7 @@
       <bibkey>zhu-etal-2022-multi</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/meld">MELD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/pec">PEC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.24</doi>
     </paper>
     <paper id="25">
       <title><fixed-case>MISC</fixed-case>: A Mixed Strategy-Aware Model integrating <fixed-case>COMET</fixed-case> for Emotional Support Conversation</title>
@@ -354,6 +378,7 @@
       <pwccode url="https://github.com/morecry/misc" additional="false">morecry/misc</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/atomic">ATOMIC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.25</doi>
     </paper>
     <paper id="26">
       <title><fixed-case>GLM</fixed-case>: General Language Model Pretraining with Autoregressive Blank Infilling</title>
@@ -375,6 +400,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.26</doi>
     </paper>
     <paper id="27">
       <title><fixed-case>Q</fixed-case>uote<fixed-case>R</fixed-case>: A Benchmark of Quote Recommendation for Writing</title>
@@ -391,6 +417,7 @@
       <bibkey>qi-etal-2022-quoter</bibkey>
       <pwccode url="https://github.com/thunlp/quoter" additional="false">thunlp/quoter</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bookcorpus">BookCorpus</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.27</doi>
     </paper>
     <paper id="28">
       <title>Towards Comprehensive Patent Approval Predictions:Beyond Traditional Document Classification</title>
@@ -405,6 +432,7 @@
       <abstract>Predicting the approval chance of a patent application is a challenging problem involving multiple facets. The most crucial facet is arguably the novelty — <i>35 U.S. Code § 102</i> rejects more recent applications that have very similar prior arts. Such novelty evaluations differ the patent approval prediction from conventional document classification — Successful patent applications may share similar writing patterns; however, too-similar newer applications would receive the opposite label, thus confusing standard document classifiers (e.g., BERT). To address this issue, we propose a novel framework that unifies the document classifier with handcrafted features, particularly time-dependent novelty scores. Specifically, we formulate the novelty scores by comparing each application with millions of prior arts using a hybrid of efficient filters and a neural bi-encoder. Moreover, we impose a new regularization term into the classification objective to enforce the monotonic change of approval prediction w.r.t. novelty scores. From extensive experiments on a large-scale USPTO dataset, we find that standard BERT fine-tuning can partially learn the correct relationship between novelty and approvals from inconsistent data. However, our time-dependent novelty features offer a boost on top of it. Also, our monotonic regularization, while shrinking the search space, can drive the optimizer to better local optima, yielding a further small performance gain. </abstract>
       <url hash="274d294c">2022.acl-long.28</url>
       <bibkey>gao-etal-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.acl-long.28</doi>
     </paper>
     <paper id="29">
       <title>Hypergraph <fixed-case>T</fixed-case>ransformer: <fixed-case>W</fixed-case>eakly-Supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering</title>
@@ -420,6 +448,7 @@
       <pwccode url="https://github.com/yujungheo/kbvqa-public" additional="false">yujungheo/kbvqa-public</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dbpedia">DBpedia</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.29</doi>
     </paper>
     <paper id="30">
       <title>Cross-Utterance Conditioned <fixed-case>VAE</fixed-case> for Non-Autoregressive Text-to-Speech</title>
@@ -438,6 +467,7 @@
       <bibkey>li-etal-2022-cross-utterance</bibkey>
       <pwccode url="https://github.com/neurowave-ai/cucvae-tts" additional="false">neurowave-ai/cucvae-tts</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ljspeech">LJSpeech</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.30</doi>
     </paper>
     <paper id="31">
       <title>Mix and Match: Learning-free Controllable Text Generationusing Energy Language Models</title>
@@ -450,6 +480,7 @@
       <bibkey>mireshghallah-etal-2022-mix</bibkey>
       <pwccode url="https://github.com/mireshghallah/mixmatch" additional="false">mireshghallah/mixmatch</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/gyafc">GYAFC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.31</doi>
     </paper>
     <paper id="32">
       <title>So Different Yet So Alike! Constrained Unsupervised Text Style Transfer</title>
@@ -463,6 +494,7 @@
       <url hash="720acf29">2022.acl-long.32</url>
       <bibkey>ramesh-kashyap-etal-2022-different</bibkey>
       <pwccode url="https://github.com/abhinavkashyap/dct" additional="false">abhinavkashyap/dct</pwccode>
+      <doi>10.18653/v1/2022.acl-long.32</doi>
     </paper>
     <paper id="33">
       <title>e-<fixed-case>CARE</fixed-case>: a New Dataset for Exploring Explainable Causal Reasoning</title>
@@ -479,6 +511,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/copa">COPA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/commonsenseqa">CommonsenseQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/genericskb">GenericsKB</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.33</doi>
     </paper>
     <paper id="34">
       <title>Fantastic Questions and Where to Find Them: <fixed-case>F</fixed-case>airytale<fixed-case>QA</fixed-case> – An Authentic Dataset for Narrative Comprehension</title>
@@ -508,6 +541,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/cloth">CLOTH</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/narrativeqa">NarrativeQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.34</doi>
     </paper>
     <paper id="35">
       <title><fixed-case>K</fixed-case>a<fixed-case>FSP</fixed-case>: Knowledge-Aware Fuzzy Semantic Parsing for Conversational Question Answering over a Large-Scale Knowledge Base</title>
@@ -519,6 +553,7 @@
       <attachment type="software" hash="04007165">2022.acl-long.35.software.zip</attachment>
       <bibkey>li-xiong-2022-kafsp</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/csqa">CSQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.35</doi>
     </paper>
     <paper id="36">
       <title>Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment</title>
@@ -536,6 +571,7 @@
       <url hash="391548be">2022.acl-long.36</url>
       <bibkey>huang-etal-2022-multilingual</bibkey>
       <pwccode url="https://github.com/amzn/ss-aga-kgc" additional="false">amzn/ss-aga-kgc</pwccode>
+      <doi>10.18653/v1/2022.acl-long.36</doi>
     </paper>
     <paper id="37">
       <title>Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization</title>
@@ -548,6 +584,7 @@
       <abstract>Automatic code summarization, which aims to describe the source code in natural language, has become an essential task in software maintenance. Our fellow researchers have attempted to achieve such a purpose through various machine learning-based approaches. One key challenge keeping these approaches from being practical lies in the lacking of retaining the semantic structure of source code, which has unfortunately been overlooked by the state-of-the-art. Existing approaches resort to representing the syntax structure of code by modeling the Abstract Syntax Trees (ASTs). However, the hierarchical structures of ASTs have not been well explored. In this paper, we propose CODESCRIBE to model the hierarchical syntax structure of code by introducing a novel triplet position for code summarization. Specifically, CODESCRIBE leverages the graph neural network and Transformer to preserve the structural and sequential information of code, respectively. In addition, we propose a pointer-generator network that pays attention to both the structure and sequential tokens of code for a better summary generation. Experiments on two real-world datasets in Java and Python demonstrate the effectiveness of our proposed approach when compared with several state-of-the-art baselines.</abstract>
       <url hash="5bee980f">2022.acl-long.37</url>
       <bibkey>guo-etal-2022-modeling</bibkey>
+      <doi>10.18653/v1/2022.acl-long.37</doi>
     </paper>
     <paper id="38">
       <title><fixed-case>F</fixed-case>ew<fixed-case>NLU</fixed-case>: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding</title>
@@ -575,6 +612,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/multirc">MultiRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wsc">WSC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.38</doi>
     </paper>
     <paper id="39">
       <title>Learn to Adapt for Generalized Zero-Shot Text Classification</title>
@@ -590,6 +628,7 @@
       <bibkey>zhang-etal-2022-learn</bibkey>
       <pwccode url="https://github.com/quareia/lta" additional="false">quareia/lta</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/atis">ATIS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.39</doi>
     </paper>
     <paper id="40">
       <title><fixed-case>T</fixed-case>able<fixed-case>F</fixed-case>ormer: Robust Transformer Modeling for Table-Text Encoding</title>
@@ -606,6 +645,7 @@
       <pwccode url="https://github.com/google-research/tapas" additional="false">google-research/tapas</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/sqa">SQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tabfact">TabFact</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.40</doi>
     </paper>
     <paper id="41">
       <title>Perceiving the World: Question-guided Reinforcement Learning for Text-based Games</title>
@@ -620,6 +660,7 @@
       <url hash="449bedd9">2022.acl-long.41</url>
       <bibkey>xu-etal-2022-perceiving</bibkey>
       <pwccode url="https://github.com/yunqiuxu/qwa" additional="false">yunqiuxu/qwa</pwccode>
+      <doi>10.18653/v1/2022.acl-long.41</doi>
     </paper>
     <paper id="42">
       <title>Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization</title>
@@ -635,6 +676,7 @@
       <bibkey>jia-etal-2022-neural</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mlsum">MLSUM</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikilingua">WikiLingua</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.42</doi>
     </paper>
     <paper id="43">
       <title>Few-Shot Class-Incremental Learning for Named Entity Recognition</title>
@@ -649,6 +691,7 @@
       <abstract>Previous work of class-incremental learning for Named Entity Recognition (NER) relies on the assumption that there exists abundance of labeled data for the training of new classes. In this work, we study a more challenging but practical problem, <i>i.e.</i>, few-shot class-incremental learning for NER, where an NER model is trained with only few labeled samples of the new classes, without forgetting knowledge of the old ones. To alleviate the problem of catastrophic forgetting in few-shot class-incremental learning, we reconstruct synthetic training data of the old classes using the trained NER model, augmenting the training of new classes. We further develop a framework that distills from the existing model with both synthetic data, and real data from the current training set. Experimental results show that our approach achieves significant improvements over existing baselines.</abstract>
       <url hash="820e6f28">2022.acl-long.43</url>
       <bibkey>wang-etal-2022-shot</bibkey>
+      <doi>10.18653/v1/2022.acl-long.43</doi>
     </paper>
     <paper id="44">
       <title>Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation</title>
@@ -666,6 +709,7 @@
       <attachment type="software" hash="7f999987">2022.acl-long.44.software.zip</attachment>
       <bibkey>zhao-etal-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.44</doi>
     </paper>
     <paper id="45">
       <title>Quality Controlled Paraphrase Generation</title>
@@ -681,6 +725,7 @@
       <bibkey>bandel-etal-2022-quality</bibkey>
       <pwccode url="https://github.com/ibm/quality-controlled-paraphrase-generation" additional="false">ibm/quality-controlled-paraphrase-generation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.45</doi>
     </paper>
     <paper id="46">
       <title>Controllable Dictionary Example Generation: Generating Example Sentences for Specific Targeted Audiences</title>
@@ -691,6 +736,7 @@
       <url hash="34d53085">2022.acl-long.46</url>
       <attachment type="software" hash="c0a4ac58">2022.acl-long.46.software.zip</attachment>
       <bibkey>he-yiu-2022-controllable</bibkey>
+      <doi>10.18653/v1/2022.acl-long.46</doi>
     </paper>
     <paper id="47">
       <title><fixed-case>A</fixed-case>ra<fixed-case>T</fixed-case>5: Text-to-Text Transformers for <fixed-case>A</fixed-case>rabic Language Generation</title>
@@ -703,6 +749,7 @@
       <bibkey>nagoudi-etal-2022-arat5</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/c4">C4</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mc4">mC4</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.47</doi>
     </paper>
     <paper id="48">
       <title>Legal Judgment Prediction via Event Extraction with Constraints</title>
@@ -715,6 +762,7 @@
       <attachment type="software" hash="079d4f4a">2022.acl-long.48.software.zip</attachment>
       <bibkey>feng-etal-2022-legal</bibkey>
       <pwccode url="https://github.com/wapay/epm" additional="false">wapay/epm</pwccode>
+      <doi>10.18653/v1/2022.acl-long.48</doi>
     </paper>
     <paper id="49">
       <title>Answer-level Calibration for Free-form Multiple Choice Question Answering</title>
@@ -733,6 +781,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/swag">SWAG</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/social-iqa">Social IQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winogrande">WinoGrande</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.49</doi>
     </paper>
     <paper id="50">
       <title>Learning When to Translate for Streaming Speech</title>
@@ -747,6 +796,7 @@
       <bibkey>dong-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/dqqcasia/mosst" additional="false">dqqcasia/mosst</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.50</doi>
     </paper>
     <paper id="51">
       <title>Compact Token Representations with Contextual Quantization for Efficient Document Re-ranking</title>
@@ -759,6 +809,7 @@
       <bibkey>yang-etal-2022-compact</bibkey>
       <pwccode url="https://github.com/yingrui-yang/ContextualQuantizer" additional="false">yingrui-yang/ContextualQuantizer</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.51</doi>
     </paper>
     <paper id="52">
       <title>Early Stopping Based on Unlabeled Samples in Text Classification</title>
@@ -774,6 +825,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.52</doi>
     </paper>
     <paper id="53">
       <title>Meta-learning via Language Model In-context Tuning</title>
@@ -788,6 +840,7 @@
       <bibkey>chen-etal-2022-meta</bibkey>
       <pwccode url="https://github.com/yandachen/in-context-tuning" additional="false">yandachen/in-context-tuning</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.53</doi>
     </paper>
     <paper id="54">
       <title>It is <fixed-case>AI</fixed-case>’s Turn to Ask Humans a Question: Question-Answer Pair Generation for Children’s Story Books</title>
@@ -806,6 +859,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/narrativeqa">NarrativeQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paq">PAQ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.54</doi>
     </paper>
     <paper id="55">
       <title>Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning</title>
@@ -820,6 +874,7 @@
       <bibkey>zhang-etal-2022-prompt</bibkey>
       <pwccode url="https://github.com/rz-zhang/prboost" additional="false">rz-zhang/prboost</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.55</doi>
     </paper>
     <paper id="56">
       <title>Constrained Multi-Task Learning for Bridging Resolution</title>
@@ -831,6 +886,7 @@
       <url hash="2bc73580">2022.acl-long.56</url>
       <bibkey>kobayashi-etal-2022-constrained</bibkey>
       <pwccode url="https://github.com/juntaoy/dali-bridging" additional="false">juntaoy/dali-bridging</pwccode>
+      <doi>10.18653/v1/2022.acl-long.56</doi>
     </paper>
     <paper id="57">
       <title><fixed-case>DEAM</fixed-case>: Dialogue Coherence Evaluation using <fixed-case>AMR</fixed-case>-based Semantic Manipulations</title>
@@ -846,6 +902,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/fed">FED</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/topical-chat">Topical-Chat</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.57</doi>
     </paper>
     <paper id="58">
       <title><fixed-case>HIBRIDS</fixed-case>: Attention with Hierarchical Biases for Structure-aware Long Document Summarization</title>
@@ -855,6 +912,7 @@
       <abstract>Document structure is critical for efficient information consumption. However, it is challenging to encode it efficiently into the modern Transformer architecture. In this work, we present HIBRIDS, which injects Hierarchical Biases foR Incorporating Document Structure into attention score calculation. We further present a new task, hierarchical question-summary generation, for summarizing salient content in the source document into a hierarchy of questions and summaries, where each follow-up question inquires about the content of its parent question-summary pair. We also annotate a new dataset with 6,153 question-summary hierarchies labeled on government reports. Experiment results show that our model produces better question-summary hierarchies than comparisons on both hierarchy quality and content coverage, a finding also echoed by human judges. Additionally, our model improves the generation of long-form summaries from long government reports and Wikipedia articles, as measured by ROUGE scores.</abstract>
       <url hash="3b983d5d">2022.acl-long.58</url>
       <bibkey>cao-wang-2022-hibrids</bibkey>
+      <doi>10.18653/v1/2022.acl-long.58</doi>
     </paper>
     <paper id="59">
       <title>De-Bias for Generative Extraction in Unified <fixed-case>NER</fixed-case> Task</title>
@@ -868,6 +926,7 @@
       <url hash="8d8c38b9">2022.acl-long.59</url>
       <bibkey>zhang-etal-2022-de</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/genia">GENIA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.59</doi>
     </paper>
     <paper id="60">
       <title>An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels</title>
@@ -890,6 +949,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/commonsenseqa">CommonsenseQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/lambada">LAMBADA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.60</doi>
     </paper>
     <paper id="61">
       <title>Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation</title>
@@ -902,6 +962,7 @@
       <bibkey>wang-etal-2022-expanding</bibkey>
       <pwccode url="https://github.com/cindyxinyiwang/expand-via-lexicon-based-adaptation" additional="false">cindyxinyiwang/expand-via-lexicon-based-adaptation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/masakhaner">MasakhaNER</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.61</doi>
     </paper>
     <paper id="62">
       <title>Language-agnostic <fixed-case>BERT</fixed-case> Sentence Embedding</title>
@@ -918,6 +979,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/senteval">SentEval</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.62</doi>
     </paper>
     <paper id="63">
       <title>Nested Named Entity Recognition with Span-level Graphs</title>
@@ -930,6 +992,7 @@
       <url hash="a9f1734f">2022.acl-long.63</url>
       <bibkey>wan-etal-2022-nested</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/genia">GENIA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.63</doi>
     </paper>
     <paper id="64">
       <title><fixed-case>C</fixed-case>og<fixed-case>T</fixed-case>askonomy: Cognitively Inspired Task Taxonomy Is Beneficial to Transfer Learning in <fixed-case>NLP</fixed-case></title>
@@ -944,6 +1007,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/taskonomy">Taskonomy</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.64</doi>
     </paper>
     <paper id="65">
       <title><fixed-case>R</fixed-case>o<fixed-case>CB</fixed-case>ert: Robust <fixed-case>C</fixed-case>hinese Bert with Multimodal Contrastive Pretraining</title>
@@ -959,6 +1023,7 @@
       <url hash="769178d0">2022.acl-long.65</url>
       <attachment type="software" hash="5845565e">2022.acl-long.65.software.zip</attachment>
       <bibkey>su-etal-2022-rocbert</bibkey>
+      <doi>10.18653/v1/2022.acl-long.65</doi>
     </paper>
     <paper id="66">
       <title>Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues</title>
@@ -982,6 +1047,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/snli-ve">SNLI-VE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/vcr">VCR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.66</doi>
     </paper>
     <paper id="67">
       <title>Parallel Instance Query Network for Named Entity Recognition</title>
@@ -1006,6 +1072,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/genia">GENIA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/nne">NNE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ontonotes-5-0">OntoNotes 5.0</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.67</doi>
     </paper>
     <paper id="68">
       <title><fixed-case>P</fixed-case>rophet<fixed-case>C</fixed-case>hat: Enhancing Dialogue Generation with Simulation of Future Conversation</title>
@@ -1022,6 +1089,7 @@
       <bibkey>liu-etal-2022-prophetchat</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.68</doi>
     </paper>
     <paper id="69">
       <title>Modeling Multi-hop Question Answering as Single Sequence Prediction</title>
@@ -1037,6 +1105,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/hotpotqa">HotpotQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/iirc">IIRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.69</doi>
     </paper>
     <paper id="70">
       <title>Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension</title>
@@ -1057,6 +1126,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/tydi-qa">TyDi QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tydiqa-goldp">TyDiQA-GoldP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.70</doi>
     </paper>
     <paper id="71">
       <title>Multi-Granularity Structural Knowledge Distillation for Language Model Compression</title>
@@ -1074,6 +1144,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.71</doi>
     </paper>
     <paper id="72">
       <title>Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts</title>
@@ -1086,6 +1157,7 @@
       <bibkey>guo-etal-2022-auto</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/crows-pairs">CrowS-Pairs</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.72</doi>
     </paper>
     <paper id="73">
       <title>Where to Go for the Holidays: Towards Mixed-Type Dialogs for Clarification of User Goals</title>
@@ -1103,6 +1175,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/durecdial">DuRecDial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/kdconv">KdConv</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.73</doi>
     </paper>
     <paper id="74">
       <title>Semi-supervised Domain Adaptation for Dependency Parsing with Dynamic Matching Network</title>
@@ -1113,6 +1186,7 @@
       <abstract>Supervised parsing models have achieved impressive results on in-domain texts. However, their performances drop drastically on out-of-domain texts due to the data distribution shift. The shared-private model has shown its promising advantages for alleviating this problem via feature separation, whereas prior works pay more attention to enhance shared features but neglect the in-depth relevance of specific ones. To address this issue, we for the first time apply a dynamic matching network on the shared-private model for semi-supervised cross-domain dependency parsing. Meanwhile, considering the scarcity of target-domain labeled data, we leverage unlabeled data from two aspects, i.e., designing a new training strategy to improve the capability of the dynamic matching network and fine-tuning BERT to obtain domain-related contextualized representations. Experiments on benchmark datasets show that our proposed model consistently outperforms various baselines, leading to new state-of-the-art results on all domains. Detailed analysis on different matching strategies demonstrates that it is essential to learn suitable matching weights to emphasize useful features and ignore useless or even harmful ones. Besides, our proposed model can be directly extended to multi-source domain adaptation and achieves best performances among various baselines, further verifying the effectiveness and robustness.</abstract>
       <url hash="f14b3d94">2022.acl-long.74</url>
       <bibkey>li-etal-2022-semi</bibkey>
+      <doi>10.18653/v1/2022.acl-long.74</doi>
     </paper>
     <paper id="75">
       <title>A Closer Look at How Fine-tuning Changes <fixed-case>BERT</fixed-case></title>
@@ -1123,6 +1197,7 @@
       <url hash="5b7293c0">2022.acl-long.75</url>
       <bibkey>zhou-srikumar-2022-closer</bibkey>
       <pwccode url="https://github.com/utahnlp/BERT-fine-tuning-analysis" additional="false">utahnlp/BERT-fine-tuning-analysis</pwccode>
+      <doi>10.18653/v1/2022.acl-long.75</doi>
     </paper>
     <paper id="76">
       <title>Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval</title>
@@ -1137,6 +1212,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.76</doi>
     </paper>
     <paper id="77">
       <title><fixed-case>F</fixed-case>ai<fixed-case>RR</fixed-case>: Faithful and Robust Deductive Reasoning over Natural Language</title>
@@ -1150,6 +1226,7 @@
       <bibkey>sanyal-etal-2022-fairr</bibkey>
       <pwccode url="https://github.com/ink-usc/fairr" additional="false">ink-usc/fairr</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/proofwriter">ProofWriter</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.77</doi>
     </paper>
     <paper id="78">
       <title><fixed-case>H</fixed-case>i<fixed-case>T</fixed-case>ab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation</title>
@@ -1171,6 +1248,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/finqa">FinQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/totto">ToTTo</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikisql">WikiSQL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.78</doi>
     </paper>
     <paper id="79">
       <title>Doctor Recommendation in Online Health Forums via Expertise Learning</title>
@@ -1183,6 +1261,7 @@
       <url hash="75cfb515">2022.acl-long.79</url>
       <bibkey>lu-etal-2022-doctor</bibkey>
       <pwccode url="https://github.com/polyusmart/doctor-recommendation" additional="false">polyusmart/doctor-recommendation</pwccode>
+      <doi>10.18653/v1/2022.acl-long.79</doi>
     </paper>
     <paper id="80">
       <title>Continual Prompt Tuning for Dialog State Tracking</title>
@@ -1197,6 +1276,7 @@
       <attachment type="software" hash="526eefbf">2022.acl-long.80.software.zip</attachment>
       <bibkey>zhu-etal-2022-continual</bibkey>
       <pwccode url="https://github.com/thu-coai/cpt4dst" additional="false">thu-coai/cpt4dst</pwccode>
+      <doi>10.18653/v1/2022.acl-long.80</doi>
     </paper>
     <paper id="81">
       <title>There’s a Time and Place for Reasoning Beyond the Image</title>
@@ -1211,6 +1291,7 @@
       <bibkey>fu-etal-2022-theres</bibkey>
       <pwccode url="https://github.com/zeyofu/tara" additional="false">zeyofu/tara</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wit">WIT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.81</doi>
     </paper>
     <paper id="82">
       <title><fixed-case>FORTAP</fixed-case>: Using Formulas for Numerical-Reasoning-Aware Table Pretraining</title>
@@ -1227,6 +1308,7 @@
       <attachment type="software" hash="1f44e443">2022.acl-long.82.software.zip</attachment>
       <bibkey>cheng-etal-2022-fortap</bibkey>
       <pwccode url="https://github.com/microsoft/TUTA_table_understanding" additional="false">microsoft/TUTA_table_understanding</pwccode>
+      <doi>10.18653/v1/2022.acl-long.82</doi>
     </paper>
     <paper id="83">
       <title>Multimodal fusion via cortical network inspired losses</title>
@@ -1235,6 +1317,7 @@
       <abstract>Information integration from different modalities is an active area of research. Human beings and, in general, biological neural systems are quite adept at using a multitude of signals from different sensory perceptive fields to interact with the environment and each other. Recent work in deep fusion models via neural networks has led to substantial improvements over unimodal approaches in areas like speech recognition, emotion recognition and analysis, captioning and image description. However, such research has mostly focused on architectural changes allowing for fusion of different modalities while keeping the model complexity manageable.Inspired by neuroscientific ideas about multisensory integration and processing, we investigate the effect of introducing neural dependencies in the loss functions. Experiments on multimodal sentiment analysis tasks with different models show that our approach provides a consistent performance boost.</abstract>
       <url hash="912921ad">2022.acl-long.83</url>
       <bibkey>shankar-2022-multimodal</bibkey>
+      <doi>10.18653/v1/2022.acl-long.83</doi>
     </paper>
     <paper id="84">
       <title>Modeling Temporal-Modal Entity Graph for Procedural Multimodal Machine Comprehension</title>
@@ -1252,6 +1335,7 @@
       <bibkey>zhang-etal-2022-modeling</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/recipeqa">RecipeQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.84</doi>
     </paper>
     <paper id="85">
       <title>Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning</title>
@@ -1264,6 +1348,7 @@
       <attachment type="software" hash="ce4e5973">2022.acl-long.85.software.zip</attachment>
       <bibkey>saha-etal-2022-explanation</bibkey>
       <pwccode url="https://github.com/swarnahub/explagraphgen" additional="false">swarnahub/explagraphgen</pwccode>
+      <doi>10.18653/v1/2022.acl-long.85</doi>
     </paper>
     <paper id="86">
       <title>Unsupervised Extractive Opinion Summarization Using Sparse Coding</title>
@@ -1276,6 +1361,7 @@
       <attachment type="software" hash="a57adbc9">2022.acl-long.86.software.zip</attachment>
       <bibkey>basu-roy-chowdhury-etal-2022-unsupervised</bibkey>
       <pwccode url="https://github.com/brcsomnath/semae" additional="false">brcsomnath/semae</pwccode>
+      <doi>10.18653/v1/2022.acl-long.86</doi>
     </paper>
     <paper id="87">
       <title><fixed-case>L</fixed-case>ex<fixed-case>S</fixed-case>ub<fixed-case>C</fixed-case>on: Integrating Knowledge from Lexical Resources into Contextual Embeddings for Lexical Substitution</title>
@@ -1289,6 +1375,7 @@
       <attachment type="software" hash="1165d16b">2022.acl-long.87.software.zip</attachment>
       <bibkey>michalopoulos-etal-2022-lexsubcon</bibkey>
       <pwccode url="https://github.com/gmichalo/lexsubcon" additional="false">gmichalo/lexsubcon</pwccode>
+      <doi>10.18653/v1/2022.acl-long.87</doi>
     </paper>
     <paper id="88">
       <title>Think Before You Speak: Explicitly Generating Implicit Commonsense Knowledge for Response Generation</title>
@@ -1307,6 +1394,7 @@
       <bibkey>zhou-etal-2022-think</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mutual">MuTual</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.88</doi>
     </paper>
     <paper id="89">
       <title>Flow-Adapter Architecture for Unsupervised Machine Translation</title>
@@ -1317,6 +1405,7 @@
       <abstract>In this work, we propose a flow-adapter architecture for unsupervised NMT. It leverages normalizing flows to explicitly model the distributions of sentence-level latent representations, which are subsequently used in conjunction with the attention mechanism for the translation task. The primary novelties of our model are: (a) capturing language-specific sentence representations separately for each language using normalizing flows and (b) using a simple transformation of these latent representations for translating from one language to another. This architecture allows for unsupervised training of each language independently. While there is prior work on latent variables for supervised MT, to the best of our knowledge, this is the first work that uses latent variables and normalizing flows for unsupervised MT. We obtain competitive results on several unsupervised MT benchmarks.</abstract>
       <url hash="ad5618f8">2022.acl-long.89</url>
       <bibkey>liu-etal-2022-flow</bibkey>
+      <doi>10.18653/v1/2022.acl-long.89</doi>
     </paper>
     <paper id="90">
       <title>Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning</title>
@@ -1330,6 +1419,7 @@
       <pwccode url="https://github.com/complementizer/rl-sentence-compression" additional="false">complementizer/rl-sentence-compression</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/newsroom">NEWSROOM</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sentence-compression">Sentence Compression</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.90</doi>
     </paper>
     <paper id="91">
       <title>Tracing Origins: Coreference-aware Machine Reading Comprehension</title>
@@ -1346,6 +1436,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/quoref">Quoref</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/searchqa">SearchQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.91</doi>
     </paper>
     <paper id="92">
       <title><fixed-case>W</fixed-case>at<fixed-case>C</fixed-case>laim<fixed-case>C</fixed-case>heck: A new Dataset for Claim Entailment and Inference</title>
@@ -1357,6 +1448,7 @@
       <url hash="f4cae244">2022.acl-long.92</url>
       <bibkey>khan-etal-2022-watclaimcheck</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/pubhealth">PUBHEALTH</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.92</doi>
     </paper>
     <paper id="93">
       <title><fixed-case>F</fixed-case>rugal<fixed-case>S</fixed-case>core: Learning Cheaper, Lighter and Faster Evaluation Metrics for Automatic Text Generation</title>
@@ -1369,6 +1461,7 @@
       <url hash="568d2e2a">2022.acl-long.93</url>
       <bibkey>kamal-eddine-etal-2022-frugalscore</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.93</doi>
     </paper>
     <paper id="94">
       <title>A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation</title>
@@ -1385,6 +1478,7 @@
       <bibkey>narayan-etal-2022-well</bibkey>
       <pwccode url="https://github.com/google-research/language" additional="false">google-research/language</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.94</doi>
     </paper>
     <paper id="95">
       <title>Synthetic Question Value Estimation for Domain Adaptation of Question Answering</title>
@@ -1400,6 +1494,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/newsqa">NewsQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.95</doi>
     </paper>
     <paper id="96">
       <title>Better Language Model with Hypernym Class Prediction</title>
@@ -1415,6 +1510,7 @@
       <pwccode url="https://github.com/richardbaihe/robustlm" additional="false">richardbaihe/robustlm</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.96</doi>
     </paper>
     <paper id="97">
       <title>Tackling Fake News Detection by Continually Improving Social Context Representations using Graph Neural Networks</title>
@@ -1426,6 +1522,7 @@
       <url hash="e471b90c">2022.acl-long.97</url>
       <bibkey>mehta-etal-2022-tackling</bibkey>
       <pwccode url="https://github.com/hockeybro12/fakenews_inference_operators" additional="false">hockeybro12/fakenews_inference_operators</pwccode>
+      <doi>10.18653/v1/2022.acl-long.97</doi>
     </paper>
     <paper id="98">
       <title>Understanding Gender Bias in Knowledge Base Embeddings</title>
@@ -1439,6 +1536,7 @@
       <abstract>Knowledge base (KB) embeddings have been shown to contain gender biases. In this paper, we study two questions regarding these biases: how to quantify them, and how to trace their origins in KB? Specifically, first, we develop two novel bias measures respectively for a group of person entities and an individual person entity. Evidence of their validity is observed by comparison with real-world census data. Second, we use the influence function to inspect the contribution of each triple in KB to the overall group bias. To exemplify the potential applications of our study, we also present two strategies (by adding and removing KB triples) to mitigate gender biases in KB embeddings.</abstract>
       <url hash="e3c6092e">2022.acl-long.98</url>
       <bibkey>du-etal-2022-understanding</bibkey>
+      <doi>10.18653/v1/2022.acl-long.98</doi>
     </paper>
     <paper id="99">
       <title>Computational Historical Linguistics and Language Diversity in <fixed-case>S</fixed-case>outh <fixed-case>A</fixed-case>sia</title>
@@ -1451,6 +1549,7 @@
       <url hash="5202e45f">2022.acl-long.99</url>
       <bibkey>arora-etal-2022-computational</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.99</doi>
     </paper>
     <paper id="100">
       <title>Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization</title>
@@ -1465,6 +1564,7 @@
       <bibkey>ladhak-etal-2022-faithful</bibkey>
       <pwccode url="https://github.com/fladhak/effective-faithfulness" additional="false">fladhak/effective-faithfulness</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wikihow">WikiHow</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.100</doi>
     </paper>
     <paper id="101">
       <title>Slangvolution: <fixed-case>A</fixed-case> Causal Analysis of Semantic Change and Frequency Dynamics in Slang</title>
@@ -1478,6 +1578,7 @@
       <attachment type="software" hash="5768303d">2022.acl-long.101.software.zip</attachment>
       <bibkey>keidar-etal-2022-slangvolution</bibkey>
       <pwccode url="https://github.com/andreasopedal/slangvolution" additional="false">andreasopedal/slangvolution</pwccode>
+      <doi>10.18653/v1/2022.acl-long.101</doi>
     </paper>
     <paper id="102">
       <title>Spurious Correlations in Reference-Free Evaluation of Text Generation</title>
@@ -1491,6 +1592,7 @@
       <pwccode url="https://github.com/esdurmus/adversarial_eval" additional="false">esdurmus/adversarial_eval</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.102</doi>
     </paper>
     <paper id="103">
       <title>On The Ingredients of an Effective Zero-shot Semantic Parser</title>
@@ -1502,6 +1604,7 @@
       <abstract>Semantic parsers map natural language utterances into meaning representations (e.g., programs). Such models are typically bottlenecked by the paucity of training data due to the required laborious annotation efforts. Recent studies have performed zero-shot learning by synthesizing training examples of canonical utterances and programs from a grammar, and further paraphrasing these utterances to improve linguistic diversity. However, such synthetic examples cannot fully capture patterns in real data. In this paper we analyze zero-shot parsers through the lenses of the language and logical gaps (Herzig and Berant, 2019), which quantify the discrepancy of language and programmatic patterns between the canonical examples and real-world user-issued ones. We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods using canonical examples that most likely reflect real user intents. Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.</abstract>
       <url hash="aa8e4d9e">2022.acl-long.103</url>
       <bibkey>yin-etal-2022-ingredients</bibkey>
+      <doi>10.18653/v1/2022.acl-long.103</doi>
     </paper>
     <paper id="104">
       <title>Bias Mitigation in Machine Translation Quality Estimation</title>
@@ -1516,6 +1619,7 @@
       <pwccode url="https://github.com/agesb/transquest" additional="false">agesb/transquest</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mlqe-pe">MLQE-PE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.104</doi>
     </paper>
     <paper id="105">
       <title>Unified Speech-Text Pre-training for Speech Translation and Recognition</title>
@@ -1537,6 +1641,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/libri-light">Libri-Light</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.105</doi>
     </paper>
     <paper id="106">
       <title>Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability</title>
@@ -1548,6 +1653,7 @@
       <url hash="16189115">2022.acl-long.106</url>
       <bibkey>fujinuma-etal-2022-match</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.106</doi>
     </paper>
     <paper id="107">
       <title>Structured Pruning Learns Compact and Accurate Models</title>
@@ -1566,6 +1672,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.107</doi>
     </paper>
     <paper id="108">
       <title>How can <fixed-case>NLP</fixed-case> Help Revitalize Endangered Languages? A Case Study and Roadmap for the <fixed-case>C</fixed-case>herokee Language</title>
@@ -1577,6 +1684,7 @@
       <url hash="bd802d41">2022.acl-long.108</url>
       <bibkey>zhang-etal-2022-nlp</bibkey>
       <pwccode url="https://github.com/zhangshiyue/revitalizecherokee" additional="false">zhangshiyue/revitalizecherokee</pwccode>
+      <doi>10.18653/v1/2022.acl-long.108</doi>
     </paper>
     <paper id="109">
       <title>Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization</title>
@@ -1588,6 +1696,7 @@
       <abstract>The IMPRESSIONS section of a radiology report about an imaging study is a summary of the radiologist’s reasoning and conclusions, and it also aids the referring physician in confirming or excluding certain diagnoses. A cascade of tasks are required to automatically generate an abstractive summary of the typical information-rich radiology report. These tasks include acquisition of salient content from the report and generation of a concise, easily consumable IMPRESSIONS section. Prior research on radiology report summarization has focused on single-step end-to-end models – which subsume the task of salient content acquisition. To fully explore the cascade structure and explainability of radiology report summarization, we introduce two innovations. First, we design a two-step approach: extractive summarization followed by abstractive summarization. Second, we additionally break down the extractive part into two independent tasks: extraction of salient (1) sentences and (2) keywords. Experiments on English radiology reports from two clinical sites show our novel approach leads to a more precise summary compared to single-step and to two-step-with-single-extractive-process baselines with an overall improvement in F1 score of 3-4%.</abstract>
       <url hash="f697dfe7">2022.acl-long.109</url>
       <bibkey>karn-etal-2022-differentiable</bibkey>
+      <doi>10.18653/v1/2022.acl-long.109</doi>
     </paper>
     <paper id="110">
       <title>Online Semantic Parsing for Latency Reduction in Task-Oriented Dialogue</title>
@@ -1601,6 +1710,7 @@
       <abstract>Standard conversational semantic parsing maps a complete user utterance into an executable program, after which the program is executed to respond to the user. This could be slow when the program contains expensive function calls. We investigate the opportunity to reduce latency by predicting and executing function calls while the user is still speaking. We introduce the task of online semantic parsing for this purpose, with a formal latency reduction metric inspired by simultaneous machine translation. We propose a general framework with first a learned prefix-to-program prediction module, and then a simple yet effective thresholding heuristic for subprogram selection for early execution. Experiments on the SMCalFlow and TreeDST datasets show our approach achieves large latency reduction with good parsing quality, with a 30%–65% latency reduction depending on function execution time and allowed cost.</abstract>
       <url hash="1207e12f">2022.acl-long.110</url>
       <bibkey>zhou-etal-2022-online</bibkey>
+      <doi>10.18653/v1/2022.acl-long.110</doi>
     </paper>
     <paper id="111">
       <title>Few-Shot Tabular Data Enrichment Using Fine-Tuned Transformer Architectures</title>
@@ -1611,6 +1721,7 @@
       <url hash="ba77468d">2022.acl-long.111</url>
       <attachment type="software" hash="9cd40d25">2022.acl-long.111.software.zip</attachment>
       <bibkey>harari-katz-2022-shot</bibkey>
+      <doi>10.18653/v1/2022.acl-long.111</doi>
     </paper>
     <paper id="112">
       <title><fixed-case>S</fixed-case>umm<tex-math>^N</tex-math>: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents</title>
@@ -1630,6 +1741,7 @@
       <pwccode url="https://github.com/psunlpgroup/summ-n" additional="true">psunlpgroup/summ-n</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/govreport">GovReport</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qmsum">QMSum</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.112</doi>
     </paper>
     <paper id="113">
       <title>Open Domain Question Answering with A Unified Knowledge Interface</title>
@@ -1648,6 +1760,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ott-qa">OTT-QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webquestions">WebQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.113</doi>
     </paper>
     <paper id="114">
       <title>Principled Paraphrase Generation with Parallel Corpora</title>
@@ -1661,6 +1774,7 @@
       <url hash="cf46ee22">2022.acl-long.114</url>
       <bibkey>ormazabal-etal-2022-principled</bibkey>
       <pwccode url="https://github.com/aitorormazabal/paraphrasing-from-parallel" additional="false">aitorormazabal/paraphrasing-from-parallel</pwccode>
+      <doi>10.18653/v1/2022.acl-long.114</doi>
     </paper>
     <paper id="115">
       <title><fixed-case>G</fixed-case>lobal<fixed-case>W</fixed-case>o<fixed-case>Z</fixed-case>: Globalizing <fixed-case>M</fixed-case>ulti<fixed-case>W</fixed-case>o<fixed-case>Z</fixed-case> to Develop Multilingual Task-Oriented Dialogue Systems</title>
@@ -1677,6 +1791,7 @@
       <attachment type="software" hash="567cb919">2022.acl-long.115.software.zip</attachment>
       <bibkey>ding-etal-2022-globalwoz</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.115</doi>
     </paper>
     <paper id="116">
       <title>Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation</title>
@@ -1691,6 +1806,7 @@
       <pwccode url="https://github.com/dmcb-gist/doktra" additional="false">dmcb-gist/doktra</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/blue">BLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/hoc-1">HOC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.116</doi>
     </paper>
     <paper id="117">
       <title>Retrieval-guided Counterfactual Generation for <fixed-case>QA</fixed-case></title>
@@ -1709,6 +1825,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/quoref">Quoref</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.117</doi>
     </paper>
     <paper id="118">
       <title><fixed-case>DYLE</fixed-case>: Dynamic Latent Extraction for Abstractive Long-Input Summarization</title>
@@ -1729,6 +1846,7 @@
       <pwccode url="https://github.com/yale-lily/dyle" additional="false">yale-lily/dyle</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/govreport">GovReport</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qmsum">QMSum</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.118</doi>
     </paper>
     <paper id="119">
       <title>Searching for fingerspelled content in <fixed-case>A</fixed-case>merican <fixed-case>S</fixed-case>ign <fixed-case>L</fixed-case>anguage</title>
@@ -1740,6 +1858,7 @@
       <abstract>Natural language processing for sign language video—including tasks like recognition, translation, and search—is crucial for making artificial intelligence technologies accessible to deaf individuals, and is gaining research interest in recent years. In this paper, we address the problem of searching for fingerspelled keywords or key phrases in raw sign language videos. This is an important task since significant content in sign language is often conveyed via fingerspelling, and to our knowledge the task has not been studied before. We propose an end-to-end model for this task, FSS-Net, that jointly detects fingerspelling and matches it to a text sequence. Our experiments, done on a large public dataset of ASL fingerspelling in the wild, show the importance of fingerspelling detection as a component of a search and retrieval model. Our model significantly outperforms baseline methods adapted from prior work on related tasks.</abstract>
       <url hash="024b6f79">2022.acl-long.119</url>
       <bibkey>shi-etal-2022-searching</bibkey>
+      <doi>10.18653/v1/2022.acl-long.119</doi>
     </paper>
     <paper id="120">
       <title>Skill Induction and Planning with Latent Language</title>
@@ -1751,6 +1870,7 @@
       <url hash="3878aea2">2022.acl-long.120</url>
       <bibkey>sharma-etal-2022-skill</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/alfred">ALFRED</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.120</doi>
     </paper>
     <paper id="121">
       <title><fixed-case>F</fixed-case>ully-<fixed-case>S</fixed-case>emantic <fixed-case>P</fixed-case>arsing and <fixed-case>G</fixed-case>eneration: the <fixed-case>B</fixed-case>abel<fixed-case>N</fixed-case>et <fixed-case>M</fixed-case>eaning <fixed-case>R</fixed-case>epresentation</title>
@@ -1762,6 +1882,7 @@
       <url hash="d3694afa">2022.acl-long.121</url>
       <bibkey>martinez-lorenzo-etal-2022-fully</bibkey>
       <pwccode url="https://github.com/sapienzanlp/bmr" additional="false">sapienzanlp/bmr</pwccode>
+      <doi>10.18653/v1/2022.acl-long.121</doi>
     </paper>
     <paper id="122">
       <title>Leveraging Similar Users for Personalized Language Modeling with Limited Data</title>
@@ -1774,6 +1895,7 @@
       <abstract>Personalized language models are designed and trained to capture language patterns specific to individual users. This makes them more accurate at predicting what a user will write. However, when a new user joins a platform and not enough text is available, it is harder to build effective personalized language models. We propose a solution for this problem, using a model trained on users that are similar to a new user. In this paper, we explore strategies for finding the similarity between new users and existing ones and methods for using the data from existing users who are a good match. We further explore the trade-off between available data for new users and how well their language can be modeled.</abstract>
       <url hash="bbb647b3">2022.acl-long.122</url>
       <bibkey>welch-etal-2022-leveraging</bibkey>
+      <doi>10.18653/v1/2022.acl-long.122</doi>
     </paper>
     <paper id="123">
       <title><fixed-case>DEEP</fixed-case>: <fixed-case>DE</fixed-case>noising Entity Pre-training for Neural Machine Translation</title>
@@ -1786,6 +1908,7 @@
       <url hash="bba876c3">2022.acl-long.123</url>
       <bibkey>hu-etal-2022-deep</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.123</doi>
     </paper>
     <paper id="124">
       <title>Multi-Modal Sarcasm Detection via Cross-Modal Graph Convolutional Network</title>
@@ -1802,6 +1925,7 @@
       <url hash="410ea4f9">2022.acl-long.124</url>
       <attachment type="software" hash="2a33318b">2022.acl-long.124.software.zip</attachment>
       <bibkey>liang-etal-2022-multi</bibkey>
+      <doi>10.18653/v1/2022.acl-long.124</doi>
     </paper>
     <paper id="125">
       <title>Composable Sparse Fine-Tuning for Cross-Lingual Transfer</title>
@@ -1817,6 +1941,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/masakhaner">MasakhaNER</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.125</doi>
     </paper>
     <paper id="126">
       <title>Toward Annotator Group Bias in Crowdsourcing</title>
@@ -1832,6 +1957,7 @@
       <abstract>Crowdsourcing has emerged as a popular approach for collecting annotated data to train supervised machine learning models. However, annotator bias can lead to defective annotations. Though there are a few works investigating individual annotator bias, the group effects in annotators are largely overlooked. In this work, we reveal that annotators within the same demographic group tend to show consistent group bias in annotation tasks and thus we conduct an initial study on annotator group bias. We first empirically verify the existence of annotator group bias in various real-world crowdsourcing datasets. Then, we develop a novel probabilistic graphical framework GroupAnno to capture annotator group bias with an extended Expectation Maximization (EM) algorithm. We conduct experiments on both synthetic and real-world datasets. Experimental results demonstrate the effectiveness of our model in modeling annotator group bias in label aggregation and model learning over competitive baselines.</abstract>
       <url hash="798b0955">2022.acl-long.126</url>
       <bibkey>liu-etal-2022-toward</bibkey>
+      <doi>10.18653/v1/2022.acl-long.126</doi>
     </paper>
     <paper id="127">
       <title>Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation</title>
@@ -1847,6 +1973,7 @@
       <pwccode url="https://github.com/mgaido91/FBK-fairseq-ST" additional="false">mgaido91/FBK-fairseq-ST</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/europarl-st">Europarl-ST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winobias">WinoBias</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.127</doi>
     </paper>
     <paper id="128">
       <title>Answering Open-Domain Multi-Answer Questions via a Recall-then-Verify Framework</title>
@@ -1858,6 +1985,7 @@
       <bibkey>shao-huang-2022-answering</bibkey>
       <pwccode url="https://github.com/zhihongshao/rectify" additional="false">zhihongshao/rectify</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.128</doi>
     </paper>
     <paper id="129">
       <title>Probing as Quantifying Inductive Bias</title>
@@ -1871,6 +1999,7 @@
       <bibkey>immer-etal-2022-probing</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/boolq">BoolQ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.129</doi>
     </paper>
     <paper id="130">
       <title>Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency</title>
@@ -1892,6 +2021,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/tydi-qa">TyDi QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.130</doi>
     </paper>
     <paper id="131">
       <title><fixed-case>GPT</fixed-case>-<fixed-case>D</fixed-case>: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models</title>
@@ -1906,6 +2036,7 @@
       <attachment type="software" hash="324f563f">2022.acl-long.131.software.zip</attachment>
       <bibkey>li-etal-2022-gpt</bibkey>
       <pwccode url="https://github.com/linguisticanomalies/hammer-nets" additional="true">linguisticanomalies/hammer-nets</pwccode>
+      <doi>10.18653/v1/2022.acl-long.131</doi>
     </paper>
     <paper id="132">
       <title>An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models</title>
@@ -1921,6 +2052,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/crows-pairs">CrowS-Pairs</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/stereoset">StereoSet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.132</doi>
     </paper>
     <paper id="133">
       <title>Exploring and Adapting <fixed-case>C</fixed-case>hinese <fixed-case>GPT</fixed-case> to <fixed-case>P</fixed-case>inyin Input Method</title>
@@ -1937,6 +2069,7 @@
       <url hash="a259ca7b">2022.acl-long.133</url>
       <bibkey>tan-etal-2022-exploring</bibkey>
       <pwccode url="https://github.com/VisualJoyce/Transformers4IME" additional="false">VisualJoyce/Transformers4IME</pwccode>
+      <doi>10.18653/v1/2022.acl-long.133</doi>
     </paper>
     <paper id="134">
       <title>Enhancing Cross-lingual Natural Language Inference by Prompt-learning from Cross-lingual Templates</title>
@@ -1951,6 +2084,7 @@
       <bibkey>qi-etal-2022-enhancing</bibkey>
       <pwccode url="https://github.com/qikunxun/pct" additional="false">qikunxun/pct</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/paws-x">PAWS-X</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.134</doi>
     </paper>
     <paper id="135">
       <title>Sense Embeddings are also Biased – Evaluating Social Biases in Static and Contextualised Sense Embeddings</title>
@@ -1964,6 +2098,7 @@
       <bibkey>zhou-etal-2022-sense</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/crows-pairs">CrowS-Pairs</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/stereoset">StereoSet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.135</doi>
     </paper>
     <paper id="136">
       <title>Hybrid Semantics for Goal-Directed Natural Language Generation</title>
@@ -1973,6 +2108,7 @@
       <abstract>We consider the problem of generating natural language given a communicative goal and a world description. We ask the question: is it possible to combine complementary meaning representations to scale a goal-directed NLG system without losing expressiveness? In particular, we consider using two meaning representations, one based on logical semantics and the other based on distributional semantics. We build upon an existing goal-directed generation system, S-STRUCT, which models sentence generation as planning in a Markov decision process. We develop a hybrid approach, which uses distributional semantics to quickly and imprecisely add the main elements of the sentence and then uses first-order logic based semantics to more slowly add the precise details. We find that our hybrid method allows S-STRUCT’s generation to scale significantly better in early phases of generation and that the hybrid can often generate sentences with the same quality as S-STRUCT in substantially less time. However, we also observe and give insight into cases where the imprecision in distributional semantics leads to generation that is not as good as using pure logical semantics.</abstract>
       <url hash="d1f82e37">2022.acl-long.136</url>
       <bibkey>baumler-ray-2022-hybrid</bibkey>
+      <doi>10.18653/v1/2022.acl-long.136</doi>
     </paper>
     <paper id="137">
       <title>Predicting Intervention Approval in Clinical Trials through Multi-Document Summarization</title>
@@ -1982,6 +2118,7 @@
       <abstract>Clinical trials offer a fundamental opportunity to discover new treatments and advance the medical knowledge. However, the uncertainty of the outcome of a trial can lead to unforeseen costs and setbacks. In this study, we propose a new method to predict the effectiveness of an intervention in a clinical trial. Our method relies on generating an informative summary from multiple documents available in the literature about the intervention under study. Specifically, our method first gathers all the abstracts of PubMed articles related to the intervention. Then, an evidence sentence, which conveys information about the effectiveness of the intervention, is extracted automatically from each abstract. Based on the set of evidence sentences extracted from the abstracts, a short summary about the intervention is constructed. Finally, the produced summaries are used to train a BERT-based classifier, in order to infer the effectiveness of an intervention. To evaluate our proposed method, we introduce a new dataset which is a collection of clinical trials together with their associated PubMed articles. Our experiments, demonstrate the effectiveness of producing short informative summaries and using them to predict the effectiveness of an intervention.</abstract>
       <url hash="8cecf816">2022.acl-long.137</url>
       <bibkey>katsimpras-paliouras-2022-predicting</bibkey>
+      <doi>10.18653/v1/2022.acl-long.137</doi>
     </paper>
     <paper id="138">
       <title><fixed-case>B</fixed-case>i<fixed-case>TIIMT</fixed-case>: A Bilingual Text-infilling Method for Interactive Machine Translation</title>
@@ -1997,6 +2134,7 @@
       <url hash="7ebaf499">2022.acl-long.138</url>
       <bibkey>xiao-etal-2022-bitiimt</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wmt-2014">WMT 2014</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.138</doi>
     </paper>
     <paper id="139">
       <title>Distributionally Robust Finetuning <fixed-case>BERT</fixed-case> for Covariate Drift in Spoken Language Understanding</title>
@@ -2007,6 +2145,7 @@
       <abstract>In this study, we investigate robustness against covariate drift in spoken language understanding (SLU). Covariate drift can occur in SLUwhen there is a drift between training and testing regarding what users request or how they request it. To study this we propose a method that exploits natural variations in data to create a covariate drift in SLU datasets. Experiments show that a state-of-the-art BERT-based model suffers performance loss under this drift. To mitigate the performance loss, we investigate distributionally robust optimization (DRO) for finetuning BERT-based models. We discuss some recent DRO methods, propose two new variants and empirically show that DRO improves robustness under drift.</abstract>
       <url hash="af18afc6">2022.acl-long.139</url>
       <bibkey>broscheit-etal-2022-distributionally</bibkey>
+      <doi>10.18653/v1/2022.acl-long.139</doi>
     </paper>
     <paper id="140">
       <title>Enhancing <fixed-case>C</fixed-case>hinese Pre-trained Language Model via Heterogeneous Linguistics Graph</title>
@@ -2026,6 +2165,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/cmrc">CMRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cmrc-2018">CMRC 2018</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/drcd">DRCD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.140</doi>
     </paper>
     <paper id="141">
       <title>Divide and Denoise: Learning from Noisy Labels in Fine-Grained Entity Typing with Cluster-Wise Loss Correction</title>
@@ -2037,6 +2177,7 @@
       <abstract>Fine-grained Entity Typing (FET) has made great progress based on distant supervision but still suffers from label noise. Existing FET noise learning methods rely on prediction distributions in an instance-independent manner, which causes the problem of confirmation bias. In this work, we propose a clustering-based loss correction framework named Feature Cluster Loss Correction (FCLC), to address these two problems. FCLC first train a coarse backbone model as a feature extractor and noise estimator. Loss correction is then applied to each feature cluster, learning directly from the noisy labels. Experimental results on three public datasets show that FCLC achieves the best performance over existing competitive systems. Auxiliary experiments further demonstrate that FCLC is stable to hyperparameters and it does help mitigate confirmation bias. We also find that in the extreme case of no clean data, the FCLC framework still achieves competitive performance.</abstract>
       <url hash="684490d1">2022.acl-long.141</url>
       <bibkey>pang-etal-2022-divide</bibkey>
+      <doi>10.18653/v1/2022.acl-long.141</doi>
     </paper>
     <paper id="142">
       <title>Towards Robustness of Text-to-<fixed-case>SQL</fixed-case> Models Against Natural and Realistic Adversarial Table Perturbation</title>
@@ -2055,6 +2196,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sparc">SParC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikisql">WikiSQL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.142</doi>
     </paper>
     <paper id="143">
       <title>Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation</title>
@@ -2067,6 +2209,7 @@
       <pwccode url="https://github.com/ictnlp/cokd" additional="false">ictnlp/cokd</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cifar-10">CIFAR-10</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cifar-100">CIFAR-100</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.143</doi>
     </paper>
     <paper id="144">
       <title>Metaphors in Pre-Trained Language Models: Probing and Generalization Across Datasets and Languages</title>
@@ -2079,6 +2222,7 @@
       <attachment type="software" hash="7de48646">2022.acl-long.144.software.zip</attachment>
       <bibkey>aghazadeh-etal-2022-metaphors</bibkey>
       <pwccode url="https://github.com/ehsanaghazadeh/metaphors_in_plms" additional="false">ehsanaghazadeh/metaphors_in_plms</pwccode>
+      <doi>10.18653/v1/2022.acl-long.144</doi>
     </paper>
     <paper id="145">
       <title>Discrete Opinion Tree Induction for Aspect-based Sentiment Analysis</title>
@@ -2092,6 +2236,7 @@
       <attachment type="software" hash="773583fe">2022.acl-long.145.software.zip</attachment>
       <bibkey>chen-etal-2022-discrete</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mams">MAMS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.145</doi>
     </paper>
     <paper id="146">
       <title>Investigating Non-local Features for Neural Constituency Parsing</title>
@@ -2104,6 +2249,7 @@
       <attachment type="software" hash="8f949b99">2022.acl-long.146.software.zip</attachment>
       <bibkey>cui-etal-2022-investigating</bibkey>
       <pwccode url="https://github.com/ringos/nfc-parser" additional="false">ringos/nfc-parser</pwccode>
+      <doi>10.18653/v1/2022.acl-long.146</doi>
     </paper>
     <paper id="147">
       <title>Learning from Sibling Mentions with Scalable Graph Inference in Fine-Grained Entity Typing</title>
@@ -2119,6 +2265,7 @@
       <url hash="663f812f">2022.acl-long.147</url>
       <attachment type="software" hash="4282ac69">2022.acl-long.147.software.zip</attachment>
       <bibkey>chen-etal-2022-learning-sibling</bibkey>
+      <doi>10.18653/v1/2022.acl-long.147</doi>
     </paper>
     <paper id="148">
       <title>A Variational Hierarchical Model for Neural Cross-Lingual Summarization</title>
@@ -2136,6 +2283,7 @@
       <bibkey>liang-etal-2022-variational</bibkey>
       <pwccode url="https://github.com/xl2248/vhm" additional="false">xl2248/vhm</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/lcsts">LCSTS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.148</doi>
     </paper>
     <paper id="149">
       <title>On the Robustness of Question Rewriting Systems to Questions of Varying Hardness</title>
@@ -2149,6 +2297,7 @@
       <bibkey>ye-etal-2022-robustness</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/canard">CANARD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.149</doi>
     </paper>
     <paper id="150">
       <title><fixed-case>O</fixed-case>pen<fixed-case>H</fixed-case>ands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages</title>
@@ -2165,6 +2314,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/autsl">AUTSL</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/gsl">GSL</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wlasl">WLASL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.150</doi>
     </paper>
     <paper id="151">
       <title>bert2<fixed-case>BERT</fixed-case>: Towards Reusable Pretrained Language Models</title>
@@ -2185,6 +2335,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/bookcorpus">BookCorpus</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cola">CoLA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.151</doi>
     </paper>
     <paper id="152">
       <title>Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis</title>
@@ -2196,6 +2347,7 @@
       <url hash="23419276">2022.acl-long.152</url>
       <bibkey>ling-etal-2022-vision</bibkey>
       <pwccode url="https://github.com/nustm/vlp-mabsa" additional="false">nustm/vlp-mabsa</pwccode>
+      <doi>10.18653/v1/2022.acl-long.152</doi>
     </paper>
     <paper id="153">
       <title>"<fixed-case>Y</fixed-case>ou might think about slightly revising the title”: Identifying Hedges in Peer-tutoring Interactions</title>
@@ -2206,6 +2358,7 @@
       <abstract>Hedges have an important role in the management of rapport. In peer-tutoring, they are notably used by tutors in dyads experiencing low rapport to tone down the impact of instructions and negative feedback.Pursuing the objective of building a tutoring agent that manages rapport with teenagers in order to improve learning, we used a multimodal peer-tutoring dataset to construct a computational framework for identifying hedges. We compared approaches relying on pre-trained resources with others that integrate insights from the social science literature. Our best performance involved a hybrid approach that outperforms the existing baseline while being easier to interpret. We employ a model explainability tool to explore the features that characterize hedges in peer-tutoring conversations, and we identify some novel features, and the benefits of a such a hybrid model approach.</abstract>
       <url hash="d4bb59f7">2022.acl-long.153</url>
       <bibkey>raphalen-etal-2022-might</bibkey>
+      <doi>10.18653/v1/2022.acl-long.153</doi>
     </paper>
     <paper id="154">
       <title>Efficient Cluster-Based <tex-math>k</tex-math>-Nearest-Neighbor Machine Translation</title>
@@ -2221,6 +2374,7 @@
       <bibkey>wang-etal-2022-efficient</bibkey>
       <pwccode url="https://github.com/tjunlp-lab/pckmt" additional="false">tjunlp-lab/pckmt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.154</doi>
     </paper>
     <paper id="155">
       <title>Headed-Span-Based Projective Dependency Parsing</title>
@@ -2232,6 +2386,7 @@
       <bibkey>yang-tu-2022-headed</bibkey>
       <pwccode url="https://github.com/sustcsonglin/span-based-dependency-parsing" additional="false">sustcsonglin/span-based-dependency-parsing</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.155</doi>
     </paper>
     <paper id="156">
       <title>Decoding Part-of-Speech from Human <fixed-case>EEG</fixed-case> Signals</title>
@@ -2243,6 +2398,7 @@
       <abstract>This work explores techniques to predict Part-of-Speech (PoS) tags from neural signals measured at millisecond resolution with electroencephalography (EEG) during text reading. We first show that information about word length, frequency and word class is encoded by the brain at different post-stimulus latencies. We then demonstrate that pre-training on averaged EEG data and data augmentation techniques boost PoS decoding accuracy for single EEG trials. Finally, applying optimised temporally-resolved decoding techniques we show that Transformers substantially outperform linear-SVMs on PoS tagging of unigram and bigram data.</abstract>
       <url hash="abb75413">2022.acl-long.156</url>
       <bibkey>murphy-etal-2022-decoding</bibkey>
+      <doi>10.18653/v1/2022.acl-long.156</doi>
     </paper>
     <paper id="157">
       <title>Robust Lottery Tickets for Pre-trained Language Models</title>
@@ -2263,6 +2419,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.157</doi>
     </paper>
     <paper id="158">
       <title>Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification</title>
@@ -2282,6 +2439,7 @@
       <pwccode url="https://github.com/thunlp/knowledgeableprompttuning" additional="true">thunlp/knowledgeableprompttuning</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/c4">C4</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.158</doi>
     </paper>
     <paper id="159">
       <title>Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages</title>
@@ -2301,6 +2459,7 @@
       <pwccode url="https://github.com/thunlp/crosset" additional="false">thunlp/crosset</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/few-nerd">Few-NERD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/open-entity-1">Open Entity</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.159</doi>
     </paper>
     <paper id="160">
       <title><fixed-case>MELM</fixed-case>: Data Augmentation with Masked Entity Language Modeling for Low-Resource <fixed-case>NER</fixed-case></title>
@@ -2317,6 +2476,7 @@
       <attachment type="software" hash="01d166f3">2022.acl-long.160.software.zip</attachment>
       <bibkey>zhou-etal-2022-melm</bibkey>
       <pwccode url="https://github.com/randyzhouran/melm" additional="false">randyzhouran/melm</pwccode>
+      <doi>10.18653/v1/2022.acl-long.160</doi>
     </paper>
     <paper id="161">
       <title><fixed-case>W</fixed-case>ord2<fixed-case>B</fixed-case>ox: Capturing Set-Theoretic Semantics of Words using Box Embeddings</title>
@@ -2332,6 +2492,7 @@
       <url hash="bf3a61da">2022.acl-long.161</url>
       <attachment type="software" hash="125673f2">2022.acl-long.161.software.zip</attachment>
       <bibkey>dasgupta-etal-2022-word2box</bibkey>
+      <doi>10.18653/v1/2022.acl-long.161</doi>
     </paper>
     <paper id="162">
       <title><fixed-case>IAM</fixed-case>: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks</title>
@@ -2348,6 +2509,7 @@
       <bibkey>cheng-etal-2022-iam</bibkey>
       <pwccode url="https://github.com/liyingcheng95/iam" additional="false">liyingcheng95/iam</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/iam-dataset">IAM Dataset</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.162</doi>
     </paper>
     <paper id="163">
       <title><fixed-case>PLANET</fixed-case>: Dynamic Content Planning in Autoregressive Transformers for Long-form Text Generation</title>
@@ -2361,6 +2523,7 @@
       <abstract>Despite recent progress of pre-trained language models on generating fluent text, existing methods still suffer from incoherence problems in long-form text generation tasks that require proper content control and planning to form a coherent high-level logical flow. In this work, we propose PLANET, a novel generation framework leveraging autoregressive self-attention mechanism to conduct content planning and surface realization dynamically. To guide the generation of output sentences, our framework enriches the Transformer decoder with latent representations to maintain sentence-level semantic plans grounded by bag-of-words. Moreover, we introduce a new coherence-based contrastive learning objective to further improve the coherence of output. Extensive experiments are conducted on two challenging long-form text generation tasks including counterargument generation and opinion article generation. Both automatic and human evaluations show that our method significantly outperforms strong baselines and generates more coherent texts with richer contents.</abstract>
       <url hash="e3b8e89b">2022.acl-long.163</url>
       <bibkey>hu-etal-2022-planet</bibkey>
+      <doi>10.18653/v1/2022.acl-long.163</doi>
     </paper>
     <paper id="164">
       <title><fixed-case>CTRLE</fixed-case>val: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation</title>
@@ -2376,6 +2539,7 @@
       <url hash="66e0b03c">2022.acl-long.164</url>
       <attachment type="software" hash="eca9ee80">2022.acl-long.164.software.zip</attachment>
       <bibkey>ke-etal-2022-ctrleval</bibkey>
+      <doi>10.18653/v1/2022.acl-long.164</doi>
     </paper>
     <paper id="165">
       <title>Beyond the Granularity: Multi-Perspective Dialogue Collaborative Selection for Dialogue State Tracking</title>
@@ -2390,6 +2554,7 @@
       <attachment type="software" hash="81cba952">2022.acl-long.165.software.zip</attachment>
       <bibkey>guo-etal-2022-beyond</bibkey>
       <pwccode url="https://github.com/guojinyu88/dicos-master" additional="false">guojinyu88/dicos-master</pwccode>
+      <doi>10.18653/v1/2022.acl-long.165</doi>
     </paper>
     <paper id="166">
       <title>Are Prompt-based Models Clueless?</title>
@@ -2403,6 +2568,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.166</doi>
     </paper>
     <paper id="167">
       <title>Learning Confidence for Transformer-based Neural Machine Translation</title>
@@ -2417,6 +2583,7 @@
       <attachment type="software" hash="787d250a">2022.acl-long.167.software.zip</attachment>
       <bibkey>lu-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/yulu-dada/learned-conf-nmt" additional="false">yulu-dada/learned-conf-nmt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.167</doi>
     </paper>
     <paper id="168">
       <title>Things not Written in Text: Exploring Spatial Commonsense from Visual Signals</title>
@@ -2432,6 +2599,7 @@
       <pwccode url="https://github.com/xxxiaol/spatial-commonsense" additional="false">xxxiaol/spatial-commonsense</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/relative-size">Relative Size</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.168</doi>
     </paper>
     <paper id="169">
       <title>Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation</title>
@@ -2447,6 +2615,7 @@
       <url hash="9c821782">2022.acl-long.169</url>
       <bibkey>zhang-etal-2022-conditional</bibkey>
       <pwccode url="https://github.com/songmzhang/cbmi" additional="false">songmzhang/cbmi</pwccode>
+      <doi>10.18653/v1/2022.acl-long.169</doi>
     </paper>
     <paper id="170">
       <title><fixed-case>C</fixed-case>luster<fixed-case>F</fixed-case>ormer: Neural Clustering Attention for Efficient and Effective Transformer</title>
@@ -2465,6 +2634,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikiqa">WikiQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.170</doi>
     </paper>
     <paper id="171">
       <title>Bottom-Up Constituency Parsing and Nested Named Entity Recognition with Pointer Networks</title>
@@ -2477,6 +2647,7 @@
       <pwccode url="https://github.com/sustcsonglin/pointer-net-for-nested" additional="false">sustcsonglin/pointer-net-for-nested</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/genia">GENIA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.171</doi>
     </paper>
     <paper id="172">
       <title>Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation</title>
@@ -2489,6 +2660,7 @@
       <abstract>Knowledge distillation (KD) is the preliminary step for training non-autoregressive translation (NAT) models, which eases the training of NAT models at the cost of losing important information for translating low-frequency words. In this work, we provide an appealing alternative for NAT – <i>monolingual KD</i>, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data. Monolingual KD is able to transfer both the knowledge of the original bilingual data (implicitly encoded in the trained AT teacher model) and that of the new monolingual data to the NAT student model. Extensive experiments on eight WMT benchmarks over two advanced NAT models show that monolingual KD consistently outperforms the standard KD by improving low-frequency word translation, without introducing any computational cost. Monolingual KD enjoys desirable expandability, which can be further enhanced (when given more computational budget) by combining with the standard KD, a reverse monolingual KD, or enlarging the scale of monolingual data. Extensive analyses demonstrate that these techniques can be used together profitably to further recall the useful information lost in the standard KD. Encouragingly, combining with standard KD, our approach achieves 30.4 and 34.1 BLEU points on the WMT14 English-German and German-English datasets, respectively. Our code and trained models are freely available at <url>https://github.com/alphadl/RLFW-NAT.mono</url>.</abstract>
       <url hash="a7669577">2022.acl-long.172</url>
       <bibkey>ding-etal-2022-redistributing</bibkey>
+      <doi>10.18653/v1/2022.acl-long.172</doi>
     </paper>
     <paper id="173">
       <title>Dependency Parsing as <fixed-case>MRC</fixed-case>-based Span-Span Prediction</title>
@@ -2507,6 +2679,7 @@
       <pwccode url="https://github.com/ShannonAI/mrc-for-dependency-parsing" additional="false">ShannonAI/mrc-for-dependency-parsing</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.173</doi>
     </paper>
     <paper id="174">
       <title>Adversarial Soft Prompt Tuning for Cross-Domain Sentiment Analysis</title>
@@ -2516,6 +2689,7 @@
       <abstract>Cross-domain sentiment analysis has achieved promising results with the help of pre-trained language models. As GPT-3 appears, prompt tuning has been widely explored to enable better semantic modeling in many natural language processing tasks. However, directly using a fixed predefined template for cross-domain research cannot model different distributions of the <tex-math>\operatorname{[MASK]}</tex-math> token in different domains, thus making underuse of the prompt tuning technique. In this paper, we propose a novel <b>Ad</b>versarial <b>S</b>oft <b>P</b>rompt <b>T</b>uning method (AdSPT) to better model cross-domain sentiment analysis. On the one hand, AdSPT adopts separate soft prompts instead of hard templates to learn different vectors for different domains, thus alleviating the domain discrepancy of the <tex-math>\operatorname{[MASK]}</tex-math> token in the masked language modeling task. On the other hand, AdSPT uses a novel domain adversarial training strategy to learn domain-invariant representations between each source domain and the target domain. Experiments on a publicly available sentiment analysis dataset show that our model achieves the new state-of-the-art results for both single-source domain adaptation and multi-source domain adaptation.</abstract>
       <url hash="eefcb7fe">2022.acl-long.174</url>
       <bibkey>wu-shi-2022-adversarial</bibkey>
+      <doi>10.18653/v1/2022.acl-long.174</doi>
     </paper>
     <paper id="175">
       <title>Generating Scientific Claims for Zero-Shot Scientific Fact Checking</title>
@@ -2533,6 +2707,7 @@
       <pwccode url="https://github.com/allenai/scientific-claim-generation" additional="false">allenai/scientific-claim-generation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scifact">SciFact</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.175</doi>
     </paper>
     <paper id="176">
       <title>Modeling Dual Read/Write Paths for Simultaneous Machine Translation</title>
@@ -2543,6 +2718,7 @@
       <url hash="7c2d6708">2022.acl-long.176</url>
       <bibkey>zhang-feng-2022-modeling</bibkey>
       <pwccode url="https://github.com/ictnlp/dual-paths" additional="false">ictnlp/dual-paths</pwccode>
+      <doi>10.18653/v1/2022.acl-long.176</doi>
     </paper>
     <paper id="177">
       <title><fixed-case>E</fixed-case>xt<fixed-case>E</fixed-case>n<fixed-case>D</fixed-case>: Extractive Entity Disambiguation</title>
@@ -2555,6 +2731,7 @@
       <bibkey>barba-etal-2022-extend</bibkey>
       <pwccode url="https://github.com/sapienzanlp/extend" additional="false">sapienzanlp/extend</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/aida-conll-yago">AIDA CoNLL-YAGO</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.177</doi>
     </paper>
     <paper id="178">
       <title>Hierarchical Sketch Induction for Paraphrase Generation</title>
@@ -2570,6 +2747,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paralex">Paralex</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quora-question-pairs">Quora Question Pairs</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.178</doi>
     </paper>
     <paper id="179">
       <title>Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction</title>
@@ -2585,6 +2763,7 @@
       <bibkey>kolluru-etal-2022-alignment</bibkey>
       <pwccode url="https://github.com/dair-iitd/moie" additional="false">dair-iitd/moie</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/x-srl">X-SRL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.179</doi>
     </paper>
     <paper id="180">
       <title>Text-to-Table: A New Way of Information Extraction</title>
@@ -2598,6 +2777,7 @@
       <pwccode url="https://github.com/shirley-wu/text_to_table" additional="false">shirley-wu/text_to_table</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/rotowire">RotoWire</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikibio">WikiBio</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.180</doi>
     </paper>
     <paper id="181">
       <title>Accelerating Code Search with Deep Hashing and Code Classification</title>
@@ -2613,6 +2793,7 @@
       <url hash="3c71b13a">2022.acl-long.181</url>
       <bibkey>gu-etal-2022-accelerating</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/codesearchnet">CodeSearchNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.181</doi>
     </paper>
     <paper id="182">
       <title>Other Roles Matter! Enhancing Role-Oriented Dialogue Summarization via Role Interactions</title>
@@ -2628,6 +2809,7 @@
       <attachment type="software" hash="b432ecaf">2022.acl-long.182.software.zip</attachment>
       <bibkey>lin-etal-2022-roles</bibkey>
       <pwccode url="https://github.com/xiaolinandy/rods" additional="false">xiaolinandy/rods</pwccode>
+      <doi>10.18653/v1/2022.acl-long.182</doi>
     </paper>
     <paper id="183">
       <title><fixed-case>C</fixed-case>lar<fixed-case>ET</fixed-case>: Pre-training a Correlation-Aware Context-To-Event Transformer for Event-Centric Generation and Classification</title>
@@ -2644,6 +2826,7 @@
       <pwccode url="https://github.com/yczhou001/ClarET" additional="false">yczhou001/ClarET</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.183</doi>
     </paper>
     <paper id="184">
       <title>Measuring and Mitigating Name Biases in Neural Machine Translation</title>
@@ -2654,6 +2837,7 @@
       <abstract>Neural Machine Translation (NMT) systems exhibit problematic biases, such as stereotypical gender bias in the translation of occupation terms into languages with grammatical gender. In this paper we describe a new source of bias prevalent in NMT systems, relating to translations of sentences containing person names. To correctly translate such sentences, a NMT system needs to determine the gender of the name. We show that leading systems are particularly poor at this task, especially for female given names. This bias is deeper than given name gender: we show that the translation of terms with ambiguous sentiment can also be affected by person names, and the same holds true for proper nouns denoting race. To mitigate these biases we propose a simple but effective data augmentation method based on randomly switching entities during translation, which effectively eliminates the problem without any effect on translation quality.</abstract>
       <url hash="0e3b38c6">2022.acl-long.184</url>
       <bibkey>wang-etal-2022-measuring</bibkey>
+      <doi>10.18653/v1/2022.acl-long.184</doi>
     </paper>
     <paper id="185">
       <title>Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation</title>
@@ -2668,6 +2852,7 @@
       <abstract>In this paper, we present a substantial step in better understanding the SOTA sequence-to-sequence (Seq2Seq) pretraining for neural machine translation (NMT). We focus on studying the impact of the jointly pretrained decoder, which is the main difference between Seq2Seq pretraining and previous encoder-based pretraining approaches for NMT. By carefully designing experiments on three language pairs, we find that Seq2Seq pretraining is a double-edged sword: On one hand, it helps NMT models to produce more diverse translations and reduce adequacy-related translation errors. On the other hand, the discrepancies between Seq2Seq pretraining and NMT finetuning limit the translation quality (i.e., domain discrepancy) and induce the over-estimation issue (i.e., objective discrepancy). Based on these observations, we further propose simple and effective strategies, named in-domain pretraining and input adaptation to remedy the domain and objective discrepancies, respectively. Experimental results on several language pairs show that our approach can consistently improve both translation performance and model robustness upon Seq2Seq pretraining.</abstract>
       <url hash="ca619454">2022.acl-long.185</url>
       <bibkey>wang-etal-2022-understanding</bibkey>
+      <doi>10.18653/v1/2022.acl-long.185</doi>
     </paper>
     <paper id="186">
       <title><fixed-case>MSCTD</fixed-case>: A Multimodal Sentiment Chat Translation Dataset</title>
@@ -2684,6 +2869,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/bmeld">BMELD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/meld">MELD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/openvidial">OpenViDial</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.186</doi>
     </paper>
     <paper id="187">
       <title>Learning Disentangled Textual Representations via Statistical Measures of Similarity</title>
@@ -2696,6 +2882,7 @@
       <url hash="11e3a96f">2022.acl-long.187</url>
       <attachment type="software" hash="84afe76c">2022.acl-long.187.software.zip</attachment>
       <bibkey>colombo-etal-2022-learning</bibkey>
+      <doi>10.18653/v1/2022.acl-long.187</doi>
     </paper>
     <paper id="188">
       <title>On the Sensitivity and Stability of Model Interpretations in <fixed-case>NLP</fixed-case></title>
@@ -2710,6 +2897,7 @@
       <pwccode url="https://github.com/uclanlp/nlp-interpretation-faithfulness" additional="false">uclanlp/nlp-interpretation-faithfulness</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.188</doi>
     </paper>
     <paper id="189">
       <title>Down and Across: Introducing Crossword-Solving as a New <fixed-case>NLP</fixed-case> Benchmark</title>
@@ -2721,6 +2909,7 @@
       <abstract>Solving crossword puzzles requires diverse reasoning capabilities, access to a vast amount of knowledge about language and the world, and the ability to satisfy the constraints imposed by the structure of the puzzle. In this work, we introduce solving crossword puzzles as a new natural language understanding task. We release a corpus of crossword puzzles collected from the New York Times daily crossword spanning 25 years and comprised of a total of around nine thousand puzzles. These puzzles include a diverse set of clues: historic, factual, word meaning, synonyms/antonyms, fill-in-the-blank, abbreviations, prefixes/suffixes, wordplay, and cross-lingual, as well as clues that depend on the answers to other clues. We separately release the clue-answer pairs from these puzzles as an open-domain question answering dataset containing over half a million unique clue-answer pairs. For the question answering task, our baselines include several sequence-to-sequence and retrieval-based generative models. We also introduce a non-parametric constraint satisfaction baseline for solving the entire crossword puzzle. Finally, we propose an evaluation framework which consists of several complementary performance metrics.</abstract>
       <url hash="5b3d208c">2022.acl-long.189</url>
       <bibkey>kulshreshtha-etal-2022-across</bibkey>
+      <doi>10.18653/v1/2022.acl-long.189</doi>
     </paper>
     <paper id="190">
       <title>Generating Data to Mitigate Spurious Correlations in Natural Language Inference Datasets</title>
@@ -2737,6 +2926,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/hans">HANS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multinli">MultiNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.190</doi>
     </paper>
     <paper id="191">
       <title><fixed-case>GL</fixed-case>-<fixed-case>CL</fixed-case>e<fixed-case>F</fixed-case>: A Global–Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding</title>
@@ -2753,6 +2943,7 @@
       <attachment type="software" hash="a10b6070">2022.acl-long.191.software.zip</attachment>
       <bibkey>qin-etal-2022-gl</bibkey>
       <pwccode url="https://github.com/lightchen233/gl-clef" additional="false">lightchen233/gl-clef</pwccode>
+      <doi>10.18653/v1/2022.acl-long.191</doi>
     </paper>
     <paper id="192">
       <title>Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource <fixed-case>NER</fixed-case></title>
@@ -2773,6 +2964,7 @@
       <bibkey>lee-etal-2022-good</bibkey>
       <pwccode url="https://github.com/ink-usc/fewner" additional="false">ink-usc/fewner</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bc5cdr">BC5CDR</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.192</doi>
     </paper>
     <paper id="193">
       <title>Contextual Representation Learning beyond Masked Language Modeling</title>
@@ -2790,6 +2982,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.193</doi>
     </paper>
     <paper id="194">
       <title>Efficient Hyper-parameter Search for Knowledge Graph Embedding</title>
@@ -2804,6 +2997,7 @@
       <pwccode url="https://github.com/automl-research/kgtuner" additional="false">automl-research/kgtuner</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fb15k-237">FB15k-237</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ogb">OGB</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.194</doi>
     </paper>
     <paper id="195">
       <title>A Meta-framework for Spatiotemporal Quantity Extraction from Text</title>
@@ -2817,6 +3011,7 @@
       <abstract>News events are often associated with quantities (e.g., the number of COVID-19 patients or the number of arrests in a protest), and it is often important to extract their type, time, and location from unstructured text in order to analyze these quantity events. This paper thus formulates the NLP problem of spatiotemporal quantity extraction, and proposes the first meta-framework for solving it. This meta-framework contains a formalism that decomposes the problem into several information extraction tasks, a shareable crowdsourcing pipeline, and transformer-based baseline models. We demonstrate the meta-framework in three domains—the COVID-19 pandemic, Black Lives Matter protests, and 2020 California wildfires—to show that the formalism is general and extensible, the crowdsourcing pipeline facilitates fast and high-quality data annotation, and the baseline system can handle spatiotemporal quantity extraction well enough to be practically useful. We release all resources for future research on this topic at https://github.com/steqe.</abstract>
       <url hash="880fce61">2022.acl-long.195</url>
       <bibkey>ning-etal-2022-meta</bibkey>
+      <doi>10.18653/v1/2022.acl-long.195</doi>
     </paper>
     <paper id="196">
       <title>Leveraging Visual Knowledge in Language Tasks: An Empirical Study on Intermediate Pre-training for Cross-Modal Knowledge Transfer</title>
@@ -2838,6 +3033,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/piqa">PIQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.196</doi>
     </paper>
     <paper id="197">
       <title>A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models</title>
@@ -2858,6 +3054,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ok-vqa">OK-VQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/nocaps">nocaps</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.197</doi>
     </paper>
     <paper id="198">
       <title>Continual Few-shot Relation Learning via Embedding Space Regularization and Data Augmentation</title>
@@ -2870,6 +3067,7 @@
       <bibkey>qin-joty-2022-continual</bibkey>
       <pwccode url="https://github.com/qcwthu/continual_fewshot_relation_learning" additional="false">qcwthu/continual_fewshot_relation_learning</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.198</doi>
     </paper>
     <paper id="199">
       <title>Variational Graph Autoencoding as Cheap Supervision for <fixed-case>AMR</fixed-case> Coreference Resolution</title>
@@ -2882,6 +3080,7 @@
       <url hash="312f40bb">2022.acl-long.199</url>
       <bibkey>li-etal-2022-variational</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/amr-bank">AMR Bank</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.199</doi>
     </paper>
     <paper id="200">
       <title>Identifying <fixed-case>C</fixed-case>hinese Opinion Expressions with Extremely-Noisy Crowdsourcing Annotations</title>
@@ -2897,6 +3096,7 @@
       <attachment type="software" hash="b212bea6">2022.acl-long.200.software.zip</attachment>
       <bibkey>zhang-etal-2022-identifying</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.200</doi>
     </paper>
     <paper id="201">
       <title>Sequence-to-Sequence Knowledge Graph Completion and Question Answering</title>
@@ -2915,6 +3115,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/webquestions">WebQuestions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webquestionssp">WebQuestionsSP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimovies">WikiMovies</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.201</doi>
     </paper>
     <paper id="202">
       <title>Learning to Mediate Disparities Towards Pragmatic Communication</title>
@@ -2926,6 +3127,7 @@
       <url hash="103417b9">2022.acl-long.202</url>
       <bibkey>bao-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/sled-group/pragmatic-rational-speaker" additional="false">sled-group/pragmatic-rational-speaker</pwccode>
+      <doi>10.18653/v1/2022.acl-long.202</doi>
     </paper>
     <paper id="203">
       <title>Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval</title>
@@ -2939,6 +3141,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.203</doi>
     </paper>
     <paper id="204">
       <title>Multimodal Dialogue Response Generation</title>
@@ -2958,6 +3161,7 @@
       <attachment type="software" hash="6d5ffd2f">2022.acl-long.204.software.zip</attachment>
       <bibkey>sun-etal-2022-multimodal</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imagenet">ImageNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.204</doi>
     </paper>
     <paper id="205">
       <title><fixed-case>CAKE</fixed-case>: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion</title>
@@ -2973,6 +3177,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/fb15k-237">FB15k-237</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/nell-995">NELL-995</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.205</doi>
     </paper>
     <paper id="206">
       <title>Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation</title>
@@ -2987,6 +3192,7 @@
       <url hash="6c119136">2022.acl-long.206</url>
       <attachment type="software" hash="081247d4">2022.acl-long.206.software.zip</attachment>
       <bibkey>zhou-etal-2022-confidence</bibkey>
+      <doi>10.18653/v1/2022.acl-long.206</doi>
     </paper>
     <paper id="207">
       <title><fixed-case>BRIO</fixed-case>: Bringing Order to Abstractive Summarization</title>
@@ -3001,6 +3207,7 @@
       <pwccode url="https://github.com/yixinl7/brio" additional="false">yixinl7/brio</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xsum">XSum</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.207</doi>
     </paper>
     <paper id="208">
       <title>Leveraging Relaxed Equilibrium by Lazy Transition for Sequence Modeling</title>
@@ -3012,6 +3219,7 @@
       <attachment type="software" hash="3e546fbd">2022.acl-long.208.software.zip</attachment>
       <bibkey>ai-fang-2022-leveraging</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/lambada">LAMBADA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.208</doi>
     </paper>
     <paper id="209">
       <title><fixed-case>FIBER</fixed-case>: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework</title>
@@ -3031,6 +3239,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/activitynet-captions">ActivityNet Captions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/vatex">VATEX</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.209</doi>
     </paper>
     <paper id="210">
       <title><fixed-case>K</fixed-case>en<fixed-case>M</fixed-case>e<fixed-case>SH</fixed-case>: Knowledge-enhanced End-to-end Biomedical Text Labelling</title>
@@ -3042,6 +3251,7 @@
       <url hash="bfa46388">2022.acl-long.210</url>
       <bibkey>wang-etal-2022-kenmesh</bibkey>
       <pwccode url="https://github.com/xdwang0726/kenmesh" additional="false">xdwang0726/kenmesh</pwccode>
+      <doi>10.18653/v1/2022.acl-long.210</doi>
     </paper>
     <paper id="211">
       <title>A Taxonomy of Empathetic Questions in Social Dialogs</title>
@@ -3055,6 +3265,7 @@
       <attachment type="software" hash="40897cc2">2022.acl-long.211.software.zip</attachment>
       <bibkey>svikhnushina-etal-2022-taxonomy</bibkey>
       <pwccode url="https://github.com/sea94/eqt" additional="false">sea94/eqt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.211</doi>
     </paper>
     <paper id="212">
       <title>Enhanced Multi-Channel Graph Convolutional Network for Aspect Sentiment Triplet Extraction</title>
@@ -3069,6 +3280,7 @@
       <attachment type="software" hash="41129d32">2022.acl-long.212.software.zip</attachment>
       <bibkey>chen-etal-2022-enhanced</bibkey>
       <pwccode url="https://github.com/ccchenhao997/emcgcn-aste" additional="false">ccchenhao997/emcgcn-aste</pwccode>
+      <doi>10.18653/v1/2022.acl-long.212</doi>
     </paper>
     <paper id="213">
       <title><fixed-case>P</fixed-case>roto<fixed-case>TE</fixed-case>x: Explaining Model Decisions with Prototype Tensors</title>
@@ -3082,6 +3294,7 @@
       <url hash="dcfd1662">2022.acl-long.213</url>
       <bibkey>das-etal-2022-prototex</bibkey>
       <pwccode url="https://github.com/anubrata/prototex" additional="false">anubrata/prototex</pwccode>
+      <doi>10.18653/v1/2022.acl-long.213</doi>
     </paper>
     <paper id="214">
       <title>Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data</title>
@@ -3099,6 +3312,7 @@
       <bibkey>zhou-etal-2022-show</bibkey>
       <pwccode url="https://github.com/shuyanzhou/wikihow_hierarchy" additional="false">shuyanzhou/wikihow_hierarchy</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/howto100m">HowTo100M</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.214</doi>
     </paper>
     <paper id="215">
       <title>Cross-Modal Discrete Representation Learning</title>
@@ -3115,6 +3329,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/imagenet">ImageNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/msr-vtt">MSR-VTT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/places205">Places205</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.215</doi>
     </paper>
     <paper id="216">
       <title>Improving Event Representation via Simultaneous Weakly Supervised Contrastive Learning and Clustering</title>
@@ -3130,6 +3345,7 @@
       <attachment type="software" hash="705b2227">2022.acl-long.216.software.zip</attachment>
       <bibkey>gao-etal-2022-improving</bibkey>
       <pwccode url="https://github.com/gaojun4ever/swcc4event" additional="false">gaojun4ever/swcc4event</pwccode>
+      <doi>10.18653/v1/2022.acl-long.216</doi>
     </paper>
     <paper id="217">
       <title>Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations</title>
@@ -3140,6 +3356,7 @@
       <url hash="31fefa89">2022.acl-long.217</url>
       <attachment type="software" hash="58d9df83">2022.acl-long.217.software.zip</attachment>
       <bibkey>wolfe-caliskan-2022-contrastive</bibkey>
+      <doi>10.18653/v1/2022.acl-long.217</doi>
     </paper>
     <paper id="218">
       <title><fixed-case>C</fixed-case>on<fixed-case>T</fixed-case>in<fixed-case>T</fixed-case>in: Continual Learning from Task Instructions</title>
@@ -3150,6 +3367,7 @@
       <abstract>The mainstream machine learning paradigms for NLP often work with two underlying presumptions. First, the target task is predefined and static; a system merely needs to learn to solve it exclusively. Second, the supervision of a task mainly comes from a set of labeled examples. A question arises: how to build a system that can keep learning new tasks from their instructions?This work defines a new learning paradigm ConTinTin (Continual Learning from Task Instructions), in which a system should learn a sequence of new tasks one by one, each task is explained by a piece of textual instruction. The system is required to (i) generate the expected outputs of a new task by learning from its instruction, (ii) transfer the knowledge acquired from upstream tasks to help solve downstream tasks (i.e., forward-transfer), and (iii) retain or even improve the performance on earlier tasks after learning new tasks (i.e., backward-transfer). This new problem is studied on a stream of more than 60 tasks, each equipped with an instruction. Technically, our method InstructionSpeak contains two strategies that make full use of task instructions to improve forward-transfer and backward-transfer: one is to learn from negative outputs, the other is to re-visit instructions of previous tasks. To our knowledge, this is the first time to study ConTinTin in NLP. In addition to the problem formulation and our promising approach, this work also contributes to providing rich analyses for the community to better understand this novel learning problem.</abstract>
       <url hash="0b09a832">2022.acl-long.218</url>
       <bibkey>yin-etal-2022-contintin</bibkey>
+      <doi>10.18653/v1/2022.acl-long.218</doi>
     </paper>
     <paper id="219">
       <title>Automated Crossword Solving</title>
@@ -3166,6 +3384,7 @@
       <attachment type="software" hash="b8982868">2022.acl-long.219.software.zip</attachment>
       <bibkey>wallace-etal-2022-automated</bibkey>
       <pwccode url="https://github.com/albertkx/berkeley-crossword-solver" additional="false">albertkx/berkeley-crossword-solver</pwccode>
+      <doi>10.18653/v1/2022.acl-long.219</doi>
     </paper>
     <paper id="220">
       <title>Learned Incremental Representations for Parsing</title>
@@ -3179,6 +3398,7 @@
       <bibkey>kitaev-etal-2022-learned</bibkey>
       <pwccode url="https://github.com/thomaslu2000/incremental-parsing-representations" additional="false">thomaslu2000/incremental-parsing-representations</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.220</doi>
     </paper>
     <paper id="221">
       <title>Knowledge Enhanced Reflection Generation for Counseling Dialogues</title>
@@ -3192,6 +3412,7 @@
       <url hash="3cb14511">2022.acl-long.221</url>
       <bibkey>shen-etal-2022-knowledge</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.221</doi>
     </paper>
     <paper id="222">
       <title>Misinfo Reaction Frames: Reasoning about Readers’ Reactions to News Headlines</title>
@@ -3209,6 +3430,7 @@
       <pwccode url="https://github.com/skgabriel/mrf-modeling" additional="false">skgabriel/mrf-modeling</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/coaid">CoAID</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/realnews">RealNews</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.222</doi>
     </paper>
     <paper id="223">
       <title>On Continual Model Refinement in Out-of-Distribution Data Streams</title>
@@ -3227,6 +3449,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/searchqa">SearchQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.223</doi>
     </paper>
     <paper id="224">
       <title>Achieving Conversational Goals with Unsupervised Post-hoc Knowledge Injection</title>
@@ -3240,6 +3463,7 @@
       <attachment type="software" hash="c828ce18">2022.acl-long.224.software.zip</attachment>
       <bibkey>majumder-etal-2022-achieving</bibkey>
       <pwccode url="https://github.com/majumderb/poki" additional="false">majumderb/poki</pwccode>
+      <doi>10.18653/v1/2022.acl-long.224</doi>
     </paper>
     <paper id="225">
       <title>Generated Knowledge Prompting for Commonsense Reasoning</title>
@@ -3261,6 +3485,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/numersense">NumerSense</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qasc">QASC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.225</doi>
     </paper>
     <paper id="226">
       <title>Training Data is More Valuable than You Think: A Simple and Effective Method by Retrieving from Training Data</title>
@@ -3287,6 +3512,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/wikihow">WikiHow</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.226</doi>
     </paper>
     <paper id="227">
       <title>Life after <fixed-case>BERT</fixed-case>: What do Other Muppets Understand about Language?</title>
@@ -3300,6 +3526,7 @@
       <bibkey>lialin-etal-2022-life</bibkey>
       <pwccode url="https://github.com/kev-zhao/life-after-bert" additional="false">kev-zhao/life-after-bert</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.227</doi>
     </paper>
     <paper id="228">
       <title>Tailor: Generating and Perturbing Text with Semantic Controls</title>
@@ -3317,6 +3544,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/styleptb">StylePTB</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.228</doi>
     </paper>
     <paper id="229">
       <title><fixed-case>T</fixed-case>ruthful<fixed-case>QA</fixed-case>: Measuring How Models Mimic Human Falsehoods</title>
@@ -3330,6 +3558,7 @@
       <bibkey>lin-etal-2022-truthfulqa</bibkey>
       <pwccode url="https://github.com/sylinrl/truthfulqa" additional="false">sylinrl/truthfulqa</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/truthfulqa">TruthfulQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.229</doi>
     </paper>
     <paper id="230">
       <title>Adaptive Testing and Debugging of <fixed-case>NLP</fixed-case> Models</title>
@@ -3340,6 +3569,7 @@
       <url hash="08277740">2022.acl-long.230</url>
       <bibkey>ribeiro-lundberg-2022-adaptive</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/paws">PAWS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.230</doi>
     </paper>
     <paper id="231">
       <title>Right for the Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning</title>
@@ -3354,6 +3584,7 @@
       <url hash="15f54a5f">2022.acl-long.231</url>
       <bibkey>gupta-etal-2022-right</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/tabfact">TabFact</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.231</doi>
     </paper>
     <paper id="232">
       <title>Interactive Word Completion for <fixed-case>P</fixed-case>lains <fixed-case>C</fixed-case>ree</title>
@@ -3364,6 +3595,7 @@
       <abstract>The composition of richly-inflected words in morphologically complex languages can be a challenge for language learners developing literacy. Accordingly, Lane and Bird (2020) proposed a finite state approach which maps prefixes in a language to a set of possible completions up to the next morpheme boundary, for the incremental building of complex words. In this work, we develop an approach to morph-based auto-completion based on a finite state morphological analyzer of Plains Cree (nêhiyawêwin), showing the portability of the concept to a much larger, more complete morphological transducer. Additionally, we propose and compare various novel ranking strategies on the morph auto-complete output. The best weighting scheme ranks the target completion in the top 10 results in 64.9% of queries, and in the top 50 in 73.9% of queries.</abstract>
       <url hash="65648f10">2022.acl-long.232</url>
       <bibkey>lane-etal-2022-interactive</bibkey>
+      <doi>10.18653/v1/2022.acl-long.232</doi>
     </paper>
     <paper id="233">
       <title><fixed-case>LAG</fixed-case>r: Label Aligned Graphs for Better Systematic Generalization in Semantic Parsing</title>
@@ -3374,6 +3606,7 @@
       <url hash="ba10bfd1">2022.acl-long.233</url>
       <bibkey>jambor-bahdanau-2022-lagr</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cfq">CFQ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.233</doi>
     </paper>
     <paper id="234">
       <title><fixed-case>T</fixed-case>oxi<fixed-case>G</fixed-case>en: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection</title>
@@ -3391,6 +3624,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/toxigen">ToxiGen</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/hate-speech">Hate Speech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/implicit-hate">Implicit Hate</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.234</doi>
     </paper>
     <paper id="235">
       <title>Direct Speech-to-Speech Translation With Discrete Units</title>
@@ -3411,6 +3645,7 @@
       <url hash="7de498ca">2022.acl-long.235</url>
       <bibkey>lee-etal-2022-direct</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.235</doi>
     </paper>
     <paper id="236">
       <title>Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization</title>
@@ -3423,6 +3658,7 @@
       <attachment type="software" hash="0fe5dfed">2022.acl-long.236.software.zip</attachment>
       <bibkey>cao-etal-2022-hallucinated</bibkey>
       <pwccode url="https://github.com/mcao516/entfa" additional="false">mcao516/entfa</pwccode>
+      <doi>10.18653/v1/2022.acl-long.236</doi>
     </paper>
     <paper id="237">
       <title><fixed-case>E</fixed-case>nt<fixed-case>SUM</fixed-case>: A Data Set for Entity-Centric Extractive Summarization</title>
@@ -3434,6 +3670,7 @@
       <url hash="392c8a29">2022.acl-long.237</url>
       <bibkey>maddela-etal-2022-entsum</bibkey>
       <pwccode url="https://github.com/bloomberg/entsum" additional="false">bloomberg/entsum</pwccode>
+      <doi>10.18653/v1/2022.acl-long.237</doi>
     </paper>
     <paper id="238">
       <title>Sentence-level Privacy for Document Embeddings</title>
@@ -3445,6 +3682,7 @@
       <url hash="67f53ef5">2022.acl-long.238</url>
       <bibkey>meehan-etal-2022-sentence</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.238</doi>
     </paper>
     <paper id="239">
       <title>Dataset Geography: Mapping Language Data to Language Users</title>
@@ -3460,6 +3698,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tydi-qa">TyDi QA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.239</doi>
     </paper>
     <paper id="240">
       <title><fixed-case>ILDAE</fixed-case>: Instance-Level Difficulty Analysis of Evaluation Data</title>
@@ -3479,6 +3718,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/swag">SWAG</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winogrande">WinoGrande</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.240</doi>
     </paper>
     <paper id="241">
       <title>Image Retrieval from Contextual Descriptions</title>
@@ -3496,6 +3736,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/spot-the-diff">Spot-the-diff</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/video-storytelling">Video Storytelling</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/youcook">YouCook</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.241</doi>
     </paper>
     <paper id="242">
       <title>Multilingual Molecular Representation Learning via Contrastive Pre-training</title>
@@ -3509,6 +3750,7 @@
       <url hash="d6a14a25">2022.acl-long.242</url>
       <bibkey>guo-etal-2022-multilingual</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/moleculenet">MoleculeNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.242</doi>
     </paper>
     <paper id="243">
       <title>Investigating Failures of Automatic Translation
@@ -3521,6 +3763,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="68031e04">2022.acl-long.243</url>
       <attachment type="software" hash="feff7e0b">2022.acl-long.243.software.zip</attachment>
       <bibkey>renduchintala-williams-2022-investigating</bibkey>
+      <doi>10.18653/v1/2022.acl-long.243</doi>
     </paper>
     <paper id="244">
       <title>Cross-Task Generalization via Natural Language Crowdsourcing Instructions</title>
@@ -3540,6 +3783,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/qasc">QASC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quoref">Quoref</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winogrande">WinoGrande</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.244</doi>
     </paper>
     <paper id="245">
       <title>Imputing Out-of-Vocabulary Embeddings with <fixed-case>LOVE</fixed-case> Makes <fixed-case>L</fixed-case>anguage<fixed-case>M</fixed-case>odels Robust with Little Cost</title>
@@ -3553,6 +3797,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chen-etal-2022-imputing</bibkey>
       <pwccode url="https://github.com/tigerchen52/love" additional="false">tigerchen52/love</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.245</doi>
     </paper>
     <paper id="246">
       <title><fixed-case>N</fixed-case>um<fixed-case>GLUE</fixed-case>: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks</title>
@@ -3570,6 +3815,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/math">MATH</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.246</doi>
     </paper>
     <paper id="247">
       <title><fixed-case>U</fixed-case>pstream <fixed-case>M</fixed-case>itigation <fixed-case>I</fixed-case>s <i>
@@ -3584,6 +3830,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="e9278971">2022.acl-long.247</url>
       <attachment type="software" hash="5559f431">2022.acl-long.247.software.zip</attachment>
       <bibkey>steed-etal-2022-upstream</bibkey>
+      <doi>10.18653/v1/2022.acl-long.247</doi>
     </paper>
     <paper id="248">
       <title>Improving Multi-label Malevolence Detection in Dialogues through Multi-faceted Label Correlation Enhancement</title>
@@ -3598,6 +3845,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="da845ace">2022.acl-long.248.software.zip</attachment>
       <bibkey>zhang-etal-2022-improving-multi</bibkey>
       <pwccode url="https://github.com/repozhang/malevolent_dialogue" additional="false">repozhang/malevolent_dialogue</pwccode>
+      <doi>10.18653/v1/2022.acl-long.248</doi>
     </paper>
     <paper id="249">
       <title>How Do We Answer Complex Questions: Discourse Structure of Long-form Answers</title>
@@ -3611,6 +3859,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/utcsnlp/lfqa_discourse" additional="false">utcsnlp/lfqa_discourse</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/eli5">ELI5</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.249</doi>
     </paper>
     <paper id="250">
       <title>Understanding Iterative Revision from Human-Written Text</title>
@@ -3625,6 +3874,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="b02c74cd">2022.acl-long.250</url>
       <bibkey>du-etal-2022-understanding-iterative</bibkey>
       <pwccode url="https://github.com/vipulraheja/iterater" additional="false">vipulraheja/iterater</pwccode>
+      <doi>10.18653/v1/2022.acl-long.250</doi>
     </paper>
     <paper id="251">
       <title>Making Transformers Solve Compositional Tasks</title>
@@ -3640,6 +3890,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/google-research/google-research" additional="false">google-research/google-research</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cfq">CFQ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scan">SCAN</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.251</doi>
     </paper>
     <paper id="252">
       <title>Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation</title>
@@ -3651,6 +3902,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="00b01eea">2022.acl-long.252</url>
       <attachment type="software" hash="53624271">2022.acl-long.252.software.zip</attachment>
       <bibkey>dankers-etal-2022-transformer</bibkey>
+      <doi>10.18653/v1/2022.acl-long.252</doi>
     </paper>
     <paper id="253">
       <title><fixed-case>C</fixed-case>onditional<fixed-case>QA</fixed-case>: A Complex Reading Comprehension Dataset with Conditional Answers</title>
@@ -3666,6 +3918,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/policyqa">PolicyQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qasper">QASPER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sharc">ShARC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.253</doi>
     </paper>
     <paper id="254">
       <title>Prompt-free and Efficient Few-shot Learning with Language Models</title>
@@ -3688,6 +3941,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.254</doi>
     </paper>
     <paper id="255">
       <title>Continual Sequence Generation with Adaptive Compositional Modules</title>
@@ -3702,6 +3956,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/GT-SALT/Adaptive-Compositional-Modules" additional="false">GT-SALT/Adaptive-Compositional-Modules</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikisql">WikiSQL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.255</doi>
     </paper>
     <paper id="256">
       <title>An Investigation of the (In)effectiveness of Counterfactually Augmented Data</title>
@@ -3714,6 +3969,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>joshi-he-2022-investigation</bibkey>
       <pwccode url="https://github.com/joshinh/investigation-cad" additional="false">joshinh/investigation-cad</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/boolq">BoolQ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.256</doi>
     </paper>
     <paper id="257">
       <title>Inducing Positive Perspectives with Text Reframing</title>
@@ -3727,6 +3983,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="8f56b787">2022.acl-long.257</url>
       <bibkey>ziems-etal-2022-inducing</bibkey>
       <pwccode url="https://github.com/gt-salt/positive-frames" additional="false">gt-salt/positive-frames</pwccode>
+      <doi>10.18653/v1/2022.acl-long.257</doi>
     </paper>
     <paper id="258">
       <title><fixed-case>VALUE</fixed-case>: <fixed-case>U</fixed-case>nderstanding Dialect Disparity in <fixed-case>NLU</fixed-case></title>
@@ -3742,6 +3999,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/cola">CoLA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.258</doi>
     </paper>
     <paper id="259">
       <title>From the Detection of Toxic Spans in Online Discussions to the Analysis of Toxic-to-Civil Transfer</title>
@@ -3755,6 +4013,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="2ff727cc">2022.acl-long.259</url>
       <bibkey>pavlopoulos-etal-2022-detection</bibkey>
       <pwccode url="https://github.com/ipavlopoulos/toxic_spans" additional="false">ipavlopoulos/toxic_spans</pwccode>
+      <doi>10.18653/v1/2022.acl-long.259</doi>
     </paper>
     <paper id="260">
       <title><fixed-case>F</fixed-case>orm<fixed-case>N</fixed-case>et: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction</title>
@@ -3773,6 +4032,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="a125679a">2022.acl-long.260</url>
       <bibkey>lee-etal-2022-formnet</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/funsd">FUNSD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.260</doi>
     </paper>
     <paper id="261">
       <title>The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems</title>
@@ -3787,6 +4047,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>ziems-etal-2022-moral</bibkey>
       <pwccode url="https://github.com/gt-salt/mic" additional="false">gt-salt/mic</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ethics-1">ETHICS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.261</doi>
     </paper>
     <paper id="262">
       <title>Token Dropping for Efficient <fixed-case>BERT</fixed-case> Pretraining</title>
@@ -3805,6 +4066,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.262</doi>
     </paper>
     <paper id="263">
       <title><fixed-case>D</fixed-case>ial<fixed-case>F</fixed-case>act: A Benchmark for Fact-Checking in Dialogue</title>
@@ -3821,6 +4083,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/vitaminc">VitaminC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.263</doi>
     </paper>
     <paper id="264">
       <title>The Trade-offs of Domain Adaptation for Neural Language Models</title>
@@ -3830,6 +4093,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>This work connects language model adaptation with concepts of machine learning theory. We consider a training setup with a large out-of-domain set and a small in-domain set. We derive how the benefit of training a model on either set depends on the size of the sets and the distance between their underlying distributions. We analyze how out-of-domain pre-training before in-domain fine-tuning achieves better generalization than either solution independently. Finally, we present how adaptation techniques based on data selection, such as importance sampling, intelligent data selection and influence functions, can be presented in a common framework which highlights their similarity and also their subtle differences.</abstract>
       <url hash="5effbca6">2022.acl-long.264</url>
       <bibkey>grangier-iter-2022-trade</bibkey>
+      <doi>10.18653/v1/2022.acl-long.264</doi>
     </paper>
     <paper id="265">
       <title>Towards Afrocentric <fixed-case>NLP</fixed-case> for <fixed-case>A</fixed-case>frican Languages: Where We Are and Where We Can Go</title>
@@ -3839,6 +4103,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Aligning with ACL 2022 special Theme on “Language Diversity: from Low Resource to Endangered Languages”, we discuss the major linguistic and sociopolitical challenges facing development of NLP technologies for African languages. Situating African languages in a typological framework, we discuss how the particulars of these languages can be harnessed. To facilitate future research, we also highlight current efforts, communities, venues, datasets, and tools. Our main objective is to motivate and advocate for an Afrocentric approach to technology development. With this in mind, we recommend <i>what</i> technologies to build and <i>how</i> to build, evaluate, and deploy them based on the needs of local African communities.</abstract>
       <url hash="abefe582">2022.acl-long.265</url>
       <bibkey>adebara-abdul-mageed-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.acl-long.265</doi>
     </paper>
     <paper id="266">
       <title>Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction</title>
@@ -3853,6 +4118,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/makstarnavskyi/gector-large" additional="false">makstarnavskyi/gector-large</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fce">FCE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/locness-corpus">WI-LOCNESS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.266</doi>
     </paper>
     <paper id="267">
       <title>Speaker Information Can Guide Models to Better Inductive Biases: A Case Study On Predicting Code-Switching</title>
@@ -3866,6 +4132,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="3c8ea6f3">2022.acl-long.267.software.zip</attachment>
       <bibkey>ostapenko-etal-2022-speaker</bibkey>
       <pwccode url="https://github.com/ostapen/switch-and-explain" additional="false">ostapen/switch-and-explain</pwccode>
+      <doi>10.18653/v1/2022.acl-long.267</doi>
     </paper>
     <paper id="268">
       <title>Detecting Unassimilated Borrowings in <fixed-case>S</fixed-case>panish: <fixed-case>A</fixed-case>n Annotated Corpus and Approaches to Modeling</title>
@@ -3877,6 +4144,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="ee28b082">2022.acl-long.268.software.zip</attachment>
       <bibkey>alvarez-mellado-lignos-2022-detecting</bibkey>
       <pwccode url="https://github.com/lirondos/coalas" additional="false">lirondos/coalas</pwccode>
+      <doi>10.18653/v1/2022.acl-long.268</doi>
     </paper>
     <paper id="269">
       <title>Is Attention Explanation? An Introduction to the Debate</title>
@@ -3891,6 +4159,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>The performance of deep learning models in NLP and other fields of machine learning has led to a rise in their popularity, and so the need for explanations of these models becomes paramount. Attention has been seen as a solution to increase performance, while providing some explanations. However, a debate has started to cast doubt on the explanatory power of attention in neural networks. Although the debate has created a vast literature thanks to contributions from various areas, the lack of communication is becoming more and more tangible. In this paper, we provide a clear overview of the insights on the debate by critically confronting works from these different areas. This holistic vision can be of great interest for future works in all the communities concerned by this debate. We sum up the main challenges spotted in these areas, and we conclude by discussing the most promising future avenues on attention as an explanation.</abstract>
       <url hash="30eeef58">2022.acl-long.269</url>
       <bibkey>bibal-etal-2022-attention</bibkey>
+      <doi>10.18653/v1/2022.acl-long.269</doi>
     </paper>
     <paper id="270">
       <title>There Are a Thousand Hamlets in a Thousand People’s Eyes: Enhancing Knowledge-grounded Dialogue with Personal Memory</title>
@@ -3904,6 +4173,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="710e59ee">2022.acl-long.270</url>
       <attachment type="software" hash="2d2952d5">2022.acl-long.270.software.zip</attachment>
       <bibkey>fu-etal-2022-thousand</bibkey>
+      <doi>10.18653/v1/2022.acl-long.270</doi>
     </paper>
     <paper id="271">
       <title>Neural Pipeline for Zero-Shot Data-to-Text Generation</title>
@@ -3915,6 +4185,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>kasner-dusek-2022-neural</bibkey>
       <pwccode url="https://github.com/kasnerz/zeroshot-d2t-pipeline" additional="false">kasnerz/zeroshot-d2t-pipeline</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wikisplit">WikiSplit</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.271</doi>
     </paper>
     <paper id="272">
       <title>Not always about you: Prioritizing community needs when developing endangered language technology</title>
@@ -3926,6 +4197,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Languages are classified as low-resource when they lack the quantity of data necessary for training statistical and machine learning tools and models. Causes of resource scarcity vary but can include poor access to technology for developing these resources, a relatively small population of speakers, or a lack of urgency for collecting such resources in bilingual populations where the second language is high-resource. As a result, the languages described as low-resource in the literature are as different as Finnish on the one hand, with millions of speakers using it in every imaginable domain, and Seneca, with only a small-handful of fluent speakers using the language primarily in a restricted domain. While issues stemming from the lack of resources necessary to train models unite this disparate group of languages, many other issues cut across the divide between widely-spoken low-resource languages and endangered languages. In this position paper, we discuss the unique technological, cultural, practical, and ethical challenges that researchers and indigenous speech community members face when working together to develop language technology to support endangered language documentation and revitalization. We report the perspectives of language teachers, Master Speakers and elders from indigenous communities, as well as the point of view of academics. We describe an ongoing fruitful collaboration and make recommendations for future partnerships between academic researchers and language community stakeholders.</abstract>
       <url hash="07da4fea">2022.acl-long.272</url>
       <bibkey>liu-etal-2022-always</bibkey>
+      <doi>10.18653/v1/2022.acl-long.272</doi>
     </paper>
     <paper id="273">
       <title>Automatic Identification and Classification of Bragging in Social Media</title>
@@ -3937,6 +4209,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Bragging is a speech act employed with the goal of constructing a favorable self-image through positive statements about oneself. It is widespread in daily communication and especially popular in social media, where users aim to build a positive image of their persona directly or indirectly. In this paper, we present the first large scale study of bragging in computational linguistics, building on previous research in linguistics and pragmatics. To facilitate this, we introduce a new publicly available data set of tweets annotated for bragging and their types. We empirically evaluate different transformer-based models injected with linguistic information in (a) binary bragging classification, i.e., if tweets contain bragging statements or not; and (b) multi-class bragging type prediction including not bragging. Our results show that our models can predict bragging with macro F1 up to 72.42 and 35.95 in the binary and multi-class classification tasks respectively. Finally, we present an extensive linguistic and error analysis of bragging prediction to guide future research on this topic.</abstract>
       <url hash="73fdcb60">2022.acl-long.273</url>
       <bibkey>jin-etal-2022-automatic</bibkey>
+      <doi>10.18653/v1/2022.acl-long.273</doi>
     </paper>
     <paper id="274">
       <title>Automatic Error Analysis for Document-level Information Extraction</title>
@@ -3954,6 +4227,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>das-etal-2022-automatic</bibkey>
       <pwccode url="https://github.com/icejinx33/auto-err-template-fill" additional="false">icejinx33/auto-err-template-fill</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/scirex">SciREX</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.274</doi>
     </paper>
     <paper id="275">
       <title>Learning Functional Distributional Semantics with Visual Data</title>
@@ -3964,6 +4238,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="aa15b642">2022.acl-long.275</url>
       <bibkey>liu-emerson-2022-learning</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.275</doi>
     </paper>
     <paper id="276">
       <title>e<fixed-case>P</fixed-case>i<fixed-case>C</fixed-case>: Employing Proverbs in Context as a Benchmark for Abstract Language Understanding</title>
@@ -3976,6 +4251,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>ghosh-srivastava-2022-epic</bibkey>
       <pwccode url="https://github.com/sgdgp/epic" additional="false">sgdgp/epic</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.276</doi>
     </paper>
     <paper id="277">
       <title>Chart-to-Text: A Large-Scale Benchmark for Chart Summarization</title>
@@ -3993,6 +4269,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="" additional="true"/>
       <pwcdataset url="https://paperswithcode.com/dataset/chart-to-text">Chart-to-text</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/chart2text">Chart2Text</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.277</doi>
     </paper>
     <paper id="278">
       <title>Characterizing Idioms: Conventionality and Contingency</title>
@@ -4004,6 +4281,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Idioms are unlike most phrases in two important ways. First, words in an idiom have non-canonical meanings. Second, the non-canonical meanings of words in an idiom are contingent on the presence of other words in the idiom. Linguistic theories differ on whether these properties depend on one another, as well as whether special theoretical machinery is needed to accommodate idioms. We define two measures that correspond to the properties above, and we show that idioms fall at the expected intersection of the two dimensions, but that the dimensions themselves are not correlated. Our results suggest that introducing special machinery to handle idioms may not be warranted.</abstract>
       <url hash="b8aa684a">2022.acl-long.278</url>
       <bibkey>socolof-etal-2022-characterizing</bibkey>
+      <doi>10.18653/v1/2022.acl-long.278</doi>
     </paper>
     <paper id="279">
       <title>Bag-of-Words vs. Graph vs. Sequence in Text Classification: Questioning the Necessity of Text-Graphs and the Surprising Strength of a Wide <fixed-case>MLP</fixed-case></title>
@@ -4015,6 +4293,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="21c5c547">2022.acl-long.279.software.zip</attachment>
       <bibkey>galke-scherp-2022-bag</bibkey>
       <pwccode url="https://github.com/lgalke/text-clf-baselines" additional="false">lgalke/text-clf-baselines</pwccode>
+      <doi>10.18653/v1/2022.acl-long.279</doi>
     </paper>
     <paper id="280">
       <title>Generative Pretraining for Paraphrase Evaluation</title>
@@ -4032,6 +4311,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/paranmt-50m">PARANMT-50M</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paws">PAWS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.280</doi>
     </paper>
     <paper id="281">
       <title>Incorporating Stock Market Signals for <fixed-case>T</fixed-case>witter Stance Detection</title>
@@ -4047,6 +4327,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="0ad3a512">2022.acl-long.281.software.zip</attachment>
       <bibkey>conforti-etal-2022-incorporating</bibkey>
       <pwccode url="https://github.com/cambridge-wtwt/acl2022-wtwt-stocks" additional="false">cambridge-wtwt/acl2022-wtwt-stocks</pwccode>
+      <doi>10.18653/v1/2022.acl-long.281</doi>
     </paper>
     <paper id="282">
       <title>Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation</title>
@@ -4060,6 +4341,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Multilingual neural machine translation models are trained to maximize the likelihood of a mix of examples drawn from multiple language pairs. The dominant inductive bias applied to these models is a shared vocabulary and a shared set of parameters across languages; the inputs and labels corresponding to examples drawn from different language pairs might still reside in distinct sub-spaces. In this paper, we introduce multilingual crossover encoder-decoder (mXEncDec) to fuse language pairs at an instance level. Our approach interpolates instances from different language pairs into joint ‘crossover examples’ in order to encourage sharing input and output spaces across languages. To ensure better fusion of examples in multilingual settings, we propose several techniques to improve example interpolation across dissimilar languages under heavy data imbalance. Experiments on a large-scale WMT multilingual dataset demonstrate that our approach significantly improves quality on English-to-Many, Many-to-English and zero-shot translation tasks (from +0.5 BLEU up to +5.5 BLEU points). Results on code-switching sets demonstrate the capability of our approach to improve model generalization to out-of-distribution multilingual examples. We also conduct qualitative and quantitative representation comparisons to analyze the advantages of our approach at the representation level.</abstract>
       <url hash="d2714739">2022.acl-long.282</url>
       <bibkey>cheng-etal-2022-multilingual</bibkey>
+      <doi>10.18653/v1/2022.acl-long.282</doi>
     </paper>
     <paper id="283">
       <title>Word Segmentation as Unsupervised Constituency Parsing</title>
@@ -4069,6 +4351,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="f2c1542c">2022.acl-long.283</url>
       <bibkey>alhama-2022-word</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/opensubtitles">OpenSubtitles</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.283</doi>
     </paper>
     <paper id="284">
       <title><fixed-case>S</fixed-case>afety<fixed-case>K</fixed-case>it: First Aid for Measuring Safety in Open-domain Conversational Systems</title>
@@ -4085,6 +4368,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>dinan-etal-2022-safetykit</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/blended-skill-talk">Blended Skill Talk</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/honest-en">HONEST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.284</doi>
     </paper>
     <paper id="285">
       <title>Zero-Shot Cross-lingual Semantic Parsing</title>
@@ -4099,6 +4383,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/atis">ATIS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mkqa">MKQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.285</doi>
     </paper>
     <paper id="286">
       <title>The Paradox of the Compositionality of Natural Language: A Neural Machine Translation Case Study</title>
@@ -4111,6 +4396,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="c9853a4f">2022.acl-long.286.software.zip</attachment>
       <bibkey>dankers-etal-2022-paradox</bibkey>
       <pwccode url="https://github.com/i-machine-think/compositionality_paradox_mt" additional="false">i-machine-think/compositionality_paradox_mt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.286</doi>
     </paper>
     <paper id="287">
       <title>Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents</title>
@@ -4124,6 +4410,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Document-level neural machine translation (DocNMT) achieves coherent translations by incorporating cross-sentence context. However, for most language pairs there’s a shortage of parallel documents, although parallel sentences are readily available. In this paper, we study whether and how contextual modeling in DocNMT is transferable via multilingual modeling. We focus on the scenario of zero-shot transfer from teacher languages with document level data to student languages with no documents but sentence level data, and for the first time treat document-level translation as a transfer learning problem. Using simple concatenation-based DocNMT, we explore the effect of 3 factors on the transfer: the number of teacher languages with document level data, the balance between document and sentence level data at training, and the data condition of parallel documents (genuine vs. back-translated). Our experiments on Europarl-7 and IWSLT-10 show the feasibility of multilingual transfer for DocNMT, particularly on document-specific metrics. We observe that more teacher languages and adequate data balance both contribute to better transfer quality. Surprisingly, the transfer is less sensitive to the data condition, where multilingual DocNMT delivers decent performance with either back-translated or genuine document pairs.</abstract>
       <url hash="cf71a332">2022.acl-long.287</url>
       <bibkey>zhang-etal-2022-multilingual</bibkey>
+      <doi>10.18653/v1/2022.acl-long.287</doi>
     </paper>
     <paper id="288">
       <title>Cross-Lingual Phrase Retrieval</title>
@@ -4139,6 +4426,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Cross-lingual retrieval aims to retrieve relevant text across languages. Current methods typically achieve cross-lingual retrieval by learning language-agnostic text representations in word or sentence level. However, how to learn phrase representations for cross-lingual phrase retrieval is still an open problem. In this paper, we propose , a cross-lingual phrase retriever that extracts phrase representations from unlabeled example sentences. Moreover, we create a large-scale cross-lingual phrase retrieval dataset, which contains 65K bilingual phrase pairs and 4.2M example sentences in 8 English-centric language pairs. Experimental results show that outperforms state-of-the-art baselines which utilize word-level or sentence-level representations. also shows impressive zero-shot transferability that enables the model to perform retrieval in an unseen language pair during training. Our dataset, code, and trained models are publicly available at github.com/cwszz/XPR/.</abstract>
       <url hash="63e649f9">2022.acl-long.288</url>
       <bibkey>zheng-etal-2022-cross-lingual</bibkey>
+      <doi>10.18653/v1/2022.acl-long.288</doi>
     </paper>
     <paper id="289">
       <title>Improving Compositional Generalization with Self-Training for Data-to-Text Generation</title>
@@ -4154,6 +4442,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>mehta-etal-2022-improving</bibkey>
       <pwccode url="https://github.com/google-research/google-research" additional="false">google-research/google-research</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/sgd">SGD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.289</doi>
     </paper>
     <paper id="290">
       <title><fixed-case>MMC</fixed-case>o<fixed-case>QA</fixed-case>: Conversational Question Answering over Text, Tables, and Images</title>
@@ -4168,6 +4457,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/liyongqi67/mmcoqa" additional="false">liyongqi67/mmcoqa</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/manymodalqa">ManyModalQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/orconvqa">ORConvQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.290</doi>
     </paper>
     <paper id="291">
       <title>Effective Token Graph Modeling using a Novel Labeling Strategy for Structured Sentiment Analysis</title>
@@ -4183,6 +4473,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/xgswlg/tgls" additional="false">xgswlg/tgls</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/norec-fine">NoReC_fine</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.291</doi>
     </paper>
     <paper id="292">
       <title><fixed-case>P</fixed-case>rom<fixed-case>DA</fixed-case>: Prompt-based Data Augmentation for Low-Resource <fixed-case>NLU</fixed-case> Tasks</title>
@@ -4201,6 +4492,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/garyyufei/promda" additional="false">garyyufei/promda</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.292</doi>
     </paper>
     <paper id="293">
       <title>Disentangled Sequence to Sequence Learning for Compositional Generalization</title>
@@ -4212,6 +4504,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>zheng-lapata-2022-disentangled</bibkey>
       <pwccode url="https://github.com/mswellhao/dangle" additional="false">mswellhao/dangle</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cfq">CFQ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.293</doi>
     </paper>
     <paper id="294">
       <title><fixed-case>RST</fixed-case> Discourse Parsing with Second-Stage <fixed-case>EDU</fixed-case>-Level Pre-training</title>
@@ -4224,6 +4517,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="e488c119">2022.acl-long.294</url>
       <attachment type="software" hash="4f304282">2022.acl-long.294.software.zip</attachment>
       <bibkey>yu-etal-2022-rst</bibkey>
+      <doi>10.18653/v1/2022.acl-long.294</doi>
     </paper>
     <paper id="295">
       <title><fixed-case>S</fixed-case>im<fixed-case>KGC</fixed-case>: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models</title>
@@ -4237,6 +4531,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="f31dc5d9">2022.acl-long.295.software.zip</attachment>
       <bibkey>wang-etal-2022-simkgc</bibkey>
       <pwccode url="https://github.com/intfloat/simkgc" additional="false">intfloat/simkgc</pwccode>
+      <doi>10.18653/v1/2022.acl-long.295</doi>
     </paper>
     <paper id="296">
       <title>Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?</title>
@@ -4250,6 +4545,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>eberle-etal-2022-transformer</bibkey>
       <pwccode url="https://github.com/oeberle/task_gaze_transformers" additional="false">oeberle/task_gaze_transformers</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.296</doi>
     </paper>
     <paper id="297">
       <title><fixed-case>L</fixed-case>ex<fixed-case>GLUE</fixed-case>: A Benchmark Dataset for Legal Language Understanding in <fixed-case>E</fixed-case>nglish</title>
@@ -4272,6 +4568,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ecthr">ECtHR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.297</doi>
     </paper>
     <paper id="298">
       <title><fixed-case>D</fixed-case>i<fixed-case>B</fixed-case>i<fixed-case>MT</fixed-case>: A Novel Benchmark for Measuring <fixed-case>W</fixed-case>ord <fixed-case>S</fixed-case>ense <fixed-case>D</fixed-case>isambiguation Biases in <fixed-case>M</fixed-case>achine <fixed-case>T</fixed-case>ranslation</title>
@@ -4286,6 +4583,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>campolungo-etal-2022-dibimt</bibkey>
       <revision id="1" href="2022.acl-long.298v1" hash="3dd88706"/>
       <revision id="2" href="2022.acl-long.298v2" hash="39fb17fa" date="2022-05-21">Various fixes throughout the paper.</revision>
+      <doi>10.18653/v1/2022.acl-long.298</doi>
     </paper>
     <paper id="299">
       <title>Improving Word Translation via Two-Stage Contrastive Learning</title>
@@ -4302,6 +4600,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/cambridgeltl/contrastivebli" additional="false">cambridgeltl/contrastivebli</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/panlex-bli">PanLex-BLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xling">XLING</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.299</doi>
     </paper>
     <paper id="300">
       <title>Scheduled Multi-task Learning for Neural Chat Translation</title>
@@ -4316,6 +4615,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>liang-etal-2022-scheduled</bibkey>
       <pwccode url="https://github.com/xl2248/sml" additional="false">xl2248/sml</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bmeld">BMELD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.300</doi>
     </paper>
     <paper id="301">
       <title><fixed-case>F</fixed-case>air<fixed-case>L</fixed-case>ex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing</title>
@@ -4332,6 +4632,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chalkidis-etal-2022-fairlex</bibkey>
       <pwccode url="https://github.com/coastalcph/fairlex" additional="false">coastalcph/fairlex</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ecthr">ECtHR</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.301</doi>
     </paper>
     <paper id="302">
       <title>Towards Abstractive Grounded Summarization of Podcast Transcripts</title>
@@ -4345,6 +4646,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="4b67226f">2022.acl-long.302</url>
       <bibkey>song-etal-2022-towards</bibkey>
       <pwccode url="https://github.com/tencent-ailab/grndpodcastsum" additional="false">tencent-ailab/grndpodcastsum</pwccode>
+      <doi>10.18653/v1/2022.acl-long.302</doi>
     </paper>
     <paper id="303">
       <title><fixed-case>F</fixed-case>i<fixed-case>NER</fixed-case>: Financial Numeric Entity Recognition for <fixed-case>XBRL</fixed-case> Tagging</title>
@@ -4361,6 +4663,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>loukas-etal-2022-finer</bibkey>
       <pwccode url="https://github.com/nlpaueb/finer" additional="false">nlpaueb/finer</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/finer-139">FiNER-139</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.303</doi>
     </paper>
     <paper id="304">
       <title>Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation</title>
@@ -4381,6 +4684,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="5797b5f8">2022.acl-long.304.software.zip</attachment>
       <bibkey>li-etal-2022-keywords</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.304</doi>
     </paper>
     <paper id="305">
       <title><fixed-case>EPT</fixed-case>-<fixed-case>X</fixed-case>: An Expression-Pointer Transformer model that generates e<fixed-case>X</fixed-case>planations for numbers</title>
@@ -4393,6 +4697,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="e1dc2af2">2022.acl-long.305</url>
       <attachment type="software" hash="422db4e6">2022.acl-long.305.software.tgz</attachment>
       <bibkey>kim-etal-2022-ept</bibkey>
+      <doi>10.18653/v1/2022.acl-long.305</doi>
     </paper>
     <paper id="306">
       <title>Identifying the Human Values behind Arguments</title>
@@ -4407,6 +4712,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="734edb4d">2022.acl-long.306</url>
       <bibkey>kiesel-etal-2022-identifying</bibkey>
       <pwccode url="https://github.com/webis-de/acl-22" additional="false">webis-de/acl-22</pwccode>
+      <doi>10.18653/v1/2022.acl-long.306</doi>
     </paper>
     <paper id="307">
       <title><fixed-case>B</fixed-case>ench<fixed-case>IE</fixed-case>: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation</title>
@@ -4423,6 +4729,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>gashteovski-etal-2022-benchie</bibkey>
       <pwccode url="https://github.com/gkiril/benchie" additional="false">gkiril/benchie</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/benchie">BenchIE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.307</doi>
     </paper>
     <paper id="308">
       <title>Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition</title>
@@ -4443,6 +4750,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/lrw">LRW</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/libri-light">Libri-Light</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.308</doi>
     </paper>
     <paper id="309">
       <title><fixed-case>S</fixed-case>umma<fixed-case>R</fixed-case>eranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization</title>
@@ -4457,6 +4765,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/ntunlp/summareranker" additional="false">ntunlp/summareranker</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/reddit-tifu">Reddit TIFU</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.309</doi>
     </paper>
     <paper id="310">
       <title>Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals</title>
@@ -4473,6 +4782,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>wu-etal-2022-understanding</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/recipeqa">RecipeQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikihow">WikiHow</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.310</doi>
     </paper>
     <paper id="311">
       <title>Zoom Out and Observe: News Environment Perception for Fake News Detection</title>
@@ -4487,6 +4797,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="be36d3cc">2022.acl-long.311</url>
       <bibkey>sheng-etal-2022-zoom</bibkey>
       <pwccode url="https://github.com/ictmcg/news-environment-perception" additional="false">ictmcg/news-environment-perception</pwccode>
+      <doi>10.18653/v1/2022.acl-long.311</doi>
     </paper>
     <paper id="312">
       <title>Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models</title>
@@ -4502,6 +4813,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/iwslt-2017">IWSLT 2017</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/opensubtitles">OpenSubtitles</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wmt-2014">WMT 2014</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.312</doi>
     </paper>
     <paper id="313">
       <title>Saliency as Evidence: Event Detection with Trigger Saliency Attribution</title>
@@ -4514,6 +4826,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="24ec5649">2022.acl-long.313.software.zip</attachment>
       <bibkey>liu-etal-2022-saliency</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/maven">MAVEN</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.313</doi>
     </paper>
     <paper id="314">
       <title><fixed-case>SRL4E</fixed-case> – <fixed-case>S</fixed-case>emantic <fixed-case>R</fixed-case>ole <fixed-case>L</fixed-case>abeling for <fixed-case>E</fixed-case>motions: <fixed-case>A</fixed-case> Unified Evaluation Framework</title>
@@ -4525,6 +4838,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="abae2774">2022.acl-long.314</url>
       <bibkey>campagnano-etal-2022-srl4e</bibkey>
       <pwccode url="https://github.com/sapienzanlp/srl4e" additional="false">sapienzanlp/srl4e</pwccode>
+      <doi>10.18653/v1/2022.acl-long.314</doi>
     </paper>
     <paper id="315">
       <title>Context Matters: A Pragmatic Study of <fixed-case>PLM</fixed-case>s’ Negation Understanding</title>
@@ -4536,6 +4850,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>gubelmann-handschuh-2022-context</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.315</doi>
     </paper>
     <paper id="316">
       <title>Probing for Predicate Argument Structures in Pretrained Language Models</title>
@@ -4546,6 +4861,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="97cc0eea">2022.acl-long.316</url>
       <bibkey>conia-navigli-2022-probing</bibkey>
       <pwccode url="https://github.com/sapienzanlp/srl-pas-probing" additional="false">sapienzanlp/srl-pas-probing</pwccode>
+      <doi>10.18653/v1/2022.acl-long.316</doi>
     </paper>
     <paper id="317">
       <title>Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction</title>
@@ -4560,6 +4876,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="cc486600">2022.acl-long.317.software.zip</attachment>
       <bibkey>huang-etal-2022-multilingual-generative</bibkey>
       <pwccode url="https://github.com/pluslabnlp/x-gear" additional="false">pluslabnlp/x-gear</pwccode>
+      <doi>10.18653/v1/2022.acl-long.317</doi>
     </paper>
     <paper id="318">
       <title>Identifying Moments of Change from Longitudinal User Text</title>
@@ -4573,6 +4890,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Identifying changes in individuals’ behaviour and mood, as observed via content shared on online platforms, is increasingly gaining importance. Most research to-date on this topic focuses on either: (a) identifying individuals at risk or with a certain mental health condition given a batch of posts or (b) providing equivalent labels at the post level. A disadvantage of such work is the lack of a strong temporal component and the inability to make longitudinal assessments following an individual’s trajectory and allowing timely interventions. Here we define a new task, that of identifying moments of change in individuals on the basis of their shared content online. The changes we consider are sudden shifts in mood (switches) or gradual mood progression (escalations). We have created detailed guidelines for capturing moments of change and a corpus of 500 manually annotated user timelines (18.7K posts). We have developed a variety of baseline models drawing inspiration from related tasks and show that the best performance is obtained through context aware sequential modelling. We also introduce new metrics for capturing rare events in temporal windows.</abstract>
       <url hash="f4ff6095">2022.acl-long.318</url>
       <bibkey>tsakalidis-etal-2022-identifying</bibkey>
+      <doi>10.18653/v1/2022.acl-long.318</doi>
     </paper>
     <paper id="319">
       <title>Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System</title>
@@ -4589,6 +4907,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="9c4f906f">2022.acl-long.319.software.zip</attachment>
       <bibkey>su-etal-2022-multi</bibkey>
       <pwccode url="https://github.com/awslabs/pptod" additional="false">awslabs/pptod</pwccode>
+      <doi>10.18653/v1/2022.acl-long.319</doi>
     </paper>
     <paper id="320">
       <title>Graph Enhanced Contrastive Learning for Radiology Findings Summarization</title>
@@ -4604,6 +4923,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="ab44a264">2022.acl-long.320.software.zip</attachment>
       <bibkey>hu-etal-2022-graph</bibkey>
       <pwccode url="https://github.com/jinpeng01/aig_cl" additional="false">jinpeng01/aig_cl</pwccode>
+      <doi>10.18653/v1/2022.acl-long.320</doi>
     </paper>
     <paper id="321">
       <title>Semi-Supervised Formality Style Transfer with Consistency Training</title>
@@ -4616,6 +4936,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>liu-etal-2022-semi</bibkey>
       <pwccode url="https://github.com/aolius/semi-fst" additional="false">aolius/semi-fst</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/gyafc">GYAFC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.321</doi>
     </paper>
     <paper id="322">
       <title>Cross-Lingual Ability of Multilingual Masked Language Models: A Study of Language Structure</title>
@@ -4627,6 +4948,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="369b0906">2022.acl-long.322</url>
       <bibkey>chai-etal-2022-cross</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.322</doi>
     </paper>
     <paper id="323">
       <title>Rare and Zero-shot Word Sense Disambiguation using <fixed-case>Z</fixed-case>-Reweighting</title>
@@ -4641,6 +4963,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>su-etal-2022-rare</bibkey>
       <pwccode url="https://github.com/suytingwan/wsd-z-reweighting" additional="false">suytingwan/wsd-z-reweighting</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/word-sense-disambiguation-a-unified">Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.323</doi>
     </paper>
     <paper id="324">
       <title><fixed-case>N</fixed-case>ibbling at the Hard Core of <fixed-case>W</fixed-case>ord <fixed-case>S</fixed-case>ense <fixed-case>D</fixed-case>isambiguation</title>
@@ -4654,6 +4977,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>maru-etal-2022-nibbling</bibkey>
       <pwccode url="https://github.com/sapienzanlp/wsd-hard-benchmark" additional="false">sapienzanlp/wsd-hard-benchmark</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/word-sense-disambiguation-a-unified">Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.324</doi>
     </paper>
     <paper id="325">
       <title>Large Scale Substitution-based Word Sense Induction</title>
@@ -4667,6 +4991,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>eyal-etal-2022-large</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coarsewsd-20">CoarseWSD-20</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.325</doi>
     </paper>
     <paper id="326">
       <title>Can Synthetic Translations Improve Bitext Quality?</title>
@@ -4677,6 +5002,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="aa723446">2022.acl-long.326</url>
       <bibkey>briakou-carpuat-2022-synthetic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.326</doi>
     </paper>
     <paper id="327">
       <title>Unsupervised Dependency Graph Network</title>
@@ -4692,6 +5018,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>shen-etal-2022-unsupervised</bibkey>
       <pwccode url="https://github.com/yikangshen/udgn" additional="false">yikangshen/udgn</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.327</doi>
     </paper>
     <paper id="328">
       <title><fixed-case>W</fixed-case>iki<fixed-case>D</fixed-case>iverse: A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity Types</title>
@@ -4710,6 +5037,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>wang-etal-2022-wikidiverse</bibkey>
       <pwccode url="https://github.com/wangxw5/wikidiverse" additional="false">wangxw5/wikidiverse</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/zeshel">ZESHEL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.328</doi>
     </paper>
     <paper id="329">
       <title>Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models</title>
@@ -4728,6 +5056,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/blue">BLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/biolama">BioLAMA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.329</doi>
     </paper>
     <paper id="330">
       <title>Fine- and Coarse-Granularity Hybrid Self-Attention for Efficient <fixed-case>BERT</fixed-case></title>
@@ -4745,6 +5074,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.330</doi>
     </paper>
     <paper id="331">
       <title>Compression of Generative Pre-trained Language Models via Quantization</title>
@@ -4764,6 +5094,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.331</doi>
     </paper>
     <paper id="332">
       <title>Visual-Language Navigation Pretraining via Prompt-based Environmental Self-exploration</title>
@@ -4780,6 +5111,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptual-captions">Conceptual Captions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/objects365">Objects365</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/places">Places</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.332</doi>
     </paper>
     <paper id="333">
       <title><fixed-case>D</fixed-case>ialog<fixed-case>VED</fixed-case>: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation</title>
@@ -4803,6 +5135,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/dstc7-task-2">DSTC7 Task 2</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.333</doi>
     </paper>
     <paper id="334">
       <title>Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations</title>
@@ -4823,6 +5156,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="91b4d5d8">2022.acl-long.334</url>
       <bibkey>chen-etal-2022-contextual</bibkey>
       <pwccode url="https://github.com/lemuria-wchen/CFC" additional="false">lemuria-wchen/CFC</pwccode>
+      <doi>10.18653/v1/2022.acl-long.334</doi>
     </paper>
     <paper id="335">
       <title>Textomics: A Dataset for Genomics Data Summary Generation</title>
@@ -4835,6 +5169,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="44887842">2022.acl-long.335.software.zip</attachment>
       <bibkey>wang-etal-2022-textomics</bibkey>
       <pwccode url="https://github.com/amos814/textomics" additional="false">amos814/textomics</pwccode>
+      <doi>10.18653/v1/2022.acl-long.335</doi>
     </paper>
     <paper id="336">
       <title>A Contrastive Framework for Learning Sentence Representations from Pairwise and Triple-wise Perspective in Angular Space</title>
@@ -4852,6 +5187,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/senteval">SentEval</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.336</doi>
     </paper>
     <paper id="337">
       <title>Packed Levitated Marker for Entity and Relation Extraction</title>
@@ -4871,6 +5207,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/few-nerd">Few-NERD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ontonotes-5-0">OntoNotes 5.0</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scierc">SciERC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.337</doi>
     </paper>
     <paper id="338">
       <title>An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation</title>
@@ -4884,6 +5221,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="1eb24f41">2022.acl-long.338.software.zip</attachment>
       <bibkey>yang-etal-2022-interpretable</bibkey>
       <pwccode url="https://github.com/shiquanyang/ns-dial" additional="false">shiquanyang/ns-dial</pwccode>
+      <doi>10.18653/v1/2022.acl-long.338</doi>
     </paper>
     <paper id="339">
       <title>Impact of Evaluation Methodologies on Code Summarization</title>
@@ -4897,6 +5235,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="77424b07">2022.acl-long.339</url>
       <bibkey>nie-etal-2022-impact</bibkey>
       <pwccode url="https://github.com/engineeringsoftware/time-segmented-evaluation" additional="false">engineeringsoftware/time-segmented-evaluation</pwccode>
+      <doi>10.18653/v1/2022.acl-long.339</doi>
     </paper>
     <paper id="340">
       <title><fixed-case>KG</fixed-case>-<fixed-case>F</fixed-case>i<fixed-case>D</fixed-case>: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering</title>
@@ -4915,6 +5254,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>yu-etal-2022-kg</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.340</doi>
     </paper>
     <paper id="341">
       <title>Which side are you on? Insider-Outsider classification in conspiracy-theoretic social media</title>
@@ -4928,6 +5268,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="0bb50764">2022.acl-long.341</url>
       <attachment type="software" hash="c3fdd3f1">2022.acl-long.341.software.zip</attachment>
       <bibkey>holur-etal-2022-side</bibkey>
+      <doi>10.18653/v1/2022.acl-long.341</doi>
     </paper>
     <paper id="342">
       <title>Learning From Failure: Data Capture in an <fixed-case>A</fixed-case>ustralian Aboriginal Community</title>
@@ -4938,6 +5279,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Most low resource language technology development is premised on the need to collect data for training statistical models. When we follow the typical process of recording and transcribing text for small Indigenous languages, we hit up against the so-called “transcription bottleneck.” Therefore it is worth exploring new ways of engaging with speakers which generate data while avoiding the transcription bottleneck. We have deployed a prototype app for speakers to use for confirming system guesses in an approach to transcription based on word spotting. However, in the process of testing the app we encountered many new problems for engagement with speakers. This paper presents a close-up study of the process of deploying data capture technology on the ground in an Australian Aboriginal community. We reflect on our interactions with participants and draw lessons that apply to anyone seeking to develop methods for language data collection in an Indigenous community.</abstract>
       <url hash="7dac6807">2022.acl-long.342</url>
       <bibkey>le-ferrand-etal-2022-learning</bibkey>
+      <doi>10.18653/v1/2022.acl-long.342</doi>
     </paper>
     <paper id="343">
       <title>Deep Inductive Logic Reasoning for Multi-Hop Reading Comprehension</title>
@@ -4949,6 +5291,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>wang-pan-2022-deep</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/medhop">MedHop</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikihop">WikiHop</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.343</doi>
     </paper>
     <paper id="344">
       <title><fixed-case>CICERO</fixed-case>: A Dataset for Contextualized Commonsense Inference in Dialogues</title>
@@ -4966,6 +5309,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/dream">DREAM</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mutual">MuTual</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.344</doi>
     </paper>
     <paper id="345">
       <title>A Comparative Study of Faithfulness Metrics for Model Interpretability Methods</title>
@@ -4978,6 +5322,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chan-etal-2022-comparative</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.345</doi>
     </paper>
     <paper id="346">
       <title><fixed-case>SP</fixed-case>o<fixed-case>T</fixed-case>: Better Frozen Model Adaptation through Soft Prompt Transfer</title>
@@ -5013,6 +5358,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/wsc">WSC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winogrande">WinoGrande</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.346</doi>
     </paper>
     <paper id="347">
       <title>Pass off Fish Eyes for Pearls: Attacking Model Selection of Pre-trained Models</title>
@@ -5034,6 +5380,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/olid">OLID</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.347</doi>
     </paper>
     <paper id="348">
       <title>Educational Question Generation of Children Storybooks via Question Type Distribution Learning and Event-centric Summarization</title>
@@ -5050,6 +5397,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>zhao-etal-2022-educational</bibkey>
       <pwccode url="https://github.com/zhaozj89/Educational-Question-Generation" additional="false">zhaozj89/Educational-Question-Generation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fairytaleqa">FairytaleQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.348</doi>
     </paper>
     <paper id="349">
       <title><fixed-case>H</fixed-case>eter<fixed-case>MPC</fixed-case>: A Heterogeneous Graph Neural Network for Response Generation in Multi-Party Conversations</title>
@@ -5065,6 +5413,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="1f251d5c">2022.acl-long.349</url>
       <bibkey>gu-etal-2022-hetermpc</bibkey>
       <pwccode url="https://github.com/lxchtan/hetermpc" additional="false">lxchtan/hetermpc</pwccode>
+      <doi>10.18653/v1/2022.acl-long.349</doi>
     </paper>
     <paper id="350">
       <title>The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature</title>
@@ -5076,6 +5425,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Although multi-document summarisation (MDS) of the biomedical literature is a highly valuable task that has recently attracted substantial interest, evaluation of the quality of biomedical summaries lacks consistency and transparency. In this paper, we examine the summaries generated by two current models in order to understand the deficiencies of existing evaluation approaches in the context of the challenges that arise in the MDS task. Based on this analysis, we propose a new approach to human evaluation and identify several challenges that must be overcome to develop effective biomedical MDS systems.</abstract>
       <url hash="7529fdc3">2022.acl-long.350</url>
       <bibkey>otmakhova-etal-2022-patient</bibkey>
+      <doi>10.18653/v1/2022.acl-long.350</doi>
     </paper>
     <paper id="351">
       <title>A Multi-Document Coverage Reward for <fixed-case>RELAX</fixed-case>ed Multi-Document Summarization</title>
@@ -5089,6 +5439,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/jacob-parnell-rozetta/longformer_coverage" additional="false">jacob-parnell-rozetta/longformer_coverage</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/multi-news">Multi-News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wcep">WCEP</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.351</doi>
     </paper>
     <paper id="352">
       <title><fixed-case>KNN</fixed-case>-Contrastive Learning for Out-of-Domain Intent Classification</title>
@@ -5099,6 +5450,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>The Out-of-Domain (OOD) intent classification is a basic and challenging task for dialogue systems. Previous methods commonly restrict the region (in feature space) of In-domain (IND) intent features to be compact or simply-connected implicitly, which assumes no OOD intents reside, to learn discriminative semantic features. Then the distribution of the IND intent features is often assumed to obey a hypothetical distribution (Gaussian mostly) and samples outside this distribution are regarded as OOD samples. In this paper, we start from the nature of OOD intent classification and explore its optimization objective. We further propose a simple yet effective method, named KNN-contrastive learning. Our approach utilizes k-nearest neighbors (KNN) of IND intents to learn discriminative semantic features that are more conducive to OOD detection.Notably, the density-based novelty detection algorithm is so well-grounded in the essence of our method that it is reasonable to use it as the OOD detection algorithm without making any requirements for the feature distribution.Extensive experiments on four public datasets show that our approach can not only enhance the OOD detection performance substantially but also improve the IND intent classification while requiring no restrictions on feature distribution.</abstract>
       <url hash="24e75142">2022.acl-long.352</url>
       <bibkey>zhou-etal-2022-knn</bibkey>
+      <doi>10.18653/v1/2022.acl-long.352</doi>
     </paper>
     <paper id="353">
       <title>A Neural Network Architecture for Program Understanding Inspired by Human Behaviors</title>
@@ -5115,6 +5467,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/recklessronan/pgnn-ek" additional="false">recklessronan/pgnn-ek</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/codesearchnet">CodeSearchNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/codexglue">CodeXGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.353</doi>
     </paper>
     <paper id="354">
       <title><fixed-case>F</fixed-case>a<fixed-case>VIQ</fixed-case>: <fixed-case>FA</fixed-case>ct Verification from Information-seeking Questions</title>
@@ -5134,6 +5487,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/fm2">FM2</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/kilt">KILT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.354</doi>
     </paper>
     <paper id="355">
       <title>Simulating Bandit Learning from User Feedback for Extractive Question Answering</title>
@@ -5152,6 +5506,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/searchqa">SearchQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.355</doi>
     </paper>
     <paper id="356">
       <title>Beyond Goldfish Memory: Long-Term Open-Domain Conversation</title>
@@ -5163,6 +5518,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="9a7d6358">2022.acl-long.356</url>
       <bibkey>xu-etal-2022-beyond</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.356</doi>
     </paper>
     <paper id="357">
       <title><fixed-case>R</fixed-case>e<fixed-case>CLIP</fixed-case>: A Strong Zero-Shot Baseline for Referring Expression Comprehension</title>
@@ -5180,6 +5536,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/clevr">CLEVR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/refcoco">RefCOCO</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.357</doi>
     </paper>
     <paper id="358">
       <title>Dynamic Prefix-Tuning for Generative Template-based Event Extraction</title>
@@ -5191,6 +5548,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>We consider event extraction in a generative manner with template-based conditional generation.Although there is a rising trend of casting the task of event extraction as a sequence generation problem with prompts, these generation-based methods have two significant challenges, including using suboptimal prompts and static event type information.In this paper, we propose a generative template-based event extraction method with dynamic prefix (GTEE-DynPref) by integrating context information with type-specific prefixes to learn a context-specific prefix for each context.Experimental results show that our model achieves competitive results with the state-of-the-art classification-based model OneIE on ACE 2005 and achieves the best performances on ERE.Additionally, our model is proven to be portable to new types of events effectively.</abstract>
       <url hash="5d579802">2022.acl-long.358</url>
       <bibkey>liu-etal-2022-dynamic</bibkey>
+      <doi>10.18653/v1/2022.acl-long.358</doi>
     </paper>
     <paper id="359">
       <title><fixed-case>E</fixed-case>-<fixed-case>LANG</fixed-case>: Energy-Based Joint Inferencing of Super and Swift Language Models</title>
@@ -5204,6 +5562,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.359</doi>
     </paper>
     <paper id="360">
       <title><fixed-case>PRIMERA</fixed-case>: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization</title>
@@ -5223,6 +5582,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/wikisum">WikiSum</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/arxiv">arXiv</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/arxiv-summarization-dataset">arXiv Summarization Dataset</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.360</doi>
     </paper>
     <paper id="361">
       <title>Dynamic Global Memory for Document-level Argument Extraction</title>
@@ -5235,6 +5595,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="8527578a">2022.acl-long.361.software.zip</attachment>
       <bibkey>du-etal-2022-dynamic</bibkey>
       <pwccode url="https://github.com/xinyadu/memory_docie" additional="false">xinyadu/memory_docie</pwccode>
+      <doi>10.18653/v1/2022.acl-long.361</doi>
     </paper>
     <paper id="362">
       <title>Measuring the Impact of (Psycho-)Linguistic and Readability Features and Their Spill Over Effects on the Prediction of Eye Movement Patterns</title>
@@ -5246,6 +5607,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>There is a growing interest in the combined use of NLP and machine learning methods to predict gaze patterns during naturalistic reading. While promising results have been obtained through the use of transformer-based language models, little work has been undertaken to relate the performance of such models to general text characteristics. In this paper we report on experiments with two eye-tracking corpora of naturalistic reading and two language models (BERT and GPT-2). In all experiments, we test effects of a broad spectrum of features for predicting human reading behavior that fall into five categories (syntactic complexity, lexical richness, register-based multiword combinations, readability and psycholinguistic word properties). Our experiments show that both the features included and the architecture of the transformer-based language models play a role in predicting multiple eye-tracking measures during naturalistic reading. We also report the results of experiments aimed at determining the relative importance of features from different groups using SP-LIME.</abstract>
       <url hash="de9ce351">2022.acl-long.362</url>
       <bibkey>wiechmann-kerz-2022-measuring</bibkey>
+      <doi>10.18653/v1/2022.acl-long.362</doi>
     </paper>
     <paper id="363">
       <title>Alternative Input Signals Ease Transfer in Multilingual Machine Translation</title>
@@ -5260,6 +5622,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Recent work in multilingual machine translation (MMT) has focused on the potential of positive transfer between languages, particularly cases where higher-resourced languages can benefit lower-resourced ones. While training an MMT model, the supervision signals learned from one language pair can be transferred to the other via the tokens shared by multiple source languages. However, the transfer is inhibited when the token overlap among source languages is small, which manifests naturally when languages use different writing systems. In this paper, we tackle inhibited transfer by augmenting the training data with alternative signals that unify different writing systems, such as phonetic, romanized, and transliterated input. We test these signals on Indic and Turkic languages, two language families where the writing systems differ but languages still share common features. Our results indicate that a straightforward multi-source self-ensemble – training a model on a mixture of various signals and ensembling the outputs of the same model fed with different signals during inference, outperforms strong ensemble baselines by 1.3 BLEU points on both language families. Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible. Finally, our analysis demonstrates that including alternative signals yields more consistency and translates named entities more accurately, which is crucial for increased factuality of automated systems.</abstract>
       <url hash="89d2323b">2022.acl-long.363</url>
       <bibkey>sun-etal-2022-alternative</bibkey>
+      <doi>10.18653/v1/2022.acl-long.363</doi>
     </paper>
     <paper id="364">
       <title>Phone-ing it in: Towards Flexible Multi-Modal Language Model Training by Phonetic Representations of Data</title>
@@ -5271,6 +5634,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>leong-whitenack-2022-phone</bibkey>
       <pwccode url="https://github.com/sil-ai/phone-it-in" additional="false">sil-ai/phone-it-in</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/masakhaner">MasakhaNER</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.364</doi>
     </paper>
     <paper id="365">
       <title>Noisy Channel Language Model Prompting for Few-Shot Text Classification</title>
@@ -5285,6 +5649,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/shmsw25/Channel-LM-Prompting" additional="false">shmsw25/Channel-LM-Prompting</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.365</doi>
     </paper>
     <paper id="366">
       <title>Multilingual unsupervised sequence segmentation transfers to extremely low-resource languages</title>
@@ -5297,6 +5662,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="13a2ba5c">2022.acl-long.366</url>
       <bibkey>downey-etal-2022-multilingual</bibkey>
       <pwccode url="https://github.com/cmdowney88/xlslm" additional="false">cmdowney88/xlslm</pwccode>
+      <doi>10.18653/v1/2022.acl-long.366</doi>
     </paper>
     <paper id="367">
       <title><fixed-case>K</fixed-case>inya<fixed-case>BERT</fixed-case>: a Morphology-aware <fixed-case>K</fixed-case>inyarwanda Language Model</title>
@@ -5310,6 +5676,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/anzeyimana/kinyabert-acl2022" additional="false">anzeyimana/kinyabert-acl2022</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.367</doi>
     </paper>
     <paper id="368">
       <title>On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency</title>
@@ -5321,6 +5688,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>park-caragea-2022-calibration</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/swag">SWAG</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.368</doi>
     </paper>
     <paper id="369">
       <title><fixed-case>IMPLI</fixed-case>: Investigating <fixed-case>NLI</fixed-case> Models’ Performance on Figurative Language</title>
@@ -5332,6 +5700,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="5d9734a9">2022.acl-long.369</url>
       <bibkey>stowe-etal-2022-impli</bibkey>
       <pwccode url="https://github.com/ukplab/acl2022-impli" additional="false">ukplab/acl2022-impli</pwccode>
+      <doi>10.18653/v1/2022.acl-long.369</doi>
     </paper>
     <paper id="370">
       <title><fixed-case>QAC</fixed-case>onv: Question Answering on Informative Conversations</title>
@@ -5353,6 +5722,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/molweni">Molweni</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.370</doi>
     </paper>
     <paper id="371">
       <title>Prix-<fixed-case>LM</fixed-case>: Pretraining for Multilingual Knowledge Base Construction</title>
@@ -5369,6 +5739,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/dbpedia">DBpedia</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xl-bel">XL-BEL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.371</doi>
     </paper>
     <paper id="372">
       <title>Semantic Composition with <fixed-case>PSHRG</fixed-case> for Derivation Tree Reconstruction from Graph-Based Meaning Representations</title>
@@ -5379,6 +5750,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>We introduce a data-driven approach to generating derivation trees from meaning representation graphs with probabilistic synchronous hyperedge replacement grammar (PSHRG). SHRG has been used to produce meaning representation graphs from texts and syntax trees, but little is known about its viability on the reverse. In particular, we experiment on Dependency Minimal Recursion Semantics (DMRS) and adapt PSHRG as a formalism that approximates the semantic composition of DMRS graphs and simultaneously recovers the derivations that license the DMRS graphs. Consistent results are obtained as evaluated on a collection of annotated corpora. This work reveals the ability of PSHRG in formalizing a syntax–semantics interface, modelling compositional graph-to-tree translations, and channelling explainability to surface realization.</abstract>
       <url hash="568f8bcf">2022.acl-long.372</url>
       <bibkey>lo-etal-2022-semantic</bibkey>
+      <doi>10.18653/v1/2022.acl-long.372</doi>
     </paper>
     <paper id="373">
       <title><fixed-case>HOLM</fixed-case>: Hallucinating Objects with Language Models for Referring Expression Recognition in Partially-Observed Scenes</title>
@@ -5390,6 +5762,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="76fb0288">2022.acl-long.373</url>
       <bibkey>cirik-etal-2022-holm</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.373</doi>
     </paper>
     <paper id="374">
       <title>Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models</title>
@@ -5409,6 +5782,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/xcopa">XCOPA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.374</doi>
     </paper>
     <paper id="375">
       <title><tex-math>\infty</tex-math>-former: Infinite Memory Transformer</title>
@@ -5423,6 +5797,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/pg-19">PG-19</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.375</doi>
     </paper>
     <paper id="376">
       <title>Systematic Inequalities in Language Technology Performance across the World’s Languages</title>
@@ -5435,6 +5810,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="676fee7e">2022.acl-long.376.software.zip</attachment>
       <bibkey>blasi-etal-2022-systematic</bibkey>
       <pwccode url="https://github.com/neubig/globalutility" additional="false">neubig/globalutility</pwccode>
+      <doi>10.18653/v1/2022.acl-long.376</doi>
     </paper>
     <paper id="377">
       <title><fixed-case>CaMEL</fixed-case>: <fixed-case>C</fixed-case>ase <fixed-case>M</fixed-case>arker <fixed-case>E</fixed-case>xtraction without <fixed-case>L</fixed-case>abels</title>
@@ -5448,6 +5824,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="d5b16165">2022.acl-long.377.software.zip</attachment>
       <bibkey>weissweiler-etal-2022-camel</bibkey>
       <pwccode url="https://github.com/leonieweissweiler/camel" additional="false">leonieweissweiler/camel</pwccode>
+      <doi>10.18653/v1/2022.acl-long.377</doi>
     </paper>
     <paper id="378">
       <title>Improving Generalizability in Implicitly Abusive Language Detection with Concept Activation Vectors</title>
@@ -5460,6 +5837,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="417a0160">2022.acl-long.378.software.zip</attachment>
       <bibkey>nejadgholi-etal-2022-improving</bibkey>
       <pwccode url="https://github.com/isarnejad/tcav-for-text-classifiers" additional="false">isarnejad/tcav-for-text-classifiers</pwccode>
+      <doi>10.18653/v1/2022.acl-long.378</doi>
     </paper>
     <paper id="379">
       <title>Reports of personal experiences and stories in argumentation: datasets and analysis</title>
@@ -5469,6 +5847,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Reports of personal experiences or stories can play a crucial role in argumentation, as they represent an immediate and (often) relatable way to back up one’s position with respect to a given topic. They are easy to understand and increase empathy: this makes them powerful in argumentation. The impact of personal reports and stories in argumentation has been studied in the Social Sciences, but it is still largely underexplored in NLP. Our work is the first step towards filling this gap: our goal is to develop robust classifiers to identify documents containing personal experiences and reports. The main challenge is the scarcity of annotated data: our solution is to leverage existing annotations to be able to scale-up the analysis. Our contribution is two-fold. First, we conduct a set of in-domain and cross-domain experiments involving three datasets (two from Argument Mining, one from the Social Sciences), modeling architectures, training setups and fine-tuning options tailored to the involved domains. We show that despite the differences among datasets and annotations, robust cross-domain classification is possible. Second, we employ linear regression for performance mining, identifying performance trends both for overall classification performance and individual classifier predictions.</abstract>
       <url hash="45832c10">2022.acl-long.379</url>
       <bibkey>falk-lapesa-2022-reports</bibkey>
+      <doi>10.18653/v1/2022.acl-long.379</doi>
     </paper>
     <paper id="380">
       <title>Non-neural Models Matter: a Re-evaluation of Neural Referring Expression Generation Systems</title>
@@ -5481,6 +5860,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="6a91ff53">2022.acl-long.380.software.zip</attachment>
       <bibkey>same-etal-2022-non</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/webnlg">WebNLG</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.380</doi>
     </paper>
     <paper id="381">
       <title>Bridging the Generalization Gap in Text-to-<fixed-case>SQL</fixed-case> Parsing with Schema Expansion</title>
@@ -5492,6 +5872,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Text-to-SQL parsers map natural language questions to programs that are executable over tables to generate answers, and are typically evaluated on large-scale datasets like Spider (Yu et al., 2018). We argue that existing benchmarks fail to capture a certain out-of-domain generalization problem that is of significant practical importance: matching domain specific phrases to composite operation over columns. To study this problem, we first propose a synthetic dataset along with a re-purposed train/test split of the Squall dataset (Shi et al., 2020) as new benchmarks to quantify domain generalization over column operations, and find existing state-of-the-art parsers struggle in these benchmarks. We propose to address this problem by incorporating prior domain knowledge by preprocessing table schemas, and design a method that consists of two components: schema expansion and schema pruning. This method can be easily applied to multiple existing base parsers, and we show that it significantly outperforms baseline parsers on this domain generalization problem, boosting the underlying parsers’ overall performance by up to 13.8% relative accuracy gain (5.1% absolute) on the new Squall data split.</abstract>
       <url hash="ae572e92">2022.acl-long.381</url>
       <bibkey>zhao-etal-2022-bridging</bibkey>
+      <doi>10.18653/v1/2022.acl-long.381</doi>
     </paper>
     <paper id="382">
       <title>Predicate-Argument Based Bi-Encoder for Paraphrase Identification</title>
@@ -5505,6 +5886,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>peng-etal-2022-predicate</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/pit">PIT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.382</doi>
     </paper>
     <paper id="383">
       <title><fixed-case>MINER</fixed-case>: Improving Out-of-Vocabulary Named Entity Recognition from an Information Theoretic Perspective</title>
@@ -5523,6 +5905,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>wang-etal-2022-miner</bibkey>
       <pwccode url="https://github.com/beyonderxx/miner" additional="false">beyonderxx/miner</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.383</doi>
     </paper>
     <paper id="384">
       <title>Leveraging <fixed-case>W</fixed-case>ikipedia article evolution for promotional tone detection</title>
@@ -5533,6 +5916,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="01d018e5">2022.acl-long.384</url>
       <bibkey>de-kock-vlachos-2022-leveraging</bibkey>
       <pwccode url="https://github.com/christinedekock11/wiki-evolve" additional="false">christinedekock11/wiki-evolve</pwccode>
+      <doi>10.18653/v1/2022.acl-long.384</doi>
     </paper>
     <paper id="385">
       <title>From text to talk: <fixed-case>H</fixed-case>arnessing conversational corpora for humane and diversity-aware language technology</title>
@@ -5542,6 +5926,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Informal social interaction is the primordial home of human language. Linguistically diverse conversational corpora are an important and largely untapped resource for computational linguistics and language technology. Through the efforts of a worldwide language documentation movement, such corpora are increasingly becoming available. We show how interactional data from 63 languages (26 families) harbours insights about turn-taking, timing, sequential structure and social action, with implications for language technology, natural language understanding, and the design of conversational interfaces. Harnessing linguistically diverse conversational corpora will provide the empirical foundations for flexible, localizable, humane language technologies of the future.</abstract>
       <url hash="ff1338b4">2022.acl-long.385</url>
       <bibkey>dingemanse-liesenfeld-2022-text</bibkey>
+      <doi>10.18653/v1/2022.acl-long.385</doi>
     </paper>
     <paper id="386">
       <title>Flooding-<fixed-case>X</fixed-case>: Improving <fixed-case>BERT</fixed-case>’s Resistance to Adversarial Attacks via Loss-Restricted Fine-Tuning</title>
@@ -5562,6 +5947,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.386</doi>
     </paper>
     <paper id="387">
       <title><fixed-case>R</fixed-case>o<fixed-case>M</fixed-case>e: A Robust Metric for Evaluating Natural Language Generation</title>
@@ -5577,6 +5963,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/rashad101/rome" additional="false">rashad101/rome</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cola">CoLA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/kelm">KELM</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.387</doi>
     </paper>
     <paper id="388">
       <title>Finding Structural Knowledge in Multimodal-<fixed-case>BERT</fixed-case></title>
@@ -5590,6 +5977,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/vsjmilewski/multimodal-probes" additional="false">vsjmilewski/multimodal-probes</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/flickr30k">Flickr30k</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.388</doi>
     </paper>
     <paper id="389">
       <title>Fully Hyperbolic Neural Networks</title>
@@ -5607,6 +5995,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chen-etal-2022-fully</bibkey>
       <pwccode url="https://github.com/chenweize1998/fully-hyperbolic-nn" additional="false">chenweize1998/fully-hyperbolic-nn</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fb15k-237">FB15k-237</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.389</doi>
     </paper>
     <paper id="390">
       <title>Neural Machine Translation with Phrase-Level Universal Visual Representations</title>
@@ -5617,6 +6006,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="60d19a6c">2022.acl-long.390</url>
       <bibkey>fang-feng-2022-neural</bibkey>
       <pwccode url="https://github.com/ictnlp/pluvr" additional="false">ictnlp/pluvr</pwccode>
+      <doi>10.18653/v1/2022.acl-long.390</doi>
     </paper>
     <paper id="391">
       <title><fixed-case>M</fixed-case>3<fixed-case>ED</fixed-case>: Multi-modal Multi-scene Multi-label Emotional Dialogue Database</title>
@@ -5639,6 +6029,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/emotionlines">EmotionLines</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/iemocap">IEMOCAP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/meld">MELD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.391</doi>
     </paper>
     <paper id="392">
       <title>Few-shot Named Entity Recognition with Self-describing Networks</title>
@@ -5654,6 +6045,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chen-etal-2022-shot</bibkey>
       <pwccode url="https://github.com/chen700564/sdnet" additional="false">chen700564/sdnet</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.392</doi>
     </paper>
     <paper id="393">
       <title><fixed-case>S</fixed-case>peech<fixed-case>T</fixed-case>5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing</title>
@@ -5681,6 +6073,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/voxceleb1">VoxCeleb1</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wham">WHAM!</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.393</doi>
     </paper>
     <paper id="394">
       <title>Human Evaluation and Correlation with Automatic Metrics in Consultation Note Generation</title>
@@ -5697,6 +6090,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="ad056890">2022.acl-long.394</url>
       <bibkey>moramarco-etal-2022-human</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.394</doi>
     </paper>
     <paper id="395">
       <title>Unified Structure Generation for Universal Information Extraction</title>
@@ -5714,6 +6108,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>lu-etal-2022-unified</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scierc">SciERC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.395</doi>
     </paper>
     <paper id="396">
       <title>Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering</title>
@@ -5729,6 +6124,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="69971a75">2022.acl-long.396</url>
       <bibkey>zhang-etal-2022-subgraph</bibkey>
       <pwccode url="https://github.com/ruckbreasoning/subgraphretrievalkbqa" additional="false">ruckbreasoning/subgraphretrievalkbqa</pwccode>
+      <doi>10.18653/v1/2022.acl-long.396</doi>
     </paper>
     <paper id="397">
       <title>Pre-training to Match for Unified Low-shot Relation Extraction</title>
@@ -5743,6 +6139,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>liu-etal-2022-pre</bibkey>
       <pwccode url="https://github.com/fc-liu/mcmn" additional="false">fc-liu/mcmn</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.397</doi>
     </paper>
     <paper id="398">
       <title>Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View</title>
@@ -5760,6 +6157,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/biolama">BioLAMA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.398</doi>
     </paper>
     <paper id="399">
       <title>Evaluating Extreme Hierarchical Multi-label Classification</title>
@@ -5769,6 +6167,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Several natural language processing (NLP) tasks are defined as a classification problem in its most complex form: Multi-label Hierarchical Extreme classification, in which items may be associated with multiple classes from a set of thousands of possible classes organized in a hierarchy and with a highly unbalanced distribution both in terms of class frequency and the number of labels per item. We analyze the state of the art of evaluation metrics based on a set of formal properties and we define an information theoretic based metric inspired by the Information Contrast Model (ICM). Experiments on synthetic data and a case study on real data show the suitability of the ICM for such scenarios.</abstract>
       <url hash="18056e83">2022.acl-long.399</url>
       <bibkey>amigo-delgado-2022-evaluating</bibkey>
+      <doi>10.18653/v1/2022.acl-long.399</doi>
     </paper>
     <paper id="400">
       <title>What does the sea say to the shore? A <fixed-case>BERT</fixed-case> based <fixed-case>DST</fixed-case> style approach for speaker to dialogue attribution in novels</title>
@@ -5779,6 +6178,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>We present a complete pipeline to extract characters in a novel and link them to their direct-speech utterances. Our model is divided into three independent components: extracting direct-speech, compiling a list of characters, and attributing those characters to their utterances. Although we find that existing systems can perform the first two tasks accurately, attributing characters to direct speech is a challenging problem due to the narrator’s lack of explicit character mentions, and the frequent use of nominal and pronominal coreference when such explicit mentions are made. We adapt the progress made on Dialogue State Tracking to tackle a new problem: attributing speakers to dialogues. This is the first application of deep learning to speaker attribution, and it shows that is possible to overcome the need for the hand-crafted features and rules used in the past. Our full pipeline improves the performance of state-of-the-art models by a relative 50% in F1-score.</abstract>
       <url hash="ded501a6">2022.acl-long.400</url>
       <bibkey>cuesta-lazaro-etal-2022-sea</bibkey>
+      <doi>10.18653/v1/2022.acl-long.400</doi>
     </paper>
     <paper id="401">
       <title>Measuring Fairness of Text Classifiers via Prediction Sensitivity</title>
@@ -5792,6 +6192,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation – accumulated prediction sensitivity, which measures fairness in machine learning models based on the model’s prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans’ perception of fairness. We conduct experiments on two text classification datasets – Jigsaw Toxicity, and Bias in Bios, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.</abstract>
       <url hash="9223cdb6">2022.acl-long.401</url>
       <bibkey>krishna-etal-2022-measuring</bibkey>
+      <doi>10.18653/v1/2022.acl-long.401</doi>
     </paper>
     <paper id="402">
       <title><fixed-case>R</fixed-case>otate<fixed-case>QVS</fixed-case>: Representing Temporal Information as Rotations in Quaternion Vector Space for Temporal Knowledge Graph Completion</title>
@@ -5806,6 +6207,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chen-etal-2022-rotateqvs</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/icews">ICEWS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/yago">YAGO</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.402</doi>
     </paper>
     <paper id="403">
       <title>Feeding What You Need by Understanding What You Learned</title>
@@ -5823,6 +6225,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/hotpotqa">HotpotQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.403</doi>
     </paper>
     <paper id="404">
       <title>Probing Simile Knowledge from Pre-trained Language Models</title>
@@ -5842,6 +6245,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chen-etal-2022-probing</bibkey>
       <pwccode url="https://github.com/nairoj/Probing-Simile-from-PLM" additional="false">nairoj/Probing-Simile-from-PLM</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bookcorpus">BookCorpus</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.404</doi>
     </paper>
     <paper id="405">
       <title>An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism</title>
@@ -5858,6 +6262,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="2acedaea">2022.acl-long.405</url>
       <attachment type="software" hash="8ac6d6ef">2022.acl-long.405.software.zip</attachment>
       <bibkey>mao-etal-2022-effective</bibkey>
+      <doi>10.18653/v1/2022.acl-long.405</doi>
     </paper>
     <paper id="406">
       <title>Entailment Graph Learning with Textual Entailment and Soft Transitivity</title>
@@ -5870,6 +6275,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chen-etal-2022-entailment</bibkey>
       <pwccode url="https://github.com/zacharychenpk/egt2" additional="false">zacharychenpk/egt2</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/figer">FIGER</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.406</doi>
     </paper>
     <paper id="407">
       <title>Logic Traps in Evaluating Attribution Scores</title>
@@ -5886,6 +6292,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.407</doi>
     </paper>
     <paper id="408">
       <title>Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory Network</title>
@@ -5900,6 +6307,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="f415a9ce">2022.acl-long.408</url>
       <bibkey>gong-etal-2022-continual</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/math">MATH</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.408</doi>
     </paper>
     <paper id="409">
       <title>Multitasking Framework for Unsupervised Simple Definition Generation</title>
@@ -5914,6 +6322,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="f153b3bd">2022.acl-long.409.software.zip</attachment>
       <bibkey>kong-etal-2022-multitasking</bibkey>
       <pwccode url="https://github.com/blcuicall/simpdefiner" additional="false">blcuicall/simpdefiner</pwccode>
+      <doi>10.18653/v1/2022.acl-long.409</doi>
     </paper>
     <paper id="410">
       <title>Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction</title>
@@ -5929,6 +6338,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/math23k">Math23K</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mathqa">MathQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/svamp">SVAMP</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.410</doi>
     </paper>
     <paper id="411">
       <title>When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues</title>
@@ -5943,6 +6353,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>kumar-etal-2022-become</bibkey>
       <pwccode url="https://github.com/lcs2-iiitd/maf" additional="false">lcs2-iiitd/maf</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wits">WITS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.411</doi>
     </paper>
     <paper id="412">
       <title>Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning</title>
@@ -5957,6 +6368,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>lee-etal-2022-toward</bibkey>
       <pwccode url="https://github.com/sh0416/clrcmd" additional="false">sh0416/clrcmd</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.412</doi>
     </paper>
     <paper id="413">
       <title>Pre-training and Fine-tuning Neural Topic Model: A Simple yet Effective Approach to Incorporating External Knowledge</title>
@@ -5972,6 +6384,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>zhang-etal-2022-pre</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/openwebtext">OpenWebText</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.413</doi>
     </paper>
     <paper id="414">
       <title>Multi-View Document Representation Learning for Open-Domain Dense Retrieval</title>
@@ -5987,6 +6400,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.414</doi>
     </paper>
     <paper id="415">
       <title>Graph Pre-training for <fixed-case>AMR</fixed-case> Parsing and Generation</title>
@@ -6004,6 +6418,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ldc2020t02">LDC2020T02</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/new3">New3</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/the-little-prince">The Little Prince</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.415</doi>
     </paper>
     <paper id="416">
       <title>Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning Skills</title>
@@ -6018,6 +6433,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/oriyor/turning_tables" additional="false">oriyor/turning_tables</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/drop">DROP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/iirc">IIRC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.416</doi>
     </paper>
     <paper id="417">
       <title><fixed-case>RNG</fixed-case>-<fixed-case>KBQA</fixed-case>: Generation Augmented Iterative Ranking for Knowledge Base Question Answering</title>
@@ -6032,6 +6448,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="5cc1ea3c">2022.acl-long.417.software.zip</attachment>
       <bibkey>ye-etal-2022-rng</bibkey>
       <pwccode url="https://github.com/salesforce/rng-kbqa" additional="false">salesforce/rng-kbqa</pwccode>
+      <doi>10.18653/v1/2022.acl-long.417</doi>
     </paper>
     <paper id="418">
       <title>Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling</title>
@@ -6043,6 +6460,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="1c7babdb">2022.acl-long.418</url>
       <attachment type="software" hash="4d532034">2022.acl-long.418.software.zip</attachment>
       <bibkey>jwalapuram-etal-2022-rethinking</bibkey>
+      <doi>10.18653/v1/2022.acl-long.418</doi>
     </paper>
     <paper id="419">
       <title>Just Rank: Rethinking Evaluation with Word and Sentence Similarities</title>
@@ -6059,6 +6477,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scicite">SciCite</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/senteval">SentEval</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.419</doi>
     </paper>
     <paper id="420">
       <title><fixed-case>M</fixed-case>arkup<fixed-case>LM</fixed-case>: Pre-training of Text and Markup Language for Visually Rich Document Understanding</title>
@@ -6070,6 +6489,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Multimodal pre-training with text, layout, and image has made significant progress for Visually Rich Document Understanding (VRDU), especially the fixed-layout documents such as scanned document images. While, there are still a large number of digital documents where the layout information is not fixed and needs to be interactively and dynamically rendered for visualization, making existing layout-based pre-training approaches not easy to apply. In this paper, we propose MarkupLM for document understanding tasks with markup languages as the backbone, such as HTML/XML-based documents, where text and markup information is jointly pre-trained. Experiment results show that the pre-trained MarkupLM significantly outperforms the existing strong baseline models on several document understanding tasks. The pre-trained model and code will be publicly available at https://aka.ms/markuplm.</abstract>
       <url hash="876c0fb9">2022.acl-long.420</url>
       <bibkey>li-etal-2022-markuplm</bibkey>
+      <doi>10.18653/v1/2022.acl-long.420</doi>
     </paper>
     <paper id="421">
       <title><fixed-case>CLIP</fixed-case> Models are Few-Shot Learners: Empirical Studies on <fixed-case>VQA</fixed-case> and Visual Entailment</title>
@@ -6085,6 +6505,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>song-etal-2022-clip</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/snli-ve">SNLI-VE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.421</doi>
     </paper>
     <paper id="422">
       <title><fixed-case>KQA</fixed-case> Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering over Knowledge Base</title>
@@ -6109,6 +6530,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/complexwebquestions">ComplexWebQuestions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/metaqa">MetaQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webquestions">WebQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.422</doi>
     </paper>
     <paper id="423">
       <title>Debiased Contrastive Learning of Unsupervised Sentence Representations</title>
@@ -6121,6 +6543,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="9dbbf18f">2022.acl-long.423</url>
       <bibkey>zhou-etal-2022-debiased</bibkey>
       <pwccode url="https://github.com/rucaibox/dclr" additional="false">rucaibox/dclr</pwccode>
+      <doi>10.18653/v1/2022.acl-long.423</doi>
     </paper>
     <paper id="424">
       <title><fixed-case>MSP</fixed-case>: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators</title>
@@ -6133,6 +6556,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="6d36cbdf">2022.acl-long.424</url>
       <bibkey>tan-etal-2022-msp</bibkey>
       <pwccode url="https://github.com/thunlp-mt/plm4mt" additional="false">thunlp-mt/plm4mt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.424</doi>
     </paper>
     <paper id="425">
       <title><fixed-case>S</fixed-case>ales<fixed-case>B</fixed-case>ot: Transitioning from Chit-Chat to Task-Oriented Dialogues</title>
@@ -6148,6 +6572,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/commonsenseqa">CommonsenseQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sgd">SGD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/swag">SWAG</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.425</doi>
     </paper>
     <paper id="426">
       <title><fixed-case>UCT</fixed-case>opic: Unsupervised Contrastive Learning for Phrase Representations and Topic Mining</title>
@@ -6165,6 +6590,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/kp20k">KP20k</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/kptimes">KPTimes</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.426</doi>
     </paper>
     <paper id="427">
       <title><fixed-case>XLM</fixed-case>-<fixed-case>E</fixed-case>: Cross-lingual Language Model Pre-training via <fixed-case>ELECTRA</fixed-case></title>
@@ -6191,6 +6617,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xtreme">XTREME</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.427</doi>
     </paper>
     <paper id="428">
       <title>Nested Named Entity Recognition as Latent Lexicalized Constituency Parsing</title>
@@ -6203,6 +6630,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>lou-etal-2022-nested</bibkey>
       <pwccode url="https://github.com/louchao98/nner_as_parsing" additional="false">louchao98/nner_as_parsing</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/nne">NNE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.428</doi>
     </paper>
     <paper id="429">
       <title>Can Explanations Be Useful for Calibrating Black Box Models?</title>
@@ -6218,6 +6646,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.429</doi>
     </paper>
     <paper id="430">
       <title><fixed-case>OIE</fixed-case>@<fixed-case>OIA</fixed-case>: an Adaptable and Efficient Open Information Extraction Framework</title>
@@ -6229,6 +6658,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Different Open Information Extraction (OIE) tasks require different types of information, so the OIE field requires strong adaptability of OIE algorithms to meet different task requirements. This paper discusses the adaptability problem in existing OIE systems and designs a new adaptable and efficient OIE system - OIE@OIA as a solution. OIE@OIA follows the methodology of Open Information eXpression (OIX): parsing a sentence to an Open Information Annotation (OIA) Graph and then adapting the OIA graph to different OIE tasks with simple rules. As the core of our OIE@OIA system, we implement an end-to-end OIA generator by annotating a dataset (we make it open available) and designing an efficient learning algorithm for the complex OIA graph. We easily adapt the OIE@OIA system to accomplish three popular OIE tasks. The experimental show that our OIE@OIA achieves new SOTA performances on these tasks, showing the great adaptability of our OIE@OIA system. Furthermore, compared to other end-to-end OIE baselines that need millions of samples for training, our OIE@OIA needs much fewer training samples (12K), showing a significant advantage in terms of efficiency.</abstract>
       <url hash="53cb483b">2022.acl-long.430</url>
       <bibkey>wang-etal-2022-oie</bibkey>
+      <doi>10.18653/v1/2022.acl-long.430</doi>
     </paper>
     <paper id="431">
       <title><fixed-case>R</fixed-case>e<fixed-case>ACC</fixed-case>: A Retrieval-Augmented Code Completion Framework</title>
@@ -6245,6 +6675,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/celbree/reacc" additional="false">celbree/reacc</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/codesearchnet">CodeSearchNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/codexglue">CodeXGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.431</doi>
     </paper>
     <paper id="432">
       <title>Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in <fixed-case>D</fixed-case>oc<fixed-case>RED</fixed-case></title>
@@ -6260,6 +6691,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>huang-etal-2022-recommend</bibkey>
       <pwccode url="https://github.com/andrewzhe/revisit-docred" additional="false">andrewzhe/revisit-docred</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/docred">DocRED</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.432</doi>
     </paper>
     <paper id="433">
       <title><fixed-case>U</fixed-case>ni<fixed-case>PELT</fixed-case>: A Unified Framework for Parameter-Efficient Language Model Tuning</title>
@@ -6278,6 +6710,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/morningmoni/unipelt" additional="false">morningmoni/unipelt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.433</doi>
     </paper>
     <paper id="434">
       <title>An Empirical Study of Memorization in <fixed-case>NLP</fixed-case></title>
@@ -6290,6 +6723,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/xszheng2020/memorization" additional="false">xszheng2020/memorization</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cifar-10">CIFAR-10</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.434</doi>
     </paper>
     <paper id="435">
       <title><fixed-case>A</fixed-case>mericas<fixed-case>NLI</fixed-case>: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages</title>
@@ -6317,6 +6751,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/AmericasNLP/americasnlp2021" additional="false">AmericasNLP/americasnlp2021</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.435</doi>
     </paper>
     <paper id="436">
       <title>Towards Learning (Dis)-Similarity of Source Code from Program Contrasts</title>
@@ -6331,6 +6766,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="3bf37c87">2022.acl-long.436</url>
       <bibkey>ding-etal-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/codexglue">CodeXGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.436</doi>
     </paper>
     <paper id="437">
       <title>Guided Attention Multimodal Multitask Financial Forecasting with Inter-Company Relationships and Global and Local News</title>
@@ -6341,6 +6777,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="5ebfa958">2022.acl-long.437</url>
       <attachment type="software" hash="2bd5d7c5">2022.acl-long.437.software.zip</attachment>
       <bibkey>ang-lim-2022-guided</bibkey>
+      <doi>10.18653/v1/2022.acl-long.437</doi>
     </paper>
     <paper id="438">
       <title>On Vision Features in Multimodal Machine Translation</title>
@@ -6356,6 +6793,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="f301209a">2022.acl-long.438</url>
       <bibkey>li-etal-2022-vision</bibkey>
       <pwccode url="https://github.com/libeineu/fairseq_mmt" additional="false">libeineu/fairseq_mmt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.438</doi>
     </paper>
     <paper id="439">
       <title><fixed-case>CONT</fixed-case>ai<fixed-case>NER</fixed-case>: Few-Shot Named Entity Recognition via Contrastive Learning</title>
@@ -6370,6 +6808,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/psunlpgroup/container" additional="false">psunlpgroup/container</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/few-nerd">Few-NERD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.439</doi>
     </paper>
     <paper id="440">
       <title><fixed-case>C</fixed-case>ree Corpus: A Collection of nêhiyawêwin Resources</title>
@@ -6382,6 +6821,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Plains Cree (nêhiyawêwin) is an Indigenous language that is spoken in Canada and the USA. It is the most widely spoken dialect of Cree and a morphologically complex language that is polysynthetic, highly inflective, and agglutinative. It is an extremely low resource language, with no existing corpus that is both available and prepared for supporting the development of language technologies. To support nêhiyawêwin revitalization and preservation, we developed a corpus covering diverse genres, time periods, and texts for a variety of intended audiences. The data has been verified and cleaned; it is ready for use in developing language technologies for nêhiyawêwin. The corpus includes the corresponding English phrases or audio files where available. We demonstrate the utility of the corpus through its community use and its use to build language technologies that can provide the types of support that community members have expressed are desirable. The corpus is available for public use.</abstract>
       <url hash="2c320d70">2022.acl-long.440</url>
       <bibkey>teodorescu-etal-2022-cree</bibkey>
+      <doi>10.18653/v1/2022.acl-long.440</doi>
     </paper>
     <paper id="441">
       <title>Learning to Rank Visual Stories From Human Ranking Data</title>
@@ -6399,6 +6839,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/academiasinicanlplab/vhed" additional="false">academiasinicanlplab/vhed</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/vist">VIST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/vist-edit">VIST-Edit</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.441</doi>
     </paper>
     <paper id="442">
       <title>Universal Conditional Masked Language Pre-training for Neural Machine Translation</title>
@@ -6412,6 +6853,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="6558a018">2022.acl-long.442</url>
       <bibkey>li-etal-2022-universal</bibkey>
       <pwccode url="https://github.com/huawei-noah/Pretrained-Language-Model" additional="false">huawei-noah/Pretrained-Language-Model</pwccode>
+      <doi>10.18653/v1/2022.acl-long.442</doi>
     </paper>
     <paper id="443">
       <title><fixed-case>CARETS</fixed-case>: A Consistency And Robustness Evaluative Test Suite for <fixed-case>VQA</fixed-case></title>
@@ -6427,6 +6869,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/gqa">GQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.443</doi>
     </paper>
     <paper id="444">
       <title>Phrase-aware Unsupervised Constituency Parsing</title>
@@ -6439,6 +6882,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Recent studies have achieved inspiring success in unsupervised grammar induction using masked language modeling (MLM) as the proxy task. Despite their high accuracy in identifying low-level structures, prior arts tend to struggle in capturing high-level structures like clauses, since the MLM task usually only requires information from local context. In this work, we revisit LM-based constituency parsing from a phrase-centered perspective. Inspired by the natural reading process of human, we propose to regularize the parser with phrases extracted by an unsupervised phrase tagger to help the LM model quickly manage low-level structures. For a better understanding of high-level structures, we propose a phrase-guided masking strategy for LM to emphasize more on reconstructing non-phrase words. We show that the initial phrase regularization serves as an effective bootstrap, and phrase-guided masking improves the identification of high-level structures. Experiments on the public benchmark with two different backbone models demonstrate the effectiveness and generality of our method.</abstract>
       <url hash="d2d6e2eb">2022.acl-long.444</url>
       <bibkey>gu-etal-2022-phrase</bibkey>
+      <doi>10.18653/v1/2022.acl-long.444</doi>
     </paper>
     <paper id="445">
       <title>Achieving Reliable Human Assessment of Open-Domain Dialogue Systems</title>
@@ -6454,6 +6898,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/tianboji/dialogue-eval" additional="false">tianboji/dialogue-eval</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/convai2">ConvAI2</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/fed">FED</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.445</doi>
     </paper>
     <paper id="446">
       <title>Updated Headline Generation: Creating Updated Summaries for Evolving News Stories</title>
@@ -6464,6 +6909,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>We propose the task of updated headline generation, in which a system generates a headline for an updated article, considering both the previous article and headline. The system must identify the novel information in the article update, and modify the existing headline accordingly. We create data for this task using the NewsEdits corpus by automatically identifying contiguous article versions that are likely to require a substantive headline update. We find that models conditioned on the prior headline and body revisions produce headlines judged by humans to be as factual as gold headlines while making fewer unnecessary edits compared to a standard headline generation model. Our experiments establish benchmarks for this new contextual summarization task.</abstract>
       <url hash="38284166">2022.acl-long.446</url>
       <bibkey>panthaplackel-etal-2022-updated</bibkey>
+      <doi>10.18653/v1/2022.acl-long.446</doi>
     </paper>
     <paper id="447">
       <title><fixed-case>S</fixed-case>a<fixed-case>F</fixed-case>e<fixed-case>RD</fixed-case>ialogues: Taking Feedback Gracefully after Conversational Safety Failures</title>
@@ -6474,6 +6920,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Current open-domain conversational models can easily be made to talk in inadequate ways. Online learning from conversational feedback given by the conversation partner is a promising avenue for a model to improve and adapt, so as to generate fewer of these safety failures. However, current state-of-the-art models tend to react to feedback with defensive or oblivious responses. This makes for an unpleasant experience and may discourage conversation partners from giving feedback in the future. This work proposes SaFeRDialogues, a task and dataset of graceful responses to conversational feedback about safety failures.We collect a dataset of 8k dialogues demonstrating safety failures, feedback signaling them, and a response acknowledging the feedback. We show how fine-tuning on this dataset results in conversations that human raters deem considerably more likely to lead to a civil conversation, without sacrificing engagingness or general conversational ability.</abstract>
       <url hash="58d3cc52">2022.acl-long.447</url>
       <bibkey>ung-etal-2022-saferdialogues</bibkey>
+      <doi>10.18653/v1/2022.acl-long.447</doi>
     </paper>
     <paper id="448">
       <title>Compositional Generalization in Dependency Parsing</title>
@@ -6485,6 +6932,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Compositionality— the ability to combine familiar units like words into novel phrases and sentences— has been the focus of intense interest in artificial intelligence in recent years. To test compositional generalization in semantic parsing, Keysers et al. (2020) introduced Compositional Freebase Queries (CFQ). This dataset maximizes the similarity between the test and train distributions over primitive units, like words, while maximizing the compound divergence: the dissimilarity between test and train distributions over larger structures, like phrases. Dependency parsing, however, lacks a compositional generalization benchmark. In this work, we introduce a gold-standard set of dependency parses for CFQ, and use this to analyze the behaviour of a state-of-the art dependency parser (Qi et al., 2020) on the CFQ dataset. We find that increasing compound divergence degrades dependency parsing performance, although not as dramatically as semantic parsing performance. Additionally, we find the performance of the dependency parser does not uniformly degrade relative to compound divergence, and the parser performs differently on different splits with the same compound divergence. We explore a number of hypotheses for what causes the non-uniform degradation in dependency parsing performance, and identify a number of syntactic structures that drive the dependency parser’s lower performance on the most challenging splits.</abstract>
       <url hash="278c8b5e">2022.acl-long.448</url>
       <bibkey>goodwin-etal-2022-compositional</bibkey>
+      <doi>10.18653/v1/2022.acl-long.448</doi>
     </paper>
     <paper id="449">
       <title><fixed-case>ASPECTNEWS</fixed-case>: Aspect-Oriented Summarization of News Documents</title>
@@ -6499,6 +6947,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="14647d02">2022.acl-long.449.software.zip</attachment>
       <bibkey>ahuja-etal-2022-aspectnews</bibkey>
       <pwccode url="https://github.com/oja/aosumm" additional="false">oja/aosumm</pwccode>
+      <doi>10.18653/v1/2022.acl-long.449</doi>
     </paper>
     <paper id="450">
       <title><fixed-case>M</fixed-case>em<fixed-case>S</fixed-case>um: Extractive Summarization of Long Documents Using Multi-Step Episodic <fixed-case>M</fixed-case>arkov Decision Processes</title>
@@ -6512,6 +6961,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>gu-etal-2022-memsum</bibkey>
       <pwccode url="https://github.com/nianlonggu/memsum" additional="false">nianlonggu/memsum</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/govreport">GovReport</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.450</doi>
     </paper>
     <paper id="451">
       <title><fixed-case>CLUES</fixed-case>: A Benchmark for Learning Classifiers using Natural Language Explanations</title>
@@ -6524,6 +6974,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="45f75bee">2022.acl-long.451.software.zip</attachment>
       <bibkey>menon-etal-2022-clues</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/clues-classifier-learning-using-natural">CLUES (Classifier Learning Using natural language ExplanationS)</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.451</doi>
     </paper>
     <paper id="452">
       <title>Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing</title>
@@ -6537,6 +6988,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>shi-etal-2022-substructure</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.452</doi>
     </paper>
     <paper id="453">
       <title>Multilingual Detection of Personal Employment Status on <fixed-case>T</fixed-case>witter</title>
@@ -6550,6 +7002,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="3d764e10">2022.acl-long.453</url>
       <bibkey>tonneau-etal-2022-multilingual</bibkey>
       <pwccode url="https://github.com/manueltonneau/twitter-unemployment" additional="false">manueltonneau/twitter-unemployment</pwccode>
+      <doi>10.18653/v1/2022.acl-long.453</doi>
     </paper>
     <paper id="454">
       <title><fixed-case>M</fixed-case>ulti<fixed-case>H</fixed-case>iertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data</title>
@@ -6567,6 +7020,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/hybridqa">HybridQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/math">MATH</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mathqa">MathQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.454</doi>
     </paper>
     <paper id="455">
       <title>Transformers in the loop: Polarity in neural models of language</title>
@@ -6581,6 +7035,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/altsoph/transformers-in-the-loop" additional="false">altsoph/transformers-in-the-loop</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-sentences-that-contain-any">Natural sentences that contain *any*</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/synthetic-parallel-sentences-that-contain-any">Synthetic parallel sentences that contain *any*</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.455</doi>
     </paper>
     <paper id="456">
       <title>Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation</title>
@@ -6595,6 +7050,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="8decd9c8">2022.acl-long.456.software.zip</attachment>
       <bibkey>he-etal-2022-bridging</bibkey>
       <pwccode url="https://github.com/zwhe99/selftraining4unmt" additional="false">zwhe99/selftraining4unmt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.456</doi>
     </paper>
     <paper id="457">
       <title><fixed-case>SDR</fixed-case>: Efficient Neural Re-ranking using Succinct Document Representation</title>
@@ -6607,6 +7063,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="759dd234">2022.acl-long.457</url>
       <bibkey>cohen-etal-2022-sdr</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.457</doi>
     </paper>
     <paper id="458">
       <title>The <fixed-case>AI</fixed-case> Doctor Is In: A Survey of Task-Oriented Dialogue Systems for Healthcare Applications</title>
@@ -6616,6 +7073,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Task-oriented dialogue systems are increasingly prevalent in healthcare settings, and have been characterized by a diverse range of architectures and objectives. Although these systems have been surveyed in the medical community from a non-technical perspective, a systematic review from a rigorous computational perspective has to date remained noticeably absent. As a result, many important implementation details of healthcare-oriented dialogue systems remain limited or underspecified, slowing the pace of innovation in this area. To fill this gap, we investigated an initial pool of 4070 papers from well-known computer science, natural language processing, and artificial intelligence venues, identifying 70 papers discussing the system-level implementation of task-oriented dialogue systems for healthcare applications. We conducted a comprehensive technical review of these papers, and present our key findings including identified gaps and corresponding recommendations.</abstract>
       <url hash="e1b96929">2022.acl-long.458</url>
       <bibkey>valizadeh-parde-2022-ai</bibkey>
+      <doi>10.18653/v1/2022.acl-long.458</doi>
     </paper>
     <paper id="459">
       <title><fixed-case>SHIELD</fixed-case>: Defending Textual Neural Networks against Multiple Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher</title>
@@ -6627,6 +7085,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="1aa589c4">2022.acl-long.459</url>
       <bibkey>le-etal-2022-shield</bibkey>
       <pwccode url="https://github.com/lethaiq/shield-defend-adversarial-texts" additional="false">lethaiq/shield-defend-adversarial-texts</pwccode>
+      <doi>10.18653/v1/2022.acl-long.459</doi>
     </paper>
     <paper id="460">
       <title>Accurate Online Posterior Alignments for Principled Lexically-Constrained Decoding</title>
@@ -6637,6 +7096,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Online alignment in machine translation refers to the task of aligning a target word to a source word when the target sequence has only been partially decoded. Good online alignments facilitate important applications such as lexically constrained translation where user-defined dictionaries are used to inject lexical constraints into the translation model. We propose a novel posterior alignment technique that is truly online in its execution and superior in terms of alignment error rates compared to existing methods. Our proposed inference technique jointly considers alignment and token probabilities in a principled manner and can be seamlessly integrated within existing constrained beam-search decoding algorithms. On five language pairs, including two distant language pairs, we achieve consistent drop in alignment error rates. When deployed on seven lexically constrained translation tasks, we achieve significant improvements in BLEU specifically around the constrained positions.</abstract>
       <url hash="f1aa3d5a">2022.acl-long.460</url>
       <bibkey>chatterjee-etal-2022-accurate</bibkey>
+      <doi>10.18653/v1/2022.acl-long.460</doi>
     </paper>
     <paper id="461">
       <title>Leveraging Task Transferability to Meta-learning for Clinical Section Classification with Limited Data</title>
@@ -6648,6 +7108,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Identifying sections is one of the critical components of understanding medical information from unstructured clinical notes and developing assistive technologies for clinical note-writing tasks. Most state-of-the-art text classification systems require thousands of in-domain text data to achieve high performance. However, collecting in-domain and recent clinical note data with section labels is challenging given the high level of privacy and sensitivity. The present paper proposes an algorithmic way to improve the task transferability of meta-learning-based text classification in order to address the issue of low-resource target data. Specifically, we explore how to make the best use of the source dataset and propose a unique task transferability measure named Normalized Negative Conditional Entropy (NNCE). Leveraging the NNCE, we develop strategies for selecting clinical categories and sections from source task data to boost cross-domain meta-learning accuracy. Experimental results show that our task selection strategies improve section classification accuracy significantly compared to meta-learning algorithms.</abstract>
       <url hash="33067df3">2022.acl-long.461</url>
       <bibkey>chen-etal-2022-leveraging</bibkey>
+      <doi>10.18653/v1/2022.acl-long.461</doi>
     </paper>
     <paper id="462">
       <title>Reinforcement Guided Multi-Task Learning Framework for Low-Resource Stereotype Detection</title>
@@ -6664,6 +7125,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/hate-speech">Hate Speech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/hate-speech-and-offensive-language">Hate Speech and Offensive Language</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/stereoset">StereoSet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.462</doi>
     </paper>
     <paper id="463">
       <title>Letters From the Past: Modeling Historical Sound Change Through Diachronic Character Embeddings</title>
@@ -6674,6 +7136,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="bfc34783">2022.acl-long.463</url>
       <bibkey>boldsen-paggio-2022-letters</bibkey>
       <pwccode url="https://github.com/syssel/letters-from-the-past" additional="false">syssel/letters-from-the-past</pwccode>
+      <doi>10.18653/v1/2022.acl-long.463</doi>
     </paper>
     <paper id="464">
       <title>A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation</title>
@@ -6690,6 +7153,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="9e344776">2022.acl-long.464.software.zip</attachment>
       <bibkey>liu-etal-2022-token</bibkey>
       <pwccode url="https://github.com/microsoft/HaDes" additional="true">microsoft/HaDes</pwccode>
+      <doi>10.18653/v1/2022.acl-long.464</doi>
     </paper>
     <paper id="465">
       <title>Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice</title>
@@ -6702,6 +7166,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="cde1af9d">2022.acl-long.465.software.zip</attachment>
       <bibkey>grivas-etal-2022-low</bibkey>
       <pwccode url="https://github.com/andreasgrv/unargmaxable" additional="false">andreasgrv/unargmaxable</pwccode>
+      <doi>10.18653/v1/2022.acl-long.465</doi>
     </paper>
     <paper id="466">
       <title><fixed-case>P</fixed-case>rompt for Extraction? <fixed-case>PAIE</fixed-case>: <fixed-case>P</fixed-case>rompting Argument Interaction for Event Argument Extraction</title>
@@ -6718,6 +7183,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="fb1c8789">2022.acl-long.466.software.zip</attachment>
       <bibkey>ma-etal-2022-prompt</bibkey>
       <pwccode url="https://github.com/mayubo2333/paie" additional="false">mayubo2333/paie</pwccode>
+      <doi>10.18653/v1/2022.acl-long.466</doi>
     </paper>
     <paper id="467">
       <title>Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework</title>
@@ -6727,6 +7193,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Simultaneous machine translation (SiMT) starts translating while receiving the streaming source inputs, and hence the source sentence is always incomplete during translating. Different from the full-sentence MT using the conventional seq-to-seq architecture, SiMT often applies prefix-to-prefix architecture, which forces each target word to only align with a partial source prefix to adapt to the incomplete source in streaming inputs. However, the source words in the front positions are always illusoryly considered more important since they appear in more prefixes, resulting in position bias, which makes the model pay more attention on the front source positions in testing. In this paper, we first analyze the phenomenon of position bias in SiMT, and develop a Length-Aware Framework to reduce the position bias by bridging the structural gap between SiMT and full-sentence MT. Specifically, given the streaming inputs, we first predict the full-sentence length and then fill the future source position with positional encoding, thereby turning the streaming inputs into a pseudo full-sentence. The proposed framework can be integrated into most existing SiMT methods to further improve performance. Experiments on two representative SiMT methods, including the state-of-the-art adaptive policy, show that our method successfully reduces the position bias and thereby achieves better SiMT performance.</abstract>
       <url hash="d0833a90">2022.acl-long.467</url>
       <bibkey>zhang-feng-2022-reducing</bibkey>
+      <doi>10.18653/v1/2022.acl-long.467</doi>
     </paper>
     <paper id="468">
       <title>A Statutory Article Retrieval Dataset in <fixed-case>F</fixed-case>rench</title>
@@ -6739,6 +7206,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>louis-spanakis-2022-statutory</bibkey>
       <pwccode url="https://github.com/maastrichtlawtech/bsard" additional="false">maastrichtlawtech/bsard</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bsard">BSARD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.468</doi>
     </paper>
     <paper id="469">
       <title><fixed-case>P</fixed-case>ara<fixed-case>D</fixed-case>etox: Detoxification with Parallel Data</title>
@@ -6755,6 +7223,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="6fe59dc1">2022.acl-long.469</url>
       <bibkey>logacheva-etal-2022-paradetox</bibkey>
       <pwccode url="https://github.com/skoltech-nlp/paradetox" additional="false">skoltech-nlp/paradetox</pwccode>
+      <doi>10.18653/v1/2022.acl-long.469</doi>
     </paper>
     <paper id="470">
       <title>Interpreting Character Embeddings With Perceptual Representations: The Case of Shape, Sound, and Color</title>
@@ -6767,6 +7236,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="4983cb84">2022.acl-long.470.software.zip</attachment>
       <bibkey>boldsen-etal-2022-interpreting</bibkey>
       <pwccode url="https://github.com/syssel/interpreting-character-embeddings" additional="false">syssel/interpreting-character-embeddings</pwccode>
+      <doi>10.18653/v1/2022.acl-long.470</doi>
     </paper>
     <paper id="471">
       <title>Fine-Grained Controllable Text Generation Using Non-Residual Prompting</title>
@@ -6783,6 +7253,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/freddefrallan/non-residual-prompting" additional="false">freddefrallan/non-residual-prompting</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/c4">C4</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/commongen">CommonGen</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.471</doi>
     </paper>
     <paper id="472">
       <title>Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features</title>
@@ -6794,6 +7265,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>lux-vu-2022-language</bibkey>
       <pwccode url="https://github.com/digitalphonetics/ims-toucan" additional="false">digitalphonetics/ims-toucan</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/css10">CSS10</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.472</doi>
     </paper>
     <paper id="473">
       <title><fixed-case>T</fixed-case>witt<fixed-case>I</fixed-case>rish: A <fixed-case>U</fixed-case>niversal <fixed-case>D</fixed-case>ependencies Treebank of Tweets in <fixed-case>M</fixed-case>odern <fixed-case>I</fixed-case>rish</title>
@@ -6805,6 +7277,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Modern Irish is a minority language lacking sufficient computational resources for the task of accurate automatic syntactic parsing of user-generated content such as tweets. Although language technology for the Irish language has been developing in recent years, these tools tend to perform poorly on user-generated content. As with other languages, the linguistic style observed in Irish tweets differs, in terms of orthography, lexicon, and syntax, from that of standard texts more commonly used for the development of language models and parsers. We release the first Universal Dependencies treebank of Irish tweets, facilitating natural language processing of user-generated content in Irish. In this paper, we explore the differences between Irish tweets and standard Irish text, and the challenges associated with dependency parsing of Irish tweets. We describe our bootstrapping method of treebank development and report on preliminary parsing experiments.</abstract>
       <url hash="dbbfe2a5">2022.acl-long.473</url>
       <bibkey>cassidy-etal-2022-twittirish</bibkey>
+      <doi>10.18653/v1/2022.acl-long.473</doi>
     </paper>
     <paper id="474">
       <title>Length Control in Abstractive Summarization by Pretraining Information Selection</title>
@@ -6817,6 +7290,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="f52aaad5">2022.acl-long.474.software.zip</attachment>
       <bibkey>liu-etal-2022-length</bibkey>
       <pwccode url="https://github.com/yizhuliu/lengthcontrol" additional="false">yizhuliu/lengthcontrol</pwccode>
+      <doi>10.18653/v1/2022.acl-long.474</doi>
     </paper>
     <paper id="475">
       <title><fixed-case>CQG</fixed-case>: A Simple and Effective Controlled Generation Framework for Multi-hop Question Generation</title>
@@ -6833,6 +7307,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>fei-etal-2022-cqg</bibkey>
       <pwccode url="https://github.com/sion-zcfei/cqg" additional="false">sion-zcfei/cqg</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/hotpotqa">HotpotQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.475</doi>
     </paper>
     <paper id="476">
       <title>Word Order Does Matter and Shuffled Language Models Know It</title>
@@ -6852,6 +7327,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/record">ReCoRD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winogrande">WinoGrande</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.476</doi>
     </paper>
     <paper id="477">
       <title>An Empirical Study on Explanations in Out-of-Domain Settings</title>
@@ -6865,6 +7341,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/gchrysostomou/ood_faith" additional="false">gchrysostomou/ood_faith</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.477</doi>
     </paper>
     <paper id="478">
       <title><fixed-case>MILIE</fixed-case>: Modular &amp; Iterative Multilingual Open Information Extraction</title>
@@ -6881,6 +7358,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="83c51e40">2022.acl-long.478</url>
       <attachment type="software" hash="84d122c7">2022.acl-long.478.software.zip</attachment>
       <bibkey>kotnis-etal-2022-milie</bibkey>
+      <doi>10.18653/v1/2022.acl-long.478</doi>
     </paper>
     <paper id="479">
       <title>What Makes Reading Comprehension Questions Difficult?</title>
@@ -6896,6 +7374,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mctest">MCTest</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/reclor">ReClor</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.479</doi>
     </paper>
     <paper id="480">
       <title>From Simultaneous to Streaming Machine Translation by Leveraging Streaming History</title>
@@ -6908,6 +7387,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="9007090a">2022.acl-long.480.software.zip</attachment>
       <bibkey>iranzo-sanchez-etal-2022-simultaneous</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.480</doi>
     </paper>
     <paper id="481">
       <title>A Rationale-Centric Framework for Human-in-the-loop Machine Learning</title>
@@ -6922,6 +7402,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/GeorgeLuImmortal/RDL-Rationales-centric-Double-robustness-Learning" additional="false">GeorgeLuImmortal/RDL-Rationales-centric-Double-robustness-Learning</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.481</doi>
     </paper>
     <paper id="482">
       <title>Challenges and Strategies in Cross-Cultural <fixed-case>NLP</fixed-case></title>
@@ -6944,6 +7425,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="90c53c09">2022.acl-long.482</url>
       <bibkey>hershcovich-etal-2022-challenges</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/marvl">MaRVL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.482</doi>
     </paper>
     <paper id="483">
       <title>Prototypical Verbalizer for Prompt-based Few-shot Tuning</title>
@@ -6958,6 +7440,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>cui-etal-2022-prototypical</bibkey>
       <pwccode url="https://github.com/thunlp/OpenPrompt" additional="false">thunlp/OpenPrompt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/few-nerd">Few-NERD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.483</doi>
     </paper>
     <paper id="484">
       <title>Clickbait Spoiling via Question Answering and Passage Retrieval</title>
@@ -6974,6 +7457,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.484</doi>
     </paper>
     <paper id="485">
       <title><fixed-case>BERT</fixed-case> Learns to Teach: Knowledge Distillation with Meta Learning</title>
@@ -6990,6 +7474,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.485</doi>
     </paper>
     <paper id="486">
       <title><fixed-case>STEMM</fixed-case>: Self-learning with Speech-text Manifold Mixup for Speech Translation</title>
@@ -7004,6 +7489,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>fang-etal-2022-stemm</bibkey>
       <pwccode url="https://github.com/ictnlp/stemm" additional="false">ictnlp/stemm</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.486</doi>
     </paper>
     <paper id="487">
       <title>Integrating Vectorized Lexical Constraints for Neural Machine Translation</title>
@@ -7015,6 +7501,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="8e3b18df">2022.acl-long.487</url>
       <bibkey>wang-etal-2022-integrating</bibkey>
       <pwccode url="https://github.com/shuo-git/vecconstnmt" additional="false">shuo-git/vecconstnmt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.487</doi>
     </paper>
     <paper id="488">
       <title><fixed-case>MPII</fixed-case>: Multi-Level Mutual Promotion for Inference and Interpretation</title>
@@ -7030,6 +7517,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/multinli">MultiNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/e-snli">e-SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.488</doi>
     </paper>
     <paper id="489">
       <title><fixed-case>S</fixed-case>table<fixed-case>M</fixed-case>o<fixed-case>E</fixed-case>: Stable Routing Strategy for Mixture of Experts</title>
@@ -7047,6 +7535,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>dai-etal-2022-stablemoe</bibkey>
       <pwccode url="https://github.com/hunter-ddm/stablemoe" additional="false">hunter-ddm/stablemoe</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cc100">CC100</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.489</doi>
     </paper>
     <paper id="490">
       <title>Boundary Smoothing for Named Entity Recognition</title>
@@ -7061,6 +7550,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/conll">CoNLL++</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/resume-ner">Resume NER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/weibo-ner">Weibo NER</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.490</doi>
     </paper>
     <paper id="491">
       <title>Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification</title>
@@ -7076,6 +7566,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/wzh9969/contrastive-htc" additional="false">wzh9969/contrastive-htc</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/rcv1">RCV1</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/web-of-science-dataset">WOS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.491</doi>
     </paper>
     <paper id="492">
       <title>Signal in Noise: Exploring Meaning Encoded in Random Character Sequences with Character-Aware Language Models</title>
@@ -7091,6 +7582,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="66c2c618">2022.acl-long.492.software.zip</attachment>
       <bibkey>chu-etal-2022-signal</bibkey>
       <pwccode url="https://github.com/comp-syn/garble" additional="false">comp-syn/garble</pwccode>
+      <doi>10.18653/v1/2022.acl-long.492</doi>
     </paper>
     <paper id="493">
       <title>Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering</title>
@@ -7117,6 +7609,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.493</doi>
     </paper>
     <paper id="494">
       <title><fixed-case>A</fixed-case>da<fixed-case>L</fixed-case>o<fixed-case>GN</fixed-case>: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension</title>
@@ -7133,6 +7626,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/nju-websoft/adalogn" additional="false">nju-websoft/adalogn</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/logiqa">LogiQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/reclor">ReClor</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.494</doi>
     </paper>
     <paper id="495">
       <title><fixed-case>CAMERO</fixed-case>: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing</title>
@@ -7151,6 +7645,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.495</doi>
     </paper>
     <paper id="496">
       <title>Interpretability for Language Learners Using Example-Based Grammatical Error Correction</title>
@@ -7166,6 +7661,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/kanekomasahiro/eb-gec" additional="false">kanekomasahiro/eb-gec</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fce">FCE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/jfleg">JFLEG</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.496</doi>
     </paper>
     <paper id="497">
       <title>Rethinking Negative Sampling for Handling Missing Entity Annotations</title>
@@ -7176,6 +7672,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Negative sampling is highly effective in handling missing annotations for named entity recognition (NER). One of our contributions is an analysis on how it makes sense through introducing two insightful concepts: missampling and uncertainty. Empirical studies show low missampling rate and high uncertainty are both essential for achieving promising performances with negative sampling. Based on the sparsity of named entities, we also theoretically derive a lower bound for the probability of zero missampling rate, which is only relevant to sentence length. The other contribution is an adaptive and weighted sampling distribution that further improves negative sampling via our former analysis. Experiments on synthetic datasets and well-annotated datasets (e.g., CoNLL-2003) show that our proposed approach benefits negative sampling in terms of F1 score and loss convergence. Besides, models with improved negative sampling have achieved new state-of-the-art results on real-world datasets (e.g., EC).</abstract>
       <url hash="934cf5bd">2022.acl-long.497</url>
       <bibkey>li-etal-2022-rethinking</bibkey>
+      <doi>10.18653/v1/2022.acl-long.497</doi>
     </paper>
     <paper id="498">
       <title>Distantly Supervised Named Entity Recognition via Confidence-Based Multi-Class Positive and Unlabeled Learning</title>
@@ -7187,6 +7684,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="587ea839">2022.acl-long.498</url>
       <attachment type="software" hash="df7af561">2022.acl-long.498.software.zip</attachment>
       <bibkey>zhou-etal-2022-distantly</bibkey>
+      <doi>10.18653/v1/2022.acl-long.498</doi>
     </paper>
     <paper id="499">
       <title><fixed-case>U</fixed-case>ni<fixed-case>X</fixed-case>coder: Unified Cross-Modal Pre-training for Code Representation</title>
@@ -7204,6 +7702,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/cosqa">CoSQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/codesearchnet">CodeSearchNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/codexglue">CodeXGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.499</doi>
     </paper>
     <paper id="500">
       <title>One Country, 700+ Languages: <fixed-case>NLP</fixed-case> Challenges for Underrepresented Languages and Dialects in <fixed-case>I</fixed-case>ndonesia</title>
@@ -7223,6 +7722,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>NLP research is impeded by a lack of resources and awareness of the challenges presented by underrepresented languages and dialects. Focusing on the languages spoken in Indonesia, the second most linguistically diverse and the fourth most populous nation of the world, we provide an overview of the current state of NLP research for Indonesia’s 700+ languages. We highlight challenges in Indonesian NLP and how these affect the performance of current NLP systems. Finally, we provide general recommendations to help develop NLP technology not only for languages of Indonesia but also other underrepresented languages.</abstract>
       <url hash="df53f603">2022.acl-long.500</url>
       <bibkey>aji-etal-2022-one</bibkey>
+      <doi>10.18653/v1/2022.acl-long.500</doi>
     </paper>
     <paper id="501">
       <title>Is <fixed-case>GPT</fixed-case>-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text</title>
@@ -7237,6 +7737,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="2cd2df98">2022.acl-long.501.software.zip</attachment>
       <bibkey>dou-etal-2022-gpt</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.501</doi>
     </paper>
     <paper id="502">
       <title>Transkimmer: Transformer Learns to Layer-wise Skim</title>
@@ -7252,6 +7753,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.502</doi>
     </paper>
     <paper id="503">
       <title><fixed-case>S</fixed-case>kip<fixed-case>BERT</fixed-case>: Efficient Inference with Shallow Layer Skipping</title>
@@ -7269,6 +7771,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.503</doi>
     </paper>
     <paper id="504">
       <title>Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models</title>
@@ -7279,6 +7782,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="b0d14093">2022.acl-long.504</url>
       <bibkey>ri-tsuruoka-2022-pretraining</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.504</doi>
     </paper>
     <paper id="505">
       <title>m<fixed-case>LUKE</fixed-case>: <fixed-case>T</fixed-case>he Power of Entity Representations in Multilingual Pretrained Language Models</title>
@@ -7298,6 +7802,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/relx">RELX</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.505</doi>
     </paper>
     <paper id="506">
       <title>Evaluating Factuality in Text Simplification</title>
@@ -7313,6 +7818,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/ashologn/evaluating-factuality-in-text-simplification" additional="false">ashologn/evaluating-factuality-in-text-simplification</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/newsela">Newsela</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikilarge">WikiLarge</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.506</doi>
     </paper>
     <paper id="507">
       <title>Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization</title>
@@ -7326,6 +7832,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>This paper describes the motivation and development of speech synthesis systems for the purposes of language revitalization. By building speech synthesis systems for three Indigenous languages spoken in Canada, Kanien’kéha, Gitksan &amp; SENĆOŦEN, we re-evaluate the question of how much data is required to build low-resource speech synthesis systems featuring state-of-the-art neural models. For example, preliminary results with English data show that a FastSpeech2 model trained with 1 hour of training data can produce speech with comparable naturalness to a Tacotron2 model trained with 10 hours of data. Finally, we motivate future research in evaluation and classroom integration in the field of speech synthesis for language revitalization.</abstract>
       <url hash="e63c7472">2022.acl-long.507</url>
       <bibkey>pine-etal-2022-requirements</bibkey>
+      <doi>10.18653/v1/2022.acl-long.507</doi>
     </paper>
     <paper id="508">
       <title>Sharpness-Aware Minimization Improves Language Model Generalization</title>
@@ -7343,6 +7850,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/tydi-qa">TyDi QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tydiqa-goldp">TyDiQA-GoldP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webquestions">WebQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.508</doi>
     </paper>
     <paper id="509">
       <title>Adversarial Authorship Attribution for Deobfuscation</title>
@@ -7354,6 +7862,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Recent advances in natural language processing have enabled powerful privacy-invasive authorship attribution. To counter authorship attribution, researchers have proposed a variety of rule-based and learning-based text obfuscation approaches. However, existing authorship obfuscation approaches do not consider the adversarial threat model. Specifically, they are not evaluated against adversarially trained authorship attributors that are aware of potential obfuscation. To fill this gap, we investigate the problem of adversarial authorship attribution for deobfuscation. We show that adversarially trained authorship attributors are able to degrade the effectiveness of existing obfuscators from 20-30% to 5-10%. We also evaluate the effectiveness of adversarial training when the attributor makes incorrect assumptions about whether and which obfuscator was used. While there is a a clear degradation in attribution accuracy, it is noteworthy that this degradation is still at or above the attribution accuracy of the attributor that is not adversarially trained at all. Our results motivate the need to develop authorship obfuscation approaches that are resistant to deobfuscation.</abstract>
       <url hash="e98c2030">2022.acl-long.509</url>
       <bibkey>zhai-etal-2022-adversarial</bibkey>
+      <doi>10.18653/v1/2022.acl-long.509</doi>
     </paper>
     <paper id="510">
       <title>Weakly Supervised Word Segmentation for Computational Language Documentation</title>
@@ -7365,6 +7874,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="4b3aac4f">2022.acl-long.510</url>
       <bibkey>okabe-etal-2022-weakly</bibkey>
       <pwccode url="https://github.com/shuokabe/pyseg" additional="false">shuokabe/pyseg</pwccode>
+      <doi>10.18653/v1/2022.acl-long.510</doi>
     </paper>
     <paper id="511">
       <title><fixed-case>S</fixed-case>ci<fixed-case>NLI</fixed-case>: A Corpus for Natural Language Inference on Scientific Text</title>
@@ -7381,6 +7891,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/swag">SWAG</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.511</doi>
     </paper>
     <paper id="512">
       <title>Neural reality of argument structure constructions</title>
@@ -7395,6 +7906,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="927d673e">2022.acl-long.512.software.zip</attachment>
       <bibkey>li-etal-2022-neural</bibkey>
       <pwccode url="https://github.com/spoclab-ca/neural-reality-constructions" additional="false">spoclab-ca/neural-reality-constructions</pwccode>
+      <doi>10.18653/v1/2022.acl-long.512</doi>
     </paper>
     <paper id="513">
       <title>On the Robustness of Offensive Language Classifiers</title>
@@ -7408,6 +7920,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>rusert-etal-2022-robustness</bibkey>
       <pwccode url="https://github.com/jonrusert/robustnessofoffensiveclassifiers" additional="false">jonrusert/robustnessofoffensiveclassifiers</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/olid">OLID</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.513</doi>
     </paper>
     <paper id="514">
       <title>Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings</title>
@@ -7423,6 +7936,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/samanantar">Samanantar</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xformal">XFORMAL</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mc4">mC4</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.514</doi>
     </paper>
     <paper id="515">
       <title><fixed-case>ABC</fixed-case>: Attention with Bounded-memory Control</title>
@@ -7443,6 +7957,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/wmt-2014">WMT 2014</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.515</doi>
     </paper>
     <paper id="516">
       <title>The Dangers of Underclaiming: Reasons for Caution When Reporting How <fixed-case>NLP</fixed-case> Systems Fail</title>
@@ -7453,6 +7968,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>bowman-2022-dangers</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.516</doi>
     </paper>
     <paper id="517">
       <title><fixed-case>REL</fixed-case>i<fixed-case>C</fixed-case>: Retrieving Evidence for Literary Claims</title>
@@ -7467,6 +7983,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/martiansideofthemoon/relic-retrieval" additional="false">martiansideofthemoon/relic-retrieval</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/relic">RELiC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/beir">BEIR</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.517</doi>
     </paper>
     <paper id="518">
       <title>Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas</title>
@@ -7480,6 +7997,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/raphael-sch/map2seq_vln" additional="false">raphael-sch/map2seq_vln</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/touchdown-dataset">Touchdown Dataset</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/map2seq">map2seq</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.518</doi>
     </paper>
     <paper id="519">
       <title>Adapting Coreference Resolution Models through Active Learning</title>
@@ -7493,6 +8011,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="a1eb8df0">2022.acl-long.519</url>
       <bibkey>yuan-etal-2022-adapting</bibkey>
       <pwccode url="https://github.com/forest-snow/incremental-coref" additional="false">forest-snow/incremental-coref</pwccode>
+      <doi>10.18653/v1/2022.acl-long.519</doi>
     </paper>
     <paper id="520">
       <title>An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models</title>
@@ -7503,6 +8022,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="c93cfec0">2022.acl-long.520</url>
       <bibkey>agrawal-carpuat-2022-imitation</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/newsela">Newsela</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.520</doi>
     </paper>
     <paper id="521">
       <title>Memorisation versus Generalisation in Pre-trained Language Models</title>
@@ -7517,6 +8037,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/conll">CoNLL++</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.521</doi>
     </paper>
     <paper id="522">
       <title><fixed-case>C</fixed-case>hat<fixed-case>M</fixed-case>atch: Evaluating Chatbots by Autonomous Chat Tournaments</title>
@@ -7530,6 +8051,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="9be2b38f">2022.acl-long.522.software.zip</attachment>
       <bibkey>yang-etal-2022-chatmatch</bibkey>
       <pwccode url="https://github.com/ruolanyang/chatmatch" additional="false">ruolanyang/chatmatch</pwccode>
+      <doi>10.18653/v1/2022.acl-long.522</doi>
     </paper>
     <paper id="523">
       <title>Do self-supervised speech models develop human-like perception biases?</title>
@@ -7541,6 +8063,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>millet-dunbar-2022-self</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/audioset">AudioSet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.523</doi>
     </paper>
     <paper id="524">
       <title>Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions</title>
@@ -7559,6 +8082,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/rxr">RxR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/streetlearn">StreetLearn</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/talk-the-walk">Talk the Walk</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.524</doi>
     </paper>
     <paper id="525">
       <title>Learning to Generate Programs for Table Fact Verification via Structure-Aware Semantic Parsing</title>
@@ -7570,6 +8094,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>ou-liu-2022-learning</bibkey>
       <pwccode url="https://github.com/ousuixin/sasp" additional="false">ousuixin/sasp</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/tabfact">TabFact</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.525</doi>
     </paper>
     <paper id="526">
       <title>Cluster &amp; Tune: <fixed-case>B</fixed-case>oost Cold Start Performance in Text Classification</title>
@@ -7585,6 +8110,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="73556e87">2022.acl-long.526</url>
       <bibkey>shnarch-etal-2022-cluster</bibkey>
       <pwccode url="https://github.com/ibm/intermediate-training-using-clustering" additional="false">ibm/intermediate-training-using-clustering</pwccode>
+      <doi>10.18653/v1/2022.acl-long.526</doi>
     </paper>
     <paper id="527">
       <title>Overcoming a Theoretical Limitation of Self-Attention</title>
@@ -7595,6 +8121,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="35f2d94d">2022.acl-long.527</url>
       <bibkey>chiang-cholak-2022-overcoming</bibkey>
       <pwccode url="https://github.com/ndnlp/parity" additional="false">ndnlp/parity</pwccode>
+      <doi>10.18653/v1/2022.acl-long.527</doi>
     </paper>
     <paper id="528">
       <title>Prediction Difference Regularization against Perturbation for Neural Machine Translation</title>
@@ -7606,6 +8133,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Regularization methods applying input perturbation have drawn considerable attention and have been frequently explored for NMT tasks in recent years. Despite their simplicity and effectiveness, we argue that these methods are limited by the under-fitting of training data. In this paper, we utilize prediction difference for ground-truth tokens to analyze the fitting of token-level samples and find that under-fitting is almost as common as over-fitting. We introduce prediction difference regularization (PD-R), a simple and effective method that can reduce over-fitting and under-fitting at the same time. For all token-level samples, PD-R minimizes the prediction difference between the original pass and the input-perturbed pass, making the model less sensitive to small input changes, thus more robust to both perturbations and under-fitted training data. Experiments on three widely used WMT translation tasks show that our approach can significantly improve over existing perturbation regularization methods. On WMT16 En-De task, our model achieves 1.80 SacreBLEU improvement over vanilla transformer.</abstract>
       <url hash="3f6fc1cb">2022.acl-long.528</url>
       <bibkey>guo-etal-2022-prediction</bibkey>
+      <doi>10.18653/v1/2022.acl-long.528</doi>
     </paper>
     <paper id="529">
       <title>Make the Best of Cross-lingual Transfer: Evidence from <fixed-case>POS</fixed-case> Tagging with over 100 Languages</title>
@@ -7617,6 +8145,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="5b592b59">2022.acl-long.529</url>
       <bibkey>de-vries-etal-2022-make</bibkey>
       <pwccode url="https://github.com/wietsedv/xpos" additional="false">wietsedv/xpos</pwccode>
+      <doi>10.18653/v1/2022.acl-long.529</doi>
     </paper>
     <paper id="530">
       <title>Should a Chatbot be Sarcastic? Understanding User Preferences Towards Sarcasm Generation</title>
@@ -7627,6 +8156,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Previous sarcasm generation research has focused on how to generate text that people perceive as sarcastic to create more human-like interactions. In this paper, we argue that we should first turn our attention to the question of when sarcasm should be generated, finding that humans consider sarcastic responses inappropriate to many input utterances. Next, we use a theory-driven framework for generating sarcastic responses, which allows us to control the linguistic devices included during generation. For each device, we investigate how much humans associate it with sarcasm, finding that pragmatic insincerity and emotional markers are devices crucial for making sarcasm recognisable.</abstract>
       <url hash="87771bb1">2022.acl-long.530</url>
       <bibkey>oprea-etal-2022-chatbot</bibkey>
+      <doi>10.18653/v1/2022.acl-long.530</doi>
     </paper>
     <paper id="531">
       <title>How Do <fixed-case>S</fixed-case>eq2<fixed-case>S</fixed-case>eq Models Perform on End-to-End Data-to-Text Generation?</title>
@@ -7639,6 +8169,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/xunjianyin/seq2seqondata2text" additional="false">xunjianyin/seq2seqondata2text</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/totto">ToTTo</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikibio">WikiBio</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.531</doi>
     </paper>
     <paper id="532">
       <title>Probing for Labeled Dependency Trees</title>
@@ -7652,6 +8183,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>muller-eberstein-etal-2022-probing</bibkey>
       <pwccode url="https://github.com/personads/depprobe" additional="false">personads/depprobe</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.532</doi>
     </paper>
     <paper id="533">
       <title><fixed-case>D</fixed-case>o<fixed-case>C</fixed-case>o<fixed-case>G</fixed-case>en: <fixed-case>D</fixed-case>omain Counterfactual Generation for Low Resource Domain Adaptation</title>
@@ -7664,6 +8196,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="8a07c770">2022.acl-long.533</url>
       <bibkey>calderon-etal-2022-docogen</bibkey>
       <pwccode url="https://github.com/nitaytech/docogen" additional="false">nitaytech/docogen</pwccode>
+      <doi>10.18653/v1/2022.acl-long.533</doi>
     </paper>
     <paper id="534">
       <title><fixed-case>L</fixed-case>i<fixed-case>LT</fixed-case>: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding</title>
@@ -7680,6 +8213,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/funsd">FUNSD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/rvl-cdip">RVL-CDIP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xfun">XFUND</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.534</doi>
     </paper>
     <paper id="535">
       <title>Dependency-based Mixture Language Models</title>
@@ -7693,6 +8227,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/fadedcosine/dependency-guided-neural-text-generation" additional="false">fadedcosine/dependency-guided-neural-text-generation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.535</doi>
     </paper>
     <paper id="536">
       <title>Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?</title>
@@ -7706,6 +8241,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="d7982853">2022.acl-long.536.software.zip</attachment>
       <bibkey>dutta-etal-2022-unsupervised</bibkey>
       <pwccode url="https://github.com/jeevesh8/arg_mining" additional="false">jeevesh8/arg_mining</pwccode>
+      <doi>10.18653/v1/2022.acl-long.536</doi>
     </paper>
     <paper id="537">
       <title>Entity-based Neural Local Coherence Modeling</title>
@@ -7717,6 +8253,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>jeon-strube-2022-entity</bibkey>
       <pwccode url="https://github.com/sdeva14/acl22-entity-neural-local-cohe" additional="false">sdeva14/acl22-entity-neural-local-cohe</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/gcdc">GCDC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.537</doi>
     </paper>
     <paper id="538">
       <title>“That Is a Suspicious Reaction!”: Interpreting Logits Variation to Detect <fixed-case>NLP</fixed-case> Adversarial Attacks</title>
@@ -7731,6 +8268,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>mosca-etal-2022-suspicious</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.538</doi>
     </paper>
     <paper id="539">
       <title>Local Languages, Third Spaces, and other High-Resource Scenarios</title>
@@ -7739,6 +8277,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>How can language technology address the diverse situations of the world’s languages? In one view, languages exist on a resource continuum and the challenge is to scale existing solutions, bringing under-resourced languages into the high-resource world. In another view, presented here, the world’s language ecology includes standardised languages, local languages, and contact languages. These are often subsumed under the label of “under-resourced languages” even though they have distinct functions and prospects. I explore this position and propose some ecologically-aware language technology agendas.</abstract>
       <url hash="bd56bfe3">2022.acl-long.539</url>
       <bibkey>bird-2022-local</bibkey>
+      <doi>10.18653/v1/2022.acl-long.539</doi>
     </paper>
     <paper id="540">
       <title>That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory</title>
@@ -7748,6 +8287,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>The evolution of language follows the rule of gradual change. Grammar, vocabulary, and lexical semantic shifts take place over time, resulting in a diachronic linguistic gap. As such, a considerable amount of texts are written in languages of different eras, which creates obstacles for natural language processing tasks, such as word segmentation and machine translation. Although the Chinese language has a long history, previous Chinese natural language processing research has primarily focused on tasks within a specific era. Therefore, we propose a cross-era learning framework for Chinese word segmentation (CWS), CROSSWISE, which uses the Switch-memory (SM) module to incorporate era-specific linguistic knowledge. Experiments on four corpora from different eras show that the performance of each corpus significantly improves. Further analyses also demonstrate that the SM can effectively integrate the knowledge of the eras into the neural network.</abstract>
       <url hash="a481e65e">2022.acl-long.540</url>
       <bibkey>tang-su-2022-slepen</bibkey>
+      <doi>10.18653/v1/2022.acl-long.540</doi>
     </paper>
     <paper id="541">
       <title>Fair and Argumentative Language Modeling for Computational Argumentation</title>
@@ -7760,6 +8300,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="5bd62ef8">2022.acl-long.541.software.zip</attachment>
       <bibkey>holtermann-etal-2022-fair</bibkey>
       <pwccode url="https://github.com/umanlp/fairargumentativelm" additional="false">umanlp/fairargumentativelm</pwccode>
+      <doi>10.18653/v1/2022.acl-long.541</doi>
     </paper>
     <paper id="542">
       <title>Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation</title>
@@ -7773,6 +8314,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>zhang-etal-2022-learning</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/bstc">BSTC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.542</doi>
     </paper>
     <paper id="543">
       <title>Can Pre-trained Language Models Interpret Similes as Smart as Human?</title>
@@ -7786,6 +8328,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="636ac358">2022.acl-long.543</url>
       <bibkey>he-etal-2022-pre</bibkey>
       <pwccode url="https://github.com/abbey4799/plms-interpret-simile" additional="false">abbey4799/plms-interpret-simile</pwccode>
+      <doi>10.18653/v1/2022.acl-long.543</doi>
     </paper>
     <paper id="544">
       <title><fixed-case>CBLUE</fixed-case>: A <fixed-case>C</fixed-case>hinese Biomedical Language Understanding Evaluation Benchmark</title>
@@ -7828,6 +8371,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/clue">CLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cmeie">CMeIE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.544</doi>
     </paper>
     <paper id="545">
       <title>Learning Non-Autoregressive Models from Search for Unsupervised Sentence Summarization</title>
@@ -7839,6 +8383,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="4cd43756">2022.acl-long.545</url>
       <bibkey>liu-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/manga-uofa/naus" additional="false">manga-uofa/naus</pwccode>
+      <doi>10.18653/v1/2022.acl-long.545</doi>
     </paper>
     <paper id="546">
       <title>Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation</title>
@@ -7854,6 +8399,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="d6bb880b">2022.acl-long.546</url>
       <bibkey>wei-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/pemywei/csanmt" additional="false">pemywei/csanmt</pwccode>
+      <doi>10.18653/v1/2022.acl-long.546</doi>
     </paper>
     <paper id="547">
       <title>Lexical Knowledge Internalization for Neural Dialog Generation</title>
@@ -7869,6 +8415,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/lividwo/ki" additional="false">lividwo/ki</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.547</doi>
     </paper>
     <paper id="548">
       <title>Modeling Syntactic-Semantic Dependency Correlations in Semantic Role Labeling Using Mixture Models</title>
@@ -7881,6 +8428,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="acd19a75">2022.acl-long.548.software.zip</attachment>
       <bibkey>chen-etal-2022-modeling</bibkey>
       <pwccode url="https://github.com/christomartin/syn-sem_dependency_correlation_mixture_model" additional="false">christomartin/syn-sem_dependency_correlation_mixture_model</pwccode>
+      <doi>10.18653/v1/2022.acl-long.548</doi>
     </paper>
     <paper id="549">
       <title>Learning the Beauty in Songs: Neural Singing Voice Beautifier</title>
@@ -7894,6 +8442,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="03677070">2022.acl-long.549</url>
       <bibkey>liu-etal-2022-learning-beauty</bibkey>
       <pwccode url="https://github.com/moonintheriver/neuralsvb" additional="true">moonintheriver/neuralsvb</pwccode>
+      <doi>10.18653/v1/2022.acl-long.549</doi>
     </paper>
     <paper id="550">
       <title>A Model-agnostic Data Manipulation Method for Persona-based Dialogue Generation</title>
@@ -7909,6 +8458,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>cao-etal-2022-model</bibkey>
       <pwccode url="https://github.com/caoyu-noob/d3" additional="false">caoyu-noob/d3</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.550</doi>
     </paper>
     <paper id="551">
       <title><fixed-case>L</fixed-case>ink<fixed-case>BERT</fixed-case>: Pretraining Language Models with Document Links</title>
@@ -7943,6 +8493,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/searchqa">SearchQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.551</doi>
     </paper>
     <paper id="552">
       <title>Improving Time Sensitivity for Question Answering over Temporal Knowledge Graphs</title>
@@ -7955,6 +8506,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="2642c44d">2022.acl-long.552</url>
       <bibkey>shang-etal-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cronquestions">CronQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.552</doi>
     </paper>
     <paper id="553">
       <title>Self-supervised Semantic-driven Phoneme Discovery for Zero-resource Speech Recognition</title>
@@ -7967,6 +8519,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="533b96f2">2022.acl-long.553</url>
       <bibkey>wang-etal-2022-self</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.553</doi>
     </paper>
     <paper id="554">
       <title>Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions</title>
@@ -7979,6 +8532,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chang-mccallum-2022-softmax</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/protoqa">ProtoQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.554</doi>
     </paper>
     <paper id="555">
       <title>Ditch the Gold Standard: Re-evaluating Conversational Question Answering</title>
@@ -7995,6 +8549,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/canard">CANARD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/coqa">CoQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.555</doi>
     </paper>
     <paper id="556">
       <title>Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity</title>
@@ -8011,6 +8566,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.556</doi>
     </paper>
     <paper id="557">
       <title>Situated Dialogue Learning through Procedural Environment Generation</title>
@@ -8021,6 +8577,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>We teach goal-driven agents to interactively act and speak in situated environments by training on generated curriculums. Our agents operate in LIGHT (Urbanek et al. 2019)—a large-scale crowd-sourced fantasy text adventure game wherein an agent perceives and interacts with the world through textual natural language. Goals in this environment take the form of character-based quests, consisting of personas and motivations. We augment LIGHT by learning to procedurally generate additional novel textual worlds and quests to create a curriculum of steadily increasing difficulty for training agents to achieve such goals. In particular, we measure curriculum difficulty in terms of the rarity of the quest in the original training distribution—an easier environment is one that is more likely to have been found in the unaugmented dataset. An ablation study shows that this method of learning from the tail of a distribution results in significantly higher generalization abilities as measured by zero-shot performance on never-before-seen quests.</abstract>
       <url hash="ac28f982">2022.acl-long.557</url>
       <bibkey>ammanabrolu-etal-2022-situated</bibkey>
+      <doi>10.18653/v1/2022.acl-long.557</doi>
     </paper>
     <paper id="558">
       <title><fixed-case>U</fixed-case>ni<fixed-case>TE</fixed-case>: Unified Translation Evaluation</title>
@@ -8037,6 +8594,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="9bea2047">2022.acl-long.558.software.zip</attachment>
       <bibkey>wan-etal-2022-unite</bibkey>
       <pwccode url="https://github.com/nlp2ct/unite" additional="true">nlp2ct/unite</pwccode>
+      <doi>10.18653/v1/2022.acl-long.558</doi>
     </paper>
     <paper id="559">
       <title>Program Transfer for Answering Complex Questions over Knowledge Bases</title>
@@ -8056,6 +8614,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/thu-keg/programtransfer" additional="false">thu-keg/programtransfer</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/complexwebquestions">ComplexWebQuestions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webquestions">WebQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.559</doi>
     </paper>
     <paper id="560">
       <title><fixed-case>EAG</fixed-case>: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation</title>
@@ -8069,6 +8628,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="bbb0b3c3">2022.acl-long.560.software.zip</attachment>
       <bibkey>xu-etal-2022-eag</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/opus-100">OPUS-100</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.560</doi>
     </paper>
     <paper id="561">
       <title>Using Context-to-Vector with Graph Retrofitting to Improve Word Embeddings</title>
@@ -8084,6 +8644,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Although contextualized embeddings generated from large-scale pre-trained models perform well in many tasks, traditional static embeddings (e.g., Skip-gram, Word2Vec) still play an important role in low-resource and lightweight settings due to their low computational cost, ease of deployment, and stability. In this paper, we aim to improve word embeddings by 1) incorporating more contextual information from existing pre-trained models into the Skip-gram framework, which we call Context-to-Vec; 2) proposing a post-processing retrofitting method for static embeddings independent of training by employing priori synonym knowledge and weighted vector distribution. Through extrinsic and intrinsic tasks, our methods are well proven to outperform the baselines by a large margin.</abstract>
       <url hash="8af54d5d">2022.acl-long.561</url>
       <bibkey>zheng-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.acl-long.561</doi>
     </paper>
     <paper id="562">
       <title>Multimodal Sarcasm Target Identification in Tweets</title>
@@ -8098,6 +8659,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="e713f939">2022.acl-long.562.software.zip</attachment>
       <bibkey>wang-etal-2022-multimodal</bibkey>
       <pwccode url="https://github.com/wjq-learning/msti" additional="false">wjq-learning/msti</pwccode>
+      <doi>10.18653/v1/2022.acl-long.562</doi>
     </paper>
     <paper id="563">
       <title>Flexible Generation from Fragmentary Linguistic Input</title>
@@ -8109,6 +8671,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>qian-levy-2022-flexible</bibkey>
       <pwccode url="https://github.com/pqian11/fragment-completion" additional="false">pqian11/fragment-completion</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/new-york-times-annotated-corpus">New York Times Annotated Corpus</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.563</doi>
     </paper>
     <paper id="564">
       <title>Revisiting Over-Smoothness in Text to Speech</title>
@@ -8122,6 +8685,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="66ce8440">2022.acl-long.564</url>
       <bibkey>ren-etal-2022-revisiting</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ljspeech">LJSpeech</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.564</doi>
     </paper>
     <paper id="565">
       <title>Coherence boosting: When your pretrained language model is not paying enough attention</title>
@@ -8144,6 +8708,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/piqa">PIQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.565</doi>
     </paper>
     <paper id="566">
       <title>Uncertainty Estimation of Transformer Predictions for Misclassification Detection</title>
@@ -8169,6 +8734,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.566</doi>
     </paper>
     <paper id="567">
       <title><fixed-case>VALSE</fixed-case>: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena</title>
@@ -8187,6 +8753,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/visdial">VisDial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual7w">Visual7W</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.567</doi>
     </paper>
     <paper id="568">
       <title>The Grammar-Learning Trajectories of Neural Language Models</title>
@@ -8203,6 +8770,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/opensubtitles">OpenSubtitles</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/openwebtext">OpenWebText</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.568</doi>
     </paper>
     <paper id="569">
       <title>Generating Scientific Definitions with Controllable Complexity</title>
@@ -8214,6 +8782,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="0aa38f88">2022.acl-long.569</url>
       <bibkey>august-etal-2022-generating</bibkey>
       <pwccode url="https://github.com/talaugust/definition-complexity" additional="false">talaugust/definition-complexity</pwccode>
+      <doi>10.18653/v1/2022.acl-long.569</doi>
     </paper>
     <paper id="570">
       <title>Label Semantic Aware Pre-training for Few-shot Text Classification</title>
@@ -8231,6 +8800,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/sgd">SGD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snips">SNIPS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/topv2">TOPv2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.570</doi>
     </paper>
     <paper id="571">
       <title><fixed-case>ODE</fixed-case> Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generation</title>
@@ -8250,6 +8820,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="13023aee">2022.acl-long.571.software.zip</attachment>
       <bibkey>li-etal-2022-ode</bibkey>
       <pwccode url="https://github.com/libeineu/ode-transformer" additional="false">libeineu/ode-transformer</pwccode>
+      <doi>10.18653/v1/2022.acl-long.571</doi>
     </paper>
     <paper id="572">
       <title>A Comparison of Strategies for Source-Free Domain Adaptation</title>
@@ -8261,6 +8832,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="fbb0015b">2022.acl-long.572</url>
       <bibkey>su-etal-2022-comparison</bibkey>
       <pwccode url="https://github.com/xinsu626/sourcefreedomainadaptation" additional="false">xinsu626/sourcefreedomainadaptation</pwccode>
+      <doi>10.18653/v1/2022.acl-long.572</doi>
     </paper>
     <paper id="573">
       <title>Ethics Sheets for <fixed-case>AI</fixed-case> Tasks</title>
@@ -8269,6 +8841,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Several high-profile events, such as the mass testing of emotion recognition systems on vulnerable sub-populations and using question answering systems to make moral judgments, have highlighted how technology will often lead to more adverse outcomes for those that are already marginalized. At issue here are not just individual systems and datasets, but also the AI tasks themselves. In this position paper, I make a case for thinking about ethical considerations not just at the level of individual models and datasets, but also at the level of AI tasks. I will present a new form of such an effort, Ethics Sheets for AI Tasks, dedicated to fleshing out the assumptions and ethical considerations hidden in how a task is commonly framed and in the choices we make regarding the data, method, and evaluation. I will also present a template for ethics sheets with 50 ethical considerations, using the task of emotion recognition as a running example. Ethics sheets are a mechanism to engage with and document ethical considerations before building datasets and systems. Similar to survey articles, a small number of carefully created ethics sheets can serve numerous researchers and developers.</abstract>
       <url hash="e355dfab">2022.acl-long.573</url>
       <bibkey>mohammad-2022-ethics</bibkey>
+      <doi>10.18653/v1/2022.acl-long.573</doi>
     </paper>
     <paper id="574">
       <title>Learning Disentangled Representations of Negation and Uncertainty</title>
@@ -8281,6 +8854,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="b6f99128">2022.acl-long.574</url>
       <bibkey>vasilakes-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/jvasilakes/disentanglement-vae" additional="false">jvasilakes/disentanglement-vae</pwccode>
+      <doi>10.18653/v1/2022.acl-long.574</doi>
     </paper>
     <paper id="575">
       <title><fixed-case>latent-GLAT</fixed-case>: Glancing at Latent Variables for Parallel Text Generation</title>
@@ -8298,6 +8872,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>bao-etal-2022-textit</bibkey>
       <pwccode url="https://github.com/baoy-nlp/latent-glat" additional="false">baoy-nlp/latent-glat</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.575</doi>
     </paper>
     <paper id="576">
       <title><fixed-case>PPT</fixed-case>: Pre-trained Prompt Tuning for Few-shot Learning</title>
@@ -8317,6 +8892,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ocnli">OCNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.576</doi>
     </paper>
     <paper id="577">
       <title>Deduplicating Training Data Makes Language Models Better</title>
@@ -8335,6 +8911,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/billion-word-benchmark">Billion Word Benchmark</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/realnews">RealNews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wiki-40b">Wiki-40B</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.577</doi>
     </paper>
     <paper id="578">
       <title>Improving the Generalizability of Depression Detection by Leveraging Clinical Questionnaires</title>
@@ -8349,6 +8926,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>nguyen-etal-2022-improving</bibkey>
       <pwccode url="https://github.com/thongnt99/acl22-depression-phq9" additional="false">thongnt99/acl22-depression-phq9</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/smhd">SMHD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.578</doi>
     </paper>
     <paper id="579">
       <title><fixed-case>I</fixed-case>nternet-Augmented Dialogue Generation</title>
@@ -8362,6 +8940,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/topical-chat">Topical-Chat</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.579</doi>
     </paper>
     <paper id="580">
       <title><fixed-case>SUPERB</fixed-case>-<fixed-case>SG</fixed-case>: Enhanced Speech processing Universal <fixed-case>PER</fixed-case>formance Benchmark for Semantic and Generative Capabilities</title>
@@ -8391,6 +8970,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/common-voice">Common Voice</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/demand">DEMAND</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/librimix">LibriMix</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.580</doi>
     </paper>
     <paper id="581">
       <title>Knowledge Neurons in Pretrained Transformers</title>
@@ -8406,6 +8986,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="f4555041">2022.acl-long.581.software.zip</attachment>
       <bibkey>dai-etal-2022-knowledge</bibkey>
       <pwccode url="https://github.com/hunter-ddm/knowledge-neurons" additional="true">hunter-ddm/knowledge-neurons</pwccode>
+      <doi>10.18653/v1/2022.acl-long.581</doi>
     </paper>
     <paper id="582">
       <title>Meta-Learning for Fast Cross-Lingual Adaptation in Dependency Parsing</title>
@@ -8422,6 +9003,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="7191cf92">2022.acl-long.582.software.zip</attachment>
       <bibkey>langedijk-etal-2022-meta</bibkey>
       <pwccode url="https://github.com/annaproxy/udify-metalearning" additional="false">annaproxy/udify-metalearning</pwccode>
+      <doi>10.18653/v1/2022.acl-long.582</doi>
     </paper>
     <paper id="583">
       <title><fixed-case>F</fixed-case>rench <fixed-case>C</fixed-case>row<fixed-case>S</fixed-case>-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than <fixed-case>E</fixed-case>nglish</title>
@@ -8434,6 +9016,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="b110747b">2022.acl-long.583</url>
       <bibkey>neveol-etal-2022-french</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/crows-pairs">CrowS-Pairs</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.583</doi>
     </paper>
     <paper id="584">
       <title>Few-Shot Learning with <fixed-case>S</fixed-case>iamese Networks and Label Tuning</title>
@@ -8451,6 +9034,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/isear">ISEAR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.584</doi>
     </paper>
     <paper id="585">
       <title>Inferring Rewards from Language in Context</title>
@@ -8463,6 +9047,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="3b2944b9">2022.acl-long.585</url>
       <bibkey>lin-etal-2022-inferring</bibkey>
       <pwccode url="https://github.com/jlin816/rewards-from-language" additional="false">jlin816/rewards-from-language</pwccode>
+      <doi>10.18653/v1/2022.acl-long.585</doi>
     </paper>
     <paper id="586">
       <title>Generating Biographies on <fixed-case>W</fixed-case>ikipedia: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies</title>
@@ -8473,6 +9058,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="59e971bf">2022.acl-long.586</url>
       <bibkey>fan-gardent-2022-generating</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wikisum">WikiSum</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.586</doi>
     </paper>
     <paper id="587">
       <title>Your Answer is Incorrect... Would you like to know why? Introducing a Bilingual Short Answer Feedback Dataset</title>
@@ -8488,6 +9074,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>filighera-etal-2022-answer</bibkey>
       <pwccode url="https://github.com/sebochs/saf" additional="false">sebochs/saf</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.587</doi>
     </paper>
     <paper id="588">
       <title>Towards Better Characterization of Paraphrases</title>
@@ -8502,6 +9089,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paws">PAWS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.588</doi>
     </paper>
     <paper id="589">
       <title><fixed-case>S</fixed-case>umm<fixed-case>S</fixed-case>creen: A Dataset for Abstractive Screenplay Summarization</title>
@@ -8516,6 +9104,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/mingdachen/SummScreen" additional="false">mingdachen/SummScreen</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/multi-news">Multi-News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tvrecap">TVRecap</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.589</doi>
     </paper>
     <paper id="590">
       <title>Sparsifying Transformer Models with Trainable Representation Pooling</title>
@@ -8530,6 +9119,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/applicaai/pyramidions" additional="false">applicaai/pyramidions</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/pubmed">Pubmed</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/arxiv-summarization-dataset">arXiv Summarization Dataset</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.590</doi>
     </paper>
     <paper id="591">
       <title>Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence Models</title>
@@ -8541,6 +9131,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="7a4428c3">2022.acl-long.591</url>
       <bibkey>stahlberg-etal-2022-uncertainty</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/jfleg">JFLEG</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.591</doi>
     </paper>
     <paper id="592">
       <title><fixed-case>F</fixed-case>lip<fixed-case>DA</fixed-case>: Effective and Robust Data Augmentation for Few-Shot Learning</title>
@@ -8562,6 +9153,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wsc">WSC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.592</doi>
     </paper>
     <paper id="593">
       <title>Text-Free Prosody-Aware Generative Spoken Language Modeling</title>
@@ -8582,6 +9174,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>kharitonov-etal-2022-text</bibkey>
       <pwccode url="https://github.com/pytorch/fairseq" additional="false">pytorch/fairseq</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.593</doi>
     </paper>
     <paper id="594">
       <title>Lite Unified Modeling for Discriminative Reading Comprehension</title>
@@ -8598,6 +9191,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/dream">DREAM</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.594</doi>
     </paper>
     <paper id="595">
       <title>Bilingual alignment transfers to multilingual alignment for unsupervised parallel text mining</title>
@@ -8608,6 +9202,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="a5f88bf7">2022.acl-long.595</url>
       <bibkey>tien-steinert-threlkeld-2022-bilingual</bibkey>
       <pwccode url="https://github.com/cctien/bimultialign" additional="false">cctien/bimultialign</pwccode>
+      <doi>10.18653/v1/2022.acl-long.595</doi>
     </paper>
     <paper id="596">
       <title>End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding</title>
@@ -8627,6 +9222,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Natural language spatial video grounding aims to detect the relevant objects in video frames with descriptive sentences as the query. In spite of the great advances, most existing methods rely on dense video frame annotations, which require a tremendous amount of human effort. To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner. One major challenge of end-to-end one-shot video grounding is the existence of videos frames that are either irrelevant to the language query or the labeled frame. Another challenge relates to the limited supervision, which might result in ineffective representation learning. To address these challenges, we designed an end-to-end model via Information Tree for One-Shot video grounding (IT-OS). Its key module, the information tree, can eliminate the interference of irrelevant frames based on branch search and branch cropping techniques. In addition, several self-supervised tasks are proposed based on the information tree to improve the representation learning under insufficient labeling. Experiments on the benchmark dataset demonstrate the effectiveness of our model.</abstract>
       <url hash="1e6f7a70">2022.acl-long.596</url>
       <bibkey>li-etal-2022-end</bibkey>
+      <doi>10.18653/v1/2022.acl-long.596</doi>
     </paper>
     <paper id="597">
       <title><fixed-case>RNS</fixed-case>um: A Large-Scale Dataset for Automatic Release Note Generation via Commit Logs Summarization</title>
@@ -8639,6 +9235,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>A release note is a technical document that describes the latest changes to a software product and is crucial in open source software development. However, it still remains challenging to generate release notes automatically. In this paper, we present a new dataset called RNSum, which contains approximately 82,000 English release notes and the associated commit messages derived from the online repositories in GitHub. Then, we propose classwise extractive-then-abstractive/abstractive summarization approaches to this task, which can employ a modern transformer-based seq2seq network like BART and can be applied to various repositories without specific constraints. The experimental results on the RNSum dataset show that the proposed methods can generate less noisy release notes at higher coverage than the baselines. We also observe that there is a significant gap in the coverage of essential information when compared to human references. Our dataset and the code are publicly available.</abstract>
       <url hash="b30284e5">2022.acl-long.597</url>
       <bibkey>kamezawa-etal-2022-rnsum</bibkey>
+      <doi>10.18653/v1/2022.acl-long.597</doi>
     </paper>
     <paper id="598">
       <title>Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge</title>
@@ -8654,6 +9251,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/c3">C3</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/dialogre">DialogRE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.598</doi>
     </paper>
     <paper id="599">
       <title>Modeling Persuasive Discourse to Adaptively Support Students’ Argumentative Writing</title>
@@ -8665,6 +9263,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="50b29959">2022.acl-long.599.software.zip</attachment>
       <bibkey>wambsganss-niklaus-2022-modeling</bibkey>
       <pwccode url="https://github.com/thiemowa/-argumentative_business_model_pitches" additional="false">thiemowa/-argumentative_business_model_pitches</pwccode>
+      <doi>10.18653/v1/2022.acl-long.599</doi>
     </paper>
     <paper id="600">
       <title>Active Evaluation: Efficient <fixed-case>NLG</fixed-case> Evaluation with Few Pairwise Comparisons</title>
@@ -8680,6 +9279,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/parabank">ParaBank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wmt-2015">WMT 2015</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wmt-2016">WMT 2016</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.600</doi>
     </paper>
     <paper id="601">
       <title>The Moral Debater: A Study on the Computational Generation of Morally Framed Arguments</title>
@@ -8693,6 +9293,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="b92d7dc9">2022.acl-long.601.software.zip</attachment>
       <bibkey>alshomary-etal-2022-moral</bibkey>
       <pwccode url="https://github.com/webis-de/acl-22" additional="false">webis-de/acl-22</pwccode>
+      <doi>10.18653/v1/2022.acl-long.601</doi>
     </paper>
     <paper id="602">
       <title>Pyramid-<fixed-case>BERT</fixed-case>: Reducing Complexity via Successive Core-set based Token Selection</title>
@@ -8708,6 +9309,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/lra">LRA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-long.602</doi>
     </paper>
     <paper id="603">
       <title>Probing for the Usage of Grammatical Number</title>
@@ -8720,6 +9322,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>A central quest of probing is to uncover how pre-trained models encode a linguistic property within their representations. An encoding, however, might be spurious—i.e., the model might not rely on it when making predictions. In this paper, we try to find an encoding that the model actually uses, introducing a usage-based probing setup. We first choose a behavioral task which cannot be solved without using the linguistic property. Then, we attempt to remove the property by intervening on the model’s representations. We contend that, if an encoding is used by the model, its removal should harm the performance on the chosen behavioral task. As a case study, we focus on how BERT encodes grammatical number, and on how it uses this encoding to solve the number agreement task. Experimentally, we find that BERT relies on a linear encoding of grammatical number to produce the correct behavioral output. We also find that BERT uses a separate encoding of grammatical number for nouns and verbs. Finally, we identify in which layers information about grammatical number is transferred from a noun to its head verb.</abstract>
       <url hash="7d434c8c">2022.acl-long.603</url>
       <bibkey>lasri-etal-2022-probing</bibkey>
+      <doi>10.18653/v1/2022.acl-long.603</doi>
     </paper>
   </volume>
   <volume id="short" ingest-date="2022-05-15">
@@ -8755,6 +9358,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.1</doi>
     </paper>
     <paper id="2">
       <title>Are Shortest Rationales the Best Explanations for Human Understanding?</title>
@@ -8767,6 +9371,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="96b76aec">2022.acl-short.2</url>
       <bibkey>shen-etal-2022-shortest</bibkey>
       <pwccode url="https://github.com/huashen218/limitedink" additional="false">huashen218/limitedink</pwccode>
+      <doi>10.18653/v1/2022.acl-short.2</doi>
     </paper>
     <paper id="3">
       <title>Analyzing Wrap-Up Effects through an Information-Theoretic Lens</title>
@@ -8779,6 +9384,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Numerous analyses of reading time (RT) data have been undertaken in the effort to learn more about the internal processes that occur during reading comprehension. However, data measured on words at the end of a sentence–or even clause–is often omitted due to the confounding factors introduced by so-called “wrap-up effects,” which manifests as a skewed distribution of RTs for these words. Consequently, the understanding of the cognitive processes that might be involved in these effects is limited. In this work, we attempt to learn more about these processes by looking for the existence–or absence–of a link between wrap-up effects and information theoretic quantities, such as word and context information content. We find that the information distribution of prior context is often predictive of sentence- and clause-final RTs (while not of sentence-medial RTs), which lends support to several prior hypotheses about the processes involved in wrap-up effects.</abstract>
       <url hash="05df18a1">2022.acl-short.3</url>
       <bibkey>meister-etal-2022-analyzing</bibkey>
+      <doi>10.18653/v1/2022.acl-short.3</doi>
     </paper>
     <paper id="4">
       <title>Have my arguments been replied to? Argument Pair Extraction as Machine Reading Comprehension</title>
@@ -8791,6 +9397,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="3b0f716e">2022.acl-short.4</url>
       <attachment type="software" hash="9485565d">2022.acl-short.4.software.zip</attachment>
       <bibkey>bao-etal-2022-arguments</bibkey>
+      <doi>10.18653/v1/2022.acl-short.4</doi>
     </paper>
     <paper id="5">
       <title>On the probability–quality paradox in language generation</title>
@@ -8802,6 +9409,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>When generating natural language from neural probabilistic models, high probability does not always coincide with high quality: It has often been observed that mode-seeking decoding methods, i.e., those that produce high-probability text under the model, lead to unnatural language. On the other hand, the lower-probability text generated by stochastic methods is perceived as more human-like. In this note, we offer an explanation for this phenomenon by analyzing language generation through an information-theoretic lens. Specifically, we posit that human-like language should contain an amount of information (quantified as negative log-probability) that is close to the entropy of the distribution over natural strings. Further, we posit that language with substantially more (or less) information is undesirable. We provide preliminary empirical evidence in favor of this hypothesis; quality ratings of both human and machine-generated text—covering multiple tasks and common decoding strategies—suggest high-quality text has an information content significantly closer to the entropy than we would expect by chance.</abstract>
       <url hash="5f6f5fc5">2022.acl-short.5</url>
       <bibkey>meister-etal-2022-high</bibkey>
+      <doi>10.18653/v1/2022.acl-short.5</doi>
     </paper>
     <paper id="6">
       <title>Disentangled Knowledge Transfer for <fixed-case>OOD</fixed-case> Intent Discovery with Unified Contrastive Learning</title>
@@ -8818,6 +9426,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="0033f1f0">2022.acl-short.6</url>
       <bibkey>mou-etal-2022-disentangled</bibkey>
       <pwccode url="https://github.com/myt517/dkt" additional="false">myt517/dkt</pwccode>
+      <doi>10.18653/v1/2022.acl-short.6</doi>
     </paper>
     <paper id="7">
       <title>Voxel-informed Language Grounding</title>
@@ -8831,6 +9440,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>corona-etal-2022-voxel</bibkey>
       <pwccode url="https://github.com/rcorona/voxel_informed_language_grounding" additional="false">rcorona/voxel_informed_language_grounding</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/snare">SNARE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.7</doi>
     </paper>
     <paper id="8">
       <title><fixed-case>P</fixed-case>-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks</title>
@@ -8848,6 +9458,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.8</doi>
     </paper>
     <paper id="9">
       <title>On Efficiently Acquiring Annotations for Multilingual Models</title>
@@ -8858,6 +9469,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>When tasked with supporting multiple languages for a given problem, two approaches have arisen: training a model for each language with the annotation budget divided equally among them, and training on a high-resource language followed by zero-shot transfer to the remaining languages. In this work, we show that the strategy of joint learning across multiple languages using a single model performs substantially better than the aforementioned alternatives. We also demonstrate that active learning provides additional, complementary benefits. We show that this simple approach enables the model to be data efficient by allowing it to arbitrate its annotation budget to query languages it is less certain on. We illustrate the effectiveness of our proposed method on a diverse set of tasks: a classification task with 4 languages, a sequence tagging task with 4 languages and a dependency parsing task with 5 languages. Our proposed method, whilst simple, substantially outperforms the other viable alternatives for building a model in a multilingual setting under constrained budgets.</abstract>
       <url hash="11bf2504">2022.acl-short.9</url>
       <bibkey>moniz-etal-2022-efficiently</bibkey>
+      <doi>10.18653/v1/2022.acl-short.9</doi>
     </paper>
     <paper id="10">
       <title>Automatic Detection of Entity-Manipulated Text using Factual Knowledge</title>
@@ -8869,6 +9481,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="48a0eebb">2022.acl-short.10</url>
       <bibkey>jawahar-etal-2022-automatic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/realnews">RealNews</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.10</doi>
     </paper>
     <paper id="11">
       <title>Does <fixed-case>BERT</fixed-case> Know that the <fixed-case>IS</fixed-case>-A Relation Is Transitive?</title>
@@ -8880,6 +9493,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="93a7eb0a">2022.acl-short.11.software.zip</attachment>
       <bibkey>lin-ng-2022-bert</bibkey>
       <pwccode url="https://github.com/nusnlp/probe-bert-transitivity" additional="false">nusnlp/probe-bert-transitivity</pwccode>
+      <doi>10.18653/v1/2022.acl-short.11</doi>
     </paper>
     <paper id="12">
       <title>Buy Tesla, Sell Ford: Assessing Implicit Stock Market Preference in Pre-trained Language Models</title>
@@ -8889,6 +9503,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Pretrained language models such as BERT have achieved remarkable success in several NLP tasks. With the wide adoption of BERT in real-world applications, researchers begin to investigate the implicit biases encoded in the BERT. In this paper, we assess the implicit stock market preferences in BERT and its finance domain-specific model FinBERT. We find some interesting patterns. For example, the language models are overall more positive towards the stock market, but there are significant differences in preferences between a pair of industry sectors, or even within a sector. Given the prevalence of NLP models in financial decision making systems, this work raises the awareness of their potential implicit preferences in the stock markets. Awareness of such problems can help practitioners improve robustness and accountability of their financial NLP pipelines .</abstract>
       <url hash="7501e9c4">2022.acl-short.12</url>
       <bibkey>chuang-yang-2022-buy</bibkey>
+      <doi>10.18653/v1/2022.acl-short.12</doi>
     </paper>
     <paper id="13">
       <title>Pixie: Preference in Implicit and Explicit Comparisons</title>
@@ -8901,6 +9516,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="c7e895be">2022.acl-short.13</url>
       <bibkey>haque-etal-2022-pixie</bibkey>
       <pwccode url="https://github.com/ahaque2/pixie" additional="false">ahaque2/pixie</pwccode>
+      <doi>10.18653/v1/2022.acl-short.13</doi>
     </paper>
     <paper id="14">
       <title>Counterfactual Explanations for Natural Language Interfaces</title>
@@ -8914,6 +9530,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="275e13c7">2022.acl-short.14.software.zip</attachment>
       <bibkey>tolkachev-etal-2022-counterfactual</bibkey>
       <pwccode url="https://github.com/georgeto20/counterfactual_explanations" additional="false">georgeto20/counterfactual_explanations</pwccode>
+      <doi>10.18653/v1/2022.acl-short.14</doi>
     </paper>
     <paper id="15">
       <title>Predicting Difficulty and Discrimination of Natural Language Questions</title>
@@ -8925,6 +9542,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="55174984">2022.acl-short.15.software.zip</attachment>
       <bibkey>byrd-srivastava-2022-predicting</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/hotpotqa">HotpotQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.15</doi>
     </paper>
     <paper id="16">
       <title>How does the pre-training objective affect what large language models learn about linguistic properties?</title>
@@ -8935,6 +9553,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="3c1f4c83">2022.acl-short.16</url>
       <bibkey>alajrami-aletras-2022-pre</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.16</doi>
     </paper>
     <paper id="17">
       <title>The Power of Prompt Tuning for Low-Resource Semantic Parsing</title>
@@ -8945,6 +9564,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Prompt tuning has recently emerged as an effective method for adapting pre-trained language models to a number of language understanding and generation tasks. In this paper, we investigate prompt tuning for semantic parsing—the task of mapping natural language utterances onto formal meaning representations. On the low-resource splits of Overnight and TOPv2, we find that a prompt tuned T5-xl significantly outperforms its fine-tuned counterpart, as well as strong GPT-3 and BART baselines. We also conduct ablation studies across different model scales and target representations, finding that, with increasing model scale, prompt tuned T5 models improve at generating target representations that are far from the pre-training distribution.</abstract>
       <url hash="76d22c4d">2022.acl-short.17</url>
       <bibkey>schucher-etal-2022-power</bibkey>
+      <doi>10.18653/v1/2022.acl-short.17</doi>
     </paper>
     <paper id="18">
       <title>Data Contamination: From Memorization to Exploitation</title>
@@ -8956,6 +9576,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>magar-schwartz-2022-data</bibkey>
       <pwccode url="https://github.com/schwartz-lab-nlp/data_contamination" additional="false">schwartz-lab-nlp/data_contamination</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.18</doi>
     </paper>
     <paper id="19">
       <title>Detecting Annotation Errors in Morphological Data with the Transformer</title>
@@ -8965,6 +9586,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Annotation errors that stem from various sources are usually unavoidable when performing large-scale annotation of linguistic data. In this paper, we evaluate the feasibility of using the Transformer model to detect various types of annotator errors in morphological data sets that contain inflected word forms. We evaluate our error detection model on four languages by introducing three different types of artificial errors in the data: (1) typographic errors, where single characters in the data are inserted, replaced, or deleted; (2) linguistic confusion errors where two inflected forms are systematically swapped; and (3) self-adversarial errors where the Transformer model itself is used to generate plausible-looking, but erroneous forms by retrieving high-scoring predictions from the search beam. Results show that the Transformer model can with perfect, or near-perfect recall detect errors in all three scenarios, even when significant amounts of the annotated data (5%-30%) are corrupted on all languages tested. Precision varies across the languages and types of errors, but is high enough that the model can be very effectively used to flag suspicious entries in large data sets for further scrutiny by human annotators.</abstract>
       <url hash="396005e1">2022.acl-short.19</url>
       <bibkey>liu-hulden-2022-detecting</bibkey>
+      <doi>10.18653/v1/2022.acl-short.19</doi>
     </paper>
     <paper id="20">
       <title>Estimating the Entropy of Linguistic Distributions</title>
@@ -8976,6 +9598,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="1b4eaa7d">2022.acl-short.20</url>
       <attachment type="software" hash="9d122214">2022.acl-short.20.software.zip</attachment>
       <bibkey>arora-etal-2022-estimating</bibkey>
+      <doi>10.18653/v1/2022.acl-short.20</doi>
     </paper>
     <paper id="21">
       <title>Morphological Reinflection with Multiple Arguments: An Extended Annotation schema and a <fixed-case>G</fixed-case>eorgian Case Study</title>
@@ -8986,6 +9609,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>In recent years, a flurry of morphological datasets had emerged, most notably UniMorph, aa multi-lingual repository of inflection tables. However, the flat structure of the current morphological annotation makes the treatment of some languages quirky, if not impossible, specifically in cases of polypersonal agreement. In this paper we propose a general solution for such cases and expand the UniMorph annotation schema to naturally address this phenomenon, in which verbs agree with multiple arguments using true affixes. We apply this extended schema to one such language, Georgian, and provide a human-verified, accurate and balanced morphological dataset for Georgian verbs. The dataset has 4 times more tables and 6 times more verb forms compared to the existing UniMorph dataset, covering all possible variants of argument marking, demonstrating the adequacy of our proposed scheme. Experiments on a reinflection task show that generalization is easy when the data is split at the form level, but extremely hard when splitting along lemma lines. Expanding the other languages in UniMorph according to this schema is expected to improve both the coverage, consistency and interpretability of this benchmark.</abstract>
       <url hash="c79b31a7">2022.acl-short.21</url>
       <bibkey>guriel-etal-2022-morphological</bibkey>
+      <doi>10.18653/v1/2022.acl-short.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>DQ</fixed-case>-<fixed-case>BART</fixed-case>: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization</title>
@@ -9004,6 +9628,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>li-etal-2022-dq</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/eli5">ELI5</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.22</doi>
     </paper>
     <paper id="23">
       <title>Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension</title>
@@ -9021,6 +9646,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/crd3">CRD3</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/dream">DREAM</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/movienet">MovieNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.23</doi>
     </paper>
     <paper id="24">
       <title>Kronecker Decomposition for <fixed-case>GPT</fixed-case> Compression</title>
@@ -9040,6 +9666,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.24</doi>
     </paper>
     <paper id="25">
       <title>Simple and Effective Knowledge-Driven Query Expansion for <fixed-case>QA</fixed-case>-Based Product Attribute Extraction</title>
@@ -9051,6 +9678,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>A key challenge in attribute value extraction (AVE) from e-commerce sites is how to handle a large number of attributes for diverse products. Although this challenge is partially addressed by a question answering (QA) approach which finds a value in product data for a given query (attribute), it does not work effectively for rare and ambiguous queries. We thus propose simple knowledge-driven query expansion based on possible answers (values) of a query (attribute) for QA-based AVE. We retrieve values of a query (attribute) from the training data to expand the query. We train a model with two tricks, knowledge dropout and knowledge token mixing, which mimic the imperfection of the value knowledge in testing. Experimental results on our cleaned version of AliExpress dataset show that our method improves the performance of AVE (+6.08 macro F1), especially for rare and ambiguous attributes (+7.82 and +6.86 macro F1, respectively).</abstract>
       <url hash="649f922d">2022.acl-short.25</url>
       <bibkey>shinzato-etal-2022-simple</bibkey>
+      <doi>10.18653/v1/2022.acl-short.25</doi>
     </paper>
     <paper id="26">
       <title>Event-Event Relation Extraction using Probabilistic Box Embedding</title>
@@ -9064,6 +9692,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>To understand a story with multiple events, it is important to capture the proper relations across these events. However, existing event relation extraction (ERE) framework regards it as a multi-class classification task and do not guarantee any coherence between different relation types, such as anti-symmetry. If a phone line “died” after “storm”, then it is obvious that the “storm” happened before the “died”. Current framework of event relation extraction do not guarantee this coherence and thus enforces it via constraint loss function (Wang et al., 2020). In this work, we propose to modify the underlying ERE model to guarantee coherence by representing each event as a box representation (BERE) without applying explicit constraints. From our experiments, BERE also shows stronger conjunctive constraint satisfaction while performing on par or better in F1 compared to previous models with constraint injection.</abstract>
       <url hash="81741f2c">2022.acl-short.26</url>
       <bibkey>hwang-etal-2022-event</bibkey>
+      <doi>10.18653/v1/2022.acl-short.26</doi>
     </paper>
     <paper id="27">
       <title>Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation</title>
@@ -9076,6 +9705,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="76f29c2f">2022.acl-short.27.software.tgz</attachment>
       <bibkey>lam-etal-2022-sample</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/europarl-st">Europarl-ST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.27</doi>
     </paper>
     <paper id="28">
       <title>Predicting Sentence Deletions for Text Simplification Using a Functional Discourse Structure</title>
@@ -9087,6 +9717,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="5fe5a282">2022.acl-short.28</url>
       <bibkey>zhang-etal-2022-predicting</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/newsela">Newsela</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.28</doi>
     </paper>
     <paper id="29">
       <title>Multilingual Pre-training with Language and Task Adaptation for Multilingual Text Style Transfer</title>
@@ -9100,6 +9731,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/laihuiyuan/multilingual-tst" additional="false">laihuiyuan/multilingual-tst</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/gyafc">GYAFC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xformal">XFORMAL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.29</doi>
     </paper>
     <paper id="30">
       <title>When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning</title>
@@ -9110,6 +9742,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Transfer learning (TL) in natural language processing (NLP) has seen a surge of interest in recent years, as pre-trained models have shown an impressive ability to transfer to novel tasks. Three main strategies have emerged for making use of multiple supervised datasets during fine-tuning: training on an intermediate task before training on the target task (STILTs), using multi-task learning (MTL) to train jointly on a supplementary task and the target task (pairwise MTL), or simply using MTL to train jointly on all available datasets (MTL-ALL). In this work, we compare all three TL methods in a comprehensive analysis on the GLUE dataset suite. We find that there is a simple heuristic for when to use one of these techniques over the other: pairwise MTL is better than STILTs when the target task has fewer instances than the supporting task and vice versa. We show that this holds true in more than 92% of applicable cases on the GLUE dataset and validate this hypothesis with experiments varying dataset size. The simplicity and effectiveness of this heuristic is surprising and warrants additional exploration by the TL community. Furthermore, we find that MTL-ALL is worse than the pairwise methods in almost every case. We hope this study will aid others as they choose between TL methods for NLP tasks.</abstract>
       <url hash="b3db0099">2022.acl-short.30</url>
       <bibkey>weller-etal-2022-use</bibkey>
+      <doi>10.18653/v1/2022.acl-short.30</doi>
     </paper>
     <paper id="31">
       <title>Leveraging Explicit Lexico-logical Alignments in Text-to-<fixed-case>SQL</fixed-case> Parsing</title>
@@ -9125,6 +9758,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="db96344a">2022.acl-short.31</url>
       <attachment type="software" hash="2537fdd8">2022.acl-short.31.software.zip</attachment>
       <bibkey>sun-etal-2022-leveraging</bibkey>
+      <doi>10.18653/v1/2022.acl-short.31</doi>
     </paper>
     <paper id="32">
       <title>Complex Evolutional Pattern Learning for Temporal Knowledge Graph Reasoning</title>
@@ -9144,6 +9778,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>li-etal-2022-complex</bibkey>
       <pwccode url="https://github.com/lee-zix/cen" additional="false">lee-zix/cen</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/icews">ICEWS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.32</doi>
     </paper>
     <paper id="33">
       <title>Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking</title>
@@ -9157,6 +9792,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="5b9d4929">2022.acl-short.33</url>
       <bibkey>kim-etal-2022-mismatch</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.33</doi>
     </paper>
     <paper id="34">
       <title><fixed-case>LM</fixed-case>-<fixed-case>BFF</fixed-case>-<fixed-case>MS</fixed-case>: Improving Few-Shot Fine-tuning of Language Models based on Multiple Soft Demonstration Memory</title>
@@ -9175,6 +9811,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.34</doi>
     </paper>
     <paper id="35">
       <title>Towards Fair Evaluation of Dialogue State Tracking by Flexible Incorporation of Turn-level Performances</title>
@@ -9188,6 +9825,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>dey-etal-2022-towards</bibkey>
       <pwccode url="https://github.com/suvodipdey/fga" additional="false">suvodipdey/fga</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.35</doi>
     </paper>
     <paper id="36">
       <title>Exploiting Language Model Prompts Using Similarity Measures: A Case Study on the Word-in-Context Task</title>
@@ -9202,6 +9840,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.36</doi>
     </paper>
     <paper id="37">
       <title>Hierarchical Curriculum Learning for <fixed-case>AMR</fixed-case> Parsing</title>
@@ -9219,6 +9858,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>wang-etal-2022-hierarchical</bibkey>
       <pwccode url="https://github.com/wangpeiyi9979/hcl-text2amr" additional="false">wangpeiyi9979/hcl-text2amr</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bio">Bio</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.37</doi>
     </paper>
     <paper id="38">
       <title><fixed-case>PARE</fixed-case>: A Simple and Strong Baseline for Monolingual and Multilingual Distantly Supervised Relation Extraction</title>
@@ -9233,6 +9873,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>rathore-etal-2022-pare</bibkey>
       <pwccode url="https://github.com/dair-iitd/dsre" additional="false">dair-iitd/dsre</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dis-rex">DiS-ReX</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.38</doi>
     </paper>
     <paper id="39">
       <title>To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo</title>
@@ -9249,6 +9890,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/whos-waldo">Who’s Waldo</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.39</doi>
     </paper>
     <paper id="40">
       <title>Translate-Train Embracing Translationese Artifacts</title>
@@ -9261,6 +9903,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="127f70c5">2022.acl-short.40</url>
       <bibkey>yu-etal-2022-translate</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/tydi-qa">TyDi QA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.40</doi>
     </paper>
     <paper id="41">
       <title><fixed-case>C</fixed-case>-<fixed-case>MORE</fixed-case>: Pretraining to Answer Open-Domain Questions by Consulting Millions of References</title>
@@ -9277,6 +9920,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/xiangyue9607/c-more" additional="false">xiangyue9607/c-more</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.41</doi>
     </paper>
     <paper id="42">
       <title>k-<fixed-case>R</fixed-case>ater <fixed-case>R</fixed-case>eliability: <fixed-case>T</fixed-case>he Correct Unit of Reliability for Aggregated Human Annotations</title>
@@ -9286,6 +9930,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Since the inception of crowdsourcing, aggregation has been a common strategy for dealing with unreliable data. Aggregate ratings are more reliable than individual ones. However, many Natural Language Processing (NLP) applications that rely on aggregate ratings only report the reliability of individual ratings, which is the incorrect unit of analysis. In these instances, the data reliability is under-reported, and a proposed <tex-math>k</tex-math>-rater reliability (kRR) should be used as the correct data reliability for aggregated datasets. It is a multi-rater generalization of inter-rater reliability (IRR). We conducted two replications of the WordSim-353 benchmark, and present empirical, analytical, and bootstrap-based methods for computing kRR on WordSim-353. These methods produce very similar results. We hope this discussion will nudge researchers to report kRR in addition to IRR.</abstract>
       <url hash="82a59fdf">2022.acl-short.42</url>
       <bibkey>wong-paritosh-2022-k</bibkey>
+      <doi>10.18653/v1/2022.acl-short.42</doi>
     </paper>
     <paper id="43">
       <title>An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers</title>
@@ -9298,6 +9943,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="f9bf618d">2022.acl-short.43.software.zip</attachment>
       <bibkey>hofmann-etal-2022-embarrassingly</bibkey>
       <pwccode url="https://github.com/valentinhofmann/flota" additional="false">valentinhofmann/flota</pwccode>
+      <doi>10.18653/v1/2022.acl-short.43</doi>
     </paper>
     <paper id="44">
       <title><fixed-case>SCD</fixed-case>: Self-Contrastive Decorrelation of Sentence Embeddings</title>
@@ -9312,6 +9958,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/senteval">SentEval</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.44</doi>
     </paper>
     <paper id="45">
       <title>Problems with Cosine as a Measure of Embedding Similarity for High Frequency Words</title>
@@ -9325,6 +9972,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>zhou-etal-2022-problems</bibkey>
       <pwccode url="https://github.com/katezhou/cosine_and_frequency" additional="false">katezhou/cosine_and_frequency</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.45</doi>
     </paper>
     <paper id="46">
       <title>Revisiting the Compositional Generalization Abilities of Neural Sequence Models</title>
@@ -9339,6 +9987,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>patel-etal-2022-revisiting</bibkey>
       <pwccode url="https://github.com/arkilpatel/compositional-generalization-seq2seq" additional="false">arkilpatel/compositional-generalization-seq2seq</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/scan">SCAN</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.46</doi>
     </paper>
     <paper id="47">
       <title>A Copy-Augmented Generative Model for Open-Domain Question Answering</title>
@@ -9353,6 +10002,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>liu-etal-2022-copy</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.47</doi>
     </paper>
     <paper id="48">
       <title>Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation</title>
@@ -9368,6 +10018,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/starsuzi/dar" additional="false">starsuzi/dar</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.48</doi>
     </paper>
     <paper id="49">
       <title><fixed-case>WLASL</fixed-case>-<fixed-case>LEX</fixed-case>: a Dataset for Recognising Phonological Properties in <fixed-case>A</fixed-case>merican <fixed-case>S</fixed-case>ign <fixed-case>L</fixed-case>anguage</title>
@@ -9382,6 +10033,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="d671aab0">2022.acl-short.49.software.zip</attachment>
       <bibkey>tavella-etal-2022-wlasl</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wlasl">WLASL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.49</doi>
     </paper>
     <paper id="50">
       <title>Investigating person-specific errors in chat-oriented dialogue systems</title>
@@ -9393,6 +10045,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Creating chatbots to behave like real people is important in terms of believability. Errors in general chatbots and chatbots that follow a rough persona have been studied, but those in chatbots that behave like real people have not been thoroughly investigated. We collected a large amount of user interactions of a generation-based chatbot trained from large-scale dialogue data of a specific character, i.e., target person, and analyzed errors related to that person. We found that person-specific errors can be divided into two types: errors in attributes and those in relations, each of which can be divided into two levels: self and other. The correspondence with an existing taxonomy of errors was also investigated, and person-specific errors that should be addressed in the future were clarified.</abstract>
       <url hash="5ef2eb15">2022.acl-short.50</url>
       <bibkey>mitsuda-etal-2022-investigating</bibkey>
+      <doi>10.18653/v1/2022.acl-short.50</doi>
     </paper>
     <paper id="51">
       <title>Direct parsing to sentiment graphs</title>
@@ -9409,6 +10062,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>samuel-etal-2022-direct</bibkey>
       <pwccode url="https://github.com/jerbarnes/direct_parsing_to_sent_graph" additional="false">jerbarnes/direct_parsing_to_sent_graph</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.51</doi>
     </paper>
     <paper id="52">
       <title><fixed-case>XDBERT</fixed-case>: <fixed-case>D</fixed-case>istilling Visual Information to <fixed-case>BERT</fixed-case> from Cross-Modal Systems to Improve Language Understanding</title>
@@ -9422,6 +10076,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>hsu-etal-2022-xdbert</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/swag">SWAG</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.52</doi>
     </paper>
     <paper id="53">
       <title>As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning</title>
@@ -9433,6 +10088,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="ae53cad9">2022.acl-short.53.software.zip</attachment>
       <bibkey>vamvas-sennrich-2022-little</bibkey>
       <pwccode url="https://github.com/zurichnlp/coverage-contrastive-conditioning" additional="false">zurichnlp/coverage-contrastive-conditioning</pwccode>
+      <doi>10.18653/v1/2022.acl-short.53</doi>
     </paper>
     <paper id="54">
       <title>How Distributed are Distributed Representations? An Observation on the Locality of Syntactic Information in Verb Agreement Tasks</title>
@@ -9443,6 +10099,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>This work addresses the question of the localization of syntactic information encoded in the transformers representations. We tackle this question from two perspectives, considering the object-past participle agreement in French, by identifying, first, in which part of the sentence and, second, in which part of the representation the syntactic information is encoded. The results of our experiments, using probing, causal analysis and feature selection method, show that syntactic information is encoded locally in a way consistent with the French grammar.</abstract>
       <url hash="9ffdcdc9">2022.acl-short.54</url>
       <bibkey>li-etal-2022-distributed</bibkey>
+      <doi>10.18653/v1/2022.acl-short.54</doi>
     </paper>
     <paper id="55">
       <title>Machine Translation for <fixed-case>L</fixed-case>ivonian: Catering to 20 Speakers</title>
@@ -9455,6 +10112,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Livonian is one of the most endangered languages in Europe with just a tiny handful of speakers and virtually no publicly available corpora. In this paper we tackle the task of developing neural machine translation (NMT) between Livonian and English, with a two-fold aim: on one hand, preserving the language and on the other – enabling access to Livonian folklore, lifestories and other textual intangible heritage as well as making it easier to create further parallel corpora. We rely on Livonian’s linguistic similarity to Estonian and Latvian and collect parallel and monolingual data for the four languages for translation experiments. We combine different low-resource NMT techniques like zero-shot translation, cross-lingual transfer and synthetic data creation to reach the highest possible translation quality as well as to find which base languages are empirically more helpful for transfer to Livonian. The resulting NMT systems and the collected monolingual and parallel data, including a manually translated and verified translation benchmark, are publicly released via OPUS and Huggingface repositories.</abstract>
       <url hash="708a7e49">2022.acl-short.55</url>
       <bibkey>rikters-etal-2022-machine</bibkey>
+      <doi>10.18653/v1/2022.acl-short.55</doi>
     </paper>
     <paper id="56">
       <title>Fire Burns, Sword Cuts: Commonsense Inductive Bias for Exploration in Text-based Games</title>
@@ -9470,6 +10128,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>ryu-etal-2022-fire</bibkey>
       <pwccode url="https://github.com/ktr0921/comm-expl-kg-a2c" additional="false">ktr0921/comm-expl-kg-a2c</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/jericho">Jericho</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.56</doi>
     </paper>
     <paper id="57">
       <title>A Simple but Effective Pluggable Entity Lookup Table for Pre-trained Language Models</title>
@@ -9489,6 +10148,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/s2orc">S2ORC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/t-rex">T-REx</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.57</doi>
     </paper>
     <paper id="58">
       <title>S<tex-math>^4</tex-math>-Tuning: A Simple Cross-lingual Sub-network Tuning Method</title>
@@ -9503,6 +10163,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>xu-etal-2022-s4</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/paws-x">PAWS-X</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.58</doi>
     </paper>
     <paper id="59">
       <title>Region-dependent temperature scaling for certainty calibration and application to class-imbalanced token classification</title>
@@ -9513,6 +10174,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="d2e90b55">2022.acl-short.59</url>
       <bibkey>dawkins-nejadgholi-2022-region</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/few-nerd">Few-NERD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.59</doi>
     </paper>
     <paper id="60">
       <title>Developmental Negation Processing in Transformer Language Models</title>
@@ -9524,6 +10186,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="457bf561">2022.acl-short.60.software.zip</attachment>
       <bibkey>laverghetta-jr-licato-2022-developmental</bibkey>
       <pwccode url="https://github.com/advancing-machine-human-reasoning-lab/negation-processing-acl-2022" additional="false">advancing-machine-human-reasoning-lab/negation-processing-acl-2022</pwccode>
+      <doi>10.18653/v1/2022.acl-short.60</doi>
     </paper>
     <paper id="61">
       <title>Canary Extraction in Natural Language Understanding Models</title>
@@ -9535,6 +10198,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="182c70ff">2022.acl-short.61</url>
       <bibkey>parikh-etal-2022-canary</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/snips">SNIPS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.61</doi>
     </paper>
     <paper id="62">
       <title>On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations</title>
@@ -9550,6 +10214,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="f397ddf9">2022.acl-short.62</url>
       <bibkey>cao-etal-2022-intrinsic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/stereoset">StereoSet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.62</doi>
     </paper>
     <paper id="63">
       <title>Sequence-to-sequence <fixed-case>AMR</fixed-case> Parsing with Ancestor Information</title>
@@ -9559,6 +10224,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>AMR parsing is the task that maps a sentence to an AMR semantic graph automatically. The difficulty comes from generating the complex graph structure. The previous state-of-the-art method translates the AMR graph into a sequence, then directly fine-tunes a pretrained sequence-to-sequence Transformer model (BART). However, purely treating the graph as a sequence does not take advantage of structural information about the graph. In this paper, we design several strategies to add the important <i>ancestor information</i> into the Transformer Decoder. Our experiments show that we can improve the performance for both AMR 2.0 and AMR 3.0 dataset and achieve new state-of-the-art results.</abstract>
       <url hash="c530c41d">2022.acl-short.63</url>
       <bibkey>yu-gildea-2022-sequence</bibkey>
+      <doi>10.18653/v1/2022.acl-short.63</doi>
     </paper>
     <paper id="64">
       <title>Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning</title>
@@ -9570,6 +10236,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="ea8a1474">2022.acl-short.64</url>
       <bibkey>de-lhoneux-etal-2022-zero</bibkey>
       <pwccode url="https://github.com/mdelhoneux/machamp-worst_case_acl" additional="false">mdelhoneux/machamp-worst_case_acl</pwccode>
+      <doi>10.18653/v1/2022.acl-short.64</doi>
     </paper>
     <paper id="65">
       <title><fixed-case>P</fixed-case>ri<fixed-case>M</fixed-case>ock57: A Dataset Of Primary Care Mock Consultations</title>
@@ -9581,6 +10248,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Recent advances in Automatic Speech Recognition (ASR) have made it possible to reliably produce automatic transcripts of clinician-patient conversations. However, access to clinical datasets is heavily restricted due to patient privacy, thus slowing down normal research practices. We detail the development of a public access, high quality dataset comprising of 57 mocked primary care consultations, including audio recordings, their manual utterance-level transcriptions, and the associated consultation notes. Our work illustrates how the dataset can be used as a benchmark for conversational medical ASR as well as consultation note generation from transcripts.</abstract>
       <url hash="7ec3cb08">2022.acl-short.65</url>
       <bibkey>papadopoulos-korfiatis-etal-2022-primock57</bibkey>
+      <doi>10.18653/v1/2022.acl-short.65</doi>
     </paper>
     <paper id="66">
       <title><fixed-case>U</fixed-case>ni<fixed-case>GDD</fixed-case>: <fixed-case>A</fixed-case> Unified Generative Framework for Goal-Oriented Document-Grounded Dialogue</title>
@@ -9593,6 +10261,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>gao-etal-2022-unigdd</bibkey>
       <pwccode url="https://github.com/gao-xiao-bai/UniGDD" additional="false">gao-xiao-bai/UniGDD</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial-1">Doc2Dial</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.66</doi>
     </paper>
     <paper id="67">
       <title><fixed-case>DM</fixed-case>ix: Adaptive Distance-aware Interpolative Mixup</title>
@@ -9611,6 +10280,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/cola">CoLA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.67</doi>
     </paper>
     <paper id="68">
       <title>Sub-Word Alignment is Still Useful: A Vest-Pocket Method for Enhancing Low-Resource Machine Translation</title>
@@ -9622,6 +10292,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="e33424fe">2022.acl-short.68.software.zip</attachment>
       <bibkey>xu-hong-2022-sub</bibkey>
       <pwccode url="https://github.com/Cosmos-Break/transfer-mt-submit" additional="false">Cosmos-Break/transfer-mt-submit</pwccode>
+      <doi>10.18653/v1/2022.acl-short.68</doi>
     </paper>
     <paper id="69">
       <title><fixed-case>HYPHEN</fixed-case>: Hyperbolic <fixed-case>H</fixed-case>awkes Attention For Text Streams</title>
@@ -9636,6 +10307,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="6fdc6607">2022.acl-short.69.software.zip</attachment>
       <bibkey>agarwal-etal-2022-hyphen</bibkey>
       <pwccode url="https://github.com/gtfintechlab/hyphen-acl" additional="false">gtfintechlab/hyphen-acl</pwccode>
+      <doi>10.18653/v1/2022.acl-short.69</doi>
     </paper>
     <paper id="70">
       <title>A Risk-Averse Mechanism for Suicidality Assessment on Social Media</title>
@@ -9646,6 +10318,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Recent studies have shown that social media has increasingly become a platform for users to express suicidal thoughts outside traditional clinical settings. With advances in Natural Language Processing strategies, it is now possible to design automated systems to assess suicide risk. However, such systems may generate uncertain predictions, leading to severe consequences. We hence reformulate suicide risk assessment as a selective prioritized prediction problem over the Columbia Suicide Severity Risk Scale (C-SSRS). We propose SASI, a risk-averse and self-aware transformer-based hierarchical attention classifier, augmented to refrain from making uncertain predictions. We show that SASI is able to refrain from 83% of incorrect predictions on real-world Reddit data. Furthermore, we discuss the qualitative, practical, and ethical aspects of SASI for suicide risk assessment as a human-in-the-loop framework.</abstract>
       <url hash="09250925">2022.acl-short.70</url>
       <bibkey>sawhney-etal-2022-risk</bibkey>
+      <doi>10.18653/v1/2022.acl-short.70</doi>
     </paper>
     <paper id="71">
       <title>When classifying grammatical role, <fixed-case>BERT</fixed-case> doesn’t care about word order... except when it matters</title>
@@ -9657,6 +10330,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="56c28312">2022.acl-short.71</url>
       <attachment type="software" hash="b23260cc">2022.acl-short.71.software.tgz</attachment>
       <bibkey>papadimitriou-etal-2022-classifying-grammatical</bibkey>
+      <doi>10.18653/v1/2022.acl-short.71</doi>
     </paper>
     <paper id="72">
       <title>Triangular Transfer: Freezing the Pivot for Triangular Machine Translation</title>
@@ -9667,6 +10341,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Triangular machine translation is a special case of low-resource machine translation where the language pair of interest has limited parallel data, but both languages have abundant parallel data with a pivot language. Naturally, the key to triangular machine translation is the successful exploitation of such auxiliary data. In this work, we propose a transfer-learning-based approach that utilizes all types of auxiliary data. As we train auxiliary source-pivot and pivot-target translation models, we initialize some parameters of the pivot side with a pre-trained language model and freeze them to encourage both translation models to work in the same pivot language space, so that they can be smoothly transferred to the source-target translation model. Experiments show that our approach can outperform previous ones.</abstract>
       <url hash="daaebfca">2022.acl-short.72</url>
       <bibkey>zhang-etal-2022-triangular</bibkey>
+      <doi>10.18653/v1/2022.acl-short.72</doi>
     </paper>
     <paper id="73">
       <title>Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge</title>
@@ -9678,6 +10353,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>madureira-schlangen-2022-visual</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visdial">VisDial</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.73</doi>
     </paper>
     <paper id="74">
       <title>Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation</title>
@@ -9690,6 +10366,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="8d6fa45e">2022.acl-short.74.software.zip</attachment>
       <bibkey>chen-etal-2022-focus</bibkey>
       <pwccode url="https://github.com/chenllliang/MLS" additional="true">chenllliang/MLS</pwccode>
+      <doi>10.18653/v1/2022.acl-short.74</doi>
     </paper>
     <paper id="75">
       <title>Contrastive Learning-Enhanced Nearest Neighbor Mechanism for Multi-Label Text Classification</title>
@@ -9701,6 +10378,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="838db0b6">2022.acl-short.75</url>
       <bibkey>su-etal-2022-contrastive</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/rcv1">RCV1</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.75</doi>
     </paper>
     <paper id="76">
       <title><fixed-case>N</fixed-case>oisy<fixed-case>T</fixed-case>une: A Little Noise Can Help You Finetune Pretrained Language Models Better</title>
@@ -9714,6 +10392,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>wu-etal-2022-noisytune</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xtreme">XTREME</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.76</doi>
     </paper>
     <paper id="77">
       <title>Adjusting the Precision-Recall Trade-Off with Align-and-Predict Decoding for Grammatical Error Correction</title>
@@ -9724,6 +10403,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="d8918ea2">2022.acl-short.77</url>
       <bibkey>sun-wang-2022-adjusting</bibkey>
       <pwccode url="https://github.com/autotemp/align-and-predict" additional="false">autotemp/align-and-predict</pwccode>
+      <doi>10.18653/v1/2022.acl-short.77</doi>
     </paper>
     <paper id="78">
       <title>On the Effect of Isotropy on <fixed-case>VAE</fixed-case> Representations of Text</title>
@@ -9736,6 +10416,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="97f5638a">2022.acl-short.78.software.zip</attachment>
       <bibkey>zhang-etal-2022-effect</bibkey>
       <pwccode url="https://github.com/lanzhang128/IGPVAE" additional="false">lanzhang128/IGPVAE</pwccode>
+      <doi>10.18653/v1/2022.acl-short.78</doi>
     </paper>
     <paper id="79">
       <title>Efficient Classification of Long Documents Using Transformers</title>
@@ -9747,6 +10428,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="b83bf9bb">2022.acl-short.79</url>
       <bibkey>park-etal-2022-efficient</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/eurlex57k">EURLEX57K</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.79</doi>
     </paper>
     <paper id="80">
       <title>Rewarding Semantic Similarity under Optimized Alignments for <fixed-case>AMR</fixed-case>-to-Text Generation</title>
@@ -9756,6 +10438,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>A common way to combat exposure bias is by applying scores from evaluation metrics as rewards in reinforcement learning (RL). Metrics leveraging contextualized embeddings appear more flexible than their n-gram matching counterparts and thus ideal as training rewards. However, metrics such as BERTScore greedily align candidate and reference tokens, which can allow system outputs to receive excess credit relative to a reference. Furthermore, past approaches featuring semantic similarity rewards suffer from repetitive outputs and overfitting. We address these issues by proposing metrics that replace the greedy alignments in BERTScore with optimized ones. We compute them on a model’s trained token embeddings to prevent domain mismatch. Our model optimizing discrete alignment metrics consistently outperforms cross-entropy and BLEU reward baselines on AMR-to-text generation. In addition, we find that this approach enjoys stable training compared to a non-RL setting.</abstract>
       <url hash="55460c69">2022.acl-short.80</url>
       <bibkey>jin-gildea-2022-rewarding</bibkey>
+      <doi>10.18653/v1/2022.acl-short.80</doi>
     </paper>
     <paper id="81">
       <title>An Analysis of Negation in Natural Language Understanding Corpora</title>
@@ -9777,6 +10460,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wsc">WSC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.81</doi>
     </paper>
     <paper id="82">
       <title><fixed-case>P</fixed-case>rimum <fixed-case>N</fixed-case>on <fixed-case>N</fixed-case>ocere: <fixed-case>B</fixed-case>efore working with <fixed-case>I</fixed-case>ndigenous data, the <fixed-case>ACL</fixed-case> must confront ongoing colonialism</title>
@@ -9785,6 +10469,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>In this paper, we challenge the ACL community to reckon with historical and ongoing colonialism by adopting a set of ethical obligations and best practices drawn from the Indigenous studies literature. While the vast majority of NLP research focuses on a very small number of very high resource languages (English, Chinese, etc), some work has begun to engage with Indigenous languages. No research involving Indigenous language data can be considered ethical without first acknowledging that Indigenous languages are not merely very low resource languages. The toxic legacy of colonialism permeates every aspect of interaction between Indigenous communities and outside researchers. To this end, we propose that the ACL draft and adopt an ethical framework for NLP researchers and computational linguists wishing to engage in research involving Indigenous languages.</abstract>
       <url hash="8c03cfb2">2022.acl-short.82</url>
       <bibkey>schwartz-2022-primum</bibkey>
+      <doi>10.18653/v1/2022.acl-short.82</doi>
     </paper>
     <paper id="83">
       <title>Unsupervised multiple-choice question generation for out-of-domain <fixed-case>Q</fixed-case>&amp;<fixed-case>A</fixed-case> fine-tuning</title>
@@ -9801,6 +10486,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/qasc">QASC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sciq">SciQ</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.83</doi>
     </paper>
     <paper id="84">
       <title>Can a Transformer Pass the Wug Test? Tuning Copying Bias in Neural Morphological Inflection Models</title>
@@ -9811,6 +10497,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="d8eb612c">2022.acl-short.84</url>
       <attachment type="software" hash="881c783e">2022.acl-short.84.software.zip</attachment>
       <bibkey>liu-hulden-2022-transformer</bibkey>
+      <doi>10.18653/v1/2022.acl-short.84</doi>
     </paper>
     <paper id="85">
       <title>Probing the Robustness of Trained Metrics for Conversational Dialogue Systems</title>
@@ -9826,6 +10513,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/jderiu/metric-robustness" additional="false">jderiu/metric-robustness</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.85</doi>
     </paper>
     <paper id="86">
       <title>Rethinking and Refining the Distinct Metric</title>
@@ -9840,6 +10528,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="e6441f69">2022.acl-short.86</url>
       <bibkey>liu-etal-2022-rethinking</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.86</doi>
     </paper>
     <paper id="87">
       <title>How reparametrization trick broke differentially-private text representation learning</title>
@@ -9850,6 +10539,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="4ef18285">2022.acl-short.87.software.zip</attachment>
       <bibkey>habernal-2022-reparametrization</bibkey>
       <pwccode url="https://github.com/trusthlt/acl2022-reparametrization-trick-broke-differential-privacy" additional="false">trusthlt/acl2022-reparametrization-trick-broke-differential-privacy</pwccode>
+      <doi>10.18653/v1/2022.acl-short.87</doi>
     </paper>
     <paper id="88">
       <title>Towards Consistent Document-level Entity Linking: Joint Models for Entity Linking and Coreference Resolution</title>
@@ -9864,6 +10554,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>zaporojets-etal-2022-towards</bibkey>
       <pwccode url="https://github.com/klimzaporojets/consistent-el" additional="false">klimzaporojets/consistent-el</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dwie">DWIE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.88</doi>
     </paper>
     <paper id="89">
       <title>A Flexible Multi-Task Model for <fixed-case>BERT</fixed-case> Serving</title>
@@ -9879,6 +10570,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.89</doi>
     </paper>
     <paper id="90">
       <title>Understanding Game-Playing Agents with Natural Language Annotations</title>
@@ -9891,6 +10583,7 @@ in the Case of Unambiguous Gender</title>
       <attachment type="software" hash="af80e601">2022.acl-short.90.software.zip</attachment>
       <bibkey>tomlin-etal-2022-understanding</bibkey>
       <pwccode url="https://github.com/andrehe02/go-probe" additional="false">andrehe02/go-probe</pwccode>
+      <doi>10.18653/v1/2022.acl-short.90</doi>
     </paper>
     <paper id="91">
       <title>Code Synonyms Do Matter: Multiple Synonyms Matching Network for Automatic <fixed-case>ICD</fixed-case> Coding</title>
@@ -9904,6 +10597,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>yuan-etal-2022-code</bibkey>
       <pwccode url="https://github.com/ganjinzero/icd-msmn" additional="false">ganjinzero/icd-msmn</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mimic-iii">MIMIC-III</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.91</doi>
     </paper>
     <paper id="92">
       <title><fixed-case>C</fixed-case>o<fixed-case>DA</fixed-case>21: Evaluating Language Understanding Capabilities of <fixed-case>NLP</fixed-case> Models With Context-Definition Alignment</title>
@@ -9915,6 +10609,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="0abc883d">2022.acl-short.92</url>
       <bibkey>senel-etal-2022-coda21</bibkey>
       <pwccode url="https://github.com/lksenel/coda21" additional="false">lksenel/coda21</pwccode>
+      <doi>10.18653/v1/2022.acl-short.92</doi>
     </paper>
     <paper id="93">
       <title>On the Importance of Effectively Adapting Pretrained Language Models for Active Learning</title>
@@ -9929,6 +10624,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.93</doi>
     </paper>
     <paper id="94">
       <title>A Recipe for Arbitrary Text Style Transfer with Large Language Models</title>
@@ -9943,6 +10639,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="8079f5f3">2022.acl-short.94</url>
       <attachment type="software" hash="46605c32">2022.acl-short.94.software.zip</attachment>
       <bibkey>reif-etal-2022-recipe</bibkey>
+      <doi>10.18653/v1/2022.acl-short.94</doi>
     </paper>
     <paper id="95">
       <title><fixed-case>D</fixed-case>i<fixed-case>S</fixed-case>-<fixed-case>R</fixed-case>e<fixed-case>X</fixed-case>: A Multilingual Dataset for Distantly Supervised Relation Extraction</title>
@@ -9957,6 +10654,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/dair-iitd/DiS-ReX" additional="false">dair-iitd/DiS-ReX</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dis-rex">DiS-ReX</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/relx">RELX</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.95</doi>
     </paper>
     <paper id="96">
       <title>(Un)solving Morphological Inflection: Lemma Overlap Artificially Inflates Models’ Performance</title>
@@ -9967,6 +10665,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>In the domain of Morphology, Inflection is a fundamental and important task that gained a lot of traction in recent years, mostly via SIGMORPHON’s shared-tasks.With average accuracy above 0.9 over the scores of all languages, the task is considered mostly solved using relatively generic neural seq2seq models, even with little data provided.In this work, we propose to re-evaluate morphological inflection models by employing harder train-test splits that will challenge the generalization capacity of the models. In particular, as opposed to the naïve split-by-form, we propose a split-by-lemma method to challenge the performance on existing benchmarks.Our experiments with the three top-ranked systems on the SIGMORPHON’s 2020 shared-task show that the lemma-split presents an average drop of 30 percentage points in macro-average for the 90 languages included. The effect is most significant for low-resourced languages with a drop as high as 95 points, but even high-resourced languages lose about 10 points on average. Our results clearly show that generalizing inflection to unseen lemmas is far from being solved, presenting a simple yet effective means to promote more sophisticated models.</abstract>
       <url hash="c3d8c442">2022.acl-short.96</url>
       <bibkey>goldman-etal-2022-un</bibkey>
+      <doi>10.18653/v1/2022.acl-short.96</doi>
     </paper>
     <paper id="97">
       <title>Text Smoothing: Enhance Various Data Augmentation Methods on Text Classification Tasks</title>
@@ -9981,6 +10680,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>wu-etal-2022-text</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/snips">SNIPS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-short.97</doi>
     </paper>
   </volume>
   <volume id="srw" ingest-date="2022-05-15">
@@ -10006,6 +10706,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>This work presents two experiments with the goal of replicating the transferability of dependency parsers and POS taggers trained on closely related languages within the low-resource language family Tupían. The experiments include both zero-shot settings as well as multilingual models. Previous studies have found that even a comparably small treebank from a closely related language will improve sequence labelling considerably in such cases. Results from both POS tagging and dependency parsing confirm previous evidence that the closer the phylogenetic relation between two languages, the better the predictions for sequence labelling tasks get. In many cases, the results are improved if multiple languages from the same family are combined. This suggests that in addition to leveraging similarity between two related languages, the incorporation of multiple languages of the same family might lead to better results in transfer learning for NLP applications.</abstract>
       <url hash="cdab8487">2022.acl-srw.1</url>
       <bibkey>blum-2022-evaluating</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.1</doi>
     </paper>
     <paper id="2">
       <title><fixed-case>RFBFN</fixed-case>: A Relation-First Blank Filling Network for Joint Relational Triple Extraction</title>
@@ -10019,6 +10720,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="63be2c47">2022.acl-srw.2</url>
       <bibkey>li-etal-2022-rfbfn</bibkey>
       <pwccode url="https://github.com/lizhe2016/rfbfn" additional="false">lizhe2016/rfbfn</pwccode>
+      <doi>10.18653/v1/2022.acl-srw.2</doi>
     </paper>
     <paper id="3">
       <title>Building a Dialogue Corpus Annotated with Expressed and Experienced Emotions</title>
@@ -10033,6 +10735,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/emobank">EmoBank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/emotionlines">EmotionLines</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/story-commonsense">Story Commonsense</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.3</doi>
     </paper>
     <paper id="4">
       <title>Darkness can not drive out darkness: Investigating Bias in Hate <fixed-case>S</fixed-case>peech<fixed-case>D</fixed-case>etection Models</title>
@@ -10041,6 +10744,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>It has become crucial to develop tools for automated hate speech and abuse detection. These tools would help to stop the bullies and the haters and provide a safer environment for individuals especially from marginalized groups to freely express themselves. However, recent research shows that machine learning models are biased and they might make the right decisions for the wrong reasons. In this thesis, I set out to understand the performance of hate speech and abuse detection models and the different biases that could influence them. I show that hate speech and abuse detection models are not only subject to social bias but also to other types of bias that have not been explored before. Finally, I investigate the causal effect of the social and intersectional bias on the performance and unfairness of hate speech detection models.</abstract>
       <url hash="01d28bf8">2022.acl-srw.4</url>
       <bibkey>elsafoury-2022-darkness</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.4</doi>
     </paper>
     <paper id="5">
       <title>Ethical Considerations for Low-resourced Machine Translation</title>
@@ -10049,6 +10753,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>This paper considers some ethical implications of machine translation for low-resourced languages. I use Armenian as a case study and investigate specific needs for and concerns arising from the creation and deployment of improved machine translation between English and Armenian. To do this, I conduct stakeholder interviews and construct Value Scenarios (Nathan et al., 2007) from the themes that emerge. These scenarios illustrate some of the potential harms that low-resourced language communities may face due to the deployment of improved machine translation systems. Based on these scenarios, I recommend 1) collaborating with stakeholders in order to create more useful and reliable machine translation tools, and 2) determining which other forms of language technology should be developed alongside efforts to improve machine translation in order to mitigate harms rendered to vulnerable language communities. Both of these goals require treating low-resourced machine translation as a language-specific, rather than language-agnostic, task.</abstract>
       <url hash="c2acb4a2">2022.acl-srw.5</url>
       <bibkey>haroutunian-2022-ethical</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.5</doi>
     </paper>
     <paper id="6">
       <title>Integrating Question Rewrites in Conversational Question Answering: A Reinforcement Learning Approach</title>
@@ -10065,6 +10770,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/coqa">CoQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qrecc">QReCC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.6</doi>
     </paper>
     <paper id="7">
       <title>What Do You Mean by Relation Extraction? A Survey on Datasets and Study on Scientific Relation Classification</title>
@@ -10079,6 +10785,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/docred">DocRED</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel-2-0">FewRel 2.0</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.7</doi>
     </paper>
     <paper id="8">
       <title>Logical Inference for Counting on Semi-structured Tables</title>
@@ -10090,6 +10797,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>kurosawa-yanaka-2022-logical</bibkey>
       <pwccode url="https://github.com/ynklab/sst_count" additional="false">ynklab/sst_count</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/infotabs">InfoTabS</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>GNN</fixed-case>er: Reducing Overlapping in Span-based <fixed-case>NER</fixed-case> Using Graph Neural Networks</title>
@@ -10104,6 +10812,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/urchade/gnner" additional="false">urchade/gnner</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scierc">SciERC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.9</doi>
     </paper>
     <paper id="10">
       <title>Compositional Semantics and Inference System for Temporal Order based on <fixed-case>J</fixed-case>apanese <fixed-case>CCG</fixed-case></title>
@@ -10114,6 +10823,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="0775c922">2022.acl-srw.10</url>
       <bibkey>sugimoto-yanaka-2022-compositional</bibkey>
       <pwccode url="https://github.com/ynklab/ccgtemp" additional="false">ynklab/ccgtemp</pwccode>
+      <doi>10.18653/v1/2022.acl-srw.10</doi>
     </paper>
     <paper id="11">
       <title>Combine to Describe: Evaluating Compositional Generalization in Image Captioning</title>
@@ -10125,6 +10835,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="fb15edb4">2022.acl-srw.11</url>
       <bibkey>pantazopoulos-etal-2022-combine</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.11</doi>
     </paper>
     <paper id="12">
       <title>Towards Unification of Discourse Annotation Frameworks</title>
@@ -10133,6 +10844,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Discourse information is difficult to represent and annotate. Among the major frameworks for annotating discourse information, RST, PDTB and SDRT are widely discussed and used, each having its own theoretical foundation and focus. Corpora annotated under different frameworks vary considerably. To make better use of the existing discourse corpora and achieve the possible synergy of different frameworks, it is worthwhile to investigate the systematic relations between different frameworks and devise methods of unifying the frameworks. Although the issue of framework unification has been a topic of discussion for a long time, there is currently no comprehensive approach which considers unifying both discourse structure and discourse relations and evaluates the unified framework intrinsically and extrinsically. We plan to use automatic means for the unification task and evaluate the result with structural complexity and downstream tasks. We will also explore the application of the unified framework in multi-task learning and graphical models.</abstract>
       <url hash="dbfb1d47">2022.acl-srw.12</url>
       <bibkey>fu-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>AMR</fixed-case> Alignment for Morphologically-rich and Pro-drop Languages</title>
@@ -10142,6 +10854,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Alignment between concepts in an abstract meaning representation (AMR) graph and the words within a sentence is one of the important stages of AMR parsing. Although there exist high performing AMR aligners for English, unfortunately, these are not well suited for many languages where many concepts appear from morpho-semantic elements.For the first time in the literature, this paper presents an AMR aligner tailored for morphologically-rich and pro-drop languages by experimenting on the Turkish language being a prominent example of this language group.Our aligner focuses on the meaning considering the rich Turkish morphology and aligns AMR concepts that emerge from morphemes using a tree traversal approach without additional resources or rules. We evaluate our aligner over a manually annotated gold data set in terms of precision, recall and F1 score. Our aligner outperforms the Turkish adaptations of the previously proposed aligners for English and Portuguese by an F1 score of 0.87 and provides a relative error reduction of up to 76%.</abstract>
       <url hash="94b71440">2022.acl-srw.13</url>
       <bibkey>oral-eryigit-2022-amr</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.13</doi>
     </paper>
     <paper id="14">
       <title>Sketching a Linguistically-Driven Reasoning Dialog Model for Social Talk</title>
@@ -10150,6 +10863,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>The capability of holding social talk (or casual conversation) and making sense of conversational content requires context-sensitive natural language understanding and reasoning, which cannot be handled efficiently by the current popular open-domain dialog systems and chatbots. Heavily relying on corpus-based machine learning techniques to encode and decode context-sensitive meanings, these systems focus on fitting a particular training dataset, but not tracking what is actually happening in a conversation, and therefore easily derail in a new context. This work sketches out a more linguistically-informed architecture to handle social talk in English, in which corpus-based methods form the backbone of the relatively context-insensitive components (e.g. part-of-speech tagging, approximation of lexical meaning and constituent chunking), while symbolic modeling is used for reasoning out the context-sensitive components, which do not have any consistent mapping to linguistic forms. All components are fitted into a Bayesian game-theoretic model to address the interactive and rational aspects of conversation.</abstract>
       <url hash="278f461b">2022.acl-srw.14</url>
       <bibkey>luu-2022-sketching</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.14</doi>
     </paper>
     <paper id="15">
       <title>Scoping natural language processing in <fixed-case>I</fixed-case>ndonesian and <fixed-case>M</fixed-case>alay for education applications</title>
@@ -10161,6 +10875,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="3232c55c">2022.acl-srw.15</url>
       <bibkey>maxwelll-smith-etal-2022-scoping</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/indonlu-benchmark">IndoNLU Benchmark</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.15</doi>
     </paper>
     <paper id="16">
       <title><fixed-case>E</fixed-case>nglish-<fixed-case>M</fixed-case>alay Cross-Lingual Embedding Alignment using Bilingual Lexicon Augmentation</title>
@@ -10170,6 +10885,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>As high-quality Malay language resources are still a scarcity, cross lingual word embeddings make it possible for richer English resources to be leveraged for downstream Malay text classification tasks. This paper focuses on creating an English-Malay cross-lingual word embeddings using embedding alignment by exploiting existing language resources. We augmented the training bilingual lexicons using machine translation with the goal to improve the alignment precision of our cross-lingual word embeddings. We investigated the quality of the current state-of-the-art English-Malay bilingual lexicon and worked on improving its quality using Google Translate. We also examined the effect of Malay word coverage on the quality of cross-lingual word embeddings. Experimental results with a precision up till 28.17% show that the alignment precision of the cross-lingual word embeddings would inevitably degrade after 1-NN but a better seed lexicon and cleaner nearest neighbours can reduce the number of word pairs required to achieve satisfactory performance. As the English and Malay monolingual embeddings are pre-trained on informal language corpora, our proposed English-Malay embeddings alignment approach is also able to map non-standard Malay translations in the English nearest neighbours.</abstract>
       <url hash="8494c205">2022.acl-srw.16</url>
       <bibkey>lim-liew-2022-english</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.16</doi>
     </paper>
     <paper id="17">
       <title>Towards Detecting Political Bias in <fixed-case>H</fixed-case>indi News Articles</title>
@@ -10181,6 +10897,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Political propaganda in recent times has been amplified by media news portals through biased reporting, creating untruthful narratives on serious issues causing misinformed public opinions with interests of siding and helping a particular political party. This issue proposes a challenging NLP task of detecting political bias in news articles.We propose a transformer-based transfer learning method to fine-tune the pre-trained network on our data for this bias detection. As the required dataset for this particular task was not available, we created our dataset comprising 1388 Hindi news articles and their headlines from various Hindi news media outlets. We marked them on whether they are biased towards, against, or neutral to BJP, a political party, and the current ruling party at the centre in India.</abstract>
       <url hash="87d7d2a6">2022.acl-srw.17</url>
       <bibkey>agrawal-etal-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.17</doi>
     </paper>
     <paper id="18">
       <title>Restricted or Not: A General Training Framework for Neural Machine Translation</title>
@@ -10193,6 +10910,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="bdd635b7">2022.acl-srw.18</url>
       <bibkey>li-etal-2022-restricted</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/aspec">ASPEC</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.18</doi>
     </paper>
     <paper id="19">
       <title>What do Models Learn From Training on More Than Text? Measuring Visual Commonsense Knowledge</title>
@@ -10203,6 +10921,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="806f797a">2022.acl-srw.19</url>
       <bibkey>hagstrom-johansson-2022-models</bibkey>
       <pwccode url="https://github.com/lovhag/measure-visual-commonsense-knowledge" additional="false">lovhag/measure-visual-commonsense-knowledge</pwccode>
+      <doi>10.18653/v1/2022.acl-srw.19</doi>
     </paper>
     <paper id="20">
       <title><fixed-case>T</fixed-case>elugu<fixed-case>NER</fixed-case>: Leveraging Multi-Domain Named Entity Recognition with Deep Transformers</title>
@@ -10215,6 +10934,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="68037939">2022.acl-srw.20</url>
       <bibkey>duggenpudi-etal-2022-teluguner</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wikiann-1">WikiAnn</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.20</doi>
     </paper>
     <paper id="21">
       <title>Using Neural Machine Translation Methods for Sign Language Translation</title>
@@ -10226,6 +10946,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="20d75fa4">2022.acl-srw.21</url>
       <bibkey>angelova-etal-2022-using</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/phoenix14t">PHOENIX14T</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.21</doi>
     </paper>
     <paper id="22">
       <title>Flexible Visual Grounding</title>
@@ -10242,6 +10963,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/refcoco">RefCOCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual7w">Visual7W</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.22</doi>
     </paper>
     <paper id="23">
       <title>A large-scale computational study of content preservation measures for text style transfer and paraphrase generation</title>
@@ -10254,6 +10976,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="b367e784">2022.acl-srw.23</url>
       <bibkey>babakov-etal-2022-large</bibkey>
       <pwccode url="https://github.com/skoltech-nlp/mutual_implication_score" additional="false">skoltech-nlp/mutual_implication_score</pwccode>
+      <doi>10.18653/v1/2022.acl-srw.23</doi>
     </paper>
     <paper id="24">
       <title>Explicit Object Relation Alignment for Vision and Language Navigation</title>
@@ -10264,6 +10987,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="2769bf94">2022.acl-srw.24</url>
       <bibkey>zhang-kordjamshidi-2022-explicit</bibkey>
       <pwccode url="https://github.com/hlr/object-grounding-for-vln" additional="false">hlr/object-grounding-for-vln</pwccode>
+      <doi>10.18653/v1/2022.acl-srw.24</doi>
     </paper>
     <paper id="25">
       <title>Mining Logical Event Schemas From Pre-Trained Language Models</title>
@@ -10274,6 +10998,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="5952286a">2022.acl-srw.25</url>
       <bibkey>lawley-schubert-2022-mining</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/framenet">FrameNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.25</doi>
     </paper>
     <paper id="26">
       <title>Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models.</title>
@@ -10284,6 +11009,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Detoxification is a task of generating text in polite style while preserving meaning and fluency of the original toxic text. Existing detoxification methods are monolingual i.e. designed to work in one exact language. This work investigates multilingual and cross-lingual detoxification and the behavior of large multilingual models in this setting. Unlike previous works we aim to make large language models able to perform detoxification without direct fine-tuning in a given language. Experiments show that multilingual models are capable of performing multilingual style transfer. However, tested state-of-the-art models are not able to perform cross-lingual detoxification and direct fine-tuning on exact language is currently inevitable and motivating the need of further research in this direction.</abstract>
       <url hash="90f73db6">2022.acl-srw.26</url>
       <bibkey>moskovskiy-etal-2022-exploring</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.26</doi>
     </paper>
     <paper id="27">
       <title><fixed-case>MEKER</fixed-case>: Memory Efficient Knowledge Embedding Representation for Link Prediction and Question Answering</title>
@@ -10298,6 +11024,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>chekalina-etal-2022-meker</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fb15k-237">FB15k-237</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/simplequestions">SimpleQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.27</doi>
     </paper>
     <paper id="28">
       <title>Discourse on <fixed-case>ASR</fixed-case> Measurement: Introducing the <fixed-case>ARPOCA</fixed-case> Assessment Tool</title>
@@ -10307,6 +11034,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Automatic speech recognition (ASR) has evolved from a pipeline architecture with pronunciation dictionaries, phonetic features and language models to the end-to-end systems performing a direct translation from a raw waveform into a word sequence. With the increase in accuracy and the availability of pre-trained models, the ASR systems are now omnipresent in our daily applications. On the other hand, the models’ interpretability and their computational cost have become more challenging, particularly when dealing with less-common languages or identifying regional variations of speakers. This research proposal will follow a four-stage process: 1) Proving an overview of acoustic features and feature extraction algorithms; 2) Exploring current ASR models, tools, and performance assessment techniques; 3) Aligning features with interpretable phonetic transcripts; and 4) Designing a prototype ARPOCA to increase awareness of regional language variation and improve models feedback by developing a semi-automatic acoustic features extraction using PRAAT in conjunction with phonetic transcription.</abstract>
       <url hash="b5deadad">2022.acl-srw.28</url>
       <bibkey>merz-scrivner-2022-discourse</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.28</doi>
     </paper>
     <paper id="29">
       <title>Pretrained Knowledge Base Embeddings for improved Sentential Relation Extraction</title>
@@ -10319,6 +11047,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="d1712ba5">2022.acl-srw.29</url>
       <bibkey>papaluca-etal-2022-pretrained</bibkey>
       <pwccode url="https://github.com/brunoliegibastonliegi/pretrained-kb-embeddings-for-re" additional="false">brunoliegibastonliegi/pretrained-kb-embeddings-for-re</pwccode>
+      <doi>10.18653/v1/2022.acl-srw.29</doi>
     </paper>
     <paper id="30">
       <title>Improving Cross-domain, Cross-lingual and Multi-modal Deception Detection</title>
@@ -10329,6 +11058,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="60ec8626">2022.acl-srw.30</url>
       <bibkey>panda-levitan-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/liar">LIAR</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.30</doi>
     </paper>
     <paper id="31">
       <title>Automatic Generation of Distractors for Fill-in-the-Blank Exercises with Round-Trip Neural Machine Translation</title>
@@ -10340,6 +11070,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>In a fill-in-the-blank exercise, a student is presented with a carrier sentence with one word hidden, and a multiple-choice list that includes the correct answer and several inappropriate options, called distractors. We propose to automatically generate distractors using round-trip neural machine translation: the carrier sentence is translated from English into another (pivot) language and back, and distractors are produced by aligning the original sentence and its round-trip translation. We show that using hundreds of translations for a given sentence allows us to generate a rich set of challenging distractors. Further, using multiple pivot languages produces a diverse set of candidates. The distractors are evaluated against a real corpus of cloze exercises and checked manually for validity. We demonstrate that the proposed method significantly outperforms two strong baselines.</abstract>
       <url hash="f5f038b9">2022.acl-srw.31</url>
       <bibkey>panda-etal-2022-automatic</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.31</doi>
     </paper>
     <paper id="32">
       <title>On the Locality of Attention in Direct Speech Translation</title>
@@ -10351,6 +11082,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Transformers have achieved state-of-the-art results across multiple NLP tasks. However, the self-attention mechanism complexity scales quadratically with the sequence length, creating an obstacle for tasks involving long sequences, like in the speech domain. In this paper, we discuss the usefulness of self-attention for Direct Speech Translation. First, we analyze the layer-wise token contributions in the self-attention of the encoder, unveiling local diagonal patterns. To prove that some attention weights are avoidable, we propose to substitute the standard self-attention with a local efficient one, setting the amount of context used based on the results of the analysis. With this approach, our model matches the baseline performance, and improves the efficiency by skipping the computation of those weights that standard attention discards.</abstract>
       <url hash="8fe5f60d">2022.acl-srw.32</url>
       <bibkey>alastruey-etal-2022-locality</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.32</doi>
     </paper>
     <paper id="33">
       <title>Extraction of Diagnostic Reasoning Relations for Clinical Knowledge Graphs</title>
@@ -10359,6 +11091,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Clinical knowledge graphs lack meaningful diagnostic relations (e.g. comorbidities, sign/symptoms), limiting their ability to represent real-world diagnostic processes. Previous methods in biomedical relation extraction have focused on concept relations, such as gene-disease and disease-drug, and largely ignored clinical processes. In this thesis, we leverage a clinical reasoning ontology and propose methods to extract such relations from a physician-facing point-of-care reference wiki and consumer health resource texts. Given the lack of data labeled with diagnostic relations, we also propose new methods of evaluating the correctness of extracted triples in the zero-shot setting. We describe a process for the intrinsic evaluation of new facts by triple confidence filtering and clinician manual review, as well extrinsic evaluation in the form of a differential diagnosis prediction task.</abstract>
       <url hash="0b2a4189">2022.acl-srw.33</url>
       <bibkey>socrates-2022-extraction</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.33</doi>
     </paper>
     <paper id="34">
       <title>Scene-Text Aware Image and Text Retrieval with Dual-Encoder</title>
@@ -10372,6 +11105,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="d4ae3b6e">2022.acl-srw.34</url>
       <bibkey>miyawaki-etal-2022-scene</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/textcaps">TextCaps</pwcdataset>
+      <doi>10.18653/v1/2022.acl-srw.34</doi>
     </paper>
     <paper id="35">
       <title>Towards Fine-grained Classification of Climate Change related Social Media Text</title>
@@ -10382,6 +11116,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>With climate change becoming a cause of concern worldwide, it becomes essential to gauge people’s reactions. This can help educate and spread awareness about it and help leaders improve decision-making. This work explores the fine-grained classification and Stance detection of climate change-related social media text. Firstly, we create two datasets, ClimateStance and ClimateEng, consisting of 3777 tweets each, posted during the 2019 United Nations Framework Convention on Climate Change and comprehensively outline the dataset collection, annotation methodology, and dataset composition. Secondly, we propose the task of Climate Change stance detection based on our proposed ClimateStance dataset. Thirdly, we propose a fine-grained classification based on the ClimateEng dataset, classifying social media text into five categories: Disaster, Ocean/Water, Agriculture/Forestry, Politics, and General. We benchmark both the datasets for climate change stance detection and fine-grained classification using state-of-the-art methods in text classification. We also create a Reddit-based dataset for both the tasks, ClimateReddit, consisting of 6262 pseudo-labeled comments along with 329 manually annotated comments for the label. We then perform semi-supervised experiments for both the tasks and benchmark their results using the best-performing model for the supervised experiments. Lastly, we provide insights into the ClimateStance and ClimateReddit using part-of-speech tagging and named-entity recognition.</abstract>
       <url hash="285efec6">2022.acl-srw.35</url>
       <bibkey>vaid-etal-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.35</doi>
     </paper>
     <paper id="36">
       <title>Deep Neural Representations for Multiword Expressions Detection</title>
@@ -10391,6 +11126,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Effective methods for multiword expressions detection are important for many technologies related to Natural Language Processing. Most contemporary methods are based on the sequence labeling scheme applied to an annotated corpus, while traditional methods use statistical measures. In our approach, we want to integrate the concepts of those two approaches. We present a novel weakly supervised multiword expressions extraction method which focuses on their behaviour in various contexts. Our method uses a lexicon of English multiword lexical units acquired from The Oxford Dictionary of English as a reference knowledge base and leverages neural language modelling with deep learning architectures. In our approach, we do not need a corpus annotated specifically for the task. The only required components are: a lexicon of multiword units, a large corpus, and a general contextual embeddings model. We propose a method for building a silver dataset by spotting multiword expression occurrences and acquiring statistical collocations as negative samples. Sample representation has been inspired by representations used in Natural Language Inference and relation recognition. Very good results (F1=0.8) were obtained with CNN network applied to individual occurrences followed by weighted voting used to combine results from the whole corpus.The proposed method can be quite easily applied to other languages.</abstract>
       <url hash="d8d0df12">2022.acl-srw.36</url>
       <bibkey>kanclerz-piasecki-2022-deep</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.36</doi>
     </paper>
     <paper id="37">
       <title>A Checkpoint on Multilingual Misogyny Identification</title>
@@ -10400,6 +11136,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>We address the problem of identifying misogyny in tweets in mono and multilingual settings in three languages: English, Italian, and Spanish. We explore model variations considering single and multiple languages both in the pre-training of the transformer and in the training of the downstream taskto explore the feasibility of detecting misogyny through a transfer learning approach across multiple languages. That is, we train monolingual transformers with monolingual data, and multilingual transformers with both monolingual and multilingual data.Our models reach state-of-the-art performance on all three languages. The single-language BERT models perform the best, closely followed by different configurations of multilingual BERT models. The performance drops in zero-shot classification across languages. Our error analysis shows that multilingual and monolingual models tend to make the same mistakes.</abstract>
       <url hash="d0b4fc41">2022.acl-srw.37</url>
       <bibkey>muti-barron-cedeno-2022-checkpoint</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.37</doi>
     </paper>
     <paper id="38">
       <title>Using dependency parsing for few-shot learning in distributional semantics</title>
@@ -10409,6 +11146,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>In this work, we explore the novel idea of employing dependency parsing information in the context of few-shot learning, the task of learning the meaning of a rare word based on a limited amount of context sentences. Firstly, we use dependency-based word embedding models as background spaces for few-shot learning. Secondly, we introduce two few-shot learning methods which enhance the additive baseline model by using dependencies.</abstract>
       <url hash="24d1ac78">2022.acl-srw.38</url>
       <bibkey>preda-emerson-2022-using</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.38</doi>
     </paper>
     <paper id="39">
       <title>A Dataset and <fixed-case>BERT</fixed-case>-based Models for Targeted Sentiment Analysis on <fixed-case>T</fixed-case>urkish Texts</title>
@@ -10418,6 +11156,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Targeted Sentiment Analysis aims to extract sentiment towards a particular target from a given text. It is a field that is attracting attention due to the increasing accessibility of the Internet, which leads people to generate an enormous amount of data. Sentiment analysis, which in general requires annotated data for training, is a well-researched area for widely studied languages such as English. For low-resource languages such as Turkish, there is a lack of such annotated data. We present an annotated Turkish dataset suitable for targeted sentiment analysis. We also propose BERT-based models with different architectures to accomplish the task of targeted sentiment analysis. The results demonstrate that the proposed models outperform the traditional sentiment analysis models for the targeted sentiment analysis task.</abstract>
       <url hash="4ac36fe8">2022.acl-srw.39</url>
       <bibkey>mutlu-ozgur-2022-dataset</bibkey>
+      <doi>10.18653/v1/2022.acl-srw.39</doi>
     </paper>
   </volume>
   <volume id="demo" ingest-date="2022-05-15">
@@ -10449,6 +11188,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="e9f55c41">2022.acl-demo.1</url>
       <bibkey>lin-etal-2022-dotat</bibkey>
       <pwccode url="https://github.com/fxlp/marktool" additional="false">fxlp/marktool</pwccode>
+      <doi>10.18653/v1/2022.acl-demo.1</doi>
     </paper>
     <paper id="2">
       <title><fixed-case>UKP</fixed-case>-<fixed-case>SQUARE</fixed-case>: An Online Platform for Question Answering Research</title>
@@ -10475,6 +11215,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.2</doi>
     </paper>
     <paper id="3">
       <title><fixed-case>V</fixed-case>i<fixed-case>LM</fixed-case>edic: a framework for research at the intersection of vision and language in medical <fixed-case>AI</fixed-case></title>
@@ -10494,6 +11235,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/jbdel/vilmedic" additional="false">jbdel/vilmedic</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/padchest">PadChest</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.3</doi>
     </paper>
     <paper id="4">
       <title><fixed-case>T</fixed-case>ext<fixed-case>P</fixed-case>runer: A Model Pruning Toolkit for Pre-Trained Language Models</title>
@@ -10504,6 +11246,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Pre-trained language models have been prevailed in natural language processing and become the backbones of many NLP tasks, but the demands for computational resources have limited their applications. In this paper, we introduce TextPruner, an open-source model pruning toolkit designed for pre-trained language models, targeting fast and easy model compression. TextPruner offers structured post-training pruning methods, including vocabulary pruning and transformer pruning, and can be applied to various models and tasks. We also propose a self-supervised pruning method that can be applied without the labeled data. Our experiments with several NLP tasks demonstrate the ability of TextPruner to reduce the model size without re-training the model.</abstract>
       <url hash="7e4f5bbf">2022.acl-demo.4</url>
       <bibkey>yang-etal-2022-textpruner</bibkey>
+      <doi>10.18653/v1/2022.acl-demo.4</doi>
     </paper>
     <paper id="5">
       <title><fixed-case>A</fixed-case>nn<fixed-case>IE</fixed-case>: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark</title>
@@ -10519,6 +11262,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="4cb2acf6">2022.acl-demo.5</url>
       <bibkey>friedrich-etal-2022-annie</bibkey>
       <pwccode url="https://github.com/nfriedri/annie-annotation-platform" additional="false">nfriedri/annie-annotation-platform</pwccode>
+      <doi>10.18653/v1/2022.acl-demo.5</doi>
     </paper>
     <paper id="6">
       <title><fixed-case>A</fixed-case>dapter<fixed-case>H</fixed-case>ub Playground: Simple and Flexible Few-Shot Learning with Adapters</title>
@@ -10539,6 +11283,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.6</doi>
     </paper>
     <paper id="7">
       <title><fixed-case>Q</fixed-case>iu<fixed-case>N</fixed-case>iu: A <fixed-case>C</fixed-case>hinese Lyrics Generation System with Passage-Level Input</title>
@@ -10550,6 +11295,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Lyrics generation has been a very popular application of natural language generation. Previous works mainly focused on generating lyrics based on a couple of attributes or keywords, rendering very limited control over the content of the lyrics. In this paper, we demonstrate the QiuNiu, a Chinese lyrics generation system which is conditioned on passage-level text rather than a few attributes or keywords. By using the passage-level text as input, the content of generated lyrics is expected to reflect the nuances of users’ needs. The QiuNiu system supports various forms of passage-level input, such as short stories, essays, poetry. The training of it is conducted under the framework of unsupervised machine translation, due to the lack of aligned passage-level text-to-lyrics corpus. We initialize the parameters of QiuNiu with a custom pretrained Chinese GPT-2 model and adopt a two-step process to finetune the model for better alignment between passage-level text and lyrics. Additionally, a postprocess module is used to filter and rerank the generated lyrics to select the ones of highest quality. The demo video of the system is available at https://youtu.be/OCQNzahqWgM.</abstract>
       <url hash="14cdb00f">2022.acl-demo.7</url>
       <bibkey>zhang-etal-2022-qiuniu</bibkey>
+      <doi>10.18653/v1/2022.acl-demo.7</doi>
     </paper>
     <paper id="8">
       <title>Automatic Gloss Dictionary for Sign Language Learners</title>
@@ -10563,6 +11309,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="eb23e040">2022.acl-demo.8</url>
       <bibkey>xu-etal-2022-automatic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wlasl">WLASL</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>P</fixed-case>rompt<fixed-case>S</fixed-case>ource: An Integrated Development Environment and Repository for Natural Language Prompts</title>
@@ -10599,6 +11346,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>bach-etal-2022-promptsource</bibkey>
       <pwccode url="https://github.com/bigscience-workshop/promptsource" additional="false">bigscience-workshop/promptsource</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.9</doi>
     </paper>
     <paper id="10">
       <title><fixed-case>O</fixed-case>pen<fixed-case>P</fixed-case>rompt: An Open-source Framework for Prompt-learning</title>
@@ -10615,6 +11363,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>ding-etal-2022-openprompt</bibkey>
       <pwccode url="https://github.com/thunlp/OpenPrompt" additional="false">thunlp/OpenPrompt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.10</doi>
     </paper>
     <paper id="11">
       <title>Guided K-best Selection for Semantic Parsing Annotation</title>
@@ -10632,6 +11381,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Collecting data for conversational semantic parsing is a time-consuming and demanding process. In this paper we consider, given an incomplete dataset with only a small amount of data, how to build an AI-powered human-in-the-loop process to enable efficient data collection. A guided K-best selection process is proposed, which (i) generates a set of possible valid candidates; (ii) allows users to quickly traverse the set and filter incorrect parses; and (iii) asks users to select the correct parse, with minimal modification when necessary. We investigate how to best support users in efficiently traversing the candidate set and locating the correct parse, in terms of speed and accuracy. In our user study, consisting of five annotators labeling 300 instances each, we find that combining keyword searching, where keywords can be used to query relevant candidates, and keyword suggestion, where representative keywords are automatically generated, enables fast and accurate annotation.</abstract>
       <url hash="7d835507">2022.acl-demo.11</url>
       <bibkey>belyy-etal-2022-guided</bibkey>
+      <doi>10.18653/v1/2022.acl-demo.11</doi>
     </paper>
     <paper id="12">
       <title>Hard and Soft Evaluation of <fixed-case>NLP</fixed-case> models with <fixed-case>BOO</fixed-case>t<fixed-case>ST</fixed-case>rap <fixed-case>SA</fixed-case>mpling - <fixed-case>B</fixed-case>oo<fixed-case>S</fixed-case>t<fixed-case>S</fixed-case>a</title>
@@ -10643,6 +11393,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Natural Language Processing (NLP) ‘s applied nature makes it necessary to select the most effective and robust models. Producing slightly higher performance is insufficient; we want to know whether this advantage will carry over to other data sets. Bootstrapped significance tests can indicate that ability.So while necessary, computing the significance of models’ performance differences has many levels of complexity. It can be tedious, especially when the experimental design has many conditions to compare and several runs of experiments.We present BooStSa, a tool that makes it easy to compute significance levels with the BOOtSTrap SAmpling procedure to evaluate models that predict not only standard hard labels but soft-labels (i.e., probability distributions over different classes) as well.</abstract>
       <url hash="d1bd1c89">2022.acl-demo.12</url>
       <bibkey>fornaciari-etal-2022-hard</bibkey>
+      <doi>10.18653/v1/2022.acl-demo.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>COVID</fixed-case>-19 Claim Radar: A Structured Claim Extraction and Tracking System</title>
@@ -10659,6 +11410,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="7ea6244d">2022.acl-demo.13</url>
       <bibkey>li-etal-2022-covid</bibkey>
       <pwccode url="https://github.com/uiucnlp/covid-claim-radar" additional="false">uiucnlp/covid-claim-radar</pwccode>
+      <doi>10.18653/v1/2022.acl-demo.13</doi>
     </paper>
     <paper id="14">
       <title><fixed-case>TS</fixed-case>-<fixed-case>ANNO</fixed-case>: An Annotation Tool to Build, Annotate and Evaluate Text Simplification Corpora</title>
@@ -10670,6 +11422,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>stodden-kallmeyer-2022-ts</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/asset">ASSET</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/asset-corpus">ASSET Corpus</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.14</doi>
     </paper>
     <paper id="15">
       <title>Language Diversity: Visible to Humans, Exploitable by Machines</title>
@@ -10683,6 +11436,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>The Universal Knowledge Core (UKC) is a large multilingual lexical database with a focus on language diversity and covering over two thousand languages. The aim of the database, as well as its tools and data catalogue, is to make the abstract notion of linguistic diversity visually understandable for humans and formally exploitable by machines. The UKC website lets users explore millions of individual words and their meanings, but also phenomena of cross-lingual convergence and divergence, such as shared interlingual meanings, lexicon similarities, cognate clusters, or lexical gaps. The UKC LiveLanguage Catalogue, in turn, provides access to the underlying lexical data in a computer-processable form, ready to be reused in cross-lingual applications.</abstract>
       <url hash="3d19ef5d">2022.acl-demo.15</url>
       <bibkey>bella-etal-2022-language</bibkey>
+      <doi>10.18653/v1/2022.acl-demo.15</doi>
     </paper>
     <paper id="16">
       <title><fixed-case>C</fixed-case>og<fixed-case>KGE</fixed-case>: A Knowledge Graph Embedding Toolkit and Benchmark for Representing Multi-source and Heterogeneous Knowledge</title>
@@ -10703,6 +11457,7 @@ in the Case of Unambiguous Gender</title>
       <pwccode url="https://github.com/jinzhuoran/cogkge" additional="false">jinzhuoran/cogkge</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/framenet">FrameNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.16</doi>
     </paper>
     <paper id="17">
       <title>Dynatask: A Framework for Creating Dynamic <fixed-case>AI</fixed-case> Benchmark Tasks</title>
@@ -10724,6 +11479,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/anli">ANLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/adversarialqa">AdversarialQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.17</doi>
     </paper>
     <paper id="18">
       <title><fixed-case>D</fixed-case>ata<fixed-case>L</fixed-case>ab: A Platform for Data Analysis and Intervention</title>
@@ -10741,6 +11497,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>xiao-etal-2022-datalab</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/beeradvocate">BeerAdvocate</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.18</doi>
     </paper>
     <paper id="19">
       <title>Cue-bot: A Conversational Agent for Assistive Technology</title>
@@ -10755,6 +11512,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Intelligent conversational assistants have become an integral part of our lives for performing simple tasks. However, such agents, for example, Google bots, Alexa and others are yet to have any social impact on minority population, for example, for people with neurological disorders and people with speech, language and social communication disorders, sometimes with locked-in states where speaking or typing is a challenge. Language model technologies can be very powerful tools in enabling these users to carry out daily communication and social interactions. In this work, we present a system that users with varied levels of disabilties can use to interact with the world, supported by eye-tracking, mouse controls and an intelligent agent Cue-bot, that can represent the user in a conversation. The agent provides relevant controllable ‘cues’ to generate desirable responses quickly for an ongoing dialog context. In the context of usage of such systems for people with degenerative disorders, we present automatic and human evaluation of our cue/keyword predictor and the controllable dialog system and show that our models perform significantly better than models without control and can also reduce user effort (fewer keystrokes) and speed up communication (typing time) significantly.</abstract>
       <url hash="4f4bc591">2022.acl-demo.19</url>
       <bibkey>h-kumar-etal-2022-cue</bibkey>
+      <doi>10.18653/v1/2022.acl-demo.19</doi>
     </paper>
     <paper id="20">
       <title><fixed-case>M</fixed-case>-<fixed-case>SENA</fixed-case>: An Integrated Platform for Multimodal Sentiment Analysis</title>
@@ -10772,6 +11530,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/ch-sims">CH-SIMS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cmu-mosei">CMU-MOSEI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multimodal-opinionlevel-sentiment-intensity">Multimodal Opinionlevel Sentiment Intensity</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>HOSMEL</fixed-case>: A Hot-Swappable Modularized Entity Linking Toolkit for <fixed-case>C</fixed-case>hinese</title>
@@ -10788,6 +11547,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>zhang-li-etal-2022-hosmel</bibkey>
       <pwccode url="https://github.com/thudm/hosmel" additional="false">thudm/hosmel</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/clue">CLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>BMI</fixed-case>nf: An Efficient Toolkit for Big Model Inference and Tuning</title>
@@ -10805,6 +11565,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="6d02c126">2022.acl-demo.22</url>
       <bibkey>han-etal-2022-bminf</bibkey>
       <pwccode url="https://github.com/openbmb/bminf" additional="false">openbmb/bminf</pwccode>
+      <doi>10.18653/v1/2022.acl-demo.22</doi>
     </paper>
     <paper id="23">
       <title><fixed-case>MMEKG</fixed-case>: Multi-modal Event Knowledge Graph towards Universal Representation across Modalities</title>
@@ -10824,6 +11585,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="174b5faa">2022.acl-demo.23</url>
       <bibkey>ma-etal-2022-mmekg</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/framenet">FrameNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.23</doi>
     </paper>
     <paper id="24">
       <title><fixed-case>S</fixed-case>ocio<fixed-case>F</fixed-case>illmore: A Tool for Discovering Perspectives</title>
@@ -10836,6 +11598,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>SOCIOFILLMORE is a multilingual tool which helps to bring to the fore the focus or the perspective that a text expresses in depicting an event. Our tool, whose rationale we also support through a large collection of human judgements, is theoretically grounded on frame semantics and cognitive linguistics, and implemented using the LOME frame semantic parser. We describe SOCIOFILLMORE’s development and functionalities, show how non-NLP researchers can easily interact with the tool, and present some example case studies which are already incorporated in the system, together with the kind of analysis that can be visualised.</abstract>
       <url hash="e9bb9382">2022.acl-demo.24</url>
       <bibkey>minnema-etal-2022-sociofillmore</bibkey>
+      <doi>10.18653/v1/2022.acl-demo.24</doi>
     </paper>
     <paper id="25">
       <title><fixed-case>T</fixed-case>ime<fixed-case>LM</fixed-case>s: Diachronic Language Models from <fixed-case>T</fixed-case>witter</title>
@@ -10850,6 +11613,7 @@ in the Case of Unambiguous Gender</title>
       <bibkey>loureiro-etal-2022-timelms</bibkey>
       <pwccode url="https://github.com/cardiffnlp/timelms" additional="true">cardiffnlp/timelms</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/tweeteval">TweetEval</pwcdataset>
+      <doi>10.18653/v1/2022.acl-demo.25</doi>
     </paper>
     <paper id="26">
       <title>Adaptor: Objective-Centric Adaptation Framework for Language Models</title>
@@ -10862,6 +11626,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="48760f05">2022.acl-demo.26</url>
       <bibkey>stefanik-etal-2022-adaptor</bibkey>
       <pwccode url="https://github.com/gaussalgo/adaptor" additional="false">gaussalgo/adaptor</pwccode>
+      <doi>10.18653/v1/2022.acl-demo.26</doi>
     </paper>
     <paper id="27">
       <title><fixed-case>Q</fixed-case>uick<fixed-case>G</fixed-case>raph: A Rapid Annotation Tool for Knowledge Graph Extraction from Technical Text</title>
@@ -10873,6 +11638,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="fb186997">2022.acl-demo.27</url>
       <bibkey>bikaun-etal-2022-quickgraph</bibkey>
       <pwccode url="https://github.com/nlp-tlp/quickgraph" additional="false">nlp-tlp/quickgraph</pwccode>
+      <doi>10.18653/v1/2022.acl-demo.27</doi>
     </paper>
   </volume>
   <volume id="tutorials" ingest-date="2022-05-15">
@@ -10905,6 +11671,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="d991f392">2022.acl-tutorials.1</url>
       <bibkey>church-etal-2022-gentle</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.acl-tutorials.1</doi>
     </paper>
     <paper id="2">
       <title>Towards Reproducible Machine Learning Research in Natural Language Processing</title>
@@ -10920,6 +11687,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>While recent progress in the field of ML has been significant, the reproducibility of these cutting-edge results is often lacking, with many submissions lacking the necessary information in order to ensure subsequent reproducibility. Despite proposals such as the Reproducibility Checklist and reproducibility criteria at several major conferences, the reflex for carrying out research with reproducibility in mind is lacking in the broader ML community. We propose this tutorial as a gentle introduction to ensuring reproducible research in ML, with a specific emphasis on computational linguistics and NLP. We also provide a framework for using reproducibility as a teaching tool in university-level computer science programs.</abstract>
       <url hash="06b2f824">2022.acl-tutorials.2</url>
       <bibkey>lucic-etal-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.acl-tutorials.2</doi>
     </paper>
     <paper id="3">
       <title>Knowledge-Augmented Methods for Natural Language Processing</title>
@@ -10936,6 +11704,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/commongen">CommonGen</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/commonsenseqa">CommonsenseQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
+      <doi>10.18653/v1/2022.acl-tutorials.3</doi>
     </paper>
     <paper id="4">
       <title>Non-Autoregressive Sequence Generation</title>
@@ -10945,6 +11714,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Non-autoregressive sequence generation (NAR) attempts to generate the entire or partial output sequences in parallel to speed up the generation process and avoid potential issues (e.g., label bias, exposure bias) in autoregressive generation. While it has received much research attention and has been applied in many sequence generation tasks in natural language and speech, naive NAR models still face many challenges to close the performance gap between state-of-the-art autoregressive models because of a lack of modeling power. In this tutorial, we will provide a thorough introduction and review of non-autoregressive sequence generation, in four sections: 1) Background, which covers the motivation of NAR generation, the problem definition, the evaluation protocol, and the comparison with standard autoregressive generation approaches. 2) Method, which includes different aspects: model architecture, objective function, training data, learning paradigm, and additional inference tricks. 3) Application, which covers different tasks in text and speech generation, and some advanced topics in applications. 4) Conclusion, in which we describe several research challenges and discuss the potential future research directions. We hope this tutorial can serve both academic researchers and industry practitioners working on non-autoregressive sequence generation.</abstract>
       <url hash="a91a6086">2022.acl-tutorials.4</url>
       <bibkey>gu-tan-2022-non</bibkey>
+      <doi>10.18653/v1/2022.acl-tutorials.4</doi>
     </paper>
     <paper id="5">
       <title>Learning with Limited Text Data</title>
@@ -10955,6 +11725,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Natural Language Processing (NLP) has achieved great progress in the past decade on the basis of neural models, which often make use of large amounts of labeled data to achieve state-of-the-art performance. The dependence on labeled data prevents NLP models from being applied to low-resource settings and languages because of the time, money, and expertise that is often required to label massive amounts of textual data. Consequently, the ability to learn with limited labeled data is crucial for deploying neural systems to real-world NLP applications. Recently, numerous approaches have been explored to alleviate the need for labeled data in NLP such as data augmentation and semi-supervised learning. This tutorial aims to provide a systematic and up-to-date overview of these methods in order to help researchers and practitioners understand the landscape of approaches and the challenges associated with learning from limited labeled data, an emerging topic in the computational linguistics community. We will consider applications to a wide variety of NLP tasks (including text classification, generation, and structured prediction) and will highlight current challenges and future directions.</abstract>
       <url hash="fe4ec2d3">2022.acl-tutorials.5</url>
       <bibkey>yang-etal-2022-learning</bibkey>
+      <doi>10.18653/v1/2022.acl-tutorials.5</doi>
     </paper>
     <paper id="6">
       <title>Zero- and Few-Shot <fixed-case>NLP</fixed-case> with Pretrained Language Models</title>
@@ -10967,6 +11738,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>The ability to efficiently learn from little-to-no data is critical to applying NLP to tasks where data collection is costly or otherwise difficult. This is a challenging setting both academically and practically—particularly because training neutral models typically require large amount of labeled data. More recently, advances in pretraining on unlabelled data have brought up the potential of better zero-shot or few-shot learning (Devlin et al., 2019; Brown et al., 2020). In particular, over the past year, a great deal of research has been conducted to better learn from limited data using large-scale language models. In this tutorial, we aim at bringing interested NLP researchers up to speed about the recent and ongoing techniques for zero- and few-shot learning with pretrained language models. Additionally, our goal is to reveal new research opportunities to the audience, which will hopefully bring us closer to address existing challenges in this domain.</abstract>
       <url hash="2e1a3511">2022.acl-tutorials.6</url>
       <bibkey>beltagy-etal-2022-zero</bibkey>
+      <doi>10.18653/v1/2022.acl-tutorials.6</doi>
     </paper>
     <paper id="7">
       <title>Vision-Language Pretraining: Current Trends and the Future</title>
@@ -10978,6 +11750,7 @@ in the Case of Unambiguous Gender</title>
       <url hash="883f682f">2022.acl-tutorials.7</url>
       <bibkey>agrawal-etal-2022-vision</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.acl-tutorials.7</doi>
     </paper>
     <paper id="8">
       <title>Natural Language Processing for Multilingual Task-Oriented Dialogue</title>
@@ -10990,6 +11763,7 @@ in the Case of Unambiguous Gender</title>
       <abstract>Recent advances in deep learning have also enabled fast progress in the research of task-oriented dialogue (ToD) systems. However, the majority of ToD systems are developed for English and merely a handful of other widely spoken languages, e.g., Chinese and German. This hugely limits the global reach and, consequently, transformative socioeconomic potential of such systems. In this tutorial, we will thus discuss and demonstrate the importance of (building) multilingual ToD systems, and then provide a systematic overview of current research gaps, challenges and initiatives related to multilingual ToD systems, with a particular focus on their connections to current research and challenges in multilingual and low-resource NLP. The tutorial will aim to provide answers or shed new light to the following questions: a) Why are multilingual dialogue systems so hard to build: what makes multilinguality for dialogue more challenging than for other NLP applications and tasks? b) What are the best existing methods and datasets for multilingual and cross-lingual (task-oriented) dialog systems? How are (multilingual) ToD systems usually evaluated? c) What are the promising future directions for multilingual ToD research: where can one draw inspiration from related NLP areas and tasks?</abstract>
       <url hash="c8809ed6">2022.acl-tutorials.8</url>
       <bibkey>razumovskaia-etal-2022-natural</bibkey>
+      <doi>10.18653/v1/2022.acl-tutorials.8</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.bigscience.xml b/data/xml/2022.bigscience.xml
index 4b10218a6d..ae20e9fc45 100644
--- a/data/xml/2022.bigscience.xml
+++ b/data/xml/2022.bigscience.xml
@@ -33,6 +33,7 @@
       <bibkey>jin-etal-2022-lifelong</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/s2orc">S2ORC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scierc">SciERC</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.1</doi>
     </paper>
     <paper id="2">
       <title>Using <fixed-case>ASR</fixed-case>-Generated Text for Spoken Language Modeling</title>
@@ -47,6 +48,7 @@
       <abstract>This papers aims at improving spoken language modeling (LM) using very large amount of automatically transcribed speech. We leverage the INA (French National Audiovisual Institute) collection and obtain 19GB of text after applying ASR on 350,000 hours of diverse TV shows. From this, spoken language models are trained either by fine-tuning an existing LM (FlauBERT) or through training a LM from scratch.The new models (FlauBERT-Oral) will be shared with the community and are evaluated not only in terms of word prediction accuracy but also for two downstream tasks : classification of TV shows and syntactic parsing of speech. Experimental results show that FlauBERT-Oral is better than its initial FlauBERT version demonstrating that, despite its inherent noisy nature, ASR-Generated text can be useful to improve spoken language modeling.</abstract>
       <url hash="db785133">2022.bigscience-1.2</url>
       <bibkey>herve-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.bigscience-1.2</doi>
     </paper>
     <paper id="3">
       <title>You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings</title>
@@ -71,6 +73,7 @@
       <url hash="eb2f21e0">2022.bigscience-1.3</url>
       <bibkey>talat-etal-2022-reap</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/crows-pairs">CrowS-Pairs</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.3</doi>
     </paper>
     <paper id="4">
       <title>Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model</title>
@@ -87,6 +90,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.4</doi>
     </paper>
     <paper id="5">
       <title><fixed-case>UNIREX</fixed-case>: A Unified Learning Framework for Language Model Rationale Extraction</title>
@@ -107,6 +111,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/multirc">MultiRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/e-snli">e-SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.5</doi>
     </paper>
     <paper id="6">
       <title>Pipelines for Social Bias Testing of Large Language Models</title>
@@ -117,6 +122,7 @@
       <abstract>The maturity level of language models is now at a stage in which many companies rely on them to solve various tasks. However, while research has shown how biased and harmful these models are, systematic ways of integrating social bias tests into development pipelines are still lacking. This short paper suggests how to use these verification techniques in development pipelines. We take inspiration from software testing and suggest addressing social bias evaluation as software testing. We hope to open a discussion on the best methodologies to handle social bias testing in language models.</abstract>
       <url hash="07ca0619">2022.bigscience-1.6</url>
       <bibkey>nozza-etal-2022-pipelines</bibkey>
+      <doi>10.18653/v1/2022.bigscience-1.6</doi>
     </paper>
     <paper id="7">
       <title>Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0</title>
@@ -131,6 +137,7 @@
       <abstract>In this work, we explore whether the recently demonstrated zero-shot abilities of the T0 model extend to Named Entity Recognition for out-of-distribution languages and time periods. Using a historical newspaper corpus in 3 languages as test-bed, we use prompts to extract possible named entities. Our results show that a naive approach for prompt-based zero-shot multilingual Named Entity Recognition is error-prone, but highlights the potential of such an approach for historical languages lacking labeled datasets. Moreover, we also find that T0-like models can be probed to predict the publication date and language of a document, which could be very relevant for the study of historical texts.</abstract>
       <url hash="72fb7cb8">2022.bigscience-1.7</url>
       <bibkey>de-toni-etal-2022-entities</bibkey>
+      <doi>10.18653/v1/2022.bigscience-1.7</doi>
     </paper>
     <paper id="8">
       <title>A Holistic Assessment of the Carbon Footprint of Noor, a Very Large <fixed-case>A</fixed-case>rabic Language Model</title>
@@ -144,6 +151,7 @@
       <url hash="857bbf5a">2022.bigscience-1.8</url>
       <bibkey>lakim-etal-2022-holistic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ccnet">CCNet</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>GPT</fixed-case>-<fixed-case>N</fixed-case>eo<fixed-case>X</fixed-case>-20<fixed-case>B</fixed-case>: An Open-Source Autoregressive Language Model</title>
@@ -179,6 +187,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/prost">PROST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/the-pile">The Pile</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.9</doi>
     </paper>
     <paper id="10">
       <title>Dataset Debt in Biomedical Language Modeling</title>
@@ -200,6 +209,7 @@
       <bibkey>fries-etal-2022-dataset</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/blue">BLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/blurb">BLURB</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.10</doi>
     </paper>
     <paper id="11">
       <title>Emergent Structures and Training Dynamics in Large Language Models</title>
@@ -214,6 +224,7 @@
       <abstract>Large language models have achieved success on a number of downstream tasks, particularly in a few and zero-shot manner. As a consequence, researchers have been investigating both the kind of information these networks learn and how such information can be encoded in the parameters of the model. We survey the literature on changes in the network during training, drawing from work outside of NLP when necessary, and on learned representations of linguistic features in large language models. We note in particular the lack of sufficient research on the emergence of functional units, subsections of the network where related functions are grouped or organised, within large language models and motivate future work that grounds the study of language models in an analysis of their changing internal structure during training time.</abstract>
       <url hash="1e301fe9">2022.bigscience-1.11</url>
       <bibkey>teehan-etal-2022-emergent</bibkey>
+      <doi>10.18653/v1/2022.bigscience-1.11</doi>
     </paper>
     <paper id="12">
       <title>Foundation Models of Scientific Knowledge for Chemistry: Opportunities, Challenges and Lessons Learned</title>
@@ -245,6 +256,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/wsc">WSC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wic">WiC</pwcdataset>
+      <doi>10.18653/v1/2022.bigscience-1.12</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.bionlp.xml b/data/xml/2022.bionlp.xml
index dd540d0613..482668f3aa 100644
--- a/data/xml/2022.bionlp.xml
+++ b/data/xml/2022.bionlp.xml
@@ -27,6 +27,7 @@
       <abstract>The healthcare domain suffers from the spread of poor quality articles on the Internet. While manual efforts exist to check the quality of online healthcare articles, they are not sufficient to assess all those in circulation. Such quality assessment can be automated as a text classification task, however, explanations for the labels are necessary for the users to trust the model predictions. While current explainable systems tackle explanation generation as summarization, we propose a new approach based on question answering (QA) that allows us to generate explanations for multiple criteria using a single model. We show that this QA-based approach is competitive with the current state-of-the-art, and complements summarization-based models for explainable quality assessment. We also introduce a human evaluation protocol more appropriate than automatic metrics for the evaluation of explanation generation models.</abstract>
       <url hash="34f326f3">2022.bionlp-1.1</url>
       <bibkey>boissonnet-etal-2022-explainable</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.1</doi>
     </paper>
     <paper id="2">
       <title>A sequence-to-sequence approach for document-level relation extraction</title>
@@ -41,6 +42,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/bc5cdr">BC5CDR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cdr">CDR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/docred">DocRED</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.2</doi>
     </paper>
     <paper id="3">
       <title>Position-based Prompting for Health Outcome Generation</title>
@@ -52,6 +54,7 @@
       <abstract>Probing factual knowledge in Pre-trained Language Models (PLMs) using prompts has indirectly implied that language models (LMs) can be treated as knowledge bases. To this end, this phenomenon has been effective, especially when these LMs are fine-tuned towards not just data, but also to the style or linguistic pattern of the prompts themselves. We observe that satisfying a particular linguistic pattern in prompts is an unsustainable, time-consuming constraint in the probing task, especially because they are often manually designed and the range of possible prompt template patterns can vary depending on the prompting task. To alleviate this constraint, we propose using a position-attention mechanism to capture positional information of each word in a prompt relative to the mask to be filled, hence avoiding the need to re-construct prompts when the prompts’ linguistic pattern changes. Using our approach, we demonstrate the ability of eliciting answers (in a case study on health outcome generation) to not only common prompt templates like Cloze and Prefix but also rare ones too, such as Postfix and Mixed patterns whose masks are respectively at the start and in multiple random places of the prompt. More so, using various biomedical PLMs, our approach consistently outperforms a baseline in which the default PLMs representation is used to predict masked tokens.</abstract>
       <url hash="69bcf8af">2022.bionlp-1.3</url>
       <bibkey>abaho-etal-2022-position</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.3</doi>
     </paper>
     <paper id="4">
       <title>How You Say It Matters: Measuring the Impact of Verbal Disfluency Tags on Automated Dementia Detection</title>
@@ -63,6 +66,7 @@
       <url hash="efd151b8">2022.bionlp-1.4</url>
       <bibkey>farzana-etal-2022-say</bibkey>
       <pwccode url="https://github.com/ashwindeshpande96/measuring_the_impact_of_verbal_disfluency_tags_on_automated_dementia_detection" additional="false">ashwindeshpande96/measuring_the_impact_of_verbal_disfluency_tags_on_automated_dementia_detection</pwccode>
+      <doi>10.18653/v1/2022.bionlp-1.4</doi>
     </paper>
     <paper id="5">
       <title>Zero-Shot Aspect-Based Scientific Document Summarization using Self-Supervised Pre-training</title>
@@ -76,6 +80,7 @@
       <bibkey>soleimani-etal-2022-zero</bibkey>
       <pwccode url="https://github.com/asoleimanib/zeroshotaspectbased" additional="false">asoleimanib/zeroshotaspectbased</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/facetsum">FacetSum</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.5</doi>
     </paper>
     <paper id="6">
       <title>Data Augmentation for Biomedical Factoid Question Answering</title>
@@ -90,6 +95,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/biomrc">BIOMRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/bioasq">BioASQ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.6</doi>
     </paper>
     <paper id="7">
       <title>Slot Filling for Biomedical Information Extraction</title>
@@ -104,6 +110,7 @@
       <pwccode url="https://github.com/ypapanik/biomedical-slot-filling" additional="true">ypapanik/biomedical-slot-filling</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/kilt">KILT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.7</doi>
     </paper>
     <paper id="8">
       <title>Automatic Biomedical Term Clustering by Learning Fine-grained Term Representations</title>
@@ -116,6 +123,7 @@
       <bibkey>zeng-etal-2022-automatic</bibkey>
       <pwccode url="https://github.com/GanjinZero/CODER" additional="false">GanjinZero/CODER</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bc5cdr">BC5CDR</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>B</fixed-case>io<fixed-case>BART</fixed-case>: Pretraining and Evaluation of A Biomedical Generative Language Model</title>
@@ -138,6 +146,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/meqsum">MeQSum</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/medmentions">MedMentions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/semantic-scholar">Semantic Scholar</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.9</doi>
     </paper>
     <paper id="10">
       <title>Incorporating Medical Knowledge to Transformer-based Language Models for Medical Dialogue Generation</title>
@@ -150,6 +159,7 @@
       <abstract>Medical dialogue systems have the potential to assist doctors in expanding access to medical care, improving the quality of patient experiences, and lowering medical expenses. The computational methods are still in their early stages and are not ready for widespread application despite their great potential. Existing transformer-based language models have shown promising results but lack domain-specific knowledge. However, to diagnose like doctors, an automatic medical diagnosis necessitates more stringent requirements for the rationality of the dialogue in the context of relevant knowledge. In this study, we propose a new method that addresses the challenges of medical dialogue generation by incorporating medical knowledge into transformer-based language models. We present a method that leverages an external medical knowledge graph and injects triples as domain knowledge into the utterances. Automatic and human evaluation on a publicly available dataset demonstrates that incorporating medical knowledge outperforms several state-of-the-art baseline methods.</abstract>
       <url hash="4e77277c">2022.bionlp-1.10</url>
       <bibkey>naseem-etal-2022-incorporating</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.10</doi>
     </paper>
     <paper id="11">
       <title>Memory-aligned Knowledge Graph for Clinically Accurate Radiology Image Report Generation</title>
@@ -158,6 +168,7 @@
       <abstract>Automatic generating the clinically accurate radiology report from X-ray images is important but challenging. The identification of multi-grained abnormal regions in image and corresponding abnormalities is difficult for data-driven neural models. In this work, we introduce a Memory-aligned Knowledge Graph (MaKG) of clinical abnormalities to better learn the visual patterns of abnormalities and their relationships by integrating it into a deep model architecture for the report generation. We carry out extensive experiments and show that the proposed MaKG deep model can improve the clinical accuracy of the generated reports.</abstract>
       <url hash="f21eee93">2022.bionlp-1.11</url>
       <bibkey>yan-2022-memory</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.11</doi>
     </paper>
     <paper id="12">
       <title>Simple Semantic-based Data Augmentation for Named Entity Recognition in Biomedical Texts</title>
@@ -167,6 +178,7 @@
       <abstract>Data augmentation is important in addressing data sparsity and low resources in NLP. Unlike data augmentation for other tasks such as sentence-level and sentence-pair ones, data augmentation for named entity recognition (NER) requires preserving the semantic of entities. To that end, in this paper we propose a simple semantic-based data augmentation method for biomedical NER. Our method leverages semantic information from pre-trained language models for both entity-level and sentence-level. Experimental results on two datasets: i2b2-2010 (English) and VietBioNER (Vietnamese) showed that the proposed method could improve NER performance.</abstract>
       <url hash="e6bc06fd">2022.bionlp-1.12</url>
       <bibkey>phan-nguyen-2022-simple</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.12</doi>
     </paper>
     <paper id="13">
       <title>Auxiliary Learning for Named Entity Recognition with Multiple Auxiliary Biomedical Training Data</title>
@@ -181,6 +193,7 @@
       <url hash="a7622584">2022.bionlp-1.13</url>
       <bibkey>watanabe-etal-2022-auxiliary</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ncbi-disease-1">NCBI Disease</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.13</doi>
     </paper>
     <paper id="14">
       <title><fixed-case>SNP</fixed-case>2<fixed-case>V</fixed-case>ec: Scalable Self-Supervised Pre-Training for Genome-Wide Association Study</title>
@@ -196,6 +209,7 @@
       <url hash="fb95c9aa">2022.bionlp-1.14</url>
       <bibkey>cahyawijaya-etal-2022-snp2vec</bibkey>
       <pwccode url="https://github.com/hltchkust/snp2vec" additional="false">hltchkust/snp2vec</pwccode>
+      <doi>10.18653/v1/2022.bionlp-1.14</doi>
     </paper>
     <paper id="15">
       <title>Biomedical <fixed-case>NER</fixed-case> using Novel Schema and Distant Supervision</title>
@@ -207,6 +221,7 @@
       <abstract>Biomedical Named Entity Recognition (BMNER) is one of the most important tasks in the field of biomedical text mining. Most work so far on this task has not focused on identification of discontinuous and overlapping entities, even though they are present in significant fractions in real-life biomedical datasets. In this paper, we introduce a novel annotation schema to capture complex entities, and explore the effects of distant supervision on our deep-learning sequence labelling model. For BMNER task, our annotation schema outperforms other BIO-based annotation schemes on the same model. We also achieve higher F1-scores than state-of-the-art models on multiple corpora without fine-tuning embeddings, highlighting the efficacy of neural feature extraction using our model.</abstract>
       <url hash="71166692">2022.bionlp-1.15</url>
       <bibkey>khandelwal-etal-2022-biomedical</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.15</doi>
     </paper>
     <paper id="16">
       <title>Improving Supervised Drug-Protein Relation Extraction with Distantly Supervised Models</title>
@@ -217,6 +232,7 @@
       <abstract>This paper proposes novel drug-protein relation extraction models that indirectly utilize distant supervision data. Concretely, instead of adding distant supervision data to the manually annotated training data, our models incorporate distantly supervised models that are relation extraction models trained with distant supervision data. Distantly supervised learning has been proposed to generate a large amount of pseudo-training data at low cost. However, there is still a problem of low prediction performance due to the inclusion of mislabeled data. Therefore, several methods have been proposed to suppress the effects of noisy cases by utilizing some manually annotated training data. However, their performance is lower than that of supervised learning on manually annotated data because mislabeled data that cannot be fully suppressed becomes noise when training the model. To overcome this issue, our methods indirectly utilize distant supervision data with manually annotated training data. The experimental results on the DrugProt corpus in the BioCreative VII Track 1 showed that our proposed model can consistently improve the supervised models in different settings.</abstract>
       <url hash="fb8d74cc">2022.bionlp-1.16</url>
       <bibkey>iinuma-etal-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.16</doi>
     </paper>
     <paper id="17">
       <title>Named Entity Recognition for Cancer Immunology Research Using Distant Supervision</title>
@@ -227,6 +243,7 @@
       <abstract>Cancer immunology research involves several important cell and protein factors. Extracting the information of such cells and proteins and the interactions between them from text are crucial in text mining for cancer immunology research. However, there are few available datasets for these entities, and the amount of annotated documents is not sufficient compared with other major named entity types. In this work, we introduce our automatically annotated dataset of key named entities, i.e., T-cells, cytokines, and transcription factors, which engages the recent cancer immunotherapy. The entities are annotated based on the UniProtKB knowledge base using dictionary matching. We build a neural named entity recognition (NER) model to be trained on this dataset and evaluate it on a manually-annotated data. Experimental results show that we can achieve a promising NER performance even though our data is automatically annotated. Our dataset also enhances the NER performance when combined with existing data, especially gaining improvement in yet investigated named entities such as cytokines and transcription factors.</abstract>
       <url hash="a9f694d0">2022.bionlp-1.17</url>
       <bibkey>trieu-etal-2022-named</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.17</doi>
     </paper>
     <paper id="18">
       <title>Intra-Template Entity Compatibility based Slot-Filling for Clinical Trial Information Extraction</title>
@@ -236,6 +253,7 @@
       <abstract>We present a deep learning based information extraction system that can extract the design and results of a published abstract describing a Randomized Controlled Trial (RCT). In contrast to other approaches, our system does not regard the PICO elements as flat objects or labels but as structured objects. We thus model the task as the one of filling a set of templates and slots; our two-step approach recognizes relevant slot candidates as a first step and assigns them to a corresponding template as second step, relying on a learned pairwise scoring function that models the compatibility of the different slot values. We evaluate the approach on a dataset of 211 manually annotated abstracts for type 2 Diabetes and Glaucoma, showing the positive impact of modelling intra-template entity compatibility. As main benefit, our approach yields a structured object for every RCT abstract that supports the aggregation and summarization of clinical trial results across published studies and can facilitate the task of creating a systematic review or meta-analysis.</abstract>
       <url hash="ee5a3300">2022.bionlp-1.18</url>
       <bibkey>witte-cimiano-2022-intra</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.18</doi>
     </paper>
     <paper id="19">
       <title>Pretrained Biomedical Language Models for Clinical <fixed-case>NLP</fixed-case> in <fixed-case>S</fixed-case>panish</title>
@@ -253,6 +271,7 @@
       <url hash="3ed5493f">2022.bionlp-1.19</url>
       <bibkey>carrino-etal-2022-pretrained</bibkey>
       <pwccode url="https://github.com/PlanTL-GOB-ES/lm-biomedical-clinical-es" additional="false">PlanTL-GOB-ES/lm-biomedical-clinical-es</pwccode>
+      <doi>10.18653/v1/2022.bionlp-1.19</doi>
     </paper>
     <paper id="20">
       <title>Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts</title>
@@ -268,6 +287,7 @@
       <bibkey>amin-etal-2022-shot</bibkey>
       <pwccode url="https://github.com/suamin/t2ner" additional="false">suamin/t2ner</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2002">CoNLL 2002</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>VPAI</fixed-case>_<fixed-case>L</fixed-case>ab at <fixed-case>M</fixed-case>ed<fixed-case>V</fixed-case>id<fixed-case>QA</fixed-case> 2022: A Two-Stage Cross-modal Fusion Method for Medical Instructional Video Classification</title>
@@ -283,6 +303,7 @@
       <pwccode url="https://github.com/lireanstar/medvidcl" additional="false">lireanstar/medvidcl</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/kinetics">Kinetics</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/medvidqa">MedVidQA</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>G</fixed-case>en<fixed-case>C</fixed-case>ompare<fixed-case>S</fixed-case>um: a hybrid unsupervised summarization method using salience</title>
@@ -298,6 +319,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/pubmed">Pubmed</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/s2orc">S2ORC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/arxiv">arXiv</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.22</doi>
     </paper>
     <paper id="23">
       <title><fixed-case>B</fixed-case>io<fixed-case>C</fixed-case>ite: A Deep Learning-based Citation Linkage Framework for Biomedical Research Articles</title>
@@ -307,6 +329,7 @@
       <abstract>Research papers reflect scientific advances. Citations are widely used in research publications to support the new findings and show their benefits, while also regulating the information flow to make the contents clearer for the audience. A citation in a research article refers to the information’s source, but not the specific text span from that source article. In biomedical research articles, this task is challenging as the same chemical or biological component can be represented in multiple ways in different papers from various domains. This paper suggests a mechanism for linking citing sentences in a publication with cited sentences in referenced sources. The framework presented here pairs the citing sentence with all of the sentences in the reference text, and then tries to retrieve the semantically equivalent pairs. These semantically related sentences from the reference paper are chosen as the cited statements. This effort involves designing a citation linkage framework utilizing sequential and tree-structured siamese deep learning models. This paper also provides a method to create a synthetic corpus for such a task.</abstract>
       <url hash="f59e0805">2022.bionlp-1.23</url>
       <bibkey>singha-roy-mercer-2022-biocite</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.23</doi>
     </paper>
     <paper id="24">
       <title>Low Resource Causal Event Detection from Biomedical Literature</title>
@@ -318,6 +341,7 @@
       <abstract>Recognizing causal precedence relations among the chemical interactions in biomedical literature is crucial to understanding the underlying biological mechanisms. However, detecting such causal relation can be hard because: (1) many times, such causal relations among events are not explicitly expressed by certain phrases but implicitly implied by very diverse expressions in the text, and (2) annotating such causal relation detection datasets requires considerable expert knowledge and effort. In this paper, we propose a strategy to address both challenges by training neural models with in-domain pre-training and knowledge distillation. We show that, by using very limited amount of labeled data, and sufficient amount of unlabeled data, the neural models outperform previous baselines on the causal precedence detection task, and are ten times faster at inference compared to the BERT base model.</abstract>
       <url hash="4555da34">2022.bionlp-1.24</url>
       <bibkey>liang-etal-2022-low</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.24</doi>
     </paper>
     <paper id="25">
       <title>Overview of the <fixed-case>M</fixed-case>ed<fixed-case>V</fixed-case>id<fixed-case>QA</fixed-case> 2022 Shared Task on Medical Video Question-Answering</title>
@@ -329,6 +353,7 @@
       <bibkey>gupta-demner-fushman-2022-overview</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/howto100m">HowTo100M</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/medvidqa">MedVidQA</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.25</doi>
     </paper>
     <paper id="26">
       <title>Inter-annotator agreement is not the ceiling of machine learning performance: Evidence from a comprehensive set of simulations</title>
@@ -339,6 +364,7 @@
       <abstract>It is commonly claimed that inter-annotator agreement (IAA) is the ceiling of machine learning (ML) performance, i.e., that the agreement between an ML system’s predictions and an annotator can not be higher than the agreement between two annotators. Although Boguslav &amp; Cohen (2017) showed that this claim is falsified by many real-world ML systems, the claim has persisted. As a complement to this real-world evidence, we conducted a comprehensive set of simulations, and show that an ML model can beat IAA even if (and especially if) annotators are noisy and differ in their underlying classification functions, as long as the ML model is reasonably well-specified. Although the latter condition has long been elusive, leading ML models to underperform IAA, we anticipate that this condition will be increasingly met in the era of big data and deep learning. Our work has implications for (1) maximizing the value of machine learning, (2) adherence to ethical standards in computing, and (3) economical use of annotated resources, which is paramount in settings where annotation is especially expensive, like biomedical natural language processing.</abstract>
       <url hash="079c2aaa">2022.bionlp-1.26</url>
       <bibkey>richie-etal-2022-inter</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.26</doi>
     </paper>
     <paper id="27">
       <title>Conversational Bots for Psychotherapy: A Study of Generative Transformer Models Using Domain-specific Dialogues</title>
@@ -356,6 +382,7 @@
       <url hash="5b70798f">2022.bionlp-1.27</url>
       <bibkey>das-etal-2022-conversational</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.27</doi>
     </paper>
     <paper id="28">
       <title><fixed-case>BEEDS</fixed-case>: Large-Scale Biomedical Event Extraction using Distant Supervision and Question Answering</title>
@@ -367,6 +394,7 @@
       <url hash="da3f9b7a">2022.bionlp-1.28</url>
       <bibkey>wang-etal-2022-beeds</bibkey>
       <pwccode url="https://github.com/wangxii/beeds" additional="false">wangxii/beeds</pwccode>
+      <doi>10.18653/v1/2022.bionlp-1.28</doi>
     </paper>
     <paper id="29">
       <title>Data Augmentation for Rare Symptoms in Vaccine Side-Effect Detection</title>
@@ -376,6 +404,7 @@
       <abstract>We study the problem of entity detection and normalization applied to patient self-reports of symptoms that arise as side-effects of vaccines. Our application domain presents unique challenges that render traditional classification methods ineffective: the number of entity types is large; and many symptoms are rare, resulting in a long-tail distribution of training examples per entity type. We tackle these challenges with an autoregressive model that generates standardized names of symptoms. We introduce a data augmentation technique to increase the number of training examples for rare symptoms. Experiments on real-life patient vaccine symptom self-reports show that our approach outperforms strong baselines, and that additional examples improve performance on the long-tail entities.</abstract>
       <url hash="cf4e8dbe">2022.bionlp-1.29</url>
       <bibkey>kim-nakashole-2022-data</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.29</doi>
     </paper>
     <paper id="30">
       <title>Improving <fixed-case>R</fixed-case>omanian <fixed-case>B</fixed-case>io<fixed-case>NER</fixed-case> Using a Biologically Inspired System</title>
@@ -385,6 +414,7 @@
       <abstract>Recognition of named entities present in text is an important step towards information extraction and natural language understanding. This work presents a named entity recognition system for the Romanian biomedical domain. The system makes use of a new and extended version of SiMoNERo corpus, that is open sourced. Also, the best system is available for direct usage in the RELATE platform.</abstract>
       <url hash="d70124c4">2022.bionlp-1.30</url>
       <bibkey>mitrofan-pais-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.30</doi>
     </paper>
     <paper id="31">
       <title><fixed-case>B</fixed-case>angla<fixed-case>B</fixed-case>io<fixed-case>M</fixed-case>ed: A Biomedical Named-Entity Annotated Corpus for <fixed-case>B</fixed-case>angla (<fixed-case>B</fixed-case>engali)</title>
@@ -394,6 +424,7 @@
       <url hash="390b4457">2022.bionlp-1.31</url>
       <bibkey>sazzed-2022-banglabiomed</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cowese">CoWeSe</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.31</doi>
     </paper>
     <paper id="32">
       <title><fixed-case>ICDB</fixed-case>ig<fixed-case>B</fixed-case>ird: A Contextual Embedding Model for <fixed-case>ICD</fixed-case> Code Classification</title>
@@ -406,6 +437,7 @@
       <abstract>The International Classification of Diseases (ICD) system is the international standard for classifying diseases and procedures during a healthcare encounter and is widely used for healthcare reporting and management purposes. Assigning correct codes for clinical procedures is important for clinical, operational and financial decision-making in healthcare. Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks. However, these models have yet to achieve state-of-the-art results in the ICD classification task since one of their main disadvantages is that they can only process documents that contain a small number of tokens which is rarely the case with real patient notes. In this paper, we introduce ICDBigBird a BigBird-based model which can integrate a Graph Convolutional Network (GCN), that takes advantage of the relations between ICD codes in order to create ‘enriched’ representations of their embeddings, with a BigBird contextual model that can process larger documents. Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task as it outperforms the previous state-of-the-art models.</abstract>
       <url hash="7bacff3c">2022.bionlp-1.32</url>
       <bibkey>michalopoulos-etal-2022-icdbigbird</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.32</doi>
     </paper>
     <paper id="33">
       <title>Doctor <fixed-case>XA</fixed-case>v<fixed-case>I</fixed-case>er: Explainable Diagnosis on Physician-Patient Dialogues and <fixed-case>XAI</fixed-case> Evaluation</title>
@@ -416,6 +448,7 @@
       <url hash="b92fd07f">2022.bionlp-1.33</url>
       <bibkey>ngai-rudzicz-2022-doctor</bibkey>
       <pwccode url="https://github.com/hillary-ngai/doctor_xavier" additional="false">hillary-ngai/doctor_xavier</pwccode>
+      <doi>10.18653/v1/2022.bionlp-1.33</doi>
     </paper>
     <paper id="34">
       <title><fixed-case>DISTANT</fixed-case>-<fixed-case>CTO</fixed-case>: A Zero Cost, Distantly Supervised Approach to Improve Low-Resource Entity Extraction Using Clinical Trials Literature</title>
@@ -425,6 +458,7 @@
       <abstract>PICO recognition is an information extraction task for identifying participant, intervention, comparator, and outcome information from clinical literature. Manually identifying PICO information is the most time-consuming step for conducting systematic reviews (SR), which is already labor-intensive. A lack of diversified and large, annotated corpora restricts innovation and adoption of automated PICO recognition systems. The largest-available PICO entity/span corpus is manually annotated which is too expensive for a majority of the scientific community. To break through the bottleneck, we propose DISTANT-CTO, a novel distantly supervised PICO entity extraction approach using the clinical trials literature, to generate a massive weakly-labeled dataset with more than a million ‘Intervention’ and ‘Comparator’ entity annotations. We train distant NER (named-entity recognition) models using this weakly-labeled dataset and demonstrate that it outperforms even the sophisticated models trained on the manually annotated dataset with a 2% F1 improvement over the Intervention entity of the PICO benchmark and more than 5% improvement when combined with the manually annotated dataset. We investigate the generalizability of our approach and gain an impressive F1 score on another domain-specific PICO benchmark. The approach is not only zero-cost but is also scalable for a constant stream of PICO entity annotations.</abstract>
       <url hash="046cb715">2022.bionlp-1.34</url>
       <bibkey>dhrangadhariya-muller-2022-distant</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.34</doi>
     </paper>
     <paper id="35">
       <title><fixed-case>E</fixed-case>cho<fixed-case>G</fixed-case>en: Generating Conclusions from Echocardiogram Notes</title>
@@ -440,6 +474,7 @@
       <url hash="989d85eb">2022.bionlp-1.35</url>
       <bibkey>tang-etal-2022-echogen</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mimic-iii">MIMIC-III</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.35</doi>
     </paper>
     <paper id="36">
       <title>Quantifying Clinical Outcome Measures in Patients with Epilepsy Using the Electronic Health Record</title>
@@ -451,6 +486,7 @@
       <abstract>A wealth of important clinical information lies untouched in the Electronic Health Record, often in the form of unstructured textual documents. For patients with Epilepsy, such information includes outcome measures like Seizure Frequency and Dates of Last Seizure, key parameters that guide all therapy for these patients. Transformer models have been able to extract such outcome measures from unstructured clinical note text as sentences with human-like accuracy; however, these sentences are not yet usable in a quantitative analysis for large-scale studies. In this study, we developed a pipeline to quantify these outcome measures. We used text summarization models to convert unstructured sentences into specific formats, and then employed rules-based quantifiers to calculate seizure frequencies and dates of last seizure. We demonstrated that our pipeline of models does not excessively propagate errors and we analyzed its mistakes. We anticipate that our methods can be generalized outside of epilepsy to other disorders to drive large-scale clinical research.</abstract>
       <url hash="25e23e69">2022.bionlp-1.36</url>
       <bibkey>xie-etal-2022-quantifying</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.36</doi>
     </paper>
     <paper id="37">
       <title>Comparing Encoder-Only and Encoder-Decoder Transformers for Relation Extraction from Biomedical Texts: An Empirical Study on Ten Benchmark Datasets</title>
@@ -462,6 +498,7 @@
       <url hash="79d31c69">2022.bionlp-1.37</url>
       <bibkey>sarrouti-etal-2022-comparing</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ddi">DDI</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.37</doi>
     </paper>
     <paper id="38">
       <title>Utility Preservation of Clinical Text After De-Identification</title>
@@ -471,6 +508,7 @@
       <abstract>Electronic health records contain valuable information about symptoms, diagnosis, treatment and outcomes of the treatments of individual patients. However, the records may also contain information that can reveal the identity of the patients. Removing these identifiers - the Protected Health Information (PHI) - can protect the identity of the patient. Automatic de-identification is a process which employs machine learning techniques to detect and remove PHI. However, automatic techniques are imperfect in their precision and introduce noise into the data. This study examines the impact of this noise on the utility of Swedish de-identified clinical data by using human evaluators and by training and testing BERT models. Our results indicate that de-identification does not harm the utility for clinical NLP and that human evaluators are less sensitive to noise from de-identification than expected.</abstract>
       <url hash="4b033df9">2022.bionlp-1.38</url>
       <bibkey>vakili-dalianis-2022-utility</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.38</doi>
     </paper>
     <paper id="39">
       <title>Horses to Zebras: Ontology-Guided Data Augmentation and Synthesis for <fixed-case>ICD</fixed-case>-9 Coding</title>
@@ -483,6 +521,7 @@
       <url hash="ab353a18">2022.bionlp-1.39</url>
       <bibkey>falis-etal-2022-horses</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mimic-iii">MIMIC-III</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.39</doi>
     </paper>
     <paper id="40">
       <title>Towards Automatic Curation of Antibiotic Resistance Genes via Statement Extraction from Scientific Papers: A Benchmark Dataset and Models</title>
@@ -495,6 +534,7 @@
       <url hash="3de7d562">2022.bionlp-1.40</url>
       <bibkey>chandak-etal-2022-towards</bibkey>
       <pwccode url="https://github.com/vt-nlp/sciarg" additional="false">vt-nlp/sciarg</pwccode>
+      <doi>10.18653/v1/2022.bionlp-1.40</doi>
     </paper>
     <paper id="41">
       <title>Model Distillation for Faithful Explanations of Medical Code Predictions</title>
@@ -505,6 +545,7 @@
       <abstract>Machine learning models that offer excellent predictive performance often lack the interpretability necessary to support integrated human machine decision-making. In clinical medicine and other high-risk settings, domain experts may be unwilling to trust model predictions without explanations. Work in explainable AI must balance competing objectives along two different axes: 1) Models should ideally be both accurate and simple. 2) Explanations must balance faithfulness to the model’s decision-making with their plausibility to a domain expert. We propose to use knowledge distillation, or training a student model that mimics the behavior of a trained teacher model, as a technique to generate faithful and plausible explanations. We evaluate our approach on the task of assigning ICD codes to clinical notes to demonstrate that the student model is faithful to the teacher model’s behavior and produces quality natural language explanations.</abstract>
       <url hash="a9a7a10d">2022.bionlp-1.41</url>
       <bibkey>wood-doughty-etal-2022-model</bibkey>
+      <doi>10.18653/v1/2022.bionlp-1.41</doi>
     </paper>
     <paper id="42">
       <title>Towards Generalizable Methods for Automating Risk Score Calculation</title>
@@ -521,6 +562,7 @@
       <bibkey>liang-etal-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mimic-iii">MIMIC-III</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/emrqa">emrQA</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.42</doi>
     </paper>
     <paper id="43">
       <title><fixed-case>D</fixed-case>o<fixed-case>SSIER</fixed-case> at <fixed-case>M</fixed-case>ed<fixed-case>V</fixed-case>id<fixed-case>QA</fixed-case> 2022: Text-based Approaches to Medical Video Answer Localization Problem</title>
@@ -534,6 +576,7 @@
       <url hash="262e40e5">2022.bionlp-1.43</url>
       <bibkey>kusa-etal-2022-dossier</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/medvidqa">MedVidQA</pwcdataset>
+      <doi>10.18653/v1/2022.bionlp-1.43</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.cmcl.xml b/data/xml/2022.cmcl.xml
index ae2be7a3ad..b5d65c0ebc 100644
--- a/data/xml/2022.cmcl.xml
+++ b/data/xml/2022.cmcl.xml
@@ -31,6 +31,7 @@
       <pwccode url="https://github.com/DannyMerkx/speech2image" additional="false">DannyMerkx/speech2image</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imagenet">ImageNet</pwcdataset>
+      <doi>10.18653/v1/2022.cmcl-1.1</doi>
     </paper>
     <paper id="2">
       <title>A Neural Model for Compositional Word Embeddings and Sentence Processing</title>
@@ -40,6 +41,7 @@
       <abstract>We propose a new neural model for word embeddings, which uses Unitary Matrices as the primary device for encoding lexical information. It uses simple matrix multiplication to derive matrices for large units, yielding a sentence processing model that is strictly compositional, does not lose information over time steps, and is transparent, in the sense that word embeddings can be analysed regardless of context. This model does not employ activation functions, and so the network is fully accessible to analysis by the methods of linear algebra at each point in its operation on an input sequence. We test it in two NLP agreement tasks and obtain rule like perfect accuracy, with greater stability than current state-of-the-art systems. Our proposed model goes some way towards offering a class of computationally powerful deep learning systems that can be fully understood and compared to human cognitive processes for natural language learning and representation.</abstract>
       <url hash="74b017db">2022.cmcl-1.2</url>
       <bibkey>lappin-bernardy-2022-neural</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.2</doi>
     </paper>
     <paper id="3">
       <title>Visually Grounded Interpretation of Noun-Noun Compounds in <fixed-case>E</fixed-case>nglish</title>
@@ -52,6 +54,7 @@
       <url hash="70b87ec6">2022.cmcl-1.3</url>
       <bibkey>lang-etal-2022-visually</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imagenet">ImageNet</pwcdataset>
+      <doi>10.18653/v1/2022.cmcl-1.3</doi>
     </paper>
     <paper id="4">
       <title>Less Descriptive yet Discriminative: Quantifying the Properties of Multimodal Referring Utterances via <fixed-case>CLIP</fixed-case></title>
@@ -63,6 +66,7 @@
       <url hash="f6e0cfdc">2022.cmcl-1.4</url>
       <bibkey>takmaz-etal-2022-less</bibkey>
       <pwccode url="https://github.com/ecekt/clip-desc-disc" additional="false">ecekt/clip-desc-disc</pwccode>
+      <doi>10.18653/v1/2022.cmcl-1.4</doi>
     </paper>
     <paper id="5">
       <title>Codenames as a Game of Co-occurrence Counting</title>
@@ -75,6 +79,7 @@
       <url hash="40b217b3">2022.cmcl-1.5</url>
       <bibkey>cserhati-etal-2022-codenames</bibkey>
       <pwccode url="https://github.com/xerevity/codenamesagent" additional="false">xerevity/codenamesagent</pwccode>
+      <doi>10.18653/v1/2022.cmcl-1.5</doi>
     </paper>
     <paper id="6">
       <title>Estimating word co-occurrence probabilities from pretrained static embeddings using a log-bilinear model</title>
@@ -83,6 +88,7 @@
       <abstract>We investigate how to use pretrained static word embeddings to deliver improved estimates of bilexical co-occurrence probabilities: conditional probabilities of one word given a single other word in a specific relationship. Such probabilities play important roles in psycholinguistics, corpus linguistics, and usage-based cognitive modeling of language more generally. We propose a log-bilinear model taking pretrained vector representations of the two words as input, enabling generalization based on the distributional information contained in both vectors. We show that this model outperforms baselines in estimating probabilities of adjectives given nouns that they attributively modify, and probabilities of nominal direct objects given their head verbs, given limited training data in Arabic, English, Korean, and Spanish.</abstract>
       <url hash="30e183b7">2022.cmcl-1.6</url>
       <bibkey>futrell-2022-estimating</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.6</doi>
     </paper>
     <paper id="7">
       <title>Modeling the Relationship between Input Distributions and Learning Trajectories with the Tolerance Principle</title>
@@ -91,6 +97,7 @@
       <abstract>Child language learners develop with remarkable uniformity, both in their learning trajectories and ultimate outcomes, despite major differences in their learning environments. In this paper, we explore the role that the frequencies and distributions of irregular lexical items in the input plays in driving learning trajectories. We conclude that while the Tolerance Principle, a type-based model of productivity learning, accounts for inter-learner uniformity, it also interacts with input distributions to drive cross-linguistic variation in learning trajectories.</abstract>
       <url hash="e9b22efe">2022.cmcl-1.7</url>
       <bibkey>kodner-2022-modeling</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.7</doi>
     </paper>
     <paper id="8">
       <title>Predicting scalar diversity with context-driven uncertainty over alternatives</title>
@@ -101,6 +108,7 @@
       <abstract>Scalar implicature (SI) arises when a speaker uses an expression (e.g., “some”) that is semantically compatible with a logically stronger alternative on the same scale (e.g., “all”), leading the listener to infer that they did not intend to convey the stronger meaning. Prior work has demonstrated that SI rates are highly variable across scales, raising the question of what factors determine the SI strength for a particular scale. Here, we test the hypothesis that SI rates depend on the listener’s confidence in the underlying scale, which we operationalize as uncertainty over the distribution of possible alternatives conditioned on the context. We use a T5 model fine-tuned on a text infilling task to estimate this distribution. We find that scale uncertainty predicts human SI rates, measured as entropy over the sampled alternatives and over latent classes among alternatives in sentence embedding space. Furthermore, we do not find a significant effect of the surprisal of the strong scalemate. Our results suggest that pragmatic inferences depend on listeners’ context-driven uncertainty over alternatives.</abstract>
       <url hash="f2351c8d">2022.cmcl-1.8</url>
       <bibkey>hu-etal-2022-predicting</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.8</doi>
     </paper>
     <paper id="9">
       <title>Eye Gaze and Self-attention: How Humans and Transformers Attend Words in Sentences</title>
@@ -119,6 +127,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/movieqa">MovieQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.cmcl-1.9</doi>
     </paper>
     <paper id="10">
       <title>About Time: Do Transformers Learn Temporal Verbal Aspect?</title>
@@ -130,6 +139,7 @@
       <url hash="fae10edd">2022.cmcl-1.10</url>
       <bibkey>metheniti-etal-2022-time</bibkey>
       <pwccode url="https://github.com/lenakmeth/telicity_classification" additional="false">lenakmeth/telicity_classification</pwccode>
+      <doi>10.18653/v1/2022.cmcl-1.10</doi>
     </paper>
     <paper id="11">
       <title>Poirot at <fixed-case>CMCL</fixed-case> 2022 Shared Task: Zero Shot Crosslingual Eye-Tracking Data Prediction using Multilingual Transformer Models</title>
@@ -138,6 +148,7 @@
       <abstract>Eye tracking data during reading is a useful source of information to understand the cognitive processes that take place during language comprehension processes. Different languages account for different cognitive triggers, however there seems to be some uniform indicatorsacross languages. In this paper, we describe our submission to the CMCL 2022 shared task on predicting human reading patterns for multi-lingual dataset. Our model uses text representations from transformers and some hand engineered features with a regression layer on top to predict statistical measures of mean and standard deviation for 2 main eye-tracking features. We train an end-to-end model to extract meaningful information from different languages and test our model on two separate datasets. We compare different transformer models andshow ablation studies affecting model performance. Our final submission ranked 4th place for SubTask-1 and 1st place for SubTask-2 forthe shared task.</abstract>
       <url hash="2c12548e">2022.cmcl-1.11</url>
       <bibkey>srivastava-2022-poirot</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.11</doi>
     </paper>
     <paper id="12">
       <title><fixed-case>NU</fixed-case> <fixed-case>HLT</fixed-case> at <fixed-case>CMCL</fixed-case> 2022 Shared Task: Multilingual and Crosslingual Prediction of Human Reading Behavior in Universal Language Space</title>
@@ -147,6 +158,7 @@
       <url hash="69adb5a8">2022.cmcl-1.12</url>
       <bibkey>imperial-2022-nu</bibkey>
       <pwccode url="https://github.com/imperialite/cmcl2022-unified-eye-tracking-ipa" additional="false">imperialite/cmcl2022-unified-eye-tracking-ipa</pwccode>
+      <doi>10.18653/v1/2022.cmcl-1.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>H</fixed-case>k<fixed-case>A</fixed-case>msters at <fixed-case>CMCL</fixed-case> 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features</title>
@@ -157,6 +169,7 @@
       <abstract>Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.</abstract>
       <url hash="d3e6acde">2022.cmcl-1.13</url>
       <bibkey>salicchi-etal-2022-hkamsters</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.13</doi>
     </paper>
     <paper id="14">
       <title><fixed-case>CMCL</fixed-case> 2022 Shared Task on Multilingual and Crosslingual Prediction of Human Reading Behavior</title>
@@ -170,6 +183,7 @@
       <abstract>We present the second shared task on eye-tracking data prediction of the Cognitive Modeling and Computational Linguistics Workshop (CMCL). Differently from the previous edition, participating teams are asked to predict eye-tracking features from multiple languages, including a surprise language for which there were no available training data. Moreover, the task also included the prediction of standard deviations of feature values in order to account for individual differences between readers.A total of six teams registered to the task. For the first subtask on multilingual prediction, the winning team proposed a regression model based on lexical features, while for the second subtask on cross-lingual prediction, the winning team used a hybrid model based on a multilingual transformer embeddings as well as statistical features.</abstract>
       <url hash="3eb0c7f0">2022.cmcl-1.14</url>
       <bibkey>hollenstein-etal-2022-cmcl</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.14</doi>
     </paper>
     <paper id="15">
       <title>Team <fixed-case>ÚFAL</fixed-case> at <fixed-case>CMCL</fixed-case> 2022 Shared Task: Figuring out the correct recipe for predicting Eye-Tracking features using Pretrained Language Models</title>
@@ -180,6 +194,7 @@
       <abstract>Eye-Tracking data is a very useful source of information to study cognition and especially language comprehension in humans. In this paper, we describe our systems for the CMCL 2022 shared task on predicting eye-tracking information. We describe our experiments withpretrained models like BERT and XLM and the different ways in which we used those representations to predict four eye-tracking features. Along with analysing the effect of using two different kinds of pretrained multilingual language models and different ways of pooling the token-level representations, we also explore how contextual information affects the performance of the systems. Finally, we also explore if factors like augmenting linguistic information affect the predictions. Our submissions achieved an average MAE of 5.72 and ranked 5th in the shared task. The average MAE showed further reduction to 5.25 in post task evaluation.</abstract>
       <url hash="69bed3a8">2022.cmcl-1.15</url>
       <bibkey>bhattacharya-etal-2022-team</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.15</doi>
     </paper>
     <paper id="16">
       <title>Team <fixed-case>DMG</fixed-case> at <fixed-case>CMCL</fixed-case> 2022 Shared Task: Transformer Adapters for the Multi- and Cross-Lingual Prediction of Human Reading Behavior</title>
@@ -188,6 +203,7 @@
       <abstract>In this paper, we present the details of our approaches that attained the second place in the shared task of the ACL 2022 Cognitive Modeling and Computational Linguistics Workshop. The shared task is focused on multi- and cross-lingual prediction of eye movement features in human reading behavior, which could provide valuable information regarding language processing. To this end, we train ‘adapters’ inserted into the layers of frozen transformer-based pretrained language models. We find that multilingual models equipped with adapters perform well in predicting eye-tracking features. Our results suggest that utilizing language- and task-specific adapters is beneficial and translating test sets into similar languages that exist in the training set could help with zero-shot transferability in the prediction of human reading behavior.</abstract>
       <url hash="d310f9d7">2022.cmcl-1.16</url>
       <bibkey>takmaz-2022-team</bibkey>
+      <doi>10.18653/v1/2022.cmcl-1.16</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.computel.xml b/data/xml/2022.computel.xml
index e198a7aca7..6b262d154e 100644
--- a/data/xml/2022.computel.xml
+++ b/data/xml/2022.computel.xml
@@ -31,6 +31,7 @@
       <abstract>In this paper we present the speech corpus for the Siberian Ingrian Finnish language. The speech corpus includes audio data, annotations, software tools for data-processing, two databases and a web application. We have published part of the audio data and annotations. The software tool for parsing annotation files and feeding a relational database is developed and published under a free license. A web application is developed and available. At this moment, about 300 words and 200 phrases can be displayed using this web application.</abstract>
       <url hash="8a63d39e">2022.computel-1.1</url>
       <bibkey>ubaleht-raudalainen-2022-development</bibkey>
+      <doi>10.18653/v1/2022.computel-1.1</doi>
     </paper>
     <paper id="2">
       <title>New syntactic insights for automated <fixed-case>W</fixed-case>olof <fixed-case>U</fixed-case>niversal <fixed-case>D</fixed-case>ependency parsing</title>
@@ -39,6 +40,7 @@
       <abstract>Focus on language-specific properties with insights from formal minimalist syntax can improve universal dependency (UD) parsing. Such improvements are especially sensitive for low-resource African languages, like Wolof, which have fewer UD treebanks in number and amount of annotations, and fewer contributing annotators. For two different UD parser pipelines, one parser model was trained on the original Wolof treebank, and one was trained on an edited treebank. For each parser pipeline, the accuracy of the edited treebank was higher than the original for both the dependency relations and dependency labels. Accuracy for universal dependency relations improved as much as 2.90%, while accuracy for universal dependency labels increased as much as 3.38%. An annotation scheme that better fits a language’s distinct syntax results in better parsing accuracy.</abstract>
       <url hash="19b4019c">2022.computel-1.2</url>
       <bibkey>dyer-2022-new</bibkey>
+      <doi>10.18653/v1/2022.computel-1.2</doi>
     </paper>
     <paper id="3">
       <title>Corpus Development of Kiswahili Speech Recognition Test and Evaluation sets, Preemptively Mitigating Demographic Bias Through Collaboration with Linguists</title>
@@ -53,6 +55,7 @@
       <abstract>Language technologies, particularly speech technologies, are becoming more pervasive for access to digital platforms and resources. This brings to the forefront concerns of their inclusivity, first in terms of language diversity. Additionally, research shows speech recognition to be more accurate for men than for women and more accurate for individuals younger than 30 years of age than those older. In the Global South where languages are low resource, these same issues should be taken into consideration in data collection efforts to not replicate these mistakes. It is also important to note that in varying contexts within the Global South, this work presents additional nuance and potential for bias based on accents, related dialects and variants of a language. This paper documents i) the designing and execution of a Linguists Engagement for purposes of building an inclusive Kiswahili Speech Recognition dataset, representative of the diversity among speakers of the language ii) the unexpected yet key learning in terms of socio-linguistcs which demonstrate the importance of multi-disciplinarity in teams developing datasets and NLP technologies iii) the creation of a test dataset intended to be used for evaluating the performance of Speech Recognition models on demographic groups that are likely to be underrepresented.</abstract>
       <url hash="e4f6689c">2022.computel-1.3</url>
       <bibkey>siminyu-etal-2022-corpus</bibkey>
+      <doi>10.18653/v1/2022.computel-1.3</doi>
     </paper>
     <paper id="4">
       <title><fixed-case>CLD</fixed-case>² Language Documentation Meets Natural Language Processing for Revitalising Endangered Languages</title>
@@ -63,6 +66,7 @@
       <abstract>Language revitalisation should not be understood as a direct outcome of language documentation, which is mainly focused on the creation of language repositories. Natural language processing (NLP) offers the potential to complement and exploit these repositories through the development of language technologies that may contribute to improving the vitality status of endangered languages. In this paper, we discuss the current state of the interaction between language documentation and computational linguistics, present a diagnosis of how the outputs of recent documentation projects for endangered languages are underutilised for the NLP community, and discuss how the situation could change from both the documentary linguistics and NLP perspectives. All this is introduced as a bridging paradigm dubbed as Computational Language Documentation and Development (CLD²). CLD² calls for (1) the inclusion of NLP-friendly annotated data as a deliverable of future language documentation projects; and (2) the exploitation of language documentation databases by the NLP community to promote the computerization of endangered languages, as one way to contribute to their revitalization.</abstract>
       <url hash="9b4b504e">2022.computel-1.4</url>
       <bibkey>zariquiey-etal-2022-cld2</bibkey>
+      <doi>10.18653/v1/2022.computel-1.4</doi>
     </paper>
     <paper id="5">
       <title>One Wug, Two Wug+s Transformer Inflection Models Hallucinate Affixes</title>
@@ -72,6 +76,7 @@
       <abstract>Data augmentation strategies are increasingly important in NLP pipelines for low-resourced and endangered languages, and in neural morphological inflection, augmentation by so called data hallucination is a popular technique. This paper presents a detailed analysis of inflection models trained with and without data hallucination for the low-resourced Canadian Indigenous language Gitksan. Our analysis reveals evidence for a concatenative inductive bias in augmented models—in contrast to models trained without hallucination, they strongly prefer affixing inflection patterns over suppletive ones. We find that preference for affixation in general improves inflection performance in “wug test” like settings, where the model is asked to inflect lexemes missing from the training set. However, data hallucination dramatically reduces prediction accuracy for reduplicative forms due to a misanalysis of reduplication as affixation. While the overall impact of data hallucination for unseen lexemes remains positive, our findings call for greater qualitative analysis and more varied evaluation conditions in testing automatic inflection systems. Our results indicate that further innovations in data augmentation for computational morphology are desirable.</abstract>
       <url hash="cd3f32a8">2022.computel-1.5</url>
       <bibkey>samir-silfverberg-2022-one</bibkey>
+      <doi>10.18653/v1/2022.computel-1.5</doi>
     </paper>
     <paper id="6">
       <title>Automated speech tools for helping communities process restricted-access corpora for language revival efforts</title>
@@ -88,6 +93,7 @@
       <abstract>Many archival recordings of speech from endangered languages remain unannotated and inaccessible to community members and language learning programs. One bottleneck is the time-intensive nature of annotation. An even narrower bottleneck occurs for recordings with access constraints, such as language that must be vetted or filtered by authorised community members before annotation can begin. We propose a privacy-preserving workflow to widen both bottlenecks for recordings where speech in the endangered language is intermixed with a more widely-used language such as English for meta-linguistic commentary and questions (e.g.What is the word for ‘tree’?). We integrate voice activity detection (VAD), spoken language identification (SLI), and automatic speech recognition (ASR) to transcribe the metalinguistic content, which an authorised person can quickly scan to triage recordings that can be annotated by people with lower levels of access. We report work-in-progress processing 136 hours archival audio containing a mix of English and Muruwari. Our collaborative work with the Muruwari custodian of the archival materials show that this workflow reduces metalanguage transcription time by 20% even given only minimal amounts of annotated training data, 10 utterances per language for SLI and for ASR at most 39 minutes, and possibly as little as 39 seconds.</abstract>
       <url hash="3030d588">2022.computel-1.6</url>
       <bibkey>san-etal-2022-automated</bibkey>
+      <doi>10.18653/v1/2022.computel-1.6</doi>
     </paper>
     <paper id="7">
       <title><fixed-case>G</fixed-case><tex-math>_i</tex-math>2<fixed-case>P</fixed-case><tex-math>_i</tex-math> Rule-based, index-preserving grapheme-to-phoneme transformations</title>
@@ -105,6 +111,7 @@
       <abstract>This paper describes the motivation and implementation details for a rule-based, index-preserving grapheme-to-phoneme engine ‘G<tex-math>_i</tex-math>2P<tex-math>_i</tex-math>' implemented in pure Python and released under the open source MIT license. The engine and interface have been designed to prioritize the developer experience of potential contributors without requiring a high level of programming knowledge. ‘G<tex-math>_i</tex-math>2P<tex-math>_i</tex-math>' already provides mappings for 30 (mostly Indigenous) languages, and the package is accompanied by a web-based interactive development environment, a RESTful API, and extensive documentation to encourage the addition of more mappings in the future. We also present three downstream applications of ‘G<tex-math>_i</tex-math>2P<tex-math>_i</tex-math>' and show results of a preliminary evaluation.</abstract>
       <url hash="92d5a06e">2022.computel-1.7</url>
       <bibkey>pine-etal-2022-gi22pi</bibkey>
+      <doi>10.18653/v1/2022.computel-1.7</doi>
     </paper>
     <paper id="8">
       <title>Shallow Parsing for <fixed-case>N</fixed-case>epal <fixed-case>B</fixed-case>hasa Complement Clauses</title>
@@ -115,6 +122,7 @@
       <abstract>Accelerating the process of data collection, annotation, and analysis is an urgent need for linguistic fieldwork and documentation of endangered languages (Bird, 2009). Our experiments describe how we maximize the quality for the Nepal Bhasa syntactic complement structure chunking model. Native speaker language consultants were trained to annotate a minimally selected raw data set (Suárez et al.,2019). The embedded clauses, matrix verbs, and embedded verbs are annotated. We apply both statistical training algorithms and transfer learning in our training, including Naive Bayes, MaxEnt, and fine-tuning the pre-trained mBERT model (Devlin et al., 2018). We show that with limited annotated data, the model is already sufficient for the task. The modeling resources we used are largely available for many other endangered languages. The practice is easy to duplicate for training a shallow parser for other endangered languages in general.</abstract>
       <url hash="d92bde43">2022.computel-1.8</url>
       <bibkey>zhang-etal-2022-shallow</bibkey>
+      <doi>10.18653/v1/2022.computel-1.8</doi>
     </paper>
     <paper id="9">
       <title>Using <fixed-case>LARA</fixed-case> to create image-based and phonetically annotated multimodal texts for endangered languages</title>
@@ -131,6 +139,7 @@
       <abstract>We describe recent extensions to the open source Learning And Reading Assistant (LARA) supporting image-based and phonetically annotated texts. We motivate the utility of these extensions both in general and specifically in relation to endangered and archaic languages, and illustrate with examples from the revived Australian language Barngarla, Icelandic Sign Language, Irish Gaelic, Old Norse manuscripts and Egyptian hieroglyphics.</abstract>
       <url hash="102b52f9">2022.computel-1.9</url>
       <bibkey>bedi-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.computel-1.9</doi>
     </paper>
     <paper id="10">
       <title>Recovering Text from Endangered Languages Corrupted <fixed-case>PDF</fixed-case> documents</title>
@@ -139,6 +148,7 @@
       <abstract>In this paper we present an approach to efficiently recover texts from corrupted documents of endangered languages. Textual resources for such languages are scarce, and sometimes the few available resources are corrupted PDF documents. Endangered languages are not supported by standard tools and present even the additional difficulties of not possessing any corpus over which to train language models to assist with the recovery. The approach presented is able to fully recover born digital PDF documents with minimal effort, thereby helping the preservation effort of endangered languages, by extending the range of documents usable for corpus building.</abstract>
       <url hash="0c309ab4">2022.computel-1.10</url>
       <bibkey>stefanovitch-2022-recovering</bibkey>
+      <doi>10.18653/v1/2022.computel-1.10</doi>
     </paper>
     <paper id="11">
       <title>Learning Through Transcription</title>
@@ -148,6 +158,7 @@
       <abstract>Transcribing speech for primarily oral, local languages is often a joint effort involving speakers and outsiders. It is commonly motivated by externally-defined scientific goals, alongside local motivations such as language acquisition and access to heritage materials. We explore the task of ‘learning through transcription’ through the design of a system for collaborative speech annotation. We have developed a prototype to support local and remote learner-speaker interactions in remote Aboriginal communities in northern Australia. We show that situated systems design for inclusive non-expert practice is a promising new direction for working with speakers of local languages.</abstract>
       <url hash="d59845f3">2022.computel-1.11</url>
       <bibkey>bettinson-bird-2022-learning</bibkey>
+      <doi>10.18653/v1/2022.computel-1.11</doi>
     </paper>
     <paper id="12">
       <title>Developing a Part-Of-Speech tagger for te reo <fixed-case>M</fixed-case>āori</title>
@@ -160,6 +171,7 @@
       <abstract>This paper discusses the development of a Part-of-Speech tagger for te reo Māori which is the Indigenous language of Aotearoa, also known as New Zealand, see Morrison. Henceforth, Part-of-Speech will be referred to as POS throughout this paper and te reo Māori will be referred to as Māori, while Universal Dependencies will be referred to as UD. Prior to the development of this tagger, there was no POS tagger for Māori from Aotearoa. POS taggers tag words according to their syntactic or grammatical category. However, many traditional syntactic categories, and by consequence POS labels, do not “work for” Māori. By this we mean that, for some of the traditional categories, The definition of, or guidelines for, an existing category is not suitable for Māori. They do not have an existing category for certain word classes of Māori. They do not reflect a Māori worldview of the Māori language. We wanted a tagset that is usable with industry-wide tools, but we also needed a tagset that would meet the needs of Māori. Therefore, we based our tagset and guidelines on the UD tagset and tagging conventions, however the categorization of words has been significantly altered to be appropriate for Māori. This is because at the time of development of our POS tagger, the UD conventions had still not been used to tag a Polyneisan language such as Māori, nor did it provide any guidelines about how to tag them. To that end, we worked with highly-proficient, specially-selected Māori speakers and linguists who are specialists in Māori. This has ensured that our POS labels and guidelines conventions faithfully reflect a Māori speaker’s conceptualization of their language.</abstract>
       <url hash="d681e3bb">2022.computel-1.12</url>
       <bibkey>finn-etal-2022-developing</bibkey>
+      <doi>10.18653/v1/2022.computel-1.12</doi>
     </paper>
     <paper id="13">
       <title>Challenges and Perspectives for Innu-Aimun within Indigenous Language Technologies</title>
@@ -171,6 +183,7 @@
       <abstract>Innu-Aimun is an Algonquian language spoken in Eastern Canada. It is the language of the Innu, an Indigenous people that now lives for the most part in a dozen communities across Quebec and Labrador. Although it is alive, Innu-Aimun sees important preservation and revitalization challenges and issues. The state of its technology is still nascent, with very few existing applications. This paper proposes a first survey of the available linguistic resources and existing technology for Innu-Aimun. Considering the existing linguistic and textual resources, we argue that developing language technology is feasible and propose first steps towards NLP applications like machine translation. The goal of developing such technologies is first and foremost to help efforts in improving language transmission and cultural safety and preservation for Innu-Aimun speakers, as those are considered urgent and vital issues. Finally, we discuss the importance of close collaboration and consultation with the Innu community in order to ensure that language technologies are developed respectfully and in accordance with that goal.</abstract>
       <url hash="a5dcba83">2022.computel-1.13</url>
       <bibkey>cadotte-etal-2022-challenges</bibkey>
+      <doi>10.18653/v1/2022.computel-1.13</doi>
     </paper>
     <paper id="14">
       <title>Using Speech and <fixed-case>NLP</fixed-case> Resources to build an i<fixed-case>CALL</fixed-case> platform for a minority language, the story of An Scéalaí, the <fixed-case>I</fixed-case>rish experience to date</title>
@@ -184,6 +197,7 @@
       <abstract>This paper describes how emerging linguistic resources and technologies can be used to build a language learning platform for Irish, an endangered language. This platform, An Scéalaí, harvests learner corpora - a vital resource both to study the stages of learners’ language acquisition and to guide future platform development. A technical description of the platform is provided, including details of how different speech technologies and linguistic resources are fused to provide a holistic learner experience. The active continuous participation of the community, and platform evaluations by learners and teachers, are discussed.</abstract>
       <url hash="236847a6">2022.computel-1.14</url>
       <bibkey>ni-chiarain-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.computel-1.14</doi>
     </paper>
     <paper id="15">
       <title>Closing the <fixed-case>NLP</fixed-case> Gap: Documentary Linguistics and <fixed-case>NLP</fixed-case> Need a Shared Software Infrastructure</title>
@@ -192,6 +206,7 @@
       <abstract>For decades, researchers in natural language processing and computational linguistics have been developing models and algorithms that aim to serve the needs of language documentation projects. However, these models have seen little use in language documentation despite their great potential for making documentary linguistic artefacts better and easier to produce. In this work, we argue that a major reason for this NLP gap is the lack of a strong foundation of application software which can on the one hand serve the complex needs of language documentation and on the other hand provide effortless integration with NLP models. We further present and describe a work-in-progress system we have developed to serve this need, Glam.</abstract>
       <url hash="462307c9">2022.computel-1.15</url>
       <bibkey>gessler-2022-closing</bibkey>
+      <doi>10.18653/v1/2022.computel-1.15</doi>
     </paper>
     <paper id="16">
       <title>Can We Use Word Embeddings for Enhancing <fixed-case>G</fixed-case>uarani-<fixed-case>S</fixed-case>panish Machine Translation?</title>
@@ -203,6 +218,7 @@
       <url hash="a57b806c">2022.computel-1.16</url>
       <bibkey>gongora-etal-2022-use</bibkey>
       <pwccode url="https://github.com/sgongora27/Guarani-embeddings-for-MT" additional="false">sgongora27/Guarani-embeddings-for-MT</pwccode>
+      <doi>10.18653/v1/2022.computel-1.16</doi>
     </paper>
     <paper id="17">
       <title>Faoi Gheasa an adaptive game for <fixed-case>I</fixed-case>rish language learning</title>
@@ -213,6 +229,7 @@
       <abstract>In this paper, we present a game with a purpose (GWAP) (Von Ahn 2006). The aim of the game is to promote language learning and ‘noticing’ (Skehan, 2013). The game has been designed for Irish, but the framework could be used for other languages. Irish is a minority language which means that L2 learners have limited opportunities for exposure to the language, and additionally, there are also limited (digital) learning resources available. This research incorporates game development, language pedagogy and ICALL language materials development. This paper will focus on the language materials development as this is a bottleneck in the teaching and learning of minority and endangered languages.</abstract>
       <url hash="164d49e2">2022.computel-1.17</url>
       <bibkey>xu-etal-2022-faoi</bibkey>
+      <doi>10.18653/v1/2022.computel-1.17</doi>
     </paper>
     <paper id="18">
       <title>Using Graph-Based Methods to Augment Online Dictionaries of Endangered Languages</title>
@@ -224,6 +241,7 @@
       <abstract>Many endangered Uralic languages have multilingual machine readable dictionaries saved in an XML format. However, the dictionaries cover translations very inconsistently between language pairs, for instance, the Livonian dictionary has some translations to Finnish, Latvian and Estonian, and the Komi-Zyrian dictionary has some translations to Finnish, English and Russian. We utilize graph-based approaches to augment such dictionaries by predicting new translations to existing and new languages based on different dictionaries for endangered languages and Wiktionaries. Our study focuses on the lexical resources for Komi-Zyrian (kpv), Erzya (myv) and Livonian (liv). We evaluate our approach by human judges fluent in the three endangered languages in question. Based on the evaluation, the method predicted good or acceptable translations 77% of the time. Furthermore, we train a neural prediction model to predict the quality of the automatically predicted translations with an 81% accuracy. The resulting extensions to the dictionaries are made available on the online dictionary platform used by the speakers of these languages.</abstract>
       <url hash="3af3c6fa">2022.computel-1.18</url>
       <bibkey>alnajjar-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.computel-1.18</doi>
     </paper>
     <paper id="19">
       <title>Reusing a Multi-lingual Setup to Bootstrap a Grammar Checker for a Very Low Resource Language without Data</title>
@@ -234,6 +252,7 @@
       <abstract>Grammar checkers (GEC) are needed for digital language survival. Very low resource languages like Lule Sámi with less than 3,000 speakers need to hurry to build these tools, but do not have the big corpus data that are required for the construction of machine learning tools. We present a rule-based tool and a workflow where the work done for a related language can speed up the process. We use an existing grammar to infer rules for the new language, and we do not need a large gold corpus of annotated grammar errors, but a smaller corpus of regression tests is built while developing the tool. We present a test case for Lule Sámi reusing resources from North Sámi, show how we achieve a categorisation of the most frequent errors, and present a preliminary evaluation of the system. We hope this serves as an inspiration for small languages that need advanced tools in a limited amount of time, but do not have big data.</abstract>
       <url hash="e9e8ccc9">2022.computel-1.19</url>
       <bibkey>lill-sigga-mikkelsen-etal-2022-reusing</bibkey>
+      <doi>10.18653/v1/2022.computel-1.19</doi>
     </paper>
     <paper id="20">
       <title>A Word-and-Paradigm Workflow for Fieldwork Annotation</title>
@@ -246,6 +265,7 @@
       <abstract>There are many challenges in morphological fieldwork annotation, it heavily relies on segmentation and feature labeling (which have both practical and theoretical drawbacks), it’s time-intensive, and the annotator needs to be linguistically trained and may still annotate things inconsistently. We propose a workflow that relies on unsupervised and active learning grounded in Word-and-Paradigm morphology (WP). Machine learning has the potential to greatly accelerate the annotation process and allow a human annotator to focus on problematic cases, while the WP approach makes for an annotation system that is word-based and relational, removing the need to make decisions about feature labeling and segmentation early in the process and allowing speakers of the language of interest to participate more actively, since linguistic training is not necessary. We present a proof-of-concept for the first step of the workflow, in a realistic fieldwork setting, annotators can process hundreds of forms per hour.</abstract>
       <url hash="0f48a981">2022.computel-1.20</url>
       <bibkey>copot-etal-2022-word</bibkey>
+      <doi>10.18653/v1/2022.computel-1.20</doi>
     </paper>
     <paper id="21">
       <title>Fine-tuning pre-trained models for Automatic Speech Recognition, experiments on a fieldwork corpus of Japhug (Trans-Himalayan family)</title>
@@ -263,6 +283,7 @@
       <abstract>This is a report on results obtained in the development of speech recognition tools intended to support linguistic documentation efforts. The test case is an extensive fieldwork corpus of Japhug, an endangered language of the Trans-Himalayan (Sino-Tibetan) family. The goal is to reduce the transcription workload of field linguists. The method used is a deep learning approach based on the language-specific tuning of a generic pre-trained representation model, XLS-R, using a Transformer architecture. We note difficulties in implementation, in terms of learning stability. But this approach brings significant improvements nonetheless. The quality of phonemic transcription is improved over earlier experiments; and most significantly, the new approach allows for reaching the stage of automatic word recognition. Subjective evaluation of the tool by the author of the training data confirms the usefulness of this approach.</abstract>
       <url hash="c8730f96">2022.computel-1.21</url>
       <bibkey>guillaume-etal-2022-fine</bibkey>
+      <doi>10.18653/v1/2022.computel-1.21</doi>
     </paper>
     <paper id="22">
       <title>Morphologically annotated corpora of Pomak</title>
@@ -280,6 +301,7 @@
       <abstract>The project XXXX is developing a platform to enable researchers of living languages to easily create and make available state-of-the-art spoken and textual annotated resources. As a case study we use Greek and Pomak, the latter being an endangered oral Slavic language of the Balkans (including Thrace/Greece). The linguistic documentation of Pomak is an ongoing work by an interdisciplinary team in close cooperation with the Pomak community of Greece. We describe our experience in the development of a Latin-based orthography and morphologically annotated text corpora of Pomak with state-of-the-art NLP technology. These resources will be made openly available on the XXXX site and the gold annotated corpora of Pomak will be made available on the Universal Dependencies treebank repository.</abstract>
       <url hash="d0591ac5">2022.computel-1.22</url>
       <bibkey>jusuf-karahoga-etal-2022-morphologically</bibkey>
+      <doi>10.18653/v1/2022.computel-1.22</doi>
     </paper>
     <paper id="23">
       <title>Enhancing Documentation of <fixed-case>H</fixed-case>upa with Automatic Speech Recognition</title>
@@ -290,6 +312,7 @@
       <abstract>This study investigates applications of automatic speech recognition (ASR) techniques to Hupa, a critically endangered Native American language from the Dene (Athabaskan) language family. Using around 9h12m of spoken data produced by one elder who is a first-language Hupa speaker, we experimented with different evaluation schemes and training settings. On average a fully connected deep neural network reached a word error rate of 35.26%. Our overall results illustrate the utility of ASR for making Hupa language documentation more accessible and usable. In addition, we found that when training acoustic models, using recordings with transcripts that were not carefully verified did not necessarily have a negative effect on model performance. This shows promise for speech corpora of indigenous languages that commonly include transcriptions produced by second-language speakers or linguists who have advanced knowledge in the language of interest.</abstract>
       <url hash="a25cfed9">2022.computel-1.23</url>
       <bibkey>liu-etal-2022-enhancing</bibkey>
+      <doi>10.18653/v1/2022.computel-1.23</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.constraint.xml b/data/xml/2022.constraint.xml
index 0cbdbf1a75..ca2e33c2ed 100644
--- a/data/xml/2022.constraint.xml
+++ b/data/xml/2022.constraint.xml
@@ -33,6 +33,7 @@
       <abstract>We present the findings of the shared task at the CONSTRAINT 2022 Workshop: Hero, Villain, and Victim: Dissecting harmful memes for Semantic role labeling of entities. The task aims to delve deeper into the domain of meme comprehension by deciphering the connotations behind the entities present in a meme. In more nuanced terms, the shared task focuses on determining the victimizing, glorifying, and vilifying intentions embedded in meme entities to explicate their connotations. To this end, we curate HVVMemes, a novel meme dataset of about 7000 memes spanning the domains of COVID-19 and US Politics, each containing entities and their associated roles: hero, villain, victim, or none. The shared task attracted 105 participants, but eventually only 6 submissions were made. Most of the successful submissions relied on fine-tuning pre-trained language and multimodal models along with ensembles. The best submission achieved an F1-score of 58.67.</abstract>
       <url hash="90ddcd2d">2022.constraint-1.1</url>
       <bibkey>sharma-etal-2022-findings</bibkey>
+      <doi>10.18653/v1/2022.constraint-1.1</doi>
     </paper>
     <paper id="2">
       <title><fixed-case>DD</fixed-case>-<fixed-case>TIG</fixed-case> at Constraint@<fixed-case>ACL</fixed-case>2022: Multimodal Understanding and Reasoning for Role Labeling of Entities in Hateful Memes</title>
@@ -47,6 +48,7 @@
       <bibkey>zhou-etal-2022-dd</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/hateful-memes">Hateful Memes</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/vcr">VCR</pwcdataset>
+      <doi>10.18653/v1/2022.constraint-1.2</doi>
     </paper>
     <paper id="3">
       <title>Are you a hero or a villain? A semantic role labelling approach for detecting harmful memes.</title>
@@ -60,6 +62,7 @@
       <abstract>Identifying good and evil through representations of victimhood, heroism, and villainy (i.e., role labeling of entities) has recently caught the research community’s interest. Because of the growing popularity of memes, the amount of offensive information published on the internet is expanding at an alarming rate. It generated a larger need to address this issue and analyze the memes for content moderation. Framing is used to show the entities engaged as heroes, villains, victims, or others so that readers may better anticipate and understand their attitudes and behaviors as characters. Positive phrases are used to characterize heroes, whereas negative terms depict victims and villains, and terms that tend to be neutral are mapped to others. In this paper, we propose two approaches to role label the entities of the meme as hero, villain, victim, or other through Named-Entity Recognition(NER), Sentiment Analysis, etc. With an F1-score of 23.855, our team secured eighth position in the Shared Task @ Constraint 2022.</abstract>
       <url hash="3539bc50">2022.constraint-1.3</url>
       <bibkey>fharook-etal-2022-hero</bibkey>
+      <doi>10.18653/v1/2022.constraint-1.3</doi>
     </paper>
     <paper id="4">
       <title>Logically at the Constraint 2022: Multimodal role labelling</title>
@@ -70,6 +73,7 @@
       <abstract>This paper describes our system for the Constraint 2022 challenge at ACL 2022, whose goal is to detect which entities are glorified, vilified or victimised, within a meme . The task should be done considering the perspective of the meme’s author. In our work, the challenge is treated as a multi-class classification task. For a given pair of a meme and an entity, we need to classify whether the entity is being referenced as Hero, a Villain, a Victim or Other. Our solution combines (ensembling) different models based on Unimodal (Text only) model and Multimodal model (Text + Images). We conduct several experiments and benchmarks different competitive pre-trained transformers and vision models in this work. Our solution, based on an ensembling method, is ranked first on the leaderboard and obtains a macro F1-score of 0.58 on test set. The code for the experiments and results are available at https://bitbucket.org/logicallydevs/constraint_2022/src/master/</abstract>
       <url hash="fd6cd88a">2022.constraint-1.4</url>
       <bibkey>kun-etal-2022-logically</bibkey>
+      <doi>10.18653/v1/2022.constraint-1.4</doi>
     </paper>
     <paper id="5">
       <title>Combining Language Models and Linguistic Information to Label Entities in Memes</title>
@@ -80,6 +84,7 @@
       <abstract>This paper describes the system we developed for the shared task ‘Hero, Villain and Victim: Dissecting harmful memes for Semantic role labelling of entities’ organised in the framework of the Second Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation (Constraint 2022). We present an ensemble approach combining transformer-based models and linguistic information, such as the presence of irony and implicit sentiment associated to the target named entities. The ensemble system obtains promising classification scores, resulting in a third place finish in the competition.</abstract>
       <url hash="b813ad58">2022.constraint-1.5</url>
       <bibkey>singh-etal-2022-combining</bibkey>
+      <doi>10.18653/v1/2022.constraint-1.5</doi>
     </paper>
     <paper id="6">
       <title>Detecting the Role of an Entity in Harmful Memes: Techniques and their Limitations</title>
@@ -93,6 +98,7 @@
       <pwccode url="https://github.com/robi56/harmful_memes_block_fusion" additional="false">robi56/harmful_memes_block_fusion</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/hateful-memes">Hateful Memes</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/hateful-memes-challenge">Hateful Memes Challenge</pwcdataset>
+      <doi>10.18653/v1/2022.constraint-1.6</doi>
     </paper>
     <paper id="7">
       <title>Fine-tuning and Sampling Strategies for Multimodal Role Labeling of Entities under Class Imbalance</title>
@@ -104,6 +110,7 @@
       <abstract>We propose our solution to the multimodal semantic role labeling task from the CONSTRAINT’22 workshop. The task aims at classifying entities in memes into classes such as “hero” and “villain”. We use several pre-trained multi-modal models to jointly encode the text and image of the memes, and implement three systems to classify the role of the entities. We propose dynamic sampling strategies to tackle the issue of class imbalance. Finally, we perform qualitative analysis on the representations of the entities.</abstract>
       <url hash="bbb7091a">2022.constraint-1.7</url>
       <bibkey>montariol-etal-2022-fine</bibkey>
+      <doi>10.18653/v1/2022.constraint-1.7</doi>
     </paper>
     <paper id="8">
       <title>Document Retrieval and Claim Verification to Mitigate <fixed-case>COVID</fixed-case>-19 Misinformation</title>
@@ -119,6 +126,7 @@
       <bibkey>sundriyal-etal-2022-document</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cord-19">CORD-19</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
+      <doi>10.18653/v1/2022.constraint-1.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>M</fixed-case>-<fixed-case>BAD</fixed-case>: A Multilabel Dataset for Detecting Aggressive Texts and Their Targets</title>
@@ -129,6 +137,7 @@
       <abstract>Recently, detection and categorization of undesired (e. g., aggressive, abusive, offensive, hate) content from online platforms has grabbed the attention of researchers because of its detrimental impact on society. Several attempts have been made to mitigate the usage and propagation of such content. However, most past studies were conducted primarily for English, where low-resource languages like Bengali remained out of the focus. Therefore, to facilitate research in this arena, this paper introduces a novel multilabel Bengali dataset (named M-BAD) containing 15650 texts to detect aggressive texts and their targets. Each text of M-BAD went through rigorous two-level annotations. At the primary level, each text is labelled as either aggressive or non-aggressive. In the secondary level, the aggressive texts have been further annotated into five fine-grained target classes: religion, politics, verbal, gender and race. Baseline experiments are carried out with different machine learning (ML), deep learning (DL) and transformer models, where Bangla-BERT acquired the highest weighted <tex-math>f_1</tex-math>-score in both detection (0.92) and target identification (0.83) tasks. Error analysis of the models exhibits the difficulty to identify context-dependent aggression, and this work argues that further research is required to address these issues.</abstract>
       <url hash="cc16d6f4">2022.constraint-1.9</url>
       <bibkey>sharif-etal-2022-bad</bibkey>
+      <doi>10.18653/v1/2022.constraint-1.9</doi>
     </paper>
     <paper id="10">
       <title>How does fake news use a thumbnail? <fixed-case>CLIP</fixed-case>-based Multimodal Detection on the Unrepresentative News Image</title>
@@ -141,6 +150,7 @@
       <url hash="b659417f">2022.constraint-1.10</url>
       <bibkey>choi-etal-2022-fake</bibkey>
       <pwccode url="https://github.com/ssu-humane/fake-news-thumbnail" additional="false">ssu-humane/fake-news-thumbnail</pwccode>
+      <doi>10.18653/v1/2022.constraint-1.10</doi>
     </paper>
     <paper id="11">
       <title>Detecting False Claims in Low-Resource Regions: A Case Study of Caribbean Islands</title>
@@ -153,6 +163,7 @@
       <url hash="09884df8">2022.constraint-1.11</url>
       <bibkey>lucas-etal-2022-detecting</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coaid">CoAID</pwcdataset>
+      <doi>10.18653/v1/2022.constraint-1.11</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.csrr.xml b/data/xml/2022.csrr.xml
index fe8dac8214..e0b4e93554 100644
--- a/data/xml/2022.csrr.xml
+++ b/data/xml/2022.csrr.xml
@@ -34,6 +34,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/commonsenseqa">CommonsenseQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/openbookqa">OpenBookQA</pwcdataset>
+      <doi>10.18653/v1/2022.csrr-1.1</doi>
     </paper>
     <paper id="2">
       <title>Cloze Evaluation for Deeper Understanding of Commonsense Stories in <fixed-case>I</fixed-case>ndonesian</title>
@@ -45,6 +46,7 @@
       <url hash="dfe0a8c0">2022.csrr-1.2</url>
       <bibkey>koto-etal-2022-cloze</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
+      <doi>10.18653/v1/2022.csrr-1.2</doi>
     </paper>
     <paper id="3">
       <title>Psycholinguistic Diagnosis of Language Models’ Commonsense Reasoning</title>
@@ -55,6 +57,7 @@
       <bibkey>cong-2022-psycholinguistic</bibkey>
       <pwccode url="https://github.com/yancong222/pragamtics-commonsense-lms" additional="false">yancong222/pragamtics-commonsense-lms</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.csrr-1.3</doi>
     </paper>
     <paper id="4">
       <title>Bridging the Gap between Recognition-level Pre-training and Commonsensical Vision-language Tasks</title>
@@ -71,6 +74,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conceptual-captions">Conceptual Captions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/vcr">VCR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
+      <doi>10.18653/v1/2022.csrr-1.4</doi>
     </paper>
     <paper id="5">
       <title>Materialized Knowledge Bases from Commonsense Transformers</title>
@@ -82,6 +86,7 @@
       <bibkey>nguyen-razniewski-2022-materialized</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.csrr-1.5</doi>
     </paper>
     <paper id="6">
       <title>Knowledge-Augmented Language Models for Cause-Effect Relation Classification</title>
@@ -96,6 +101,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/bcopa-ce">BCOPA-CE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/copa">COPA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tcr">TCR</pwcdataset>
+      <doi>10.18653/v1/2022.csrr-1.6</doi>
     </paper>
     <paper id="7">
       <title><fixed-case>CURIE</fixed-case>: An Iterative Querying Approach for Reasoning About Situations</title>
@@ -114,6 +120,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/quartz">QuaRTz</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quarel">QuaRel</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wiqa">WIQA</pwcdataset>
+      <doi>10.18653/v1/2022.csrr-1.7</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.deelio.xml b/data/xml/2022.deelio.xml
index 029bb68a5e..516c647df5 100644
--- a/data/xml/2022.deelio.xml
+++ b/data/xml/2022.deelio.xml
@@ -24,6 +24,7 @@
       <abstract>Cross-lingual Transfer Learning typically involves training a model on a high-resource sourcelanguage and applying it to a low-resource tar-get language. In this work we introduce a lexi-cal database calledValency Patterns Leipzig(ValPal)which provides the argument patterninformation about various verb-forms in mul-tiple languages including low-resource langua-ges. We also provide a framework to integratethe ValPal database knowledge into the state-of-the-art LSTM based model for cross-lingualsemantic role labelling. Experimental resultsshow that integrating such knowledge resultedin am improvement in performance of the mo-del on all the target languages on which it isevaluated.</abstract>
       <url hash="ccf54317">2022.deelio-1.1</url>
       <bibkey>choudhary-oriordan-2022-cross</bibkey>
+      <doi>10.18653/v1/2022.deelio-1.1</doi>
     </paper>
     <paper id="2">
       <title>How Do Transformer-Architecture Models Address Polysemy of <fixed-case>K</fixed-case>orean Adverbial Postpositions?</title>
@@ -34,6 +35,7 @@
       <url hash="3a277e45">2022.deelio-1.2</url>
       <attachment type="software" hash="0dd5162f">2022.deelio-1.2.software.zip</attachment>
       <bibkey>mun-desagulier-2022-transformer</bibkey>
+      <doi>10.18653/v1/2022.deelio-1.2</doi>
     </paper>
     <paper id="3">
       <title>Query Generation with External Knowledge for Dense Retrieval</title>
@@ -52,6 +54,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/scidocs">SciDocs</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scifact">SciFact</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/simplequestions">SimpleQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.deelio-1.3</doi>
     </paper>
     <paper id="4">
       <title>Uncovering Values: Detecting Latent Moral Content from Natural Language with Explainable and Non-Trained Methods</title>
@@ -67,6 +70,7 @@
       <bibkey>asprino-etal-2022-uncovering</bibkey>
       <pwccode url="https://github.com/stendoipanni/moraldilemmas" additional="false">stendoipanni/moraldilemmas</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dbpedia">DBpedia</pwcdataset>
+      <doi>10.18653/v1/2022.deelio-1.4</doi>
     </paper>
     <paper id="5">
       <title>Jointly Identifying and Fixing Inconsistent Readings from Information Extraction Systems</title>
@@ -79,6 +83,7 @@
       <bibkey>padia-etal-2022-jointly</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tacred">TACRED</pwcdataset>
+      <doi>10.18653/v1/2022.deelio-1.5</doi>
     </paper>
     <paper id="6">
       <title><fixed-case>KIQA</fixed-case>: Knowledge-Infused Question Answering Model for Financial Table-Text Data</title>
@@ -89,6 +94,7 @@
       <abstract>While entity retrieval models continue to advance their capabilities, our understanding of their wide-ranging applications is limited, especially in domain-specific settings. We highlighted this issue by using recent general-domain entity-linking models, LUKE and GENRE, to inject external knowledge into a question-answering (QA) model for a financial QA task with a hybrid tabular-textual dataset. We found that both models improved the baseline model by 1.57% overall and 8.86% on textual data. Nonetheless, the challenge remains as they still struggle to handle tabular inputs. We subsequently conducted a comprehensive attention-weight analysis, revealing how LUKE utilizes external knowledge supplied by GENRE. The analysis also elaborates how the injection of symbolic knowledge can be helpful and what needs further improvement, paving the way for future research on this challenging QA task and advancing our understanding of how a language model incorporates external knowledge.</abstract>
       <url hash="2993c9a8">2022.deelio-1.6</url>
       <bibkey>nararatwong-etal-2022-kiqa</bibkey>
+      <doi>10.18653/v1/2022.deelio-1.6</doi>
     </paper>
     <paper id="7">
       <title>Trans-<fixed-case>KBLSTM</fixed-case>: An External Knowledge Enhanced Transformer <fixed-case>B</fixed-case>i<fixed-case>LSTM</fixed-case> Model for Tabular Reasoning</title>
@@ -101,6 +107,7 @@
       <bibkey>varun-etal-2022-trans</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.deelio-1.7</doi>
     </paper>
     <paper id="8">
       <title>Fast Few-shot Debugging for <fixed-case>NLU</fixed-case> Test Suites</title>
@@ -113,6 +120,7 @@
       <bibkey>malon-etal-2022-fast</bibkey>
       <pwccode url="https://github.com/necla-ml/debug-test-suites" additional="false">necla-ml/debug-test-suites</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.deelio-1.8</doi>
     </paper>
     <paper id="9">
       <title>On Masked Language Models for Contextual Link Prediction</title>
@@ -123,6 +131,7 @@
       <abstract>In the real world, many relational facts require context; for instance, a politician holds a given elected position only for a particular timespan. This context (the timespan) is typically ignored in knowledge graph link prediction tasks, or is leveraged by models designed specifically to make use of it (i.e. n-ary link prediction models). Here, we show that the task of n-ary link prediction is easily performed using language models, applied with a basic method for constructing cloze-style query sentences. We introduce a pre-training methodology based around an auxiliary entity-linked corpus that outperforms other popular pre-trained models like BERT, even with a smaller model. This methodology also enables n-ary link prediction without access to any n-ary training set, which can be invaluable in circumstances where expensive and time-consuming curation of n-ary knowledge graphs is not feasible. We achieve state-of-the-art performance on the primary n-ary link prediction dataset WD50K and on WikiPeople facts that include literals - typically ignored by knowledge graph embedding methods.</abstract>
       <url hash="fdbdefe5">2022.deelio-1.9</url>
       <bibkey>brayne-etal-2022-masked</bibkey>
+      <doi>10.18653/v1/2022.deelio-1.9</doi>
     </paper>
     <paper id="10">
       <title>What Makes Good In-Context Examples for <fixed-case>GPT</fixed-case>-3?</title>
@@ -144,6 +153,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.deelio-1.10</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.dialdoc.xml b/data/xml/2022.dialdoc.xml
index 24ca4287e8..8bf38b5cde 100644
--- a/data/xml/2022.dialdoc.xml
+++ b/data/xml/2022.dialdoc.xml
@@ -27,6 +27,7 @@
       <url hash="a5065568">2022.dialdoc-1.1</url>
       <bibkey>feng-etal-2022-msamsum</bibkey>
       <pwccode url="https://github.com/xcfcode/msamsum" additional="false">xcfcode/msamsum</pwccode>
+      <doi>10.18653/v1/2022.dialdoc-1.1</doi>
     </paper>
     <paper id="2">
       <title><fixed-case>U</fixed-case>ni<fixed-case>DS</fixed-case>: A Unified Dialogue System for Chit-Chat and Task-oriented Dialogues</title>
@@ -43,6 +44,7 @@
       <abstract>With the advances in deep learning, tremendous progress has been made with chit-chat dialogue systems and task-oriented dialogue systems. However, these two systems are often tackled separately in current methods. To achieve more natural interaction with humans, dialogue systems need to be capable of both chatting and accomplishing tasks. To this end, we propose a unified dialogue system (UniDS) with the two aforementioned skills. In particular, we design a unified dialogue data schema, compatible for both chit-chat and task-oriented dialogues. Besides, we propose a two-stage training method to train UniDS based on the unified dialogue data schema. UniDS does not need to adding extra parameters to existing chit-chat dialogue systems. Experimental results demonstrate that the proposed UniDS works comparably well as the state-of-the-art chit-chat dialogue systems and task-oriented dialogue systems. More importantly, UniDS achieves better robustness than pure dialogue systems and satisfactory switch ability between two types of dialogues.</abstract>
       <url hash="7505eaa1">2022.dialdoc-1.2</url>
       <bibkey>zhao-etal-2022-unids</bibkey>
+      <doi>10.18653/v1/2022.dialdoc-1.2</doi>
     </paper>
     <paper id="3">
       <title>Low-Resource Adaptation of Open-Domain Generative Chatbots</title>
@@ -57,6 +59,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/blended-skill-talk">Blended Skill Talk</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/convai2">ConvAI2</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qrecc">QReCC</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.3</doi>
     </paper>
     <paper id="4">
       <title>Pseudo Ambiguous and Clarifying Questions Based on Sentence Structures Toward Clarifying Question Answering System</title>
@@ -70,6 +73,7 @@
       <url hash="681bc19b">2022.dialdoc-1.4</url>
       <bibkey>nakano-etal-2022-pseudo</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/hotpotqa">HotpotQA</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.4</doi>
     </paper>
     <paper id="5">
       <title>Parameter-Efficient Abstractive Question Answering over Tables or Text</title>
@@ -82,6 +86,7 @@
       <bibkey>pal-etal-2022-parameter</bibkey>
       <pwccode url="https://github.com/kolk/pea-qa" additional="false">kolk/pea-qa</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/narrativeqa">NarrativeQA</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.5</doi>
     </paper>
     <paper id="6">
       <title>Conversation- and Tree-Structure Losses for Dialogue Disentanglement</title>
@@ -93,6 +98,7 @@
       <abstract>When multiple conversations occur simultaneously, a listener must decide which conversation each utterance is part of in order to interpret and respond to it appropriately. This task is referred as dialogue disentanglement. A significant drawback of previous studies on disentanglement lies in that they only focus on pair-wise relationships between utterances while neglecting the conversation structure which is important for conversation structure modeling. In this paper, we propose a hierarchical model, named Dialogue BERT (DIALBERT), which integrates the local and global semantics in the context range by using BERT to encode each message-pair and using BiLSTM to aggregate the chronological context information into the output of BERT. In order to integrate the conversation structure information into the model, two types of loss of conversation-structure loss and tree-structure loss are designed. In this way, our model can implicitly learn and leverage the conversation structures without being restricted to the lack of explicit access to such structures during the inference stage. Experimental results on two large datasets show that our method outperforms previous methods by substantial margins, achieving great performance on dialogue disentanglement.</abstract>
       <url hash="7072f64e">2022.dialdoc-1.6</url>
       <bibkey>li-etal-2022-conversation</bibkey>
+      <doi>10.18653/v1/2022.dialdoc-1.6</doi>
     </paper>
     <paper id="7">
       <title>Conversational Search with Mixed-Initiative - Asking Good Clarification Questions backed-up by Passage Retrieval</title>
@@ -104,6 +110,7 @@
       <abstract>We deal with the scenario of conversational search, where user queries are under-specified or ambiguous. This calls for a mixed-initiative setup. User-asks (queries) and system-answers, as well as system-asks (clarification questions) and user response, in order to clarify her information needs. We focus on the task of selecting the next clarification question, given conversation context. Our method leverages passage retrieval from background content to fine-tune two deep-learning models for ranking candidate clarification questions. We evaluated our method on two different use-cases. The first is an open domain conversational search in a large web collection. The second is a task-oriented customer-support setup.We show that our method performs well on both use-cases.</abstract>
       <url hash="f71bd331">2022.dialdoc-1.7</url>
       <bibkey>mass-etal-2022-conversational</bibkey>
+      <doi>10.18653/v1/2022.dialdoc-1.7</doi>
     </paper>
     <paper id="8">
       <title>Graph-combined Coreference Resolution Methods on Conversational Machine Reading Comprehension with Pre-trained Language Model</title>
@@ -115,6 +122,7 @@
       <bibkey>wang-komatani-2022-graph</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/canard">CANARD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/coqa">CoQA</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.8</doi>
     </paper>
     <paper id="9">
       <title>Construction of Hierarchical Structured Knowledge-based Recommendation Dialogue Dataset and Dialogue System</title>
@@ -127,6 +135,7 @@
       <bibkey>kodama-etal-2022-construction</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/kdconv">KdConv</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.9</doi>
     </paper>
     <paper id="10">
       <title>Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters</title>
@@ -144,6 +153,7 @@
       <bibkey>xu-etal-2022-retrieval</bibkey>
       <pwccode url="https://github.com/hltchkust/knowexpert" additional="false">hltchkust/knowexpert</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.10</doi>
     </paper>
     <paper id="11">
       <title>G4: Grounding-guided Goal-oriented Dialogues Generation with Multiple Documents</title>
@@ -157,6 +167,7 @@
       <url hash="10c8265b">2022.dialdoc-1.11</url>
       <bibkey>zhang-etal-2022-g4</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/multidoc2dial">MultiDoc2Dial</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.11</doi>
     </paper>
     <paper id="12">
       <title><fixed-case>U</fixed-case><fixed-case>G</fixed-case>ent-<fixed-case>T2K</fixed-case> at the 2nd <fixed-case>D</fixed-case>ial<fixed-case>D</fixed-case>oc Shared Task: A Retrieval-Focused Dialog System Grounded in Multiple Documents</title>
@@ -172,6 +183,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial-1">Doc2Dial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multidoc2dial">MultiDoc2Dial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial">doc2dial</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.12</doi>
     </paper>
     <paper id="13">
       <title>Grounded Dialogue Generation with Cross-encoding Re-ranker, Grounding Span Prediction, and Passage Dropout</title>
@@ -186,6 +198,7 @@
       <abstract>MultiDoc2Dial presents an important challenge on modeling dialogues grounded with multiple documents. This paper proposes a pipeline system of “retrieve, re-rank, and generate”, where each component is individually optimized. This enables the passage re-ranker and response generator to fully exploit training with ground-truth data. Furthermore, we use a deep cross-encoder trained with localized hard negative passages from the retriever. For the response generator, we use grounding span prediction as an auxiliary task to be jointly trained with the main task of response generation. We also adopt a passage dropout and regularization technique to improve response generation performance. Experimental results indicate that the system clearly surpasses the competitive baseline and our team CPII-NLP ranked 1st among the public submissions on ALL four leaderboards based on the sum of F1, SacreBLEU, METEOR and RougeL scores.</abstract>
       <url hash="804578aa">2022.dialdoc-1.13</url>
       <bibkey>li-etal-2022-grounded</bibkey>
+      <doi>10.18653/v1/2022.dialdoc-1.13</doi>
     </paper>
     <paper id="14">
       <title>A Knowledge storage and semantic space alignment Method for Multi-documents dialogue generation</title>
@@ -200,6 +213,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/coqa">CoQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multidoc2dial">MultiDoc2Dial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.14</doi>
     </paper>
     <paper id="15">
       <title>Improving Multiple Documents Grounded Goal-Oriented Dialog Systems via Diverse Knowledge Enhanced Pretrained Language Model</title>
@@ -216,6 +230,7 @@
       <bibkey>jang-etal-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coqa">CoQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multidoc2dial">MultiDoc2Dial</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.15</doi>
     </paper>
     <paper id="16">
       <title>Docalog: Multi-document Dialogue System using Transformer-based Span Retrieval</title>
@@ -234,6 +249,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/multidoc2dial">MultiDoc2Dial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial">doc2dial</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.16</doi>
     </paper>
     <paper id="17">
       <title>R3 : Refined Retriever-Reader pipeline for Multidoc2dial</title>
@@ -256,6 +272,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial">doc2dial</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.17</doi>
     </paper>
     <paper id="18">
       <title><fixed-case>D</fixed-case>ial<fixed-case>D</fixed-case>oc 2022 Shared Task: Open-Book Document-grounded Dialogue Modeling</title>
@@ -269,6 +286,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial-1">Doc2Dial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multidoc2dial">MultiDoc2Dial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial">doc2dial</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.18</doi>
     </paper>
     <paper id="19">
       <title><fixed-case>TRUE</fixed-case>: Re-evaluating Factual Consistency Evaluation</title>
@@ -292,6 +310,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paws">PAWS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/vitaminc">VitaminC</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.19</doi>
     </paper>
     <paper id="20">
       <title>Handling Comments in Documents through Interactions</title>
@@ -301,6 +320,7 @@
       <abstract>Comments are widely used by users in collaborative documents every day. The documents’ comments enable collaborative editing and review dynamics, transforming each document into a context-sensitive communication channel. Understanding the role of comments in communication dynamics within documents is the first step towards automating their management. In this paper we propose the first ever taxonomy for different types of in-document comments based on analysis of a large scale dataset of public documents from the web. We envision that the next generation of intelligent collaborative document experiences allow interactive creation and consumption of content, there We also introduce the components necessary for developing novel tools that automate the handling of comments through natural language interaction with the documents. We identify the commands that users would use to respond to various types of comments. We train machine learning algorithms to recognize the different types of comments and assess their feasibility. We conclude by discussing some of the implications for the design of automatic document management tools.</abstract>
       <url hash="5ff2769a">2022.dialdoc-1.20</url>
       <bibkey>nouri-toxtli-2022-handling</bibkey>
+      <doi>10.18653/v1/2022.dialdoc-1.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>T</fixed-case>ask2<fixed-case>D</fixed-case>ial: A Novel Task and Dataset for Commonsense-enhanced Task-based Dialogue Grounded in Documents</title>
@@ -313,6 +333,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/coqa">CoQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial-1">Doc2Dial</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/doc2dial">doc2dial</pwcdataset>
+      <doi>10.18653/v1/2022.dialdoc-1.21</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.dravidianlangtech.xml b/data/xml/2022.dravidianlangtech.xml
index 4e385bc9a2..3956b1d9f4 100644
--- a/data/xml/2022.dravidianlangtech.xml
+++ b/data/xml/2022.dravidianlangtech.xml
@@ -31,6 +31,7 @@
       <url hash="bfaf020b">2022.dravidianlangtech-1.1</url>
       <bibkey>kumar-etal-2022-bert</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.1</doi>
     </paper>
     <paper id="2">
       <title>A Dataset for Detecting Humor in <fixed-case>T</fixed-case>elugu Social Media Text</title>
@@ -42,6 +43,7 @@
       <url hash="23db0d91">2022.dravidianlangtech-1.2</url>
       <bibkey>bellamkonda-etal-2022-dataset</bibkey>
       <pwccode url="https://github.com/shaswa123/telugu_humour_dataset" additional="false">shaswa123/telugu_humour_dataset</pwccode>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.2</doi>
     </paper>
     <paper id="3">
       <title><fixed-case>M</fixed-case>u<fixed-case>C</fixed-case>o<fixed-case>T</fixed-case>: Multilingual Contrastive Training for Question-Answering in Low-resource Languages</title>
@@ -56,6 +58,7 @@
       <pwccode url="https://github.com/gokulkarthik/mucot" additional="false">gokulkarthik/mucot</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/chaii-hindi-and-tamil-question-answering">ChAII - Hindi and Tamil Question Answering</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.3</doi>
     </paper>
     <paper id="4">
       <title><fixed-case>T</fixed-case>amil<fixed-case>ATIS</fixed-case>: Dataset for Task-Oriented Dialog in <fixed-case>T</fixed-case>amil</title>
@@ -67,6 +70,7 @@
       <url hash="d6f840c3">2022.dravidianlangtech-1.4</url>
       <bibkey>s-etal-2022-tamilatis</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/atis">ATIS</pwcdataset>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.4</doi>
     </paper>
     <paper id="5">
       <title><fixed-case>DE</fixed-case>-<fixed-case>ABUSE</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case> 2022: Transliteration as Data Augmentation for Abuse Detection in <fixed-case>T</fixed-case>amil</title>
@@ -78,6 +82,7 @@
       <abstract>With the rise of social media and internet, thereis a necessity to provide an inclusive space andprevent the abusive topics against any gender,race or community. This paper describes thesystem submitted to the ACL-2022 shared taskon fine-grained abuse detection in Tamil. In ourapproach we transliterated code-mixed datasetas an augmentation technique to increase thesize of the data. Using this method we wereable to rank 3rd on the task with a 0.290 macroaverage F1 score and a 0.590 weighted F1score</abstract>
       <url hash="db791b78">2022.dravidianlangtech-1.5</url>
       <bibkey>palanikumar-etal-2022-de</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.5</doi>
     </paper>
     <paper id="6">
       <title><fixed-case>UMUT</fixed-case>eam@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Emotional Analysis in <fixed-case>T</fixed-case>amil</title>
@@ -88,6 +93,7 @@
       <abstract>This working notes summarises the participation of the UMUTeam on the TamilNLP (ACL 2022) shared task concerning emotion analysis in Tamil. We participated in the two multi-classification challenges proposed with a neural network that combines linguistic features with different feature sets based on contextual and non-contextual sentence embeddings. Our proposal achieved the 1st result for the second subtask, with an f1-score of 15.1% discerning among 30 different emotions. However, our results for the first subtask were not recorded in the official leader board. Accordingly, we report our results for this subtask with the validation split, reaching a macro f1-score of 32.360%.</abstract>
       <url hash="7fed5723">2022.dravidianlangtech-1.6</url>
       <bibkey>garcia-diaz-etal-2022-umuteam</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.6</doi>
     </paper>
     <paper id="7">
       <title><fixed-case>UMUT</fixed-case>eam@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Abusive Detection in <fixed-case>T</fixed-case>amil using Linguistic Features and Transformers</title>
@@ -98,6 +104,7 @@
       <abstract>Social media has become a dangerous place as bullies take advantage of the anonymity the Internet provides to target and intimidate vulnerable individuals and groups. In the past few years, the research community has focused on developing automatic classification tools for detecting hate-speech, its variants, and other types of abusive behaviour. However, these methods are still at an early stage in low-resource languages. With the aim of reducing this barrier, the TamilNLP shared task has proposed a multi-classification challenge for Tamil written in Tamil script and code-mixed to detect abusive comments and hope-speech. Our participation consists of a knowledge integration strategy that combines sentence embeddings from BERT, RoBERTa, FastText and a subset of language-independent linguistic features. We achieved our best result in code-mixed, reaching 3rd position with a macro-average f1-score of 35%.</abstract>
       <url hash="e9a35d75">2022.dravidianlangtech-1.7</url>
       <bibkey>garcia-diaz-etal-2022-umuteam-tamilnlp</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.7</doi>
     </paper>
     <paper id="8">
       <title>hate-alert@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Ensembling Multi-Modalities for <fixed-case>T</fixed-case>amil <fixed-case>T</fixed-case>roll<fixed-case>M</fixed-case>eme Classification</title>
@@ -108,6 +115,7 @@
       <abstract>Social media platforms often act as breeding grounds for various forms of trolling or malicious content targeting users or communities. One way of trolling users is by creating memes, which in most cases unites an image with a short piece of text embedded on top of it. The situation is more complex for multilingual(e.g., Tamil) memes due to the lack of benchmark datasets and models. We explore several models to detect Troll memes in Tamil based on the shared task, “Troll Meme Classification in DravidianLangTech2022” at ACL-2022. We observe while the text-based model MURIL performs better for Non-troll meme classification, the image-based model VGG16 performs better for Troll-meme classification. Further fusing these two modalities help us achieve stable outcomes in both classes. Our fusion model achieved a 0.561 weighted average F1 score and ranked second in this task.</abstract>
       <url hash="0fd4067e">2022.dravidianlangtech-1.8</url>
       <bibkey>das-etal-2022-hate</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>J</fixed-case>udith<fixed-case>J</fixed-case>eyafreeda<fixed-case>A</fixed-case>ndrew@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022:<fixed-case>CNN</fixed-case> for Emotion Analysis in <fixed-case>T</fixed-case>amil</title>
@@ -116,6 +124,7 @@
       <abstract>Using technology for analysis of human emotion is a relatively nascent research area. There are several types of data where emotion recognition can be employed, such as - text, images, audio and video. In this paper, the focus is on emotion recognition in text data. Emotion recognition in text can be performed from both written comments and from conversations. In this paper, the dataset used for emotion recognition is a list of comments. While extensive research is being performed in this area, the language of the text plays a very important role. In this work, the focus is on the Dravidian language of Tamil. The language and its script demands an extensive pre-processing. The paper contributes to this by adapting various pre-processing methods to the Dravidian Language of Tamil. A CNN method has been adopted for the task at hand. The proposed method has achieved a comparable result.</abstract>
       <url hash="f9ff50ad">2022.dravidianlangtech-1.9</url>
       <bibkey>andrew-2022-judithjeyafreedaandrew</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.9</doi>
     </paper>
     <paper id="10">
       <title><fixed-case>MUCIC</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Abusive Comment Detection in <fixed-case>T</fixed-case>amil Language using 1<fixed-case>D</fixed-case> Conv-<fixed-case>LSTM</fixed-case></title>
@@ -128,6 +137,7 @@
       <url hash="73f7b689">2022.dravidianlangtech-1.10</url>
       <bibkey>balouchzahi-etal-2022-mucic</bibkey>
       <pwccode url="https://github.com/anushamdgowda/abusive-detection" additional="false">anushamdgowda/abusive-detection</pwccode>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.10</doi>
     </paper>
     <paper id="11">
       <title><fixed-case>CEN</fixed-case>-<fixed-case>T</fixed-case>amil@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Abusive Comment detection in <fixed-case>T</fixed-case>amil using <fixed-case>TF</fixed-case>-<fixed-case>IDF</fixed-case> and Random Kitchen Sink Algorithm</title>
@@ -140,6 +150,7 @@
       <abstract>This paper describes the approach of team CEN-Tamil used for abusive comment detection in Tamil. This task aims to identify whether a given comment contains abusive comments. We used TF-IDF with char-wb analyzers with Random Kitchen Sink (RKS) algorithm to create feature vectors and the Support Vector Machine (SVM) classifier with polynomial kernel for classification. We used this method for both Tamil and Tamil-English datasets and secured first place with an f1-score of 0.32 and seventh place with an f1-score of 0.25, respectively. The code for our approach is shared in the GitHub repository.</abstract>
       <url hash="693d1759">2022.dravidianlangtech-1.11</url>
       <bibkey>s-n-etal-2022-cen</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.11</doi>
     </paper>
     <paper id="12">
       <title><fixed-case>NITK</fixed-case>-<fixed-case>IT</fixed-case>_<fixed-case>NLP</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Transformer based model for Toxic Span Identification in <fixed-case>T</fixed-case>amil</title>
@@ -150,6 +161,7 @@
       <abstract>Toxic span identification in Tamil is a shared task that focuses on identifying harmful content, contributing to offensiveness. In this work, we have built a model that can efficiently identify the span of text contributing to offensive content. We have used various transformer-based models to develop the system, out of which the fine-tuned MuRIL model was able to achieve the best overall character F1-score of 0.4489.</abstract>
       <url hash="9855dad8">2022.dravidianlangtech-1.12</url>
       <bibkey>lekshmiammal-etal-2022-nitk</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>T</fixed-case>eam<fixed-case>X</fixed-case>@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: A Comparative Analysis for Troll-Based Meme Classification</title>
@@ -162,6 +174,7 @@
       <bibkey>nandi-etal-2022-teamx</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/hateful-memes">Hateful Memes</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/hateful-memes-challenge">Hateful Memes Challenge</pwcdataset>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.13</doi>
     </paper>
     <paper id="14">
       <title><fixed-case>GJG</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Emotion Analysis and Classification in <fixed-case>T</fixed-case>amil using Transformers</title>
@@ -172,6 +185,7 @@
       <abstract>This paper describes the systems built by our team for the “Emotion Analysis in Tamil” shared task at the Second Workshop on Speech and Language Technologies for Dravidian Languages at ACL 2022. There were two multi-class classification sub-tasks as a part of this shared task. The dataset for sub-task A contained 11 types of emotions while sub-task B was more fine-grained with 31 emotions. We fine-tuned an XLM-RoBERTa and DeBERTA base model for each sub-task. For sub-task A, the XLM-RoBERTa model achieved an accuracy of 0.46 and the DeBERTa model achieved an accuracy of 0.45. We had the best classification performance out of 11 teams for sub-task A. For sub-task B, the XLM-RoBERTa model’s accuracy was 0.33 and the DeBERTa model had an accuracy of 0.26. We ranked 2nd out of 7 teams for sub-task B.</abstract>
       <url hash="67785674">2022.dravidianlangtech-1.14</url>
       <bibkey>prasad-etal-2022-gjg</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.14</doi>
     </paper>
     <paper id="15">
       <title><fixed-case>GJG</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Using Transformers for Abusive Comment Classification in <fixed-case>T</fixed-case>amil</title>
@@ -182,6 +196,7 @@
       <abstract>This paper presents transformer-based models for the “Abusive Comment Detection” shared task at the Second Workshop on Speech and Language Technologies for Dravidian Languages at ACL 2022. Our team participated in both the multi-class classification sub-tasks as a part of this shared task. The dataset for sub-task A was in Tamil text; while B was code-mixed Tamil-English text. Both the datasets contained 8 classes of abusive comments. We trained an XLM-RoBERTa and DeBERTA base model on the training splits for each sub-task. For sub-task A, the XLM-RoBERTa model achieved an accuracy of 0.66 and the DeBERTa model achieved an accuracy of 0.62. For sub-task B, both the models achieved a classification accuracy of 0.72; however, the DeBERTa model performed better in other classification metrics. Our team ranked 2nd in the code-mixed classification sub-task and 8th in Tamil-text sub-task.</abstract>
       <url hash="bbcf8d95">2022.dravidianlangtech-1.15</url>
       <bibkey>prasad-etal-2022-gjg-tamilnlp</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.15</doi>
     </paper>
     <paper id="16">
       <title><fixed-case>IIITDWD</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Transformer-based approach to classify abusive content in <fixed-case>D</fixed-case>ravidian Code-mixed text</title>
@@ -191,6 +206,7 @@
       <abstract>Identifying abusive content or hate speech in social media text has raised the research community’s interest in recent times. The major driving force behind this is the widespread use of social media websites. Further, it also leads to identifying abusive content in low-resource regional languages, which is an important research problem in computational linguistics. As part of ACL-2022, organizers of DravidianLangTech@ACL 2022 have released a shared task on abusive category identification in Tamil and Tamil-English code-mixed text to encourage further research on offensive content identification in low-resource Indic languages. This paper presents the working notes for the model submitted by IIITDWD at DravidianLangTech@ACL 2022. Our team competed in Sub-Task B and finished in 9th place among the participating teams. In our proposed approach, we used a pre-trained transformer model such as Indic-bert for feature extraction, and on top of that, SVM classifier is used for stance detection. Further, our model achieved 62 % accuracy on code-mixed Tamil-English text.</abstract>
       <url hash="0d05868b">2022.dravidianlangtech-1.16</url>
       <bibkey>biradar-saumya-2022-iiitdwd</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.16</doi>
     </paper>
     <paper id="17">
       <title><fixed-case>PANDAS</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Emotion Analysis in <fixed-case>T</fixed-case>amil Text using Language Agnostic Embeddings</title>
@@ -204,6 +220,7 @@
       <abstract>As the world around us continues to become increasingly digital, it has been acknowledged that there is a growing need for emotion analysis of social media content. The task of identifying the emotion in a given text has many practical applications ranging from screening public health to business and management. In this paper, we propose a language-agnostic model that focuses on emotion analysis in Tamil text. Our experiments yielded an F1-score of 0.010.</abstract>
       <url hash="a06198bc">2022.dravidianlangtech-1.17</url>
       <bibkey>k-etal-2022-pandas</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.17</doi>
     </paper>
     <paper id="18">
       <title><fixed-case>PANDAS</fixed-case>@Abusive Comment Detection in <fixed-case>T</fixed-case>amil Code-Mixed Data Using Custom Embeddings with <fixed-case>L</fixed-case>a<fixed-case>BSE</fixed-case></title>
@@ -216,6 +233,7 @@
       <abstract>Abusive language has lately been prevalent in comments on various social media platforms. The increasing hostility observed on the internet calls for the creation of a system that can identify and flag such acerbic content, to prevent conflict and mental distress. This task becomes more challenging when low-resource languages like Tamil, as well as the often-observed Tamil-English code-mixed text, are involved. The approach used in this paper for the classification model includes different methods of feature extraction and the use of traditional classifiers. We propose a novel method of combining language-agnostic sentence embeddings with the TF-IDF vector representation that uses a curated corpus of words as vocabulary, to create a custom embedding, which is then passed to an SVM classifier. Our experimentation yielded an accuracy of 52% and an F1-score of 0.54.</abstract>
       <url hash="c5e342d7">2022.dravidianlangtech-1.18</url>
       <bibkey>swaminathan-etal-2022-pandas</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.18</doi>
     </paper>
     <paper id="19">
       <title>Translation Techies @<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022-Machine Translation in <fixed-case>D</fixed-case>ravidian Languages</title>
@@ -227,6 +245,7 @@
       <abstract>This paper discusses the details of submission made by team Translation Techies to the Shared Task on Machine Translation in Dravidian languages- ACL 2022. In connection to the task, five language pairs were provided to test the accuracy of submitted model. A baseline transformer model with Neural Machine Translation(NMT) technique is used which has been taken directly from the OpenNMT framework. On this baseline model, tokenization is applied using the IndicNLP library. Finally, the evaluation is performed using the BLEU scoring mechanism.</abstract>
       <url hash="5840c154">2022.dravidianlangtech-1.19</url>
       <bibkey>goyal-etal-2022-translation</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.19</doi>
     </paper>
     <paper id="20">
       <title><fixed-case>SSNCSE</fixed-case>_<fixed-case>NLP</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Transformer based approach for Emotion analysis in <fixed-case>T</fixed-case>amil language</title>
@@ -236,6 +255,7 @@
       <abstract>Emotion analysis is the process of identifying and analyzing the underlying emotions expressed in textual data. Identifying emotions from a textual conversation is a challenging task due to the absence of gestures, vocal intonation, and facial expressions. Once the chatbots and messengers detect and report the emotions of the user, a comfortable conversation can be carried out with no misunderstandings. Our task is to categorize text into a predefined notion of emotion. In this thesis, it is required to classify text into several emotional labels depending on the task. We have adopted the transformer model approach to identify the emotions present in the text sequence. Our task is to identify whether a given comment contains emotion, and the emotion it stands for. The datasets were provided to us by the LT-EDI organizers (CITATION) for two tasks, in the Tamil language. We have evaluated the datasets using the pretrained transformer models and we have obtained the micro-averaged F1 scores as 0.19 and 0.12 for Task1 and Task 2 respectively.</abstract>
       <url hash="4c2d8857">2022.dravidianlangtech-1.20</url>
       <bibkey>b-varsha-2022-ssncse</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>SSN</fixed-case>_<fixed-case>MLRG</fixed-case>1@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Troll Meme Classification in <fixed-case>T</fixed-case>amil using Transformer Models</title>
@@ -248,6 +268,7 @@
       <abstract>The ACL shared task of DravidianLangTech-2022 for Troll Meme classification is a binary classification task that involves identifying Tamil memes as troll or not-troll. Classification of memes is a challenging task since memes express humour and sarcasm in an implicit way. Team SSN_MLRG1 tested and compared results obtained by using three models namely BERT, ALBERT and XLNET. The XLNet model outperformed the other two models in terms of various performance metrics. The proposed XLNet model obtained the 3rd rank in the shared task with a weighted F1-score of 0.558.</abstract>
       <url hash="06d628a1">2022.dravidianlangtech-1.21</url>
       <bibkey>hariprasad-etal-2022-ssn</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>B</fixed-case>p<fixed-case>H</fixed-case>igh@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Effects of Data Augmentation on Indic-Transformer based classifier for Abusive Comments Detection in <fixed-case>T</fixed-case>amil</title>
@@ -256,6 +277,7 @@
       <abstract>Social Media platforms have grown their reach worldwide. As an effect of this growth, many vernacular social media platforms have also emerged, focusing more on the diverse languages in the specific regions. Tamil has also emerged as a popular language for use on social media platforms due to the increasing penetration of vernacular media like Sharechat and Moj, which focus more on local Indian languages than English and encourage their users to converse in Indic languages. Abusive language remains a significant challenge in the social media framework and more so when we consider languages like Tamil, which are low-resource languages and have poor performance on multilingual models and lack language-specific models. Based on this shared task, “Abusive Comment detection in Tamil@DravidianLangTech-ACL 2022”, we present an exploration of different techniques used to tackle and increase the accuracy of our models using data augmentation in NLP. We also show the results of these techniques.</abstract>
       <url hash="c09537c8">2022.dravidianlangtech-1.22</url>
       <bibkey>pahwa-2022-bphigh</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.22</doi>
     </paper>
     <paper id="23">
       <title><fixed-case>MUCS</fixed-case>@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech@<fixed-case>ACL</fixed-case>2022: Ensemble of Logistic Regression Penalties to Identify Emotions in <fixed-case>T</fixed-case>amil Text</title>
@@ -266,6 +288,7 @@
       <abstract>Emotion Analysis (EA) is the process of automatically analyzing and categorizing the input text into one of the predefined sets of emotions. In recent years, people have turned to social media to express their emotions, opinions or feelings about news, movies, products, services, and so on. These users’ emotions may help the public, governments, business organizations, film producers, and others in devising strategies, making decisions, and so on. The increasing number of social media users and the increasing amount of user generated text containing emotions on social media demands automated tools for the analysis of such data as handling this data manually is labor intensive and error prone. Further, the characteristics of social media data makes the EA challenging. Most of the EA research works have focused on English language leaving several Indian languages including Tamil unexplored for this task. To address the challenges of EA in Tamil texts, in this paper, we - team MUCS, describe the model submitted to the shared task on Emotion Analysis in Tamil at DravidianLangTech@ACL 2022. Out of the two subtasks in this shared task, our team submitted the model only for Task a. The proposed model comprises of an Ensemble of Logistic Regression (LR) classifiers with three penalties, namely: L1, L2, and Elasticnet. This Ensemble model trained with Term Frequency - Inverse Document Frequency (TF-IDF) of character bigrams and trigrams secured 4th rank in Task a with a macro averaged F1-score of 0.04. The code to reproduce the proposed models is available in github1.</abstract>
       <url hash="ac3b5cd3">2022.dravidianlangtech-1.23</url>
       <bibkey>hegde-etal-2022-mucs</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.23</doi>
     </paper>
     <paper id="24">
       <title><fixed-case>BPHC</fixed-case>@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022-A comparative analysis of classical and pre-trained models for troll meme classification in <fixed-case>T</fixed-case>amil</title>
@@ -277,6 +300,7 @@
       <abstract>Trolling refers to any user behaviour on the internet to intentionally provoke or instigate conflict predominantly in social media. This paper aims to classify troll meme captions in Tamil-English code-mixed form. Embeddings are obtained for raw code-mixed text and the translated and transliterated version of the text and their relative performances are compared. Furthermore, this paper compares the performances of 11 different classification algorithms using Accuracy and F1- Score. We conclude that we were able to achieve a weighted F1 score of 0.74 through MuRIL pretrained model.</abstract>
       <url hash="df670fc7">2022.dravidianlangtech-1.24</url>
       <bibkey>v-etal-2022-bphc</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.24</doi>
     </paper>
     <paper id="25">
       <title><fixed-case>SSNCSE</fixed-case> <fixed-case>NLP</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Transformer based approach for detection of abusive comment for <fixed-case>T</fixed-case>amil language</title>
@@ -286,6 +310,7 @@
       <abstract>Social media platforms along with many other public forums on the Internet have shown a significant rise in the cases of abusive behavior such as Misogynism, Misandry, Homophobia, and Cyberbullying. To tackle these concerns, technologies are being developed and applied, as it is a tedious and time-consuming task to identify, report and block these offenders. Our task was to automate the process of identifying abusive comments and classify them into appropriate categories. The datasets provided by the DravidianLangTech@ACL2022 organizers were a code-mixed form of Tamil text. We trained the datasets using pre-trained transformer models such as BERT,m-BERT, and XLNET and achieved a weighted average of F1 scores of 0.96 for Tamil-English code mixed text and 0.59 for Tamil text.</abstract>
       <url hash="81547dcb">2022.dravidianlangtech-1.25</url>
       <bibkey>b-varsha-2022-ssncse-nlp</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.25</doi>
     </paper>
     <paper id="26">
       <title><fixed-case>V</fixed-case>arsini_and_<fixed-case>K</fixed-case>irthanna@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022-Emotional Analysis in <fixed-case>T</fixed-case>amil</title>
@@ -299,6 +324,7 @@
       <abstract>In this paper, we present our system for the task of Emotion analysis in Tamil. Over 3.96 million people use these platforms to send messages formed using texts, images, videos, audio or combinations of these to express their thoughts and feelings. Text communication on social media platforms is quite overwhelming due to its enormous quantity and simplicity. The data must be processed to understand the general feeling felt by the author. We present a lexicon-based approach for the extraction emotion in Tamil texts. We use dictionaries of words labelled with their respective emotions. The process of assigning an emotional label to each text, and then capture the main emotion expressed in it. Finally, the F1-score in the official test set is 0.0300 and our method ranks 5th.</abstract>
       <url hash="40bd322b">2022.dravidianlangtech-1.26</url>
       <bibkey>s-etal-2022-varsini</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.26</doi>
     </paper>
     <paper id="27">
       <title><fixed-case>CUET</fixed-case>-<fixed-case>NLP</fixed-case>@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Investigating Deep Learning Techniques to Detect Multimodal Troll Memes</title>
@@ -311,6 +337,7 @@
       <abstract>With the substantial rise of internet usage, social media has become a powerful communication medium to convey information, opinions, and feelings on various issues. Recently, memes have become a popular way of sharing information on social media. Usually, memes are visuals with text incorporated into them and quickly disseminate hatred and offensive content. Detecting or classifying memes is challenging due to their region-specific interpretation and multimodal nature. This work presents a meme classification technique in Tamil developed by the CUET NLP team under the shared task (DravidianLangTech-ACL2022). Several computational models have been investigated to perform the classification task. This work also explored visual and textual features using VGG16, ResNet50, VGG19, CNN and CNN+LSTM models. Multimodal features are extracted by combining image (VGG16) and text (CNN, LSTM+CNN) characteristics. Results demonstrate that the textual strategy with CNN+LSTM achieved the highest weighted <tex-math>f_1</tex-math>-score (0.52) and recall (0.57). Moreover, the CNN-Text+VGG16 outperformed the other models concerning the multimodal memes detection by achieving the highest <tex-math>f_1</tex-math>-score of 0.49, but the LSTM+CNN model allowed the team to achieve <tex-math>4^{th}</tex-math> place in the shared task.</abstract>
       <url hash="406c975e">2022.dravidianlangtech-1.27</url>
       <bibkey>hasan-etal-2022-cuet</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.27</doi>
     </paper>
     <paper id="28">
       <title><fixed-case>PICT</fixed-case>@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Neural Machine Translation On <fixed-case>D</fixed-case>ravidian Languages</title>
@@ -325,6 +352,7 @@
       <bibkey>vyawahare-etal-2022-pict</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/indiccorp">IndicCorp</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/samanantar">Samanantar</pwcdataset>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.28</doi>
     </paper>
     <paper id="29">
       <title>Sentiment Analysis on Code-Switched <fixed-case>D</fixed-case>ravidian Languages with Kernel Based Extreme Learning Machines</title>
@@ -335,6 +363,7 @@
       <abstract>Code-switching refers to the textual or spoken data containing multiple languages. Application of natural language processing (NLP) tasks like sentiment analysis is a harder problem on code-switched languages due to the irregularities in the sentence structuring and ordering. This paper shows the experiment results of building a Kernel based Extreme Learning Machines(ELM) for sentiment analysis for code-switched Dravidian languages with English. Our results show that ELM performs better than traditional machine learning classifiers on various metrics as well as trains faster than deep learning models. We also show that Polynomial kernels perform better than others in the ELM architecture. We were able to achieve a median AUC of 0.79 with a polynomial kernel.</abstract>
       <url hash="d68d97c8">2022.dravidianlangtech-1.29</url>
       <bibkey>s-r-etal-2022-sentiment</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.29</doi>
     </paper>
     <paper id="30">
       <title><fixed-case>CUET</fixed-case>-<fixed-case>NLP</fixed-case>@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Exploiting Textual Features to Classify Sentiment of Multimodal Movie Reviews</title>
@@ -348,6 +377,7 @@
       <abstract>With the proliferation of internet usage, a massive growth of consumer-generated content on social media has been witnessed in recent years that provide people’s opinions on diverse issues. Through social media, users can convey their emotions and thoughts in distinctive forms such as text, image, audio, video, and emoji, which leads to the advancement of the multimodality of the content users on social networking sites. This paper presents a technique for classifying multimodal sentiment using the text modality into five categories: highly positive, positive, neutral, negative, and highly negative categories. A shared task was organized to develop models that can identify the sentiments expressed by the videos of movie reviewers in both Malayalam and Tamil languages. This work applied several machine learning techniques (LR, DT, MNB, SVM) and deep learning (BiLSTM, CNN+BiLSTM) to accomplish the task. Results demonstrate that the proposed model with the decision tree (DT) outperformed the other methods and won the competition by acquiring the highest macro <tex-math>f_1</tex-math>-score of 0.24.</abstract>
       <url hash="f62eef95">2022.dravidianlangtech-1.30</url>
       <bibkey>mustakim-etal-2022-cuet</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.30</doi>
     </paper>
     <paper id="31">
       <title><fixed-case>CUET</fixed-case>-<fixed-case>NLP</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Multi-Class Textual Emotion Detection from Social Media using Transformer</title>
@@ -361,6 +391,7 @@
       <abstract>Recently, emotion analysis has gained increased attention by NLP researchers due to its various applications in opinion mining, e-commerce, comprehensive search, healthcare, personalized recommendations and online education. Developing an intelligent emotion analysis model is challenging in resource-constrained languages like Tamil. Therefore a shared task is organized to identify the underlying emotion of a given comment expressed in the Tamil language. The paper presents our approach to classifying the textual emotion in Tamil into 11 classes: ambiguous, anger, anticipation, disgust, fear, joy, love, neutral, sadness, surprise and trust. We investigated various machine learning (LR, DT, MNB, SVM), deep learning (CNN, LSTM, BiLSTM) and transformer-based models (Multilingual-BERT, XLM-R). Results reveal that the XLM-R model outdoes all other models by acquiring the highest macro <tex-math>f_1</tex-math>-score (0.33).</abstract>
       <url hash="6b362444">2022.dravidianlangtech-1.31</url>
       <bibkey>mustakim-etal-2022-cuet-nlp</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.31</doi>
     </paper>
     <paper id="32">
       <title><fixed-case>DLRG</fixed-case>@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Abusive Comment Detection in <fixed-case>T</fixed-case>amil using Multilingual Transformer Models</title>
@@ -371,6 +402,7 @@
       <abstract>Online Social Network has let people to connect and interact with each other. It does, however, also provide a platform for online abusers to propagate abusive content. The vast majority of abusive remarks are written in a multilingual style, which allows them to easily slip past internet inspection. This paper presents a system developed for the Shared Task on Abusive Comment Detection (Misogyny, Misandry, Homophobia, Transphobic, Xenophobia, CounterSpeech, Hope Speech) in Tamil DravidianLangTech@ACL 2022 to detect the abusive category of each comment. We approach the task with three methodologies - Machine Learning, Deep Learning and Transformer-based modeling, for two sets of data - Tamil and Tamil+English language dataset. The dataset used in our system can be accessed from the competition on CodaLab. For Machine Learning, eight algorithms were implemented, among which Random Forest gave the best result with Tamil+English dataset, with a weighted average F1-score of 0.78. For Deep Learning, Bi-Directional LSTM gave best result with pre-trained word embeddings. In Transformer-based modeling, we used IndicBERT and mBERT with fine-tuning, among which mBERT gave the best result for Tamil dataset with a weighted average F1-score of 0.7.</abstract>
       <url hash="e75c6065">2022.dravidianlangtech-1.32</url>
       <bibkey>rajalakshmi-etal-2022-dlrg</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.32</doi>
     </paper>
     <paper id="33">
       <title>Aanisha@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022:Abusive Detection in <fixed-case>T</fixed-case>amil</title>
@@ -379,6 +411,7 @@
       <abstract>In social media, there are instances where people present their opinions in strong language, resorting to abusive/toxic comments.There are instances of communal hatred, hate-speech, toxicity and bullying. And, in this age of social media, it’s very important to find means to keep check on these toxic comments, as to preserve the mental peace of people in social media.While there are tools, models to detect andpotentially filter these kind of content, developing these kinds of models for the low resource language space is an issue of research.In this paper, the task of abusive comment identification in Tamil language, is seen upon as a multi-class classification problem.There are different pre-processing as well as modelling approaches discussed in this paper.The different approaches are compared on the basis of weighted average accuracy.</abstract>
       <url hash="6d849255">2022.dravidianlangtech-1.33</url>
       <bibkey>bhattacharyya-2022-aanisha</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.33</doi>
     </paper>
     <paper id="34">
       <title><fixed-case>COMBATANT</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Fine-grained Categorization of Abusive Comments using Logistic Regression</title>
@@ -391,6 +424,7 @@
       <abstract>With the widespread usage of social media and effortless internet access, millions of posts and comments are generated every minute. Unfortunately, with this substantial rise, the usage of abusive language has increased significantly in these mediums. This proliferation leads to many hazards such as cyber-bullying, vulgarity, online harassment and abuse. Therefore, it becomes a crucial issue to detect and mitigate the usage of abusive language. This work presents our system developed as part of the shared task to detect the abusive language in Tamil. We employed three machine learning (LR, DT, SVM), two deep learning (CNN+BiLSTM, CNN+BiLSTM with FastText) and a transformer-based model (Indic-BERT). The experimental results show that Logistic regression (LR) and CNN+BiLSTM models outperformed the others. Both Logistic Regression (LR) and CNN+BiLSTM with FastText achieved the weighted <tex-math>F_1</tex-math>-score of 0.39. However, LR obtained a higher recall value (0.44) than CNN+BiLSTM (0.36). This leads us to stand the <tex-math>2^{nd}</tex-math> rank in the shared task competition.</abstract>
       <url hash="b9d0f212">2022.dravidianlangtech-1.34</url>
       <bibkey>hossain-etal-2022-combatant</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.34</doi>
     </paper>
     <paper id="35">
       <title><fixed-case>O</fixed-case>ptimize_<fixed-case>P</fixed-case>rime@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Emotion Analysis in <fixed-case>T</fixed-case>amil</title>
@@ -403,6 +437,7 @@
       <abstract>This paper aims to perform an emotion analysis of social media comments in Tamil. Emotion analysis is the process of identifying the emotional context of the text. In this paper, we present the findings obtained by Team Optimize_Prime in the ACL 2022 shared task “Emotion Analysis in Tamil.” The task aimed to classify social media comments into categories of emotion like Joy, Anger, Trust, Disgust, etc. The task was further divided into two subtasks, one with 11 broad categories of emotions and the other with 31 specific categories of emotion. We implemented three different approaches to tackle this problem: transformer-based models, Recurrent Neural Networks (RNNs), and Ensemble models. XLM-RoBERTa performed the best on the first task with a macro-averaged f1 score of 0.27, while MuRIL provided the best results on the second task with a macro-averaged f1 score of 0.13.</abstract>
       <url hash="4880a9f5">2022.dravidianlangtech-1.35</url>
       <bibkey>gokhale-etal-2022-optimize</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.35</doi>
     </paper>
     <paper id="36">
       <title><fixed-case>O</fixed-case>ptimize_<fixed-case>P</fixed-case>rime@<fixed-case>D</fixed-case>ravidian<fixed-case>L</fixed-case>ang<fixed-case>T</fixed-case>ech-<fixed-case>ACL</fixed-case>2022: Abusive Comment Detection in <fixed-case>T</fixed-case>amil</title>
@@ -415,6 +450,7 @@
       <abstract>This paper tries to address the problem of abusive comment detection in low-resource indic languages. Abusive comments are statements that are offensive to a person or a group of people. These comments are targeted toward individuals belonging to specific ethnicities, genders, caste, race, sexuality, etc. Abusive Comment Detection is a significant problem, especially with the recent rise in social media users. This paper presents the approach used by our team — Optimize_Prime, in the ACL 2022 shared task “Abusive Comment Detection in Tamil.” This task detects and classifies YouTube comments in Tamil and Tamil-English Codemixed format into multiple categories. We have used three methods to optimize our results: Ensemble models, Recurrent Neural Networks, and Transformers. In the Tamil data, MuRIL and XLM-RoBERTA were our best performing models with a macro-averaged f1 score of 0.43. Furthermore, for the Code-mixed data, MuRIL and M-BERT provided sublime results, with a macro-averaged f1 score of 0.45.</abstract>
       <url hash="b7ebb9c9">2022.dravidianlangtech-1.36</url>
       <bibkey>patankar-etal-2022-optimize</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.36</doi>
     </paper>
     <paper id="37">
       <title>Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction</title>
@@ -426,6 +462,7 @@
       <url hash="764dda12">2022.dravidianlangtech-1.37</url>
       <bibkey>ravikiran-chakravarthi-2022-zero</bibkey>
       <pwccode url="https://github.com/manikandan-ravikiran/zero-shot-offensive-span" additional="false">manikandan-ravikiran/zero-shot-offensive-span</pwccode>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.37</doi>
     </paper>
     <paper id="38">
       <title><fixed-case>DLRG</fixed-case>@<fixed-case>T</fixed-case>amil<fixed-case>NLP</fixed-case>-<fixed-case>ACL</fixed-case>2022: Offensive Span Identification in <fixed-case>T</fixed-case>amil using<fixed-case>B</fixed-case>i<fixed-case>LSTM</fixed-case>-<fixed-case>CRF</fixed-case> approach</title>
@@ -439,6 +476,7 @@
       <abstract>Identifying offensive speech is an exciting andessential area of research, with ample tractionin recent times. This paper presents our sys-tem submission to the subtask 1, focusing onusing supervised approaches for extracting Of-fensive spans from code-mixed Tamil-Englishcomments. To identify offensive spans, wedeveloped the Bidirectional Long Short-TermMemory (BiLSTM) model with Glove Em-bedding. To this end, the developed systemachieved an overall F1 of 0.1728. Addition-ally, for comments with less than 30 characters,the developed system shows an F1 of 0.3890,competitive with other submissions.</abstract>
       <url hash="8dba2428">2022.dravidianlangtech-1.38</url>
       <bibkey>rajalakshmi-etal-2022-dlrg-tamilnlp</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.38</doi>
     </paper>
     <paper id="39">
       <title>Findings of the Shared Task on Multimodal Sentiment Analysis and Troll Meme Classification in <fixed-case>D</fixed-case>ravidian Languages</title>
@@ -455,6 +493,7 @@
       <abstract>This paper presents the findings of the shared task on Multimodal Sentiment Analysis and Troll meme classification in Dravidian languages held at ACL 2022. Multimodal sentiment analysis deals with the identification of sentiment from video. In addition to video data, the task requires the analysis of corresponding text and audio features for the classification of movie reviews into five classes. We created a dataset for this task in Malayalam and Tamil. The Troll meme classification task aims to classify multimodal Troll memes into two categories. This task assumes the analysis of both text and image features for making better predictions. The performance of the participating teams was analysed using the F1-score. Only one team submitted their results in the Multimodal Sentiment Analysis task, whereas we received six submissions in the Troll meme classification task. The only team that participated in the Multimodal Sentiment Analysis shared task obtained an F1-score of 0.24. In the Troll meme classification task, the winning team achieved an F1-score of 0.596.</abstract>
       <url hash="f16a9dd8">2022.dravidianlangtech-1.39</url>
       <bibkey>b-etal-2022-findings</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.39</doi>
     </paper>
     <paper id="40">
       <title>Findings of the Shared Task on Offensive Span Identification from<fixed-case>C</fixed-case>ode-Mixed <fixed-case>T</fixed-case>amil-<fixed-case>E</fixed-case>nglish Comments</title>
@@ -470,6 +509,7 @@
       <abstract>Offensive content moderation is vital in social media platforms to support healthy online discussions. However, their prevalence in code-mixed Dravidian languages is limited to classifying whole comments without identifying part of it contributing to offensiveness. Such limitation is primarily due to the lack of annotated data for offensive spans. Accordingly, in this shared task, we provide Tamil-English code-mixed social comments with offensive spans. This paper outlines the dataset so released, methods, and results of the submitted systems.</abstract>
       <url hash="0dbfadd8">2022.dravidianlangtech-1.40</url>
       <bibkey>ravikiran-etal-2022-findings</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.40</doi>
     </paper>
     <paper id="41">
       <title>Overview of the Shared Task on Machine Translation in <fixed-case>D</fixed-case>ravidian Languages</title>
@@ -485,6 +525,7 @@
       <url hash="686c25fe">2022.dravidianlangtech-1.41</url>
       <bibkey>madasamy-etal-2022-overview</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/samanantar">Samanantar</pwcdataset>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.41</doi>
     </paper>
     <paper id="42">
       <title>Findings of the Shared Task on Emotion Analysis in <fixed-case>T</fixed-case>amil</title>
@@ -505,6 +546,7 @@
       <abstract>This paper presents the overview of the shared task on emotional analysis in Tamil. The result of the shared task is presented at the workshop. This paper presents the dataset used in the shared task, task description, and the methodology used by the participants and the evaluation results of the submission. This task is organized as two Tasks. Task A is carried with 11 emotions annotated data for social media comments in Tamil and Task B is organized with 31 fine-grained emotion annotated data for social media comments in Tamil. For conducting experiments, training and development datasets were provided to the participants and results are evaluated for the unseen data. Totally we have received around 24 submissions from 13 teams. For evaluating the models, Precision, Recall, micro average metrics are used.</abstract>
       <url hash="72666233">2022.dravidianlangtech-1.42</url>
       <bibkey>sampath-etal-2022-findings</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.42</doi>
     </paper>
     <paper id="43">
       <title>Findings of the Shared Task on Multi-task Learning in <fixed-case>D</fixed-case>ravidian Languages</title>
@@ -523,6 +565,7 @@
       <abstract>We present our findings from the first shared task on Multi-task Learning in Dravidian Languages at the second Workshop on Speech and Language Technologies for Dravidian Languages. In this task, a sentence in any of three Dravidian Languages is required to be classified into two closely related tasks namely <i>Sentiment Analyis</i> (<b>SA</b>) and <i>Offensive Language Identification</i> (<b>OLI</b>). The task spans over three Dravidian Languages, namely, Kannada, Malayalam, and Tamil. It is one of the first shared tasks that focuses on Multi-task Learning for closely related tasks, especially for a very low-resourced language family such as the Dravidian language family. In total, 55 people signed up to participate in the task, and due to the intricate nature of the task, especially in its first iteration, 3 submissions have been received.</abstract>
       <url hash="49778f1e">2022.dravidianlangtech-1.43</url>
       <bibkey>chakravarthi-etal-2022-findings</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.43</doi>
     </paper>
     <paper id="44">
       <title>Overview of Abusive Comment Detection in <fixed-case>T</fixed-case>amil-<fixed-case>ACL</fixed-case> 2022</title>
@@ -538,6 +581,7 @@
       <abstract>The social media is one of the significantdigital platforms that create a huge im-pact in peoples of all levels. The commentsposted on social media is powerful enoughto even change the political and businessscenarios in very few hours. They alsotend to attack a particular individual ora group of individuals. This shared taskaims at detecting the abusive comments in-volving, Homophobia, Misandry, Counter-speech, Misogyny, Xenophobia, Transpho-bic. The hope speech is also identified. Adataset collected from social media taggedwith the above said categories in Tamiland Tamil-English code-mixed languagesare given to the participants. The par-ticipants used different machine learningand deep learning algorithms. This paperpresents the overview of this task compris-ing the dataset details and results of theparticipants.</abstract>
       <url hash="11b875cf">2022.dravidianlangtech-1.44</url>
       <bibkey>priyadharshini-etal-2022-overview</bibkey>
+      <doi>10.18653/v1/2022.dravidianlangtech-1.44</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.ecnlp.xml b/data/xml/2022.ecnlp.xml
index c6eaea6589..4e4ec7c166 100644
--- a/data/xml/2022.ecnlp.xml
+++ b/data/xml/2022.ecnlp.xml
@@ -26,6 +26,7 @@
       <abstract>Defect Triage is a time-sensitive and critical process in a large-scale agile software development lifecycle for e-commerce. Inefficiencies arising from human and process dependencies in this domain have motivated research in automated approaches using machine learning to accurately assign defects to qualified teams. This work proposes a novel framework for automated defect triage (DEFTri) using fine-tuned state-of-the-art pre-trained BERT on labels fused text embeddings to improve contextual representations from human-generated product defects. For our multi-label text classification defect triage task, we also introduce a Walmart proprietary dataset of product defects using weak supervision and adversarial learning, in a few-shot setting.</abstract>
       <url hash="c1b502de">2022.ecnlp-1.1</url>
       <bibkey>mohanty-2022-deftri</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.1</doi>
     </paper>
     <paper id="2">
       <title>Interactive Latent Knowledge Selection for <fixed-case>E</fixed-case>-Commerce Product Copywriting Generation</title>
@@ -40,6 +41,7 @@
       <abstract>As the multi-modal e-commerce is thriving, high-quality advertising product copywriting has gain more attentions, which plays a crucial role in the e-commerce recommender, advertising and even search platforms.The advertising product copywriting is able to enhance the user experience by highlighting the product’s characteristics with textual descriptions and thus to improve the likelihood of user click and purchase. Automatically generating product copywriting has attracted noticeable interests from both academic and industrial communities, where existing solutions merely make use of a product’s title and attribute information to generate its corresponding description.However, in addition to the product title and attributes, we observe that there are various auxiliary descriptions created by the shoppers or marketers in the e-commerce platforms (namely human knowledge), which contains valuable information for product copywriting generation, yet always accompanying lots of noises.In this work, we propose a novel solution to automatically generating product copywriting that involves all the title, attributes and denoised auxiliary knowledge.To be specific, we design an end-to-end generation framework equipped with two variational autoencoders that works interactively to select informative human knowledge and generate diverse copywriting.</abstract>
       <url hash="7856da03">2022.ecnlp-1.2</url>
       <bibkey>wang-etal-2022-interactive</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.2</doi>
     </paper>
     <paper id="3">
       <title>Leveraging Seq2seq Language Generation for Multi-level Product Issue Identification</title>
@@ -54,6 +56,7 @@
       <abstract>In a leading e-commerce business, we receive hundreds of millions of customer feedback from different text communication channels such as product reviews. The feedback can contain rich information regarding customers’ dissatisfaction in the quality of goods and services. To harness such information to better serve customers, in this paper, we created a machine learning approach to automatically identify product issues and uncover root causes from the customer feedback text. We identify issues at two levels: coarse grained (L-Coarse) and fine grained (L-Granular). We formulate this multi-level product issue identification problem as a seq2seq language generation problem. Specifically, we utilize transformer-based seq2seq models due to their versatility and strong transfer-learning capability. We demonstrate that our approach is label efficient and outperforms the traditional approach such as multi-class multi-label classification formulation. Based on human evaluation, our fine-tuned model achieves 82.1% and 95.4% human-level performance for L-Coarse and L-Granular issue identification, respectively. Furthermore, our experiments illustrate that the model can generalize to identify unseen L-Granular issues.</abstract>
       <url hash="f2a30476">2022.ecnlp-1.3</url>
       <bibkey>liu-etal-2022-leveraging</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.3</doi>
     </paper>
     <paper id="4">
       <title>Data Quality Estimation Framework for Faster Tax Code Classification</title>
@@ -64,6 +67,7 @@
       <abstract>This paper describes a novel framework to estimate the data quality of a collection of product descriptions to identify required relevant information for accurate product listing classification for tax-code assignment. Our Data Quality Estimation (DQE) framework consists of a Question Answering (QA) based attribute value extraction model to identify missing attributes and a classification model to identify bad quality records. We show that our framework can accurately predict the quality of product descriptions. In addition to identifying low-quality product listings, our framework can also generate a detailed report at a category level showing missing product information resulting in a better customer experience.</abstract>
       <url hash="3256ac65">2022.ecnlp-1.4</url>
       <bibkey>kondadadi-etal-2022-data</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.4</doi>
     </paper>
     <paper id="5">
       <title><fixed-case>CML</fixed-case>: A Contrastive Meta Learning Method to Estimate Human Label Confidence Scores and Reduce Data Collection Cost</title>
@@ -77,6 +81,7 @@
       <abstract>Deep neural network models are especially susceptible to noise in annotated labels. In the real world, annotated data typically contains noise caused by a variety of factors such as task difficulty, annotator experience, and annotator bias. Label quality is critical for label validation tasks; however, correcting for noise by collecting more data is often costly. In this paper, we propose a contrastive meta-learning framework (CML) to address the challenges introduced by noisy annotated data, specifically in the context of natural language processing. CML combines contrastive and meta learning to improve the quality of text feature representations. Meta-learning is also used to generate confidence scores to assess label quality. We demonstrate that a model built on CML-filtered data outperforms a model built on clean data. Furthermore, we perform experiments on deidentified commercial voice assistant datasets and demonstrate that our model outperforms several SOTA approaches.</abstract>
       <url hash="7f4af822">2022.ecnlp-1.5</url>
       <bibkey>dong-etal-2022-cml</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.5</doi>
     </paper>
     <paper id="6">
       <title>Improving Relevance Quality in Product Search using High-Precision Query-Product Semantic Similarity</title>
@@ -92,6 +97,7 @@
       <abstract>Ensuring relevance quality in product search is a critical task as it impacts the customer’s ability to find intended products in the short-term as well as the general perception and trust of the e-commerce system in the long term. In this work we leverage a high-precision cross-encoder BERT model for semantic similarity between customer query and products and survey its effectiveness for three ranking applications where offline-generated scores could be used: (1) as an offline metric for estimating relevance quality impact, (2) as a re-ranking feature covering head/torso queries, and (3) as a training objective for optimization. We present results on effectiveness of this strategy for the large e-commerce setting, which has general applicability for choice of other high-precision models and tasks in ranking.</abstract>
       <url hash="522e9521">2022.ecnlp-1.6</url>
       <bibkey>bagheri-garakani-etal-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.6</doi>
     </paper>
     <paper id="7">
       <title>Comparative Snippet Generation</title>
@@ -103,6 +109,7 @@
       <url hash="84ae4f33">2022.ecnlp-1.7</url>
       <bibkey>jain-etal-2022-comparative</bibkey>
       <pwccode url="https://github.com/wing-nus/comparative-snippet-generation-dataset" additional="false">wing-nus/comparative-snippet-generation-dataset</pwccode>
+      <doi>10.18653/v1/2022.ecnlp-1.7</doi>
     </paper>
     <paper id="8">
       <title>Textual Content Moderation in <fixed-case>C</fixed-case>2<fixed-case>C</fixed-case> Marketplace</title>
@@ -113,6 +120,7 @@
       <abstract>Automatic monitoring systems for inappropriate user-generated messages have been found to be effective in reducing human operation costs in Consumer to Consumer (C2C) marketplace services, in which customers send messages directly to other customers.We propose a lightweight neural network that takes a conversation as input, which we deployed to a production service.Our results show that the system reduced the human operation costs to less than one-sixth compared to the conventional rule-based monitoring at Mercari.</abstract>
       <url hash="c210ec11">2022.ecnlp-1.8</url>
       <bibkey>shido-etal-2022-textual</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.8</doi>
     </paper>
     <paper id="9">
       <title>Spelling Correction using Phonetics in <fixed-case>E</fixed-case>-commerce Search</title>
@@ -127,6 +135,7 @@
       <abstract>In E-commerce search, spelling correction plays an important role to find desired products for customers in processing user-typed search queries. However, resolving phonetic errors is a critical but much overlooked area. The query with phonetic spelling errors tends to appear correct based on pronunciation but is nonetheless inaccurate in spelling (e.g., “bluetooth sound system” vs. “blutut sant sistam”) with numerous noisy forms and sparse occurrences. In this work, we propose a generalized spelling correction system integrating phonetics to address phonetic errors in E-commerce search without additional latency cost. Using India (IN) E-commerce market for illustration, the experiment shows that our proposed phonetic solution significantly improves the F1 score by 9%+ and recall of phonetic errors by 8%+. This phonetic spelling correction system has been deployed to production, currently serving hundreds of millions of customers.</abstract>
       <url hash="8cedf7ed">2022.ecnlp-1.9</url>
       <bibkey>yang-etal-2022-spelling</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.9</doi>
     </paper>
     <paper id="10">
       <title>Logical Reasoning for Task Oriented Dialogue Systems</title>
@@ -139,6 +148,7 @@
       <abstract>In recent years, large pretrained models have been used in dialogue systems to improve successful task completion rates. However, lack of reasoning capabilities of dialogue platforms make it difficult to provide relevant and fluent responses, unless the designers of a conversational experience spend a considerable amount of time implementing these capabilities in external rule based modules. In this work, we propose a novel method to fine-tune pretrained transformer models such as Roberta and T5, to reason over a set of facts in a given dialogue context.Our method includes a synthetic data generation mechanism which helps the model learn logical relations, such as comparison between list of numerical values, inverse relations (and negation), inclusion and exclusion for categorical attributes, and application of a combination of attributes over both numerical and categorical values, and spoken form for numerical values, without need for additional training data. We show that the transformer based model can perform logical reasoning to answer questions when the dialogue context contains all the required information, otherwise it is able to extract appropriate constraints to pass to downstream components (e.g. a knowledge base) when partial information is available. We observe that transformer based models such as UnifiedQA-T5 can be fine-tuned to perform logical reasoning (such as numerical and categorical attributes’ comparison) over attributes seen at training time (e.g., accuracy of 90%+ for comparison of smaller than kmax=5 values over heldout test dataset).</abstract>
       <url hash="9ea923ff">2022.ecnlp-1.10</url>
       <bibkey>beygi-etal-2022-logical</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.10</doi>
     </paper>
     <paper id="11">
       <title><fixed-case>C</fixed-case>o<fixed-case>VA</fixed-case>: Context-aware Visual Attention for Webpage Information Extraction</title>
@@ -153,6 +163,7 @@
       <bibkey>kumar-etal-2022-cova</bibkey>
       <pwccode url="https://github.com/kevalmorabia97/cova-web-object-detection" additional="false">kevalmorabia97/cova-web-object-detection</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cova">CoVA</pwcdataset>
+      <doi>10.18653/v1/2022.ecnlp-1.11</doi>
     </paper>
     <paper id="12">
       <title>Product Titles-to-Attributes As a Text-to-Text Task</title>
@@ -162,6 +173,7 @@
       <abstract>Online marketplaces use attribute-value pairs, such as brand, size, size type, color, etc. to help define important and relevant facts about a listing. These help buyers to curate their search results using attribute filtering and overall create a richer experience. Although their critical importance for listings’ discoverability, getting sellers to input tens of different attribute-value pairs per listing is costly and often results in missing information. This can later translate to the unnecessary removal of relevant listings from the search results when buyers are filtering by attribute values. In this paper we demonstrate using a Text-to-Text hierarchical multi-label ranking model framework to predict the most relevant attributes per listing, along with their expected values, using historic user behavioral data. This solution helps sellers by allowing them to focus on verifying information on attributes that are likely to be used by buyers, and thus, increase the expected recall for their listings. Specifically for eBay’s case we show that using this model can improve the relevancy of the attribute extraction process by 33.2% compared to the current highly-optimized production system. Apart from the empirical contribution, the highly generalized nature of the framework presented in this paper makes it relevant for many high-volume search-driven websites.</abstract>
       <url hash="d1ccbdcd">2022.ecnlp-1.12</url>
       <bibkey>fuchs-acriche-2022-product</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.12</doi>
     </paper>
     <paper id="13">
       <title>Product Answer Generation from Heterogeneous Sources: A New Benchmark and Best Practices</title>
@@ -176,6 +188,7 @@
       <url hash="b611f2ee">2022.ecnlp-1.13</url>
       <bibkey>shen-etal-2022-product</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/amazonqa">AmazonQA</pwcdataset>
+      <doi>10.18653/v1/2022.ecnlp-1.13</doi>
     </paper>
     <paper id="14">
       <title>semi<fixed-case>PQA</fixed-case>: A Study on Product Question Answering over Semi-structured Data</title>
@@ -192,6 +205,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/newsqa">NewsQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.ecnlp-1.14</doi>
     </paper>
     <paper id="15">
       <title>Improving Specificity in Review Response Generation with Data-Driven Data Filtering</title>
@@ -201,6 +215,7 @@
       <abstract>Responding to online customer reviews has become an essential part of successfully managing and growing a business both in e-commerce and the hospitality and tourism sectors. Recently, neural text generation methods intended to assist authors in composing responses have been shown to deliver highly fluent and natural looking texts. However, they also tend to learn a strong, undesirable bias towards generating overly generic, one-size-fits-all outputs to a wide range of inputs. While this often results in ‘safe’, high-probability responses, there are many practical settings in which greater specificity is preferable. In this work we examine the task of generating more specific responses for online reviews in the hospitality domain by identifying generic responses in the training data, filtering them and fine-tuning the generation model. We experiment with a range of data-driven filtering methods and show through automatic and human evaluation that, despite a 60% reduction in the amount of training data, filtering helps to derive models that are capable of generating more specific, useful responses.</abstract>
       <url hash="be3239c6">2022.ecnlp-1.15</url>
       <bibkey>kew-volk-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.15</doi>
     </paper>
     <paper id="16">
       <title>Extreme Multi-Label Classification with Label Masking for Product Attribute Value Extraction</title>
@@ -211,6 +226,7 @@
       <abstract>Although most studies have treated attribute value extraction (AVE) as named entity recognition, these approaches are not practical in real-world e-commerce platforms because they perform poorly, and require canonicalization of extracted values. Furthermore, since values needed for actual services is static in many attributes, extraction of new values is not always necessary. Given the above, we formalize AVE as extreme multi-label classification (XMC). A major problem in solving AVE as XMC is that the distribution between positive and negative labels for products is heavily imbalanced. To mitigate the negative impact derived from such biased distribution, we propose label masking, a simple and effective method to reduce the number of negative labels in training. We exploit attribute taxonomy designed for e-commerce platforms to determine which labels are negative for products. Experimental results using a dataset collected from a Japanese e-commerce platform demonstrate that the label masking improves micro and macro F<tex-math>_1</tex-math> scores by 3.38 and 23.20 points, respectively.</abstract>
       <url hash="691c7994">2022.ecnlp-1.16</url>
       <bibkey>chen-etal-2022-extreme</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.16</doi>
     </paper>
     <paper id="17">
       <title>Enhanced Representation with Contrastive Loss for Long-Tail Query Classification in e-commerce</title>
@@ -222,6 +238,7 @@
       <abstract>Query classification is a fundamental task in an e-commerce search engine, which assigns one or multiple predefined product categories in response to each search query. Taking click-through logs as training data in deep learning methods is a common and effective approach for query classification. However, the frequency distribution of queries typically has long-tail property, which means that there are few logs for most of the queries. The lack of reliable user feedback information results in worse performance of long-tail queries compared with frequent queries. To solve the above problem, we propose a novel method that leverages an auxiliary module to enhance the representations of long-tail queries by taking advantage of reliable supervised information of variant frequent queries. The long-tail queries are guided by the contrastive loss to obtain category-aligned representations in the auxiliary module, where the variant frequent queries serve as anchors in the representation space. We train our model with real-world click data from AliExpress and conduct evaluation on both offline labeled data and online AB test. The results and further analysis demonstrate the effectiveness of our proposed method.</abstract>
       <url hash="b06c823b">2022.ecnlp-1.17</url>
       <bibkey>zhu-etal-2022-enhanced</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.17</doi>
     </paper>
     <paper id="18">
       <title>Domain-specific knowledge distillation yields smaller and better models for conversational commerce</title>
@@ -239,6 +256,7 @@
       <abstract>We demonstrate that knowledge distillation can be used not only to reduce model size, but to simultaneously adapt a contextual language model to a specific domain. We use Multilingual BERT (mBERT; Devlin et al., 2019) as a starting point and follow the knowledge distillation approach of (Sahn et al., 2019) to train a smaller multilingual BERT model that is adapted to the domain at hand. We show that for in-domain tasks, the domain-specific model shows on average 2.3% improvement in F1 score, relative to a model distilled on domain-general data. Whereas much previous work with BERT has fine-tuned the encoder weights during task training, we show that the model improvements from distillation on in-domain data persist even when the encoder weights are frozen during task training, allowing a single encoder to support classifiers for multiple tasks and languages.</abstract>
       <url hash="6bfdf962">2022.ecnlp-1.18</url>
       <bibkey>howell-etal-2022-domain</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.18</doi>
     </paper>
     <paper id="19">
       <title><fixed-case>O</fixed-case>pen<fixed-case>B</fixed-case>rand: Open Brand Value Extraction from Product Descriptions</title>
@@ -250,6 +268,7 @@
       <url hash="29d9d1a7">2022.ecnlp-1.19</url>
       <bibkey>sabeh-etal-2022-openbrand</bibkey>
       <pwccode url="https://github.com/kassemsabeh/open-brand" additional="false">kassemsabeh/open-brand</pwccode>
+      <doi>10.18653/v1/2022.ecnlp-1.19</doi>
     </paper>
     <paper id="20">
       <title>Robust Product Classification with Instance-Dependent Noise</title>
@@ -259,6 +278,7 @@
       <abstract>Noisy labels in large E-commerce product data (i.e., product items are placed into incorrect categories) is a critical issue for product categorization task because they are unavoidable, non-trivial to remove and degrade prediction performance significantly. Training a product title classification model which is robust to noisy labels in the data is very important to make product classification applications more practical. In this paper, we study the impact of instance-dependent noise to performance of product title classification by comparing our data denoising algorithm and different noise-resistance training algorithms which were designed to prevent a classifier model from over-fitting to noise. We develop a simple yet effective Deep Neural Network for product title classification to use as a base classifier. Along with recent methods of stimulating instance-dependent noise, we propose a novel noise stimulation algorithm based on product title similarity. Our experiments cover multiple datasets, various noise methods and different training solutions. Results uncover the limit of classification task when noise rate is not negligible and data distribution is highly skewed.</abstract>
       <url hash="f65fe769">2022.ecnlp-1.20</url>
       <bibkey>nguyen-khatwani-2022-robust</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.20</doi>
     </paper>
     <paper id="21">
       <title>Structured Extraction of Terms and Conditions from <fixed-case>G</fixed-case>erman and <fixed-case>E</fixed-case>nglish Online Shops</title>
@@ -270,6 +290,7 @@
       <url hash="4ff85b37">2022.ecnlp-1.21</url>
       <bibkey>schamel-etal-2022-structured</bibkey>
       <pwccode url="https://github.com/sebischair/lowestcommonancestorextractor" additional="false">sebischair/lowestcommonancestorextractor</pwccode>
+      <doi>10.18653/v1/2022.ecnlp-1.21</doi>
     </paper>
     <paper id="22">
       <title>“Does it come in black?” <fixed-case>CLIP</fixed-case>-like models are zero-shot recommenders</title>
@@ -282,6 +303,7 @@
       <abstract>Product discovery is a crucial component for online shopping. However, item-to-item recommendations today do not allow users to explore changes along selected dimensions: given a query item, can a model suggest something similar but in a different color? We consider item recommendations of the comparative nature (e.g. “something darker”) and show how CLIP-based models can support this use case in a zero-shot manner. Leveraging a large model built for fashion, we introduce GradREC and its industry potential, and offer a first rounded assessment of its strength and weaknesses.</abstract>
       <url hash="515d5c5c">2022.ecnlp-1.22</url>
       <bibkey>chia-etal-2022-come</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.22</doi>
     </paper>
     <paper id="23">
       <title>Clause Topic Classification in <fixed-case>G</fixed-case>erman and <fixed-case>E</fixed-case>nglish Standard Form Contracts</title>
@@ -291,6 +313,7 @@
       <abstract>So-called standard form contracts, i.e. contracts that are drafted unilaterally by one party, like terms and conditions of online shops or terms of services of social networks, are cornerstones of our modern economy. Their processing is, therefore, of significant practical value. Often, the sheer size of these contracts allows the drafting party to hide unfavourable terms from the other party. In this paper, we compare different approaches for automatically classifying the topics of clauses in standard form contracts, based on a data-set of more than 6,000 clauses from more than 170 contracts, which we collected from German and English online shops and annotated based on a taxonomy of clause topics, that we developed together with legal experts. We will show that, in our comparison of seven approaches, from simple keyword matching to transformer language models, BERT performed best with an F1-score of up to 0.91, however much simpler and computationally cheaper models like logistic regression also achieved similarly good results of up to 0.87.</abstract>
       <url hash="095f35ec">2022.ecnlp-1.23</url>
       <bibkey>braun-matthes-2022-clause</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.23</doi>
     </paper>
     <paper id="24">
       <title>Investigating the Generative Approach for Question Answering in <fixed-case>E</fixed-case>-Commerce</title>
@@ -302,6 +325,7 @@
       <abstract>Many e-commerce websites provide Product-related Question Answering (PQA) platform where potential customers can ask questions related to a product, and other consumers can post an answer to that question based on their experience. Recently, there has been a growing interest in providing automated responses to product questions. In this paper, we investigate the suitability of the generative approach for PQA. We use state-of-the-art generative models proposed by Deng et al.(2020) and Lu et al.(2020) for this purpose. On closer examination, we find several drawbacks in this approach: (1) input reviews are not always utilized significantly for answer generation, (2) the performance of the models is abysmal while answering the numerical questions, (3) many of the generated answers contain phrases like “I do not know” which are taken from the reference answer in training data, and these answers do not convey any information to the customer. Although these approaches achieve a high ROUGE score, it does not reflect upon these shortcomings of the generated answers. We hope that our analysis will lead to more rigorous PQA approaches, and future research will focus on addressing these shortcomings in PQA.</abstract>
       <url hash="b4ed5ec5">2022.ecnlp-1.24</url>
       <bibkey>roy-etal-2022-investigating</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.24</doi>
     </paper>
     <paper id="25">
       <title>Utilizing Cross-Modal Contrastive Learning to Improve Item Categorization <fixed-case>BERT</fixed-case> Model</title>
@@ -311,6 +335,7 @@
       <abstract>Item categorization (IC) is a core natural language processing (NLP) task in e-commerce. As a special text classification task, fine-tuning pre-trained models, e.g., BERT, has become a mainstream solution. To improve IC performance further, other product metadata, e.g., product images, have been used. Although multimodal IC (MIC) systems show higher performance, expanding from processing text to more resource-demanding images brings large engineering impacts and hinders the deployment of such dual-input MIC systems. In this paper, we proposed a new way of using product images to improve text-only IC model: leveraging cross-modal signals between products’ titles and associated images to adapt BERT models in a self-supervised learning (SSL) way. Our experiments on the three genres in the public Amazon product dataset show that the proposed method generates improved prediction accuracy and macro-F1 values than simply using the original BERT. Moreover, the proposed method is able to keep using existing text-only IC inference implementation and shows a resource advantage than the deployment of a dual-input MIC system.</abstract>
       <url hash="eab00f23">2022.ecnlp-1.25</url>
       <bibkey>chen-chou-2022-utilizing</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.25</doi>
     </paper>
     <paper id="26">
       <title>Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</title>
@@ -324,6 +349,7 @@
       <abstract>Recently, semantic search has been successfully applied to E-commerce product search and the learned semantic space for query and product encoding are expected to generalize well to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not really help generalization which aligns with the discovery of prior art, yet proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a manually annotated query-product relevance data.</abstract>
       <url hash="759f3658">2022.ecnlp-1.26</url>
       <bibkey>liu-etal-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.26</doi>
     </paper>
     <paper id="27">
       <title>Can Pretrained Language Models Generate Persuasive, Faithful, and Informative Ad Text for Product Descriptions?</title>
@@ -334,6 +360,7 @@
       <abstract>For any e-commerce service, persuasive, faithful, and informative product descriptions can attract shoppers and improve sales. While not all sellers are capable of providing such interesting descriptions, a language generation system can be a source of such descriptions at scale, and potentially assist sellers to improve their product descriptions. Most previous work has addressed this task based on statistical approaches (Wang et al., 2017), limited attributes such as titles (Chen et al., 2019; Chan et al., 2020), and focused on only one product type (Wang et al., 2017; Munigala et al., 2018; Hong et al., 2021). In this paper, we jointly train image features and 10 text attributes across 23 diverse product types, with two different target text types with different writing styles: bullet points and paragraph descriptions. Our findings suggest that multimodal training with modern pretrained language models can generate fluent and persuasive advertisements, but are less faithful and informative, especially out of domain.</abstract>
       <url hash="9bb72f8c">2022.ecnlp-1.27</url>
       <bibkey>koto-etal-2022-pretrained</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.27</doi>
     </paper>
     <paper id="28">
       <title>A Simple Baseline for Domain Adaptation in End to End <fixed-case>ASR</fixed-case> Systems Using Synthetic Data</title>
@@ -343,6 +370,7 @@
       <abstract>Automatic Speech Recognition(ASR) has been dominated by deep learning-based end-to-end speech recognition models. These approaches require large amounts of labeled data in the form of audio-text pairs. Moreover, these models are more susceptible to domain shift as compared to traditional models. It is common practice to train generic ASR models and then adapt them to target domains using comparatively smaller data sets. We consider a more extreme case of domain adaptation where text-only corpus is available. In this work, we propose a simple baseline technique for domain adaptation in end-to-end speech recognition models. We convert the text-only corpus to audio data using single speaker Text to Speech (TTS) engine. The parallel data in the target domain is then used to fine-tune the final dense layer of generic ASR models. We show that single speaker synthetic TTS data coupled with final dense layer only fine-tuning provides reasonable improvements in word error rates. We use text data from address and e-commerce search domains to show the effectiveness of our low-cost baseline approach on CTC and attention-based models.</abstract>
       <url hash="b6b30a29">2022.ecnlp-1.28</url>
       <bibkey>joshi-singh-2022-simple</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.28</doi>
     </paper>
     <paper id="29">
       <title>Lot or Not: Identifying Multi-Quantity Offerings in <fixed-case>E</fixed-case>-Commerce</title>
@@ -352,6 +380,7 @@
       <abstract>The term <i>lot</i> in is defined to mean an offering that contains a collection of multiple identical items for sale. In a large online marketplace, lot offerings play an important role, allowing buyers and sellers to set price levels to optimally balance supply and demand needs. In spite of their central role, platforms often struggle to identify lot offerings, since explicit lot status identification is frequently not provided by sellers. The ability to identify lot offerings plays a key role in many fundamental tasks, from matching offerings to catalog products, through ranking search results, to providing effective pricing guidance. In this work, we seek to determine the lot status (and lot size) of each offering, in order to facilitate an improved buyer experience, while reducing the friction for sellers posting new offerings. We demonstrate experimentally the ability to accurately classify offerings as lots and predict their lot size using only the offer title, by adapting state-of-the-art natural language techniques to the lot identification problem. </abstract>
       <url hash="37294547">2022.ecnlp-1.29</url>
       <bibkey>lavee-guy-2022-lot</bibkey>
+      <doi>10.18653/v1/2022.ecnlp-1.29</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.fever.xml b/data/xml/2022.fever.xml
index 43438c9852..be78c35d52 100644
--- a/data/xml/2022.fever.xml
+++ b/data/xml/2022.fever.xml
@@ -34,6 +34,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/iirc">IIRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qasc">QASC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/eqasc">eQASC</pwcdataset>
+      <doi>10.18653/v1/2022.fever-1.1</doi>
     </paper>
     <paper id="2">
       <title>Heterogeneous-Graph Reasoning and Fine-Grained Aggregation for Fact Checking</title>
@@ -44,6 +45,7 @@
       <url hash="af9423ae">2022.fever-1.2</url>
       <bibkey>lin-fu-2022-heterogeneous</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
+      <doi>10.18653/v1/2022.fever-1.2</doi>
     </paper>
     <paper id="3">
       <title>Distilling Salient Reviews with Zero Labels</title>
@@ -57,6 +59,7 @@
       <abstract>Many people read online reviews to learn about real-world entities of their interest. However, majority of reviews only describes general experiences and opinions of the customers, and may not reveal facts that are specific to the entity being reviewed. In this work, we focus on a novel task of mining from a review corpus sentences that are unique for each entity. We refer to this task as Salient Fact Extraction. Salient facts are extremely scarce due to their very nature. Consequently, collecting labeled examples for training supervised models is tedious and cost-prohibitive. To alleviate this scarcity problem, we develop an unsupervised method, ZL-Distiller, which leverages contextual language representations of the reviews and their distributional patterns to identify salient sentences about entities. Our experiments on multiple domains (hotels, products, and restaurants) show that ZL-Distiller achieves state-of-the-art performance and further boosts the performance of other supervised/unsupervised algorithms for the task. Furthermore, we show that salient sentences mined by ZL-Distiller provide unique and detailed information about entities, which benefit downstream NLP applications including question answering and summarization.</abstract>
       <url hash="cfd98f98">2022.fever-1.3</url>
       <bibkey>huang-etal-2022-distilling</bibkey>
+      <doi>10.18653/v1/2022.fever-1.3</doi>
     </paper>
     <paper id="4">
       <title>Automatic Fake News Detection: Are current models “fact-checking” or“gut-checking”?</title>
@@ -71,6 +74,7 @@
       <bibkey>kelk-etal-2022-automatic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/politifact">PolitiFact</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snopes">Snopes</pwcdataset>
+      <doi>10.18653/v1/2022.fever-1.4</doi>
     </paper>
     <paper id="5">
       <title>A Semantics-Aware Approach to Automated Claim Verification</title>
@@ -82,6 +86,7 @@
       <url hash="5b6759c6">2022.fever-1.5</url>
       <bibkey>calvo-figueras-etal-2022-semantics</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
+      <doi>10.18653/v1/2022.fever-1.5</doi>
     </paper>
     <paper id="6">
       <title><fixed-case>PHEMEP</fixed-case>lus: Enriching Social Media Rumour Verification with External Evidence</title>
@@ -95,6 +100,7 @@
       <url hash="fc9e5c01">2022.fever-1.6</url>
       <bibkey>dougrez-lewis-etal-2022-phemeplus</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
+      <doi>10.18653/v1/2022.fever-1.6</doi>
     </paper>
     <paper id="7">
       <title><fixed-case>XI</fixed-case>nfo<fixed-case>T</fixed-case>ab<fixed-case>S</fixed-case>: Evaluating Multilingual Tabular Natural Language Inference</title>
@@ -108,6 +114,7 @@
       <url hash="1ef43e41">2022.fever-1.7</url>
       <bibkey>minhas-etal-2022-xinfotabs</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/tabfact">TabFact</pwcdataset>
+      <doi>10.18653/v1/2022.fever-1.7</doi>
     </paper>
     <paper id="8">
       <title>Neural Machine Translation for Fact-checking Temporal Claims</title>
@@ -119,6 +126,7 @@
       <abstract>Computational fact-checking aims at supporting the verification process of textual claims by exploiting trustworthy sources. However, there are large classes of complex claims that cannot be automatically verified, for instance those related to temporal reasoning. To this aim, in this work, we focus on the verification of economic claims against time series sources.Starting from given textual claims in natural language, we propose a neural machine translation approach to produce respective queries expressed in a recently proposed temporal fragment of the Datalog language. The adopted deep neural approach shows promising preliminary results for the translation of 10 categories of claims extracted from real use cases.</abstract>
       <url hash="bb79d9e7">2022.fever-1.8</url>
       <bibkey>mori-etal-2022-neural</bibkey>
+      <doi>10.18653/v1/2022.fever-1.8</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.findings.xml b/data/xml/2022.findings.xml
index 11f3729ec8..d88f428e60 100644
--- a/data/xml/2022.findings.xml
+++ b/data/xml/2022.findings.xml
@@ -30,6 +30,7 @@
       <abstract>Whole word masking (WWM), which masks all subwords corresponding to a word at once, makes a better English BERT model. For the Chinese language, however, there is no subword because each token is an atomic character. The meaning of a word in Chinese is different in that a word is a compositional unit consisting of multiple characters. Such difference motivates us to investigate whether WWM leads to better context understanding ability for Chinese BERT. To achieve this, we introduce two probing tasks related to grammatical error correction and ask pretrained models to revise or insert tokens in a masked language modeling manner. We construct a dataset including labels for 19,075 tokens in 10,448 sentences. We train three Chinese BERT models with standard character-level masking (CLM), WWM, and a combination of CLM and WWM, respectively. Our major findings are as follows: First, when one character needs to be inserted or replaced, the model trained with CLM performs the best. Second, when more than one character needs to be handled, WWM is the key to better performance. Finally, when being fine-tuned on sentence-level downstream tasks, models trained with different masking strategies perform comparably.</abstract>
       <url hash="df716d18">2022.findings-acl.1</url>
       <bibkey>dai-etal-2022-whole</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.1</doi>
     </paper>
     <paper id="2">
       <title>Compilable Neural Code Generation with Compiler Feedback</title>
@@ -48,6 +49,7 @@
       <url hash="a411c1bf">2022.findings-acl.2</url>
       <bibkey>wang-etal-2022-compilable</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/codesearchnet">CodeSearchNet</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.2</doi>
     </paper>
     <paper id="3">
       <title>Towards Unifying the Label Space for Aspect- and Sentence-based Sentiment Analysis</title>
@@ -59,6 +61,7 @@
       <abstract>The aspect-based sentiment analysis (ABSA) is a fine-grained task that aims to determine the sentiment polarity towards targeted aspect terms occurring in the sentence. The development of the ABSA task is very much hindered by the lack of annotated data. To tackle this, the prior works have studied the possibility of utilizing the sentiment analysis (SA) datasets to assist in training the ABSA model, primarily via pretraining or multi-task learning. In this article, we follow this line, and for the first time, we manage to apply the Pseudo-Label (PL) method to merge the two homogeneous tasks. While it seems straightforward to use generated pseudo labels to handle this case of label granularity unification for two highly related tasks, we identify its major challenge in this paper and propose a novel framework, dubbed as Dual-granularity Pseudo Labeling (DPL). Further, similar to PL, we regard the DPL as a general framework capable of combining other prior methods in the literature. Through extensive experiments, DPL has achieved state-of-the-art performance on standard benchmarks surpassing the prior work significantly.</abstract>
       <url hash="7ab13df3">2022.findings-acl.3</url>
       <bibkey>zhang-etal-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.3</doi>
     </paper>
     <paper id="4">
       <title>Input-specific Attention Subnetworks for Adversarial Detection</title>
@@ -78,6 +81,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.4</doi>
     </paper>
     <paper id="5">
       <title><fixed-case>R</fixed-case>elation<fixed-case>P</fixed-case>rompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction</title>
@@ -92,6 +96,7 @@
       <pwccode url="https://github.com/declare-lab/relationprompt" additional="false">declare-lab/relationprompt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wiki-zsl">Wiki-ZSL</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.5</doi>
     </paper>
     <paper id="6">
       <title>Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?</title>
@@ -108,6 +113,7 @@
       <attachment type="software" hash="a436de63">2022.findings-acl.6.software.zip</attachment>
       <bibkey>lee-etal-2022-pre</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/pmindia">PMIndia</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.6</doi>
     </paper>
     <paper id="7">
       <title>Multi-Scale Distribution Deep Variational Autoencoder for Explanation Generation</title>
@@ -120,6 +126,7 @@
       <abstract>Generating explanations for recommender systems is essential for improving their transparency, as users often wish to understand the reason for receiving a specified recommendation. Previous methods mainly focus on improving the generation quality, but often produce generic explanations that fail to incorporate user and item specific details. To resolve this problem, we present Multi-Scale Distribution Deep Variational Autoencoders (MVAE).These are deep hierarchical VAEs with a prior network that eliminates noise while retaining meaningful signals in the input, coupled with a recognition network serving as the source of information to guide the learning of the prior network. Further, the Multi-scale distribution Learning Framework (MLF) along with a Target Tracking Kullback-Leibler divergence (TKL) mechanism are proposed to employ multi KL divergences at different scales for more effective learning. Extensive empirical experiments demonstrate that our methods can generate explanations with concrete input-specific contents.</abstract>
       <url hash="d9aa8f0a">2022.findings-acl.7</url>
       <bibkey>cai-etal-2022-multi</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.7</doi>
     </paper>
     <paper id="8">
       <title>Dual Context-Guided Continuous Prompt Tuning for Few-Shot Learning</title>
@@ -133,6 +140,7 @@
       <abstract>Prompt-based paradigm has shown its competitive performance in many NLP tasks. However, its success heavily depends on prompt design, and the effectiveness varies upon the model and training data. In this paper, we propose a novel dual context-guided continuous prompt (DCCP) tuning method. To explore the rich contextual information in language structure and close the gap between discrete prompt tuning and continuous prompt tuning, DCCP introduces two auxiliary training objectives and constructs input in a pair-wise fashion.Experimental results demonstrate that our method is applicable to many NLP tasks, and can often outperform existing prompt tuning methods by a large margin in the few-shot setting.</abstract>
       <url hash="9656d26e">2022.findings-acl.8</url>
       <bibkey>zhou-etal-2022-dual</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.8</doi>
     </paper>
     <paper id="9">
       <title>Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training</title>
@@ -146,6 +154,7 @@
       <abstract>Nested named entity recognition (NER) is a task in which named entities may overlap with each other. Span-based approaches regard nested NER as a two-stage span enumeration and classification task, thus having the innate ability to handle this task. However, they face the problems of error propagation, ignorance of span boundary, difficulty in long entity recognition and requirement on large-scale annotated data. In this paper, we propose Extract-Select, a span selection framework for nested NER, to tackle these problems. Firstly, we introduce a span selection framework in which nested entities with different input categories would be separately extracted by the extractor, thus naturally avoiding error propagation in two-stage span-based approaches. In the inference phase, the trained extractor selects final results specific to the given entity category. Secondly, we propose a hybrid selection strategy in the extractor, which not only makes full use of span boundary but also improves the ability of long entity recognition. Thirdly, we design a discriminator to evaluate the extraction result, and train both extractor and discriminator with generative adversarial training (GAT). The use of GAT greatly alleviates the stress on the dataset size. Experimental results on four benchmark datasets demonstrate that Extract-Select outperforms competitive nested NER models, obtaining state-of-the-art results. The proposed model also performs well when less labeled data are given, proving the effectiveness of GAT.</abstract>
       <url hash="ab6638a3">2022.findings-acl.9</url>
       <bibkey>huang-etal-2022-extract</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.9</doi>
     </paper>
     <paper id="10">
       <title>Controlled Text Generation Using Dictionary Prior in Variational Autoencoders</title>
@@ -161,6 +170,7 @@
       <bibkey>fang-etal-2022-controlled</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.10</doi>
     </paper>
     <paper id="11">
       <title>Challenges to Open-Domain Constituency Parsing</title>
@@ -175,6 +185,7 @@
       <bibkey>yang-etal-2022-challenges</bibkey>
       <pwccode url="https://github.com/ringos/multi-domain-parsing-analysis" additional="false">ringos/multi-domain-parsing-analysis</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.11</doi>
     </paper>
     <paper id="12">
       <title>Going “Deeper”: Structured Sememe Prediction via Transformer with Tree Attention</title>
@@ -188,6 +199,7 @@
       <attachment type="software" hash="c59dd366">2022.findings-acl.12.software.zip</attachment>
       <bibkey>ye-etal-2022-going</bibkey>
       <pwccode url="https://github.com/thunlp/stg" additional="false">thunlp/stg</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.12</doi>
     </paper>
     <paper id="13">
       <title>Table-based Fact Verification with Self-adaptive Mixture of Experts</title>
@@ -201,6 +213,7 @@
       <bibkey>zhou-etal-2022-table</bibkey>
       <pwccode url="https://github.com/thumlp/samoe" additional="false">thumlp/samoe</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/tabfact">TabFact</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.13</doi>
     </paper>
     <paper id="14">
       <title>Investigating Data Variance in Evaluations of Automatic Machine Translation Metrics</title>
@@ -215,6 +228,7 @@
       <abstract>Current practices in metric evaluation focus on one single dataset, e.g., Newstest dataset in each year’s WMT Metrics Shared Task. However, in this paper, we qualitatively and quantitatively show that the performances of metrics are sensitive to data. The ranking of metrics varies when the evaluation is conducted on different datasets. Then this paper further investigates two potential hypotheses, i.e., insignificant data points and the deviation of i.i.d assumption, which may take responsibility for the issue of data variance. In conclusion, our findings suggest that when evaluating automatic translation metrics, researchers should take data variance into account and be cautious to report the results on unreliable datasets, because it may leads to inconsistent results with most of the other datasets.</abstract>
       <url hash="de0daa53">2022.findings-acl.14</url>
       <bibkey>xiang-etal-2022-investigating</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.14</doi>
     </paper>
     <paper id="15">
       <title>Sememe Prediction for <fixed-case>B</fixed-case>abel<fixed-case>N</fixed-case>et Synsets using Multilingual and Multimodal Information</title>
@@ -231,6 +245,7 @@
       <bibkey>qi-etal-2022-sememe</bibkey>
       <pwccode url="https://github.com/thunlp/msgi" additional="false">thunlp/msgi</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/imagenet">ImageNet</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.15</doi>
     </paper>
     <paper id="16">
       <title>Query and Extract: Refining Event Extraction as Type-oriented Binary Decoding</title>
@@ -244,6 +259,7 @@
       <url hash="9299452c">2022.findings-acl.16</url>
       <bibkey>wang-etal-2022-query</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/maven">MAVEN</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.16</doi>
     </paper>
     <paper id="17">
       <title><fixed-case>LEVEN</fixed-case>: A Large-Scale <fixed-case>C</fixed-case>hinese Legal Event Detection Dataset</title>
@@ -264,6 +280,7 @@
       <bibkey>yao-etal-2022-leven</bibkey>
       <pwccode url="https://github.com/thunlp/leven" additional="false">thunlp/leven</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/maven">MAVEN</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.17</doi>
     </paper>
     <paper id="18">
       <title>Analyzing Dynamic Adversarial Training Data in the Limit</title>
@@ -278,6 +295,7 @@
       <pwccode url="https://github.com/facebookresearch/dadc-limit" additional="false">facebookresearch/dadc-limit</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.18</doi>
     </paper>
     <paper id="19">
       <title><fixed-case>A</fixed-case>bduction<fixed-case>R</fixed-case>ules: Training Transformers to Explain Unexpected Inputs</title>
@@ -291,6 +309,7 @@
       <bibkey>young-etal-2022-abductionrules</bibkey>
       <pwccode url="https://github.com/strong-ai-lab/abductionrules" additional="false">strong-ai-lab/abductionrules</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/proofwriter">ProofWriter</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.19</doi>
     </paper>
     <paper id="20">
       <title>On the Importance of Data Size in Probing Fine-tuned Models</title>
@@ -306,6 +325,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>R</fixed-case>u<fixed-case>CC</fixed-case>o<fixed-case>N</fixed-case>: Clinical Concept Normalization in <fixed-case>R</fixed-case>ussian</title>
@@ -324,6 +344,7 @@
       <url hash="8f620f3a">2022.findings-acl.21</url>
       <bibkey>nesterov-etal-2022-ruccon</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/xl-bel">XL-BEL</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.21</doi>
     </paper>
     <paper id="22">
       <title>A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive Learning Framework for Sentence Embeddings</title>
@@ -337,6 +358,7 @@
       <url hash="8b6b9cf4">2022.findings-acl.22</url>
       <bibkey>tan-etal-2022-sentence</bibkey>
       <pwccode url="https://github.com/namco0816/pt-bert" additional="false">namco0816/pt-bert</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.22</doi>
     </paper>
     <paper id="23">
       <title>Eider: Empowering Document-level Relation Extraction with Efficient Evidence Extraction and Inference-stage Fusion</title>
@@ -351,6 +373,7 @@
       <bibkey>xie-etal-2022-eider</bibkey>
       <pwccode url="https://github.com/veronicium/eider" additional="false">veronicium/eider</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/docred">DocRED</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.23</doi>
     </paper>
     <paper id="24">
       <title>Meta-X<tex-math>_{NLG}</tex-math>: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation</title>
@@ -366,6 +389,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/tydi-qa">TyDi QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikilingua">WikiLingua</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.24</doi>
     </paper>
     <paper id="25">
       <title><fixed-case>MR</fixed-case>-<fixed-case>P</fixed-case>: A Parallel Decoding Algorithm for Iterative Refinement Non-Autoregressive Translation</title>
@@ -375,6 +399,7 @@
       <abstract>Non-autoregressive translation (NAT) predicts all the target tokens in parallel and significantly speeds up the inference process. The Conditional Masked Language Model (CMLM) is a strong baseline of NAT. It decodes with the Mask-Predict algorithm which iteratively refines the output. Most works about CMLM focus on the model structure and the training objective. However, the decoding algorithm is equally important. We propose a simple, effective, and easy-to-implement decoding algorithm that we call MaskRepeat-Predict (MR-P). The MR-P algorithm gives higher priority to consecutive repeated tokens when selecting tokens to mask for the next iteration and stops the iteration after target tokens converge. We conduct extensive experiments on six translation directions with varying data sizes. The results show that MR-P significantly improves the performance with the same model parameters. Specifically, we achieve a BLEU increase of 1.39 points in the WMT’14 En-De translation task.</abstract>
       <url hash="9668043b">2022.findings-acl.25</url>
       <bibkey>cheng-zhang-2022-mr</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.25</doi>
     </paper>
     <paper id="26">
       <title>Open Relation Modeling: Learning to Define Relations between Entities</title>
@@ -388,6 +413,7 @@
       <attachment type="software" hash="0120116c">2022.findings-acl.26.software.zip</attachment>
       <bibkey>huang-etal-2022-open</bibkey>
       <pwccode url="https://github.com/jeffhj/open-relation-modeling" additional="false">jeffhj/open-relation-modeling</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.26</doi>
     </paper>
     <paper id="27">
       <title>A Slot Is Not Built in One Utterance: Spoken Language Dialogs with Sub-Slots</title>
@@ -409,6 +435,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ssd-name">SSD_NAME</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ssd-phone">SSD_PHONE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ssd-plate">SSD_PLATE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.27</doi>
     </paper>
     <paper id="28">
       <title>Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction</title>
@@ -424,6 +451,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/break">BREAK</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/gem">GEM</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/splash">SPLASH</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.28</doi>
     </paper>
     <paper id="29">
       <title><fixed-case>MINER</fixed-case>: Multi-Interest Matching Network for News Recommendation</title>
@@ -440,6 +468,7 @@
       <url hash="92b4b110">2022.findings-acl.29</url>
       <bibkey>li-etal-2022-miner</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mind">MIND</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.29</doi>
     </paper>
     <paper id="30">
       <title><fixed-case>KSAM</fixed-case>: Infusing Multi-Source Knowledge into Dialogue Generation via Knowledge Source Aware Multi-Head Decoding</title>
@@ -451,6 +480,7 @@
       <abstract>Knowledge-enhanced methods have bridged the gap between human beings and machines in generating dialogue responses. However, most previous works solely seek knowledge from a single source, and thus they often fail to obtain available knowledge because of the insufficient coverage of a single knowledge source. To this end, infusing knowledge from multiple sources becomes a trend. This paper proposes a novel approach Knowledge Source Aware Multi-Head Decoding, KSAM, to infuse multi-source knowledge into dialogue generation more efficiently. Rather than following the traditional single decoder paradigm, KSAM uses multiple independent source-aware decoder heads to alleviate three challenging problems in infusing multi-source knowledge, namely, the diversity among different knowledge sources, the indefinite knowledge alignment issue, and the insufficient flexibility/scalability in knowledge usage. Experiments on a Chinese multi-source knowledge-aligned dataset demonstrate the superior performance of KSAM against various competitive approaches.</abstract>
       <url hash="1aca8e3e">2022.findings-acl.30</url>
       <bibkey>wu-etal-2022-ksam</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.30</doi>
     </paper>
     <paper id="31">
       <title>Towards Responsible Natural Language Annotation for the Varieties of <fixed-case>A</fixed-case>rabic</title>
@@ -460,6 +490,7 @@
       <abstract>When building NLP models, there is a tendency to aim for broader coverage, often overlooking cultural and (socio)linguistic nuance. In this position paper, we make the case for care and attention to such nuances, particularly in dataset annotation, as well as the inclusion of cultural and linguistic expertise in the process. We present a playbook for responsible dataset creation for polyglossic, multidialectal languages. This work is informed by a study on Arabic annotation of social media content.</abstract>
       <url hash="f48868e9">2022.findings-acl.31</url>
       <bibkey>bergman-diab-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.31</doi>
     </paper>
     <paper id="32">
       <title>Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection</title>
@@ -472,6 +503,7 @@
       <url hash="c6b773c3">2022.findings-acl.32</url>
       <bibkey>bose-etal-2022-dynamically</bibkey>
       <pwccode url="https://github.com/tbose20/d-ref" additional="false">tbose20/d-ref</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.32</doi>
     </paper>
     <paper id="33">
       <title>Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue Systems</title>
@@ -486,6 +518,7 @@
       <url hash="53c555c0">2022.findings-acl.33</url>
       <bibkey>tuan-etal-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/opendialkg">OpenDialKG</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.33</doi>
     </paper>
     <paper id="34">
       <title><fixed-case>MDER</fixed-case>ank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction</title>
@@ -502,6 +535,7 @@
       <url hash="dc9b7e27">2022.findings-acl.34</url>
       <bibkey>zhang-etal-2022-mderank</bibkey>
       <pwccode url="https://github.com/linhanz/mderank" additional="false">linhanz/mderank</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.34</doi>
     </paper>
     <paper id="35">
       <title>Visualizing the Relationship Between Encoded Linguistic Information and Task Performance</title>
@@ -515,6 +549,7 @@
       <abstract>Probing is popular to analyze whether linguistic information can be captured by a well-trained deep neural model, but it is hard to answer how the change of the encoded linguistic information will affect task performance. To this end, we study the dynamic relationship between the encoded linguistic information and task performance from the viewpoint of Pareto Optimality. Its key idea is to obtain a set of models which are Pareto-optimal in terms of both objectives. From this viewpoint, we propose a method to optimize the Pareto-optimal models by formalizing it as a multi-objective optimization problem. We conduct experiments on two popular NLP tasks, i.e., machine translation and language modeling, and investigate the relationship between several kinds of linguistic information and task performances. Experimental results demonstrate that the proposed method is better than a baseline method. Our empirical findings suggest that some syntactic information is helpful for NLP tasks whereas encoding more syntactic information does not necessarily lead to better performance, because the model architecture is also an important factor.</abstract>
       <url hash="42104ce7">2022.findings-acl.35</url>
       <bibkey>xiang-etal-2022-visualizing</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.35</doi>
     </paper>
     <paper id="36">
       <title>Efficient Argument Structure Extraction with Transfer Learning and Active Learning</title>
@@ -525,6 +560,7 @@
       <url hash="52042a9a">2022.findings-acl.36</url>
       <bibkey>hua-wang-2022-efficient</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cdcp">CDCP</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.36</doi>
     </paper>
     <paper id="37">
       <title>Plug-and-Play Adaptation for Continuously-updated <fixed-case>QA</fixed-case></title>
@@ -541,6 +577,7 @@
       <bibkey>lee-etal-2022-plug</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/situatedqa">SituatedQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.37</doi>
     </paper>
     <paper id="38">
       <title>Reinforced Cross-modal Alignment for Radiology Report Generation</title>
@@ -552,6 +589,7 @@
       <attachment type="software" hash="4730ab3c">2022.findings-acl.38.software.zip</attachment>
       <bibkey>qin-song-2022-reinforced</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/chexpert">CheXpert</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.38</doi>
     </paper>
     <paper id="39">
       <title>What Works and Doesn’t Work, A Deep Decoder for Neural Machine Translation</title>
@@ -565,6 +603,7 @@
       <abstract>Deep learning has demonstrated performance advantages in a wide range of natural language processing tasks, including neural machine translation (NMT). Transformer NMT models are typically strengthened by deeper encoder layers, but deepening their decoder layers usually results in failure. In this paper, we first identify the cause of the failure of the deep decoder in the Transformer model. Inspired by this discovery, we then propose approaches to improving it, with respect to model structure and model training, to make the deep decoder practical in NMT. Specifically, with respect to model structure, we propose a cross-attention drop mechanism to allow the decoder layers to perform their own different roles, to reduce the difficulty of deep-decoder learning. For model training, we propose a collapse reducing training approach to improve the stability and effectiveness of deep-decoder training. We experimentally evaluated our proposed Transformer NMT model structure modification and novel training methods on several popular machine translation benchmarks. The results showed that deepening the NMT model by increasing the number of decoder layers successfully prevented the deepened decoder from degrading to an unconditional language model. In contrast to prior work on deepening an NMT model on the encoder, our method can deepen the model on both the encoder and decoder at the same time, resulting in a deeper model and improved performance.</abstract>
       <url hash="a6d228a1">2022.findings-acl.39</url>
       <bibkey>li-etal-2022-works</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.39</doi>
     </paper>
     <paper id="40">
       <title><fixed-case>S</fixed-case>y<fixed-case>MC</fixed-case>o<fixed-case>M</fixed-case> - Syntactic Measure of Code Mixing A Study Of <fixed-case>E</fixed-case>nglish-<fixed-case>H</fixed-case>indi Code-Mixing</title>
@@ -578,6 +617,7 @@
       <url hash="11f49965">2022.findings-acl.40</url>
       <bibkey>kodali-etal-2022-symcom</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/lince">LinCE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.40</doi>
     </paper>
     <paper id="41">
       <title><fixed-case>H</fixed-case>ybri<fixed-case>D</fixed-case>ialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data</title>
@@ -598,6 +638,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/recipeqa">RecipeQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sqa">SQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sharc">ShARC</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.41</doi>
     </paper>
     <paper id="42">
       <title><fixed-case>NEWTS</fixed-case>: A Corpus for News Topic-Focused Summarization</title>
@@ -608,6 +649,7 @@
       <abstract>Text summarization models are approaching human levels of fidelity. Existing benchmarking corpora provide concordant pairs of full and abridged versions of Web, news or professional content. To date, all summarization datasets operate under a one-size-fits-all paradigm that may not reflect the full range of organic summarization needs. Several recently proposed models (e.g., plug and play language models) have the capacity to condition the generated summaries on a desired range of themes. These capacities remain largely unused and unevaluated as there is no dedicated dataset that would support the task of topic-focused summarization.This paper introduces the first topical summarization corpus NEWTS, based on the well-known CNN/Dailymail dataset, and annotated via online crowd-sourcing. Each source article is paired with two reference summaries, each focusing on a different theme of the source document. We evaluate a representative range of existing techniques and analyze the effectiveness of different prompting methods.</abstract>
       <url hash="b277e1fe">2022.findings-acl.42</url>
       <bibkey>bahrainian-etal-2022-newts</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.42</doi>
     </paper>
     <paper id="43">
       <title>Classification without (Proper) Representation: Political Heterogeneity in Social Media and Its Implications for Classification and Behavioral Analysis</title>
@@ -618,6 +660,7 @@
       <abstract>Reddit is home to a broad spectrum of political activity, and users signal their political affiliations in multiple ways—from self-declarations to community participation. Frequently, computational studies have treated political users as a single bloc, both in developing models to infer political leaning and in studying political behavior. Here, we test this assumption of political users and show that commonly-used political-inference models do not generalize, indicating heterogeneous types of political users. The models remain imprecise at best for most users, regardless of which sources of data or methods are used. Across a 14-year longitudinal analysis, we demonstrate that the choice in definition of a political user has significant implications for behavioral analysis. Controlling for multiple factors, political users are more toxic on the platform and inter-party interactions are even more toxic—but not all political users behave this way. Last, we identify a subset of political users who repeatedly flip affiliations, showing that these users are the most controversial of all, acting as provocateurs by more frequently bringing up politics, and are more likely to be banned, suspended, or deleted.</abstract>
       <url hash="4c1bc63e">2022.findings-acl.43</url>
       <bibkey>alkiek-etal-2022-classification</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.43</doi>
     </paper>
     <paper id="44">
       <title>Toward More Meaningful Resources for Lower-resourced Languages</title>
@@ -631,6 +674,7 @@
       <bibkey>lignos-etal-2022-toward</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/masakhaner">MasakhaNER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikiann-1">WikiAnn</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.44</doi>
     </paper>
     <paper id="45">
       <title>Better Quality Estimation for Low Resource Corpus Mining</title>
@@ -643,6 +687,7 @@
       <attachment type="software" hash="82443e78">2022.findings-acl.45.software.zip</attachment>
       <bibkey>kocyigit-etal-2022-better</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mlqe">MLQE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.45</doi>
     </paper>
     <paper id="46">
       <title>End-to-End Segmentation-based News Summarization</title>
@@ -654,6 +699,7 @@
       <url hash="4749987b">2022.findings-acl.46</url>
       <bibkey>liu-etal-2022-end</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.46</doi>
     </paper>
     <paper id="47">
       <title>Fast Nearest Neighbor Machine Translation</title>
@@ -669,6 +715,7 @@
       <url hash="6757763e">2022.findings-acl.47</url>
       <bibkey>meng-etal-2022-fast</bibkey>
       <pwccode url="https://github.com/ShannonAI/fast-knn-nmt" additional="false">ShannonAI/fast-knn-nmt</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.47</doi>
     </paper>
     <paper id="48">
       <title>Extracting Latent Steering Vectors from Pretrained Language Models</title>
@@ -681,6 +728,7 @@
       <bibkey>subramani-etal-2022-extracting</bibkey>
       <pwccode url="https://github.com/nishantsubramani/steering_vectors" additional="false">nishantsubramani/steering_vectors</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/styleptb">StylePTB</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.48</doi>
     </paper>
     <paper id="49">
       <title>Domain Generalisation of <fixed-case>NMT</fixed-case>: Fusing Adapters with Leave-One-Domain-Out Training</title>
@@ -692,6 +740,7 @@
       <abstract>Generalising to unseen domains is under-explored and remains a challenge in neural machine translation. Inspired by recent research in parameter-efficient transfer learning from pretrained models, this paper proposes a fusion-based generalisation method that learns to combine domain-specific parameters. We propose a leave-one-domain-out training strategy to avoid information leaking to address the challenge of not knowing the test domain during training time. Empirical results on three language pairs show that our proposed fusion method outperforms other baselines up to +0.8 BLEU score on average.</abstract>
       <url hash="af446f61">2022.findings-acl.49</url>
       <bibkey>vu-etal-2022-domain</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.49</doi>
     </paper>
     <paper id="50">
       <title>Reframing Instructional Prompts to <fixed-case>GPT</fixed-case>k’s Language</title>
@@ -709,6 +758,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mc-taco">MC-TACO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qasc">QASC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winogrande">WinoGrande</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.50</doi>
     </paper>
     <paper id="51">
       <title>Read Top News First: A Document Reordering Approach for Multi-Document News Summarization</title>
@@ -725,6 +775,7 @@
       <pwccode url="https://github.com/zhaochaocs/mds-dr" additional="false">zhaochaocs/mds-dr</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multi-news">Multi-News</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.51</doi>
     </paper>
     <paper id="52">
       <title>Human Language Modeling</title>
@@ -737,6 +788,7 @@
       <url hash="7ec3f959">2022.findings-acl.52</url>
       <bibkey>soni-etal-2022-human</bibkey>
       <pwccode url="https://github.com/humanlab/hart" additional="false">humanlab/hart</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.52</doi>
     </paper>
     <paper id="53">
       <title>Inverse is Better! Fast and Accurate Prompt for Few-shot Slot Tagging</title>
@@ -750,6 +802,7 @@
       <url hash="bd9d02cc">2022.findings-acl.53</url>
       <bibkey>hou-etal-2022-inverse</bibkey>
       <pwccode url="https://github.com/atmahou/promptslottagging" additional="false">atmahou/promptslottagging</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.53</doi>
     </paper>
     <paper id="54">
       <title>Cross-Modal Cloze Task: A New Task to Brain-to-Word Decoding</title>
@@ -762,6 +815,7 @@
       <url hash="3ea873e7">2022.findings-acl.54</url>
       <bibkey>zou-etal-2022-cross</bibkey>
       <pwccode url="https://github.com/littletreezou/cross-modal-cloze-task" additional="false">littletreezou/cross-modal-cloze-task</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.54</doi>
     </paper>
     <paper id="55">
       <title>Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal</title>
@@ -780,6 +834,7 @@
       <url hash="2406173b">2022.findings-acl.55</url>
       <bibkey>gupta-etal-2022-mitigating</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.55</doi>
     </paper>
     <paper id="56">
       <title>Domain Representative Keywords Selection: A Probabilistic Approach</title>
@@ -795,6 +850,7 @@
       <bibkey>akash-etal-2022-domain</bibkey>
       <pwccode url="https://github.com/pritomsaha/keyword-selection" additional="false">pritomsaha/keyword-selection</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/aminer">AMiner</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.56</doi>
     </paper>
     <paper id="57">
       <title>Hierarchical Inductive Transfer for Continual Dialogue Learning</title>
@@ -806,6 +862,7 @@
       <abstract>Pre-trained models have achieved excellent performance on the dialogue task. However, for the continual increase of online chit-chat scenarios, directly fine-tuning these models for each of the new tasks not only explodes the capacity of the dialogue system on the embedded devices but also causes knowledge forgetting on pre-trained models and knowledge interference among diverse dialogue tasks. In this work, we propose a hierarchical inductive transfer framework to learn and deploy the dialogue skills continually and efficiently. First, we introduce the adapter module into pre-trained models for learning new dialogue tasks. As the only trainable module, it is beneficial for the dialogue system on the embedded devices to acquire new dialogue skills with negligible additional parameters. Then, for alleviating knowledge interference between tasks yet benefiting the regularization between them, we further design hierarchical inductive transfer that enables new tasks to use general knowledge in the base adapter without being misled by diverse knowledge in task-specific adapters. Empirical evaluation and analysis indicate that our framework obtains comparable performance under deployment-friendly model capacity.</abstract>
       <url hash="d38f5210">2022.findings-acl.57</url>
       <bibkey>feng-etal-2022-hierarchical</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.57</doi>
     </paper>
     <paper id="58">
       <title>Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation</title>
@@ -820,6 +877,7 @@
       <pwccode url="https://github.com/kushalarora/quantifying_exposure_bias" additional="false">kushalarora/quantifying_exposure_bias</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.58</doi>
     </paper>
     <paper id="59">
       <title>Question Answering Infused Pre-training of General-Purpose Contextualized Representations</title>
@@ -840,6 +898,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/searchqa">SearchQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.59</doi>
     </paper>
     <paper id="60">
       <title>Automatic Song Translation for Tonal Languages</title>
@@ -854,6 +913,7 @@
       <abstract>This paper develops automatic song translation (AST) for tonal languages and addresses the unique challenge of aligning words’ tones with melody of a song in addition to conveying the original meaning. We propose three criteria for effective AST—preserving meaning, singability and intelligibility—and design metrics for these criteria. We develop a new benchmark for English–Mandarin song translation and develop an unsupervised AST system, Guided AliGnment for Automatic Song Translation (GagaST), which combines pre-training with three decoding constraints. Both automatic and human evaluations show GagaST successfully balances semantics and singability.</abstract>
       <url hash="84bff581">2022.findings-acl.60</url>
       <bibkey>guo-etal-2022-automatic</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.60</doi>
     </paper>
     <paper id="61">
       <title>Read before Generate! Faithful Long Form Question Answering with Machine Reading</title>
@@ -873,6 +933,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/kilt">KILT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.61</doi>
     </paper>
     <paper id="62">
       <title>A Simple yet Effective Relation Information Guided Approach for Few-Shot Relation Extraction</title>
@@ -887,6 +948,7 @@
       <bibkey>liu-etal-2022-simple</bibkey>
       <pwccode url="https://github.com/lylylylylyly/simplefsre" additional="false">lylylylylyly/simplefsre</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.62</doi>
     </paper>
     <paper id="63">
       <title><fixed-case>MIMIC</fixed-case>ause: <fixed-case>R</fixed-case>epresentation and automatic extraction of causal relation types from clinical notes</title>
@@ -902,6 +964,7 @@
       <bibkey>khetan-etal-2022-mimicause</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mimic-iii">MIMIC-III</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.63</doi>
     </paper>
     <paper id="64">
       <title>Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation</title>
@@ -915,6 +978,7 @@
       <bibkey>zhao-etal-2022-compressing</bibkey>
       <pwccode url="https://github.com/xuandongzhao/hpd" additional="false">xuandongzhao/hpd</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.64</doi>
     </paper>
     <paper id="65">
       <title>Debiasing Event Understanding for Visual Commonsense Tasks</title>
@@ -929,6 +993,7 @@
       <attachment type="software" hash="c4f67a1e">2022.findings-acl.65.software.zip</attachment>
       <bibkey>seo-etal-2022-debiasing</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/vcr">VCR</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.65</doi>
     </paper>
     <paper id="66">
       <title>Fact-Tree Reasoning for N-ary Question Answering over Knowledge Graphs</title>
@@ -941,6 +1006,7 @@
       <abstract>Current Question Answering over Knowledge Graphs (KGQA) task mainly focuses on performing answer reasoning upon KGs with binary facts. However, it neglects the n-ary facts, which contain more than two entities. In this work, we highlight a more challenging but under-explored task: n-ary KGQA, i.e., answering n-ary facts questions upon n-ary KGs. Nevertheless, the multi-hop reasoning framework popular in binary KGQA task is not directly applicable on n-ary KGQA. We propose two feasible improvements: 1) upgrade the basic reasoning unit from entity or relation to fact, and 2) upgrade the reasoning structure from chain to tree. Therefore, we propose a novel fact-tree reasoning framework, FacTree, which integrates the above two upgrades. FacTree transforms the question into a fact tree and performs iterative fact reasoning on the fact tree to infer the correct answer. Experimental results on the n-ary KGQA dataset we constructed and two binary KGQA benchmarks demonstrate the effectiveness of FacTree compared with state-of-the-art methods.</abstract>
       <url hash="248c692c">2022.findings-acl.66</url>
       <bibkey>zhang-etal-2022-fact</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.66</doi>
     </paper>
     <paper id="67">
       <title><fixed-case>D</fixed-case>eep<fixed-case>S</fixed-case>truct: Pretraining of Language Models for Structure Prediction</title>
@@ -965,6 +1031,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/opiec">OPIEC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/t-rex">T-REx</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tekgen">TekGen</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.67</doi>
     </paper>
     <paper id="68">
       <title>The Change that Matters in Discourse Parsing: Estimating the Impact of Domain Shift on Parser Error</title>
@@ -977,6 +1044,7 @@
       <url hash="65f823c8">2022.findings-acl.68</url>
       <bibkey>atwell-etal-2022-change</bibkey>
       <pwccode url="https://github.com/anthonysicilia/change-that-matters-acl2022" additional="false">anthonysicilia/change-that-matters-acl2022</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.68</doi>
     </paper>
     <paper id="69">
       <title>Mukayese: <fixed-case>T</fixed-case>urkish <fixed-case>NLP</fixed-case> Strikes Back</title>
@@ -990,6 +1058,7 @@
       <bibkey>safaya-etal-2022-mukayese</bibkey>
       <pwccode url="https://github.com/alisafaya/mukayese" additional="false">alisafaya/mukayese</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.69</doi>
     </paper>
     <paper id="70">
       <title>Virtual Augmentation Supported Contrastive Learning of Sentence Representations</title>
@@ -1003,6 +1072,7 @@
       <url hash="308247c5">2022.findings-acl.70</url>
       <bibkey>zhang-etal-2022-virtual</bibkey>
       <pwccode url="https://github.com/amazon-research/sentence-representations" additional="false">amazon-research/sentence-representations</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.70</doi>
     </paper>
     <paper id="71">
       <title><fixed-case>M</fixed-case>o<fixed-case>E</fixed-case>fication: Transformer Feed-forward Layers are Mixtures of Experts</title>
@@ -1020,6 +1090,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.71</doi>
     </paper>
     <paper id="72">
       <title><fixed-case>DS</fixed-case>-<fixed-case>TOD</fixed-case>: Efficient Domain Specialization for Task-Oriented Dialog</title>
@@ -1034,6 +1105,7 @@
       <bibkey>hung-etal-2022-ds</bibkey>
       <pwccode url="https://github.com/umanlp/ds-tod" additional="false">umanlp/ds-tod</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ccnet">CCNet</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.72</doi>
     </paper>
     <paper id="73">
       <title>Distinguishing Non-natural from Natural Adversarial Samples for More Robust Pre-trained Language Model</title>
@@ -1048,6 +1120,7 @@
       <pwccode url="https://github.com/lilynlp/distinguishing-non-natural" additional="false">lilynlp/distinguishing-non-natural</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.73</doi>
     </paper>
     <paper id="74">
       <title>Learning Adaptive Axis Attentions in Fine-tuning: Beyond Fixed Sparse Attention Patterns</title>
@@ -1067,6 +1140,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/lra">LRA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.74</doi>
     </paper>
     <paper id="75">
       <title>Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment</title>
@@ -1079,6 +1153,7 @@
       <abstract>Most research on question answering focuses on the pre-deployment stage; i.e., building an accurate model for deployment.In this paper, we ask the question: Can we improve QA systems further post-deployment based on user interactions? We focus on two kinds of improvements: 1) improving the QA system’s performance itself, and 2) providing the model with the ability to explain the correctness or incorrectness of an answer.We collect a retrieval-based QA dataset, FeedbackQA, which contains interactive feedback from users. We collect this dataset by deploying a base QA system to crowdworkers who then engage with the system and provide feedback on the quality of its answers.The feedback contains both structured ratings and unstructured natural language explanations.We train a neural model with this feedback data that can generate explanations and re-score answer candidates. We show that feedback data not only improves the accuracy of the deployed QA system but also other stronger non-deployed systems. The generated explanations also help users make informed decisions about the correctness of answers.</abstract>
       <url hash="6ba1dacd">2022.findings-acl.75</url>
       <bibkey>li-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.75</doi>
     </paper>
     <paper id="76">
       <title>To be or not to be an Integer? Encoding Variables for Mathematical Text</title>
@@ -1092,6 +1167,7 @@
       <url hash="e1f13195">2022.findings-acl.76</url>
       <attachment type="software" hash="04b2843f">2022.findings-acl.76.software.zip</attachment>
       <bibkey>ferreira-etal-2022-integer</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.76</doi>
     </paper>
     <paper id="77">
       <title><fixed-case>GRS</fixed-case>: Combining Generation and Revision in Unsupervised Sentence Simplification</title>
@@ -1106,6 +1182,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/asset">ASSET</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cola">CoLA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/newsela">Newsela</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.77</doi>
     </paper>
     <paper id="78">
       <title><fixed-case>BPE</fixed-case> vs. Morphological Segmentation: A Case Study on Machine Translation of Four Polysynthetic Languages</title>
@@ -1118,6 +1195,7 @@
       <abstract>Morphologically-rich polysynthetic languages present a challenge for NLP systems due to data sparsity, and a common strategy to handle this issue is to apply subword segmentation. We investigate a wide variety of supervised and unsupervised morphological segmentation methods for four polysynthetic languages: Nahuatl, Raramuri, Shipibo-Konibo, and Wixarika. Then, we compare the morphologically inspired segmentation methods against Byte-Pair Encodings (BPEs) as inputs for machine translation (MT) when translating to and from Spanish. We show that for all language pairs except for Nahuatl, an unsupervised morphological segmentation algorithm outperforms BPEs consistently and that, although supervised methods achieve better segmentation scores, they under-perform in MT challenges. Finally, we contribute two new morphological segmentation datasets for Raramuri and Shipibo-Konibo, and a parallel corpus for Raramuri–Spanish.</abstract>
       <url hash="6cdee1b8">2022.findings-acl.78</url>
       <bibkey>mager-etal-2022-bpe</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.78</doi>
     </paper>
     <paper id="79">
       <title>Distributed <fixed-case>NLI</fixed-case>: Learning to Predict Human Opinion Distributions for Language Reasoning</title>
@@ -1132,6 +1210,7 @@
       <pwccode url="https://github.com/easonnie/ChaosNLI" additional="false">easonnie/ChaosNLI</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/chaosnli">ChaosNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.79</doi>
     </paper>
     <paper id="80">
       <title>Morphological Processing of Low-Resource Languages: Where We Are and What’s Next</title>
@@ -1146,6 +1225,7 @@
       <abstract>Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages. Having long been multilingual, the field of computational morphology is increasingly moving towards approaches suitable for languages with minimal or no annotated resources. First, we survey recent developments in computational morphology with a focus on low-resource languages. Second, we argue that the field is ready to tackle the logical next challenge: understanding a language’s morphology from raw text alone. We perform an empirical study on a truly unsupervised version of the paradigm completion task and show that, while existing state-of-the-art models bridged by two newly proposed models we devise perform reasonably, there is still much room for improvement. The stakes are high: solving this task will increase the language coverage of morphological resources by a number of magnitudes.</abstract>
       <url hash="1d7407bb">2022.findings-acl.80</url>
       <bibkey>wiemerslage-etal-2022-morphological</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.80</doi>
     </paper>
     <paper id="81">
       <title>Learning and Evaluating Character Representations in Novels</title>
@@ -1158,6 +1238,7 @@
       <url hash="7c807cab">2022.findings-acl.81</url>
       <bibkey>inoue-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/naoya-i/charembench" additional="false">naoya-i/charembench</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.81</doi>
     </paper>
     <paper id="82">
       <title>Answer Uncertainty and Unanswerability in Multiple-Choice Machine Reading Comprehension</title>
@@ -1169,6 +1250,7 @@
       <bibkey>raina-gales-2022-answer</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/reclor">ReClor</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.82</doi>
     </paper>
     <paper id="83">
       <title>Measuring the Language of Self-Disclosure across Corpora</title>
@@ -1181,6 +1263,7 @@
       <abstract>Being able to reliably estimate self-disclosure – a key component of friendship and intimacy – from language is important for many psychology studies. We build single-task models on five self-disclosure corpora, but find that these models generalize poorly; the within-domain accuracy of predicted message-level self-disclosure of the best-performing model (mean Pearson’s r=0.69) is much higher than the respective across data set accuracy (mean Pearson’s r=0.32), due to both variations in the corpora (e.g., medical vs. general topics) and labeling instructions (target variables: self-disclosure, emotional disclosure, intimacy). However, some lexical features, such as expression of negative emotions and use of first person personal pronouns such as ‘I’ reliably predict self-disclosure across corpora. We develop a multi-task model that yields better results, with an average Pearson’s r of 0.37 for out-of-corpora prediction.</abstract>
       <url hash="0acc04dc">2022.findings-acl.83</url>
       <bibkey>reuel-etal-2022-measuring</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.83</doi>
     </paper>
     <paper id="84">
       <title>When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation</title>
@@ -1199,6 +1282,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sick">SICK</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.84</doi>
     </paper>
     <paper id="85">
       <title>Explaining Classes through Stable Word Attributions</title>
@@ -1213,6 +1297,7 @@
       <attachment type="software" hash="69b60bdf">2022.findings-acl.85.software.tgz</attachment>
       <bibkey>ronnqvist-etal-2022-explaining</bibkey>
       <pwccode url="https://github.com/turkunlp/class-explainer" additional="false">turkunlp/class-explainer</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.85</doi>
     </paper>
     <paper id="86">
       <title>What to Learn, and How: <fixed-case>T</fixed-case>oward Effective Learning from Rationales</title>
@@ -1227,6 +1312,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/fever">FEVER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multirc">MultiRC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/e-snli">e-SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.86</doi>
     </paper>
     <paper id="87">
       <title>Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments</title>
@@ -1241,6 +1327,7 @@
       <url hash="e8cf54ea">2022.findings-acl.87</url>
       <bibkey>maronikolakis-etal-2022-listening</bibkey>
       <pwccode url="https://github.com/antmarakis/xtremespeech" additional="false">antmarakis/xtremespeech</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.87</doi>
     </paper>
     <paper id="88">
       <title>Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists</title>
@@ -1255,6 +1342,7 @@
       <bibkey>attanasio-etal-2022-entropy</bibkey>
       <pwccode url="https://github.com/g8a9/ear" additional="false">g8a9/ear</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mlma-hate-speech">MLMA Hate Speech</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.88</doi>
     </paper>
     <paper id="89">
       <title>From <fixed-case>BERT</fixed-case>‘s <fixed-case>P</fixed-case>oint of <fixed-case>V</fixed-case>iew: <fixed-case>R</fixed-case>evealing the <fixed-case>P</fixed-case>revailing <fixed-case>C</fixed-case>ontextual <fixed-case>D</fixed-case>ifferences</title>
@@ -1265,6 +1353,7 @@
       <url hash="f82c8085">2022.findings-acl.89</url>
       <attachment type="software" hash="bf9e3053">2022.findings-acl.89.software.zip</attachment>
       <bibkey>schuster-hegelich-2022-berts</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.89</doi>
     </paper>
     <paper id="90">
       <title>Learning Bias-reduced Word Embeddings Using Dictionary Definitions</title>
@@ -1276,6 +1365,7 @@
       <url hash="71949aa4">2022.findings-acl.90</url>
       <bibkey>an-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/haozhe-an/dd-glove" additional="false">haozhe-an/dd-glove</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.90</doi>
     </paper>
     <paper id="91">
       <title>Knowledge Graph Embedding by Adaptive Limit Scoring Loss Using Dynamic Weighting Strategy</title>
@@ -1291,6 +1381,7 @@
       <url hash="6f749bd6">2022.findings-acl.91</url>
       <bibkey>yang-etal-2022-knowledge</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fb15k-237">FB15k-237</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.91</doi>
     </paper>
     <paper id="92">
       <title><fixed-case>OCR</fixed-case> Improves Machine Translation for Low-Resource Languages</title>
@@ -1302,6 +1393,7 @@
       <abstract>We aim to investigate the performance of current OCR systems on low resource languages and low resource scripts.We introduce and make publicly available a novel benchmark, OCR4MT, consisting of real and synthetic data, enriched with noise, for 60 low-resource languages in low resource scripts. We evaluate state-of-the-art OCR systems on our benchmark and analyse most common errors. We show that OCR monolingual data is a valuable resource that can increase performance of Machine Translation models, when used in backtranslation. We then perform an ablation study to investigate how OCR errors impact Machine Translation performance and determine what is the minimum level of OCR quality needed for the monolingual data to be useful for Machine Translation.</abstract>
       <url hash="8d03ec3c">2022.findings-acl.92</url>
       <bibkey>ignat-etal-2022-ocr</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.92</doi>
     </paper>
     <paper id="93">
       <title><fixed-case>C</fixed-case>o<fixed-case>C</fixed-case>o<fixed-case>LM</fixed-case>: Complex Commonsense Enhanced Language Model with Discourse Relations</title>
@@ -1319,6 +1411,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.93</doi>
     </paper>
     <paper id="94">
       <title>Learning to Robustly Aggregate Labeling Functions for Semi-supervised Data Programming</title>
@@ -1334,6 +1427,7 @@
       <attachment type="software" hash="064c9fcf">2022.findings-acl.94.software.zip</attachment>
       <bibkey>maheshwari-etal-2022-learning</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.94</doi>
     </paper>
     <paper id="95">
       <title>Multi-Granularity Semantic Aware Graph Model for Reducing Position Bias in Emotion Cause Pair Extraction</title>
@@ -1346,6 +1440,7 @@
       <abstract>The emotion cause pair extraction (ECPE) task aims to extract emotions and causes as pairs from documents. We observe that the relative distance distribution of emotions and causes is extremely imbalanced in the typical ECPE dataset. Existing methods have set a fixed size window to capture relations between neighboring clauses. However, they neglect the effective semantic connections between distant clauses, leading to poor generalization ability towards position-insensitive data. To alleviate the problem, we propose a novel <tex-math>\textbf{M}</tex-math>ulti-<tex-math>\textbf{G}</tex-math>ranularity <tex-math>\textbf{S}</tex-math>emantic <tex-math>\textbf{A}</tex-math>ware <tex-math>\textbf{G}</tex-math>raph model (MGSAG) to incorporate fine-grained and coarse-grained semantic features jointly, without regard to distance limitation. In particular, we first explore semantic dependencies between clauses and keywords extracted from the document that convey fine-grained semantic features, obtaining keywords enhanced clause representations. Besides, a clause graph is also established to model coarse-grained semantic relations between clauses. Experimental results indicate that MGSAG surpasses the existing state-of-the-art ECPE models. Especially, MGSAG outperforms other models significantly in the condition of position-insensitive data.</abstract>
       <url hash="c858ac28">2022.findings-acl.95</url>
       <bibkey>bao-etal-2022-multi</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.95</doi>
     </paper>
     <paper id="96">
       <title>Cross-lingual Inference with A <fixed-case>C</fixed-case>hinese Entailment Graph</title>
@@ -1362,6 +1457,7 @@
       <pwccode url="https://github.com/teddy-li/chineseentgraph" additional="false">teddy-li/chineseentgraph</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/clue">CLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/figer">FIGER</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.96</doi>
     </paper>
     <paper id="97">
       <title>Multi-task Learning for Paraphrase Generation With Keyword and Part-of-Speech Reconstruction</title>
@@ -1374,6 +1470,7 @@
       <attachment type="software" hash="6e514ed0">2022.findings-acl.97.software.zip</attachment>
       <bibkey>xie-etal-2022-multi</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.97</doi>
     </paper>
     <paper id="98">
       <title><fixed-case>MDCS</fixed-case>pell: A Multi-task Detector-Corrector Framework for <fixed-case>C</fixed-case>hinese Spelling Correction</title>
@@ -1385,6 +1482,7 @@
       <abstract>Chinese Spelling Correction (CSC) is a task to detect and correct misspelled characters in Chinese texts. CSC is challenging since many Chinese characters are visually or phonologically similar but with quite different semantic meanings. Many recent works use BERT-based language models to directly correct each character of the input sentence. However, these methods can be sub-optimal since they correct every character of the sentence only by the context which is easily negatively affected by the misspelled characters. Some other works propose to use an error detector to guide the correction by masking the detected errors. Nevertheless, these methods dampen the visual or phonological features from the misspelled characters which could be critical for correction. In this work, we propose a novel general detector-corrector multi-task framework where the corrector uses BERT to capture the visual and phonological features from each character in the raw sentence and uses a late fusion strategy to fuse the hidden states of the corrector with that of the detector to minimize the negative impact from the misspelled characters. Comprehensive experiments on benchmarks demonstrate that our proposed method can significantly outperform the state-of-the-art methods in the CSC task.</abstract>
       <url hash="c16669e5">2022.findings-acl.98</url>
       <bibkey>zhu-etal-2022-mdcspell</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.98</doi>
     </paper>
     <paper id="99">
       <title><fixed-case>S</fixed-case><tex-math>^2</tex-math><fixed-case>SQL</fixed-case>: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-<fixed-case>SQL</fixed-case> Parsers</title>
@@ -1402,6 +1500,7 @@
       <attachment type="software" hash="a9f19b91">2022.findings-acl.99.software.zip</attachment>
       <bibkey>hui-etal-2022-s2sql</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/spider-1">SPIDER</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.99</doi>
     </paper>
     <paper id="100">
       <title>Constructing Open Cloze Tests Using Generation and Discrimination Capabilities of Transformers</title>
@@ -1412,6 +1511,7 @@
       <abstract>This paper presents the first multi-objective transformer model for generating open cloze tests that exploits generation and discrimination capabilities to improve performance. Our model is further enhanced by tweaking its loss function and applying a post-processing re-ranking algorithm that improves overall test structure. Experiments using automatic and human evaluation show that our approach can achieve up to 82% accuracy according to experts, outperforming previous work and baselines. We also release a collection of high-quality open cloze tests along with sample system output and human annotations that can serve as a future benchmark.</abstract>
       <url hash="340484a3">2022.findings-acl.100</url>
       <bibkey>felice-etal-2022-constructing</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.100</doi>
     </paper>
     <paper id="101">
       <title><fixed-case>C</fixed-case>o-training an <fixed-case>U</fixed-case>nsupervised <fixed-case>C</fixed-case>onstituency <fixed-case>P</fixed-case>arser with <fixed-case>W</fixed-case>eak <fixed-case>S</fixed-case>upervision</title>
@@ -1424,6 +1524,7 @@
       <pwccode url="https://github.com/Nickil21/weakly-supervised-parsing" additional="false">Nickil21/weakly-supervised-parsing</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/chinese-treebank">Chinese Treebank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.101</doi>
     </paper>
     <paper id="102">
       <title><fixed-case>H</fixed-case>i<fixed-case>S</fixed-case>truct+: Improving Extractive Text Summarization with Hierarchical Structure Information</title>
@@ -1438,6 +1539,7 @@
       <pwccode url="https://github.com/QianRuan/histruct" additional="false">QianRuan/histruct</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/pubmed">Pubmed</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/arxiv">arXiv</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.102</doi>
     </paper>
     <paper id="103">
       <title>An Isotropy Analysis in the Multilingual <fixed-case>BERT</fixed-case> Embedding Space</title>
@@ -1448,6 +1550,7 @@
       <url hash="4da36a67">2022.findings-acl.103</url>
       <bibkey>rajaee-pilehvar-2022-isotropy</bibkey>
       <pwccode url="https://github.com/sara-rajaee/multilingual-isotropy" additional="false">sara-rajaee/multilingual-isotropy</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.103</doi>
     </paper>
     <paper id="104">
       <title>Multi-Stage Prompting for Knowledgeable Dialogue Generation</title>
@@ -1464,6 +1567,7 @@
       <bibkey>liu-etal-2022-multi</bibkey>
       <pwccode url="https://github.com/NVIDIA/Megatron-LM" additional="false">NVIDIA/Megatron-LM</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.104</doi>
     </paper>
     <paper id="105">
       <title><tex-math>\textrm{DuReader}_{\textrm{vis}}</tex-math>: A <fixed-case>C</fixed-case>hinese Dataset for Open-domain Document Visual Question Answering</title>
@@ -1486,6 +1590,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/infographicvqa">InfographicVQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visualmrc">VisualMRC</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.105</doi>
     </paper>
     <paper id="106">
       <title>Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models</title>
@@ -1500,6 +1605,7 @@
       <bibkey>mueller-etal-2022-coloring</bibkey>
       <pwccode url="https://github.com/sebschu/multilingual-transformations" additional="false">sebschu/multilingual-transformations</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mc4">mC4</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.106</doi>
     </paper>
     <paper id="107">
       <title><fixed-case>C</fixed-case><tex-math>^3</tex-math><fixed-case>KG</fixed-case>: A <fixed-case>C</fixed-case>hinese Commonsense Conversation Knowledge Graph</title>
@@ -1517,6 +1623,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/atomic">ATOMIC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mod-1">MOD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.107</doi>
     </paper>
     <paper id="108">
       <title>Graph Neural Networks for Multiparallel Word Alignment</title>
@@ -1529,6 +1636,7 @@
       <abstract>After a period of decrease, interest in word alignments is increasing again for their usefulness in domains such as typological research, cross-lingual annotation projection and machine translation. Generally, alignment algorithms only use bitext and do not make use of the fact that many parallel corpora are multiparallel. Here, we compute high-quality word alignments between multiple language pairs by considering all language pairs together. First, we create a multiparallel word alignment graph, joining all bilingual word alignment pairs in one graph. Next, we use graph neural networks (GNNs) to exploit the graph structure. Our GNN approach (i) utilizes information about the meaning, position and language of the input words, (ii) incorporates information from multiple parallel sentences, (iii) adds and removes edges from the initial alignments, and (iv) yields a prediction model that can generalize beyond the training sentences. We show that community detection algorithms can provide valuable information for multiparallel word alignment. Our method outperforms previous work on three word alignment datasets and on a downstream task.</abstract>
       <url hash="c8f2efd4">2022.findings-acl.108</url>
       <bibkey>imani-etal-2022-graph</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.108</doi>
     </paper>
     <paper id="109">
       <title>Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with <fixed-case>ASR</fixed-case> Errors</title>
@@ -1545,6 +1653,7 @@
       <bibkey>wu-etal-2022-sentiment</bibkey>
       <pwccode url="https://github.com/albertwy/SWRM" additional="false">albertwy/SWRM</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/multimodal-opinionlevel-sentiment-intensity">Multimodal Opinionlevel Sentiment Intensity</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.109</doi>
     </paper>
     <paper id="110">
       <title>A Novel Framework Based on Medical Concept Driven Attention for Explainable Medical Code Prediction via External Knowledge</title>
@@ -1557,6 +1666,7 @@
       <abstract>Medical code prediction from clinical notes aims at automatically associating medical codes with the clinical notes. Rare code problem, the medical codes with low occurrences, is prominent in medical code prediction. Recent studies employ deep neural networks and the external knowledge to tackle it. However, such approaches lack interpretability which is a vital issue in medical application. Moreover, due to the lengthy and noisy clinical notes, such approaches fail to achieve satisfactory results. Therefore, in this paper, we propose a novel framework based on medical concept driven attention to incorporate external knowledge for explainable medical code prediction. In specific, both the clinical notes and Wikipedia documents are aligned into topic space to extract medical concepts using topic modeling. Then, the medical concept-driven attention mechanism is applied to uncover the medical code related concepts which provide explanations for medical code prediction. Experimental results on the benchmark dataset show the superiority of the proposed framework over several state-of-the-art baselines.</abstract>
       <url hash="ccce9501">2022.findings-acl.110</url>
       <bibkey>wang-etal-2022-novel</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.110</doi>
     </paper>
     <paper id="111">
       <title>Effective Unsupervised Constrained Text Generation based on Perturbed Masking</title>
@@ -1568,6 +1678,7 @@
       <abstract>Unsupervised constrained text generation aims to generate text under a given set of constraints without any supervised data. Current state-of-the-art methods stochastically sample edit positions and actions, which may cause unnecessary search steps. In this paper, we propose PMCTG to improve effectiveness by searching for the best edit position and action in each step. Specifically, PMCTG extends perturbed masking technique to effectively search for the most incongruent token to edit. Then it introduces four multi-aspect scoring functions to select edit action to further reduce search difficulty. Since PMCTG does not require supervised data, it could be applied to different generation tasks. We show that under the unsupervised setting, PMCTG achieves new state-of-the-art results in two representative tasks, namely keywords- to-sentence generation and paraphrasing.</abstract>
       <url hash="540266d4">2022.findings-acl.111</url>
       <bibkey>fu-etal-2022-effective</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.111</doi>
     </paper>
     <paper id="112">
       <title>Combining (Second-Order) Graph-Based and Headed-Span-Based Projective Dependency Parsing</title>
@@ -1579,6 +1690,7 @@
       <bibkey>yang-tu-2022-combining</bibkey>
       <pwccode url="https://github.com/sustcsonglin/span-based-dependency-parsing" additional="false">sustcsonglin/span-based-dependency-parsing</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.112</doi>
     </paper>
     <paper id="113">
       <title>End-to-End Speech Translation for Code Switched Speech</title>
@@ -1595,6 +1707,7 @@
       <bibkey>weller-etal-2022-end</bibkey>
       <pwccode url="https://github.com/apple/ml-code-switched-speech-translation" additional="false">apple/ml-code-switched-speech-translation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/covost">CoVoST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.113</doi>
     </paper>
     <paper id="114">
       <title>A Transformational Biencoder with In-Domain Negative Sampling for Zero-Shot Entity Linking</title>
@@ -1609,6 +1722,7 @@
       <attachment type="software" hash="eecc7c72">2022.findings-acl.114.software.zip</attachment>
       <bibkey>sun-etal-2022-transformational</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/zeshel">ZESHEL</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.114</doi>
     </paper>
     <paper id="115">
       <title>Finding the Dominant Winning Ticket in Pre-Trained Language Models</title>
@@ -1626,6 +1740,7 @@
       <bibkey>gong-etal-2022-finding</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.115</doi>
     </paper>
     <paper id="116">
       <title><fixed-case>T</fixed-case>hai Nested Named Entity Recognition Corpus</title>
@@ -1642,6 +1757,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/dan">DaN+</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/nne">NNE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.116</doi>
     </paper>
     <paper id="117">
       <title>Two-Step Question Retrieval for Open-Domain <fixed-case>QA</fixed-case></title>
@@ -1660,6 +1776,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paq">PAQ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/triviaqa">TriviaQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.117</doi>
     </paper>
     <paper id="118">
       <title>Semantically Distributed Robust Optimization for Vision-and-Language Inference</title>
@@ -1674,6 +1791,7 @@
       <bibkey>gokhale-etal-2022-semantically</bibkey>
       <pwccode url="https://github.com/asu-apg/vli_sdro" additional="false">asu-apg/vli_sdro</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/violin">Violin</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.118</doi>
     </paper>
     <paper id="119">
       <title>Learning from Missing Relations: Contrastive Learning with Commonsense Knowledge Graphs for Commonsense Inference</title>
@@ -1691,6 +1809,7 @@
       <pwccode url="https://github.com/yongho94/solar-framework_commonsense-inference" additional="false">yongho94/solar-framework_commonsense-inference</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/event2mind">Event2Mind</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.119</doi>
     </paper>
     <paper id="120">
       <title>Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference</title>
@@ -1708,6 +1827,7 @@
       <bibkey>wang-etal-2022-capture</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/chaosnli">ChaosNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multinli">MultiNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.120</doi>
     </paper>
     <paper id="121">
       <title>Efficient, Uncertainty-based Moderation of Neural Networks Text Classifiers</title>
@@ -1720,6 +1840,7 @@
       <bibkey>andersen-maalej-2022-efficient</bibkey>
       <pwccode url="https://github.com/jsandersen/cmt" additional="false">jsandersen/cmt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.121</doi>
     </paper>
     <paper id="122">
       <title>Revisiting Automatic Evaluation of Extractive Summarization Task: Can We Do Better than <fixed-case>ROUGE</fixed-case>?</title>
@@ -1730,6 +1851,7 @@
       <abstract>It has been the norm for a long time to evaluate automated summarization tasks using the popular ROUGE metric. Although several studies in the past have highlighted the limitations of ROUGE, researchers have struggled to reach a consensus on a better alternative until today. One major limitation of the traditional ROUGE metric is the lack of semantic understanding (relies on direct overlap of n-grams). In this paper, we exclusively focus on the extractive summarization task and propose a semantic-aware nCG (normalized cumulative gain)-based evaluation metric (called Sem-nCG) for evaluating this task. One fundamental contribution of the paper is that it demonstrates how we can generate more reliable semantic-aware ground truths for evaluating extractive summarization tasks without any additional human intervention. To the best of our knowledge, this work is the first of its kind. We have conducted extensive experiments with this new metric using the widely used CNN/DailyMail dataset. Experimental results show that the new Sem-nCG metric is indeed semantic-aware, shows higher correlation with human judgement (more reliable) and yields a large number of disagreements with the original ROUGE metric (suggesting that ROUGE often leads to inaccurate conclusions also verified by humans).</abstract>
       <url hash="1da33b09">2022.findings-acl.122</url>
       <bibkey>akter-etal-2022-revisiting</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.122</doi>
     </paper>
     <paper id="123">
       <title>Open Vocabulary Extreme Classification Using Generative Models</title>
@@ -1744,6 +1866,7 @@
       <abstract>The extreme multi-label classification (XMC) task aims at tagging content with a subset of labels from an extremely large label set. The label vocabulary is typically defined in advance by domain experts and assumed to capture all necessary tags. However in real world scenarios this label set, although large, is often incomplete and experts frequently need to refine it. To develop systems that simplify this process, we introduce the task of open vocabulary XMC (OXMC): given a piece of content, predict a set of labels, some of which may be outside of the known tag set. Hence, in addition to not having training data for some labels–as is the case in zero-shot classification–models need to invent some labels on-thefly. We propose GROOV, a fine-tuned seq2seq model for OXMC that generates the set of labels as a flat sequence and is trained using a novel loss independent of predicted label order. We show the efficacy of the approach, experimenting with popular XMC datasets for which GROOV is able to predict meaningful labels outside the given vocabulary while performing on par with state-of-the-art solutions for known labels.</abstract>
       <url hash="33d9845e">2022.findings-acl.123</url>
       <bibkey>simig-etal-2022-open</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.123</doi>
     </paper>
     <paper id="124">
       <title>Decomposed Meta-Learning for Few-Shot Named Entity Recognition</title>
@@ -1761,6 +1884,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2002">CoNLL 2002</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/few-nerd">Few-NERD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.124</doi>
     </paper>
     <paper id="125">
       <title><fixed-case>T</fixed-case>eg<fixed-case>T</fixed-case>ok: Augmenting Text Generation via Task-specific and Open-world Knowledge</title>
@@ -1777,6 +1901,7 @@
       <url hash="50290aea">2022.findings-acl.125</url>
       <bibkey>tan-etal-2022-tegtok</bibkey>
       <pwccode url="https://github.com/lxchtan/tegtok" additional="false">lxchtan/tegtok</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.125</doi>
     </paper>
     <paper id="126">
       <title><fixed-case>E</fixed-case>mo<fixed-case>C</fixed-case>aps: Emotion Capsule based Model for Conversational Emotion Recognition</title>
@@ -1791,6 +1916,7 @@
       <bibkey>li-etal-2022-emocaps</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/iemocap">IEMOCAP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/meld">MELD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.126</doi>
     </paper>
     <paper id="127">
       <title>Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text</title>
@@ -1809,6 +1935,7 @@
       <bibkey>wang-etal-2022-logic</bibkey>
       <pwccode url="https://github.com/WangsyGit/LReasoner" additional="false">WangsyGit/LReasoner</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/reclor">ReClor</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.127</doi>
     </paper>
     <paper id="128">
       <title>Transfer Learning and Prediction Consistency for Detecting Offensive Spans of Text</title>
@@ -1823,6 +1950,7 @@
       <url hash="555ea717">2022.findings-acl.128</url>
       <attachment type="software" hash="2cfb564e">2022.findings-acl.128.software.zip</attachment>
       <bibkey>pouran-ben-veyseh-etal-2022-transfer</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.128</doi>
     </paper>
     <paper id="129">
       <title>Learning Reasoning Patterns for Relational Triple Extraction with Mutual Generation of Text and Graph</title>
@@ -1833,6 +1961,7 @@
       <abstract>Relational triple extraction is a critical task for constructing knowledge graphs. Existing methods focused on learning text patterns from explicit relational mentions. However, they usually suffered from ignoring relational reasoning patterns, thus failed to extract the implicitly implied triples. Fortunately, the graph structure of a sentence’s relational triples can help find multi-hop reasoning paths. Moreover, the type inference logic through the paths can be captured with the sentence’s supplementary relational expressions that represent the real-world conceptual meanings of the paths’ composite relations. In this paper, we propose a unified framework to learn the relational reasoning patterns for this task. To identify multi-hop reasoning paths, we construct a relational graph from the sentence (text-to-graph generation) and apply multi-layer graph convolutions to it. To capture the relation type inference logic of the paths, we propose to understand the unlabeled conceptual expressions by reconstructing the sentence from the relational graph (graph-to-text generation) in a self-supervised manner. Experimental results on several benchmark datasets demonstrate the effectiveness of our method.</abstract>
       <url hash="f200a93e">2022.findings-acl.129</url>
       <bibkey>chen-etal-2022-learning</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.129</doi>
     </paper>
     <paper id="130">
       <title>Document-Level Event Argument Extraction via Optimal Transport</title>
@@ -1846,6 +1975,7 @@
       <url hash="d82688c1">2022.findings-acl.130</url>
       <attachment type="software" hash="e2c7df95">2022.findings-acl.130.software.zip</attachment>
       <bibkey>pouran-ben-veyseh-etal-2022-document</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.130</doi>
     </paper>
     <paper id="131">
       <title>N-Shot Learning for Augmenting Task-Oriented Dialogue State Tracking</title>
@@ -1858,6 +1988,7 @@
       <url hash="0e632aa9">2022.findings-acl.131</url>
       <bibkey>aksu-etal-2022-n</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.131</doi>
     </paper>
     <paper id="132">
       <title>Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation</title>
@@ -1872,6 +2003,7 @@
       <bibkey>tan-etal-2022-document</bibkey>
       <pwccode url="https://github.com/tonytan48/kd-docre" additional="false">tonytan48/kd-docre</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/docred">DocRED</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.132</doi>
     </paper>
     <paper id="133">
       <title>Calibration of Machine Reading Systems at Scale</title>
@@ -1884,6 +2016,7 @@
       <url hash="b3b952b0">2022.findings-acl.133</url>
       <bibkey>dhuliawala-etal-2022-calibration</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.133</doi>
     </paper>
     <paper id="134">
       <title>Towards Adversarially Robust Text Classifiers by Learning to Reweight Clean Examples</title>
@@ -1901,6 +2034,7 @@
       <bibkey>xu-etal-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.134</doi>
     </paper>
     <paper id="135">
       <title>Morphosyntactic Tagging with Pre-trained Language Models for <fixed-case>A</fixed-case>rabic and its Dialects</title>
@@ -1912,6 +2046,7 @@
       <url hash="2d0842f4">2022.findings-acl.135</url>
       <bibkey>inoue-etal-2022-morphosyntactic</bibkey>
       <pwccode url="https://github.com/camel-lab/camelbert_morphosyntactic_tagger" additional="false">camel-lab/camelbert_morphosyntactic_tagger</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.135</doi>
     </paper>
     <paper id="136">
       <title>How Pre-trained Language Models Capture Factual Knowledge? A Causal-Inspired Analysis</title>
@@ -1929,6 +2064,7 @@
       <url hash="2bc28acb">2022.findings-acl.136</url>
       <bibkey>li-etal-2022-pre</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.136</doi>
     </paper>
     <paper id="137">
       <title>Metadata Shaping: A Simple Approach for Knowledge-Enhanced Language Models</title>
@@ -1944,6 +2080,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/open-entity-1">Open Entity</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tacred">TACRED</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.137</doi>
     </paper>
     <paper id="138">
       <title>Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense</title>
@@ -1959,6 +2096,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wsc">WSC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winogrande">WinoGrande</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.138</doi>
     </paper>
     <paper id="139">
       <title>Weighted self Distillation for <fixed-case>C</fixed-case>hinese word segmentation</title>
@@ -1972,6 +2110,7 @@
       <attachment type="software" hash="d5cb4ee0">2022.findings-acl.139.software.zip</attachment>
       <bibkey>he-etal-2022-weighted</bibkey>
       <pwccode url="https://github.com/anzi20/weidc" additional="false">anzi20/weidc</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.139</doi>
     </paper>
     <paper id="140">
       <title>Sibylvariant Transformations for Robust Text Classification</title>
@@ -1987,6 +2126,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.140</doi>
     </paper>
     <paper id="141">
       <title><fixed-case>D</fixed-case>a<fixed-case>LC</fixed-case>: Domain Adaptation Learning Curve Prediction for Neural Machine Translation</title>
@@ -1999,6 +2139,7 @@
       <abstract>Domain Adaptation (DA) of Neural Machine Translation (NMT) model often relies on a pre-trained general NMT model which is adapted to the new domain on a sample of in-domain parallel data. Without parallel data, there is no way to estimate the potential benefit of DA, nor the amount of parallel samples it would require. It is however a desirable functionality that could help MT practitioners to make an informed decision before investing resources in dataset creation. We propose a Domain adaptation Learning Curve prediction (DaLC) model that predicts prospective DA performance based on in-domain monolingual samples in the source language. Our model relies on the NMT encoder representations combined with various instance and corpus-level features. We demonstrate that instance-level is better able to distinguish between different domains compared to corpus-level frameworks proposed in previous studies Finally, we perform in-depth analyses of the results highlighting the limitations of our approach, and provide directions for future research.</abstract>
       <url hash="1a30adf6">2022.findings-acl.141</url>
       <bibkey>park-etal-2022-dalc</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.141</doi>
     </paper>
     <paper id="142">
       <title>Hey <fixed-case>AI</fixed-case>, Can You Solve Complex Tasks by Talking to Agents?</title>
@@ -2013,6 +2154,7 @@
       <pwccode url="https://github.com/allenai/commaqa" additional="false">allenai/commaqa</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/drop">DROP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mathqa">MathQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.142</doi>
     </paper>
     <paper id="143">
       <title>Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion</title>
@@ -2023,6 +2165,7 @@
       <url hash="7b9dc305">2022.findings-acl.143</url>
       <bibkey>yao-mihalcea-2022-modality</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/meld">MELD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.143</doi>
     </paper>
     <paper id="144">
       <title><fixed-case>B</fixed-case>i<fixed-case>S</fixed-case>yn-<fixed-case>GAT</fixed-case>+: Bi-Syntax Aware Graph Attention Network for Aspect-based Sentiment Analysis</title>
@@ -2037,6 +2180,7 @@
       <bibkey>liang-etal-2022-bisyn</bibkey>
       <pwccode url="https://github.com/CCIIPLab/BiSyn_GAT_plus" additional="false">CCIIPLab/BiSyn_GAT_plus</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mams">MAMS</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.144</doi>
     </paper>
     <paper id="145">
       <title><fixed-case>I</fixed-case>ndic<fixed-case>BART</fixed-case>: A Pre-trained Model for Indic Natural Language Generation</title>
@@ -2054,6 +2198,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/flores">FLoRes</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/indiccorp">IndicCorp</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/samanantar">Samanantar</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.145</doi>
     </paper>
     <paper id="146">
       <title>Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models</title>
@@ -2073,6 +2218,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/reqa">ReQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/senteval">SentEval</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.146</doi>
     </paper>
     <paper id="147">
       <title>Improving Relation Extraction through Syntax-induced Pre-training with Dependency Masking</title>
@@ -2088,6 +2234,7 @@
       <revision id="2" href="2022.findings-acl.147v2" hash="544e937d" date="2022-05-18">Updated code link in footnote.</revision>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/semeval-2010-task-8">SemEval-2010 Task 8</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.147</doi>
     </paper>
     <paper id="148">
       <title>Striking a Balance: Alleviating Inconsistency in Pre-trained Models for Symmetric Classification Tasks</title>
@@ -2103,6 +2250,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/paws">PAWS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.148</doi>
     </paper>
     <paper id="149">
       <title>Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts</title>
@@ -2117,6 +2265,7 @@
       <url hash="0f1f93d2">2022.findings-acl.149</url>
       <bibkey>yu-etal-2022-diversifying</bibkey>
       <pwccode url="https://github.com/DM2-ND/MoKGE" additional="false">DM2-ND/MoKGE</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.149</doi>
     </paper>
     <paper id="150">
       <title>Dict-<fixed-case>BERT</fixed-case>: Enhancing Language Model Pre-training with Dictionary</title>
@@ -2135,6 +2284,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wnlampro">WNLaMPro</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.150</doi>
     </paper>
     <paper id="151">
       <title>A Feasibility Study of Answer-Unaware Question Generation for Education</title>
@@ -2151,6 +2301,7 @@
       <url hash="c5c12bf2">2022.findings-acl.151</url>
       <bibkey>dugan-etal-2022-feasibility</bibkey>
       <pwccode url="https://github.com/liamdugan/summary-qg" additional="false">liamdugan/summary-qg</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.151</doi>
     </paper>
     <paper id="152">
       <title>Relevant <fixed-case>C</fixed-case>ommon<fixed-case>S</fixed-case>ense Subgraphs for “What if...” Procedural Reasoning</title>
@@ -2162,6 +2313,7 @@
       <bibkey>zheng-kordjamshidi-2022-relevant</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wiqa">WIQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.152</doi>
     </paper>
     <paper id="153">
       <title>Combining Feature and Instance Attribution to Detect Artifacts</title>
@@ -2176,6 +2328,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/boolq">BoolQ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.153</doi>
     </paper>
     <paper id="154">
       <title>Leveraging Expert Guided Adversarial Augmentation For Improving Generalization in Named Entity Recognition</title>
@@ -2191,6 +2344,7 @@
       <bibkey>reich-etal-2022-leveraging</bibkey>
       <pwccode url="https://github.com/gt-salt/guided-adversarial-augmentation" additional="false">gt-salt/guided-adversarial-augmentation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.154</doi>
     </paper>
     <paper id="155">
       <title>Label Semantics for Few Shot Named Entity Recognition</title>
@@ -2208,6 +2362,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ncbi-disease-1">NCBI Disease</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.155</doi>
     </paper>
     <paper id="156">
       <title>Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as a Multi-Task Problem</title>
@@ -2223,6 +2378,7 @@
       <bibkey>mrini-etal-2022-detection</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/aida-conll-yago">AIDA CoNLL-YAGO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cometa">COMETA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.156</doi>
     </paper>
     <paper id="157">
       <title><fixed-case>VISITRON</fixed-case>: Visual Semantics-Aligned Interactively Trained Object-Navigator</title>
@@ -2240,6 +2396,7 @@
       <pwccode url="https://github.com/alexa/visitron" additional="false">alexa/visitron</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/matterport3d">Matterport3D</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/rxr">RxR</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.157</doi>
     </paper>
     <paper id="158">
       <title>Investigating Selective Prediction Approaches Across Several Tasks in <fixed-case>IID</fixed-case>, <fixed-case>OOD</fixed-case>, and Adversarial Settings</title>
@@ -2252,6 +2409,7 @@
       <attachment type="software" hash="64c1717d">2022.findings-acl.158.software.zip</attachment>
       <bibkey>varshney-etal-2022-investigating</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.158</doi>
     </paper>
     <paper id="159">
       <title>Unsupervised Natural Language Inference Using <fixed-case>PHL</fixed-case> Triplet Generation</title>
@@ -2269,6 +2427,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multinli">MultiNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.159</doi>
     </paper>
     <paper id="160">
       <title>Data Augmentation and Learned Layer Aggregation for Improved Multilingual Language Understanding in Dialogue</title>
@@ -2281,6 +2440,7 @@
       <bibkey>razumovskaia-etal-2022-data</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cc100">CC100</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xsid">xSID</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.160</doi>
     </paper>
     <paper id="161">
       <title>Ranking-Constrained Learning with Rationales for Text Classification</title>
@@ -2291,6 +2451,7 @@
       <abstract>We propose a novel approach that jointly utilizes the labels and elicited rationales for text classification to speed up the training of deep learning models with limited training data. We define and optimize a ranking-constrained loss function that combines cross-entropy loss with ranking losses as rationale constraints. We evaluate our proposed rationale-augmented learning approach on three human-annotated datasets, and show that our approach provides significant improvements over classification approaches that do not utilize rationales as well as other state-of-the-art rationale-augmented baselines.</abstract>
       <url hash="ed9f5299">2022.findings-acl.161</url>
       <bibkey>wang-etal-2022-ranking</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.161</doi>
     </paper>
     <paper id="162">
       <title><fixed-case>C</fixed-case>a<fixed-case>M</fixed-case>-<fixed-case>G</fixed-case>en: <fixed-case>C</fixed-case>ausally Aware Metric-Guided Text Generation</title>
@@ -2304,6 +2465,7 @@
       <abstract>Content is created for a well-defined purpose, often described by a metric or signal represented in the form of structured information. The relationship between the goal (metrics) of target content and the content itself is non-trivial. While large-scale language models show promising text generation capabilities, guiding the generated text with external metrics is challenging.These metrics and content tend to have inherent relationships and not all of them may be of consequence. We introduce CaM-Gen: Causally aware Generative Networks guided by user-defined target metrics incorporating the causal relationships between the metric and content features. We leverage causal inference techniques to identify causally significant aspects of a text that lead to the target metric and then explicitly guide generative models towards these by a feedback mechanism. We propose this mechanism for variational autoencoder and Transformer-based generative models. The proposed models beat baselines in terms of the target metric control while maintaining fluency and language quality of the generated text. To the best of our knowledge, this is one of the early attempts at controlled generation incorporating a metric guide using causal inference.</abstract>
       <url hash="04a83ef0">2022.findings-acl.162</url>
       <bibkey>goyal-etal-2022-cam</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.162</doi>
     </paper>
     <paper id="163">
       <title>Training Dynamics for Text Summarization Models</title>
@@ -2315,6 +2477,7 @@
       <abstract>Pre-trained language models (e.g. BART) have shown impressive results when fine-tuned on large summarization datasets. However, little is understood about this fine-tuning process, including what knowledge is retained from pre-training time or how content selection and generation strategies are learnt across iterations. In this work, we analyze the training dynamics for generation models, focusing on summarization. Across different datasets (CNN/DM, XSum, MediaSum) and summary properties, such as abstractiveness and hallucination, we study what the model learns at different stages of its fine-tuning process. We find that a propensity to copy the input is learned early in the training process consistently across all datasets studied. On the other hand, factual errors, such as hallucination of unsupported facts, are learnt in the later stages, though this behavior is more varied across domains. Based on these observations, we explore complementary approaches for modifying training: first, disregarding high-loss tokens that are challenging to learn and second, disregarding low-loss tokens that are learnt very quickly in the latter stages of the training process. We show that these simple training modifications allow us to configure our model to achieve different goals, such as improving factuality or improving abstractiveness.</abstract>
       <url hash="47a0b4e3">2022.findings-acl.163</url>
       <bibkey>goyal-etal-2022-training</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.163</doi>
     </paper>
     <paper id="164">
       <title>Richer Countries and Richer Representations</title>
@@ -2326,6 +2489,7 @@
       <url hash="e7c3f5c9">2022.findings-acl.164</url>
       <bibkey>zhou-etal-2022-richer</bibkey>
       <pwccode url="https://github.com/katezhou/country_distortions" additional="false">katezhou/country_distortions</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.164</doi>
     </paper>
     <paper id="165">
       <title><fixed-case>BBQ</fixed-case>: A hand-built bias benchmark for question answering</title>
@@ -2344,6 +2508,7 @@
       <pwccode url="https://github.com/nyu-mll/bbq" additional="false">nyu-mll/bbq</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bbq">BBQ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.165</doi>
     </paper>
     <paper id="166">
       <title>Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble</title>
@@ -2357,6 +2522,7 @@
       <url hash="1ce8da67">2022.findings-acl.166</url>
       <bibkey>li-etal-2022-zero</bibkey>
       <pwccode url="https://github.com/xinjli/transphone" additional="false">xinjli/transphone</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.166</doi>
     </paper>
     <paper id="167">
       <title>Dim Wihl Gat Tun: <fixed-case>T</fixed-case>he Case for Linguistic Expertise in <fixed-case>NLP</fixed-case> for Under-Documented Languages</title>
@@ -2371,6 +2537,7 @@
       <abstract>Recent progress in NLP is driven by pretrained models leveraging massive datasets and has predominantly benefited the world’s political and economic superpowers. Technologically underserved languages are left behind because they lack such resources. Hundreds of underserved languages, nevertheless, have available data sources in the form of interlinear glossed text (IGT) from language documentation efforts. IGT remains underutilized in NLP work, perhaps because its annotations are only semi-structured and often language-specific. With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available. We specifically advocate for collaboration with documentary linguists. Our paper provides a roadmap for successful projects utilizing IGT data: (1) It is essential to define which NLP tasks can be accomplished with the given IGT data and how these will benefit the speech community. (2) Great care and target language expertise is required when converting the data into structured formats commonly employed in NLP. (3) Task-specific and user-specific evaluation can help to ascertain that the tools which are created benefit the target language speech community. We illustrate each step through a case study on developing a morphological reinflection system for the Tsimchianic language Gitksan.</abstract>
       <url hash="32ecd05a">2022.findings-acl.167</url>
       <bibkey>forbes-etal-2022-dim</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.167</doi>
     </paper>
     <paper id="168">
       <title>Question Generation for Reading Comprehension Assessment by Modeling How and What to Ask</title>
@@ -2385,6 +2552,7 @@
       <bibkey>ghanem-etal-2022-question</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cosmosqa">CosmosQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.168</doi>
     </paper>
     <paper id="169">
       <title><fixed-case>TAB</fixed-case>i: <fixed-case>T</fixed-case>ype-Aware Bi-Encoders for Open-Domain Entity Retrieval</title>
@@ -2400,6 +2568,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/figer">FIGER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/kilt">KILT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.169</doi>
     </paper>
     <paper id="170">
       <title>Hierarchical Recurrent Aggregative Generation for Few-Shot <fixed-case>NLG</fixed-case></title>
@@ -2411,6 +2580,7 @@
       <url hash="108609b2">2022.findings-acl.170</url>
       <bibkey>zhou-etal-2022-hierarchical</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/sgd">SGD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.170</doi>
     </paper>
     <paper id="171">
       <title>Training Text-to-Text Transformers with Privacy Guarantees</title>
@@ -2424,6 +2594,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/c4">C4</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.171</doi>
     </paper>
     <paper id="172">
       <title>Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers</title>
@@ -2439,6 +2610,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mr">MR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/subj">SUBJ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/trec-10">TREC-10</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.172</doi>
     </paper>
     <paper id="173">
       <title>The impact of lexical and grammatical processing on generating code from natural language</title>
@@ -2451,6 +2623,7 @@
       <pwccode url="https://gitlab.com/codegenfact/BertranX" additional="false">codegenfact/BertranX</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conala">CoNaLa</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/django">Django</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.173</doi>
     </paper>
     <paper id="174">
       <title><fixed-case>S</fixed-case>eq2<fixed-case>P</fixed-case>ath: Generating Sentiment Tuples as Paths of a Tree</title>
@@ -2464,6 +2637,7 @@
       <url hash="cce06e84">2022.findings-acl.174</url>
       <attachment type="software" hash="85d89133">2022.findings-acl.174.software.zip</attachment>
       <bibkey>mao-etal-2022-seq2path</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.174</doi>
     </paper>
     <paper id="175">
       <title>Mitigating the Inconsistency Between Word Saliency and Model Confidence with Pathological Contrastive Training</title>
@@ -2479,6 +2653,7 @@
       <bibkey>zhan-etal-2022-mitigating</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.175</doi>
     </paper>
     <paper id="176">
       <title>Your fairness may vary: Pretrained language model fairness in toxic text classification</title>
@@ -2492,6 +2667,7 @@
       <url hash="08332a7e">2022.findings-acl.176</url>
       <bibkey>baldini-etal-2022-fairness</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/hatexplain">HateXplain</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.176</doi>
     </paper>
     <paper id="177">
       <title><fixed-case>C</fixed-case>hart<fixed-case>QA</fixed-case>: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning</title>
@@ -2509,6 +2685,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/figureqa">FigureQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/leaf-qa">LEAF-QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/plotqa">PlotQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.177</doi>
     </paper>
     <paper id="178">
       <title>A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification</title>
@@ -2520,6 +2697,7 @@
       <url hash="dcddf4d9">2022.findings-acl.178</url>
       <bibkey>liu-etal-2022-novel</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mind">MIND</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.178</doi>
     </paper>
     <paper id="179">
       <title>Learn and Review: Enhancing Continual Named Entity Recognition via Reviewing Synthetic Samples</title>
@@ -2535,6 +2713,7 @@
       <url hash="2b1f451d">2022.findings-acl.179</url>
       <bibkey>xia-etal-2022-learn</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.179</doi>
     </paper>
     <paper id="180">
       <title>Phoneme transcription of endangered languages: an evaluation of recent <fixed-case>ASR</fixed-case> architectures in the single speaker scenario</title>
@@ -2543,6 +2722,7 @@
       <abstract>Transcription is often reported as the bottleneck in endangered language documentation, requiring large efforts from scarce speakers and transcribers. In general, automatic speech recognition (ASR) can be accurate enough to accelerate transcription only if trained on large amounts of transcribed data. However, when a single speaker is involved, several studies have reported encouraging results for phonetic transcription even with small amounts of training. Here we expand this body of work on speaker-dependent transcription by comparing four ASR approaches, notably recent transformer and pretrained multilingual models, on a common dataset of 11 languages. To automate data preparation, training and evaluation steps, we also developed a phoneme recognition setup which handles morphologically complex languages and writing systems for which no pronunciation dictionary exists.We find that fine-tuning a multilingual pretrained model yields an average phoneme error rate (PER) of 15% for 6 languages with 99 minutes or less of transcribed data for training. For the 5 languages with between 100 and 192 minutes of training, we achieved a PER of 8.4% or less. These results on a number of varied languages suggest that ASR can now significantly reduce transcription efforts in the speaker-dependent situation common in endangered language work.</abstract>
       <url hash="60fa5a43">2022.findings-acl.180</url>
       <bibkey>boulianne-2022-phoneme</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.180</doi>
     </paper>
     <paper id="181">
       <title>Does <fixed-case>BERT</fixed-case> really agree ? Fine-grained Analysis of Lexical Dependence on a Syntactic Task</title>
@@ -2553,6 +2733,7 @@
       <abstract>Although transformer-based Neural Language Models demonstrate impressive performance on a variety of tasks, their generalization abilities are not well understood. They have been shown to perform strongly on subject-verb number agreement in a wide array of settings, suggesting that they learned to track syntactic dependencies during their training even without explicit supervision. In this paper, we examine the extent to which BERT is able to perform lexically-independent subject-verb number agreement (NA) on targeted syntactic templates. To do so, we disrupt the lexical patterns found in naturally occurring stimuli for each targeted structure in a novel fine-grained analysis of BERT’s behavior. Our results on nonce sentences suggest that the model generalizes well for simple templates, but fails to perform lexically-independent syntactic generalization when as little as one attractor is present.</abstract>
       <url hash="77797ea1">2022.findings-acl.181</url>
       <bibkey>lasri-etal-2022-bert</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.181</doi>
     </paper>
     <paper id="182">
       <title>Combining Static and Contextualised Multilingual Embeddings</title>
@@ -2567,6 +2748,7 @@
       <pwccode url="https://github.com/kathyhaem/combining-static-contextual" additional="false">kathyhaem/combining-static-contextual</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/tydi-qa">TyDi QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.182</doi>
     </paper>
     <paper id="183">
       <title>An Accurate Unsupervised Method for Joint Entity Alignment and Dangling Entity Detection</title>
@@ -2578,6 +2760,7 @@
       <attachment type="software" hash="68be3234">2022.findings-acl.183.software.zip</attachment>
       <bibkey>luo-yu-2022-accurate</bibkey>
       <pwccode url="https://github.com/luosx18/ued" additional="false">luosx18/ued</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.183</doi>
     </paper>
     <paper id="184">
       <title>Square One Bias in <fixed-case>NLP</fixed-case>: Towards a Multi-Dimensional Exploration of the Research Manifold</title>
@@ -2588,6 +2771,7 @@
       <abstract>The prototypical NLP experiment trains a standard architecture on labeled English data and optimizes for accuracy, without accounting for other dimensions such as fairness, interpretability, or computational efficiency. We show through a manual classification of recent NLP research papers that this is indeed the case and refer to it as the square one experimental setup. We observe that NLP research often goes beyond the square one setup, e.g, focusing not only on accuracy, but also on fairness or interpretability, but typically only along a single dimension. Most work targeting multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on. Such one-dimensionality of most research means we are only exploring a fraction of the NLP research search space. We provide historical and recent examples of how the square one bias has led researchers to draw false conclusions or make unwise choices, point to promising yet unexplored directions on the research manifold, and make practical recommendations to enable more multi-dimensional research. We open-source the results of our annotations to enable further analysis.</abstract>
       <url hash="291713f7">2022.findings-acl.184</url>
       <bibkey>ruder-etal-2022-square</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.184</doi>
     </paper>
     <paper id="185">
       <title>Systematicity, Compositionality and Transitivity of Deep <fixed-case>NLP</fixed-case> Models: a Metamorphic Testing Perspective</title>
@@ -2601,6 +2785,7 @@
       <url hash="93456b9e">2022.findings-acl.185</url>
       <attachment type="software" hash="c85e0fb0">2022.findings-acl.185.software.zip</attachment>
       <bibkey>manino-etal-2022-systematicity</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.185</doi>
     </paper>
     <paper id="186">
       <title>Improving Neural Political Statement Classification with Class Hierarchical Information</title>
@@ -2616,6 +2801,7 @@
       <url hash="04b06c85">2022.findings-acl.186</url>
       <attachment type="software" hash="3affc077">2022.findings-acl.186.software.zip</attachment>
       <bibkey>dayanik-etal-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.186</doi>
     </paper>
     <paper id="187">
       <title>Enabling Multimodal Generation on <fixed-case>CLIP</fixed-case> via Vision-Language Knowledge Distillation</title>
@@ -2633,6 +2819,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ok-vqa">OK-VQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/nocaps">nocaps</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.187</doi>
     </paper>
     <paper id="188">
       <title>Co-<fixed-case>VQA</fixed-case> : Answering by Interactive Sub Question Sequence</title>
@@ -2649,6 +2836,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering-v2-0">Visual Question Answering v2.0</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.188</doi>
     </paper>
     <paper id="189">
       <title>A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation</title>
@@ -2671,6 +2859,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.189</doi>
     </paper>
     <paper id="190">
       <title>Auxiliary tasks to boost Biaffine Semantic Dependency Parsing</title>
@@ -2681,6 +2870,7 @@
       <attachment type="software" hash="fc9f57cf">2022.findings-acl.190.software.tgz</attachment>
       <bibkey>candito-2022-auxiliary</bibkey>
       <pwccode url="https://github.com/mcandito/aux-tasks-biaffine-graph-parser-findingsacl22" additional="false">mcandito/aux-tasks-biaffine-graph-parser-findingsacl22</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.190</doi>
     </paper>
     <paper id="191">
       <title>Syntax-guided Contrastive Learning for Pre-trained Language Model</title>
@@ -2699,6 +2889,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/open-entity-1">Open Entity</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.191</doi>
     </paper>
     <paper id="192">
       <title>Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting</title>
@@ -2711,6 +2902,7 @@
       <bibkey>chalkidis-sogaard-2022-improved</bibkey>
       <pwccode url="https://github.com/coastalcph/lw-robust" additional="false">coastalcph/lw-robust</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bioasq">BioASQ</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.192</doi>
     </paper>
     <paper id="193">
       <title><fixed-case>ASCM</fixed-case>: An Answer Space Clustered Prompting Method without Answer Engineering</title>
@@ -2726,6 +2918,7 @@
       <url hash="27a6ba4f">2022.findings-acl.193</url>
       <bibkey>wang-etal-2022-ascm</bibkey>
       <pwccode url="https://github.com/miaomiao1215/ascm" additional="false">miaomiao1215/ascm</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.193</doi>
     </paper>
     <paper id="194">
       <title>Why don’t people use character-level machine translation?</title>
@@ -2737,6 +2930,7 @@
       <url hash="cb530acb">2022.findings-acl.194</url>
       <attachment type="software" hash="d50242f4">2022.findings-acl.194.software.tgz</attachment>
       <bibkey>libovicky-etal-2022-dont</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.194</doi>
     </paper>
     <paper id="195">
       <title>Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems</title>
@@ -2754,6 +2948,7 @@
       <pwccode url="https://github.com/zwx980624/mwp-cl" additional="false">zwx980624/mwp-cl</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/math23k">Math23K</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mathqa">MathQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.195</doi>
     </paper>
     <paper id="196">
       <title>x<fixed-case>GQA</fixed-case>: Cross-Lingual Visual Question Answering</title>
@@ -2772,6 +2967,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/gqa">GQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/iglue">IGLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multisubs">MultiSubs</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.196</doi>
     </paper>
     <paper id="197">
       <title>Automatic Speech Recognition and Query By Example for Creole Languages Documentation</title>
@@ -2784,6 +2980,7 @@
       <url hash="f8c16e05">2022.findings-acl.197</url>
       <bibkey>macaire-etal-2022-automatic</bibkey>
       <pwccode url="https://github.com/macairececile/asr-qbe-creole" additional="false">macairececile/asr-qbe-creole</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.197</doi>
     </paper>
     <paper id="198">
       <title><fixed-case>MR</fixed-case>e<fixed-case>D</fixed-case>: A Meta-Review Dataset for Structure-Controllable Text Generation</title>
@@ -2799,6 +2996,7 @@
       <bibkey>shen-etal-2022-mred</bibkey>
       <pwccode url="https://github.com/shen-chenhui/mred" additional="false">shen-chenhui/mred</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.198</doi>
     </paper>
     <paper id="199">
       <title>Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation</title>
@@ -2809,6 +3007,7 @@
       <abstract>Subword regularizations use multiple subword segmentations during training to improve the robustness of neural machine translation models.In previous subword regularizations, we use multiple segmentations in the training process but use only one segmentation in the inference.In this study, we propose an inference strategy to address this discrepancy.The proposed strategy approximates the marginalized likelihood by using multiple segmentations including the most plausible segmentation and several sampled segmentations.Because the proposed strategy aggregates predictions from several segmentations, we can regard it as a single model ensemble that does not require any additional cost for training.Experimental results show that the proposed strategy improves the performance of models trained with subword regularization in low-resource machine translation tasks.</abstract>
       <url hash="16e62edb">2022.findings-acl.199</url>
       <bibkey>takase-etal-2022-single</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.199</doi>
     </paper>
     <paper id="200">
       <title>Detecting Various Types of Noise for Neural Machine Translation</title>
@@ -2820,6 +3019,7 @@
       <abstract>The filtering and/or selection of training data is one of the core aspects to be considered when building a strong machine translation system.In their influential work, Khayrallah and Koehn (2018) investigated the impact of different types of noise on the performance of machine translation systems.In the same year the WMT introduced a shared task on parallel corpus filtering, which went on to be repeated in the following years, and resulted in many different filtering approaches being proposed.In this work we aim to combine the recent achievements in data filtering with the original analysis of Khayrallah and Koehn (2018) and investigate whether state-of-the-art filtering systems are capable of removing all the suggested noise types.We observe that most of these types of noise can be detected with an accuracy of over 90% by modern filtering systems when operating in a well studied high resource setting.However, we also find that when confronted with more refined noise categories or when working with a less common language pair, the performance of the filtering systems is far from optimal, showing that there is still room for improvement in this area of research.</abstract>
       <url hash="e5c38443">2022.findings-acl.200</url>
       <bibkey>herold-etal-2022-detecting</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.200</doi>
     </paper>
     <paper id="201">
       <title><fixed-case>DU</fixed-case>-<fixed-case>VLG</fixed-case>: Unifying Vision-and-Language Generation via Dual Sequence-to-Sequence Pre-training</title>
@@ -2833,6 +3033,7 @@
       <url hash="a6c16797">2022.findings-acl.201</url>
       <bibkey>huang-etal-2022-du</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.201</doi>
     </paper>
     <paper id="202">
       <title><fixed-case>H</fixed-case>i<fixed-case>CLRE</fixed-case>: A Hierarchical Contrastive Learning Framework for Distantly Supervised Relation Extraction</title>
@@ -2846,6 +3047,7 @@
       <url hash="49c2116c">2022.findings-acl.202</url>
       <bibkey>li-etal-2022-hiclre</bibkey>
       <pwccode url="https://github.com/matnlp/hiclre" additional="false">matnlp/hiclre</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.202</doi>
     </paper>
     <paper id="203">
       <title>Prompt-Driven Neural Machine Translation</title>
@@ -2858,6 +3060,7 @@
       <url hash="2e9cfdd3">2022.findings-acl.203</url>
       <bibkey>li-etal-2022-prompt</bibkey>
       <pwccode url="https://github.com/yafuly/promptnmt" additional="false">yafuly/promptnmt</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.203</doi>
     </paper>
     <paper id="204">
       <title>On Controlling Fallback Responses for Grounded Dialogue Generation</title>
@@ -2870,6 +3073,7 @@
       <url hash="5c880298">2022.findings-acl.204</url>
       <attachment type="software" hash="66939f03">2022.findings-acl.204.software.zip</attachment>
       <bibkey>lu-etal-2022-controlling</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.204</doi>
     </paper>
     <paper id="205">
       <title><fixed-case>CRAFT</fixed-case>: A Benchmark for Causal Reasoning About Forces and in<fixed-case>T</fixed-case>eractions</title>
@@ -2891,6 +3095,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/phyre">PHYRE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tvqa">TVQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tvqa-1">TVQA+</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.205</doi>
     </paper>
     <paper id="206">
       <title>A Graph Enhanced <fixed-case>BERT</fixed-case> Model for Event Prediction</title>
@@ -2905,6 +3110,7 @@
       <attachment type="software" hash="03b914e7">2022.findings-acl.206.software.zip</attachment>
       <bibkey>du-etal-2022-graph</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.206</doi>
     </paper>
     <paper id="207">
       <title>Long Time No See! Open-Domain Conversation with Long-Term Persona Memory</title>
@@ -2921,6 +3127,7 @@
       <bibkey>xu-etal-2022-long</bibkey>
       <pwccode url="https://github.com/PaddlePaddle/Research/tree/master/NLP/ACL2022-DuLeMon" additional="false">PaddlePaddle/Research</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/delemon">DuLeMon</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.207</doi>
     </paper>
     <paper id="208">
       <title>Lacking the Embedding of a Word? Look it up into a Traditional Dictionary</title>
@@ -2935,6 +3142,7 @@
       <url hash="a0d5de34">2022.findings-acl.208</url>
       <attachment type="software" hash="86b8a2fa">2022.findings-acl.208.software.zip</attachment>
       <bibkey>ruzzetti-etal-2022-lacking</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.208</doi>
     </paper>
     <paper id="209">
       <title><fixed-case>MTR</fixed-case>ec: Multi-Task Learning over <fixed-case>BERT</fixed-case> for News Recommendation</title>
@@ -2949,6 +3157,7 @@
       <url hash="632dfd18">2022.findings-acl.209</url>
       <bibkey>bi-etal-2022-mtrec</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mind">MIND</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.209</doi>
     </paper>
     <paper id="210">
       <title>Cross-domain Named Entity Recognition via Graph Matching</title>
@@ -2961,6 +3170,7 @@
       <attachment type="software" hash="4c4335d1">2022.findings-acl.210.software.zip</attachment>
       <bibkey>zheng-etal-2022-cross</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/crossner">CrossNER</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.210</doi>
     </paper>
     <paper id="211">
       <title>Assessing Multilingual Fairness in Pre-trained Multimodal Representations</title>
@@ -2973,6 +3183,7 @@
       <attachment type="software" hash="5844056e">2022.findings-acl.211.software.tgz</attachment>
       <bibkey>wang-etal-2022-assessing</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fairface">FairFace</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.211</doi>
     </paper>
     <paper id="212">
       <title>More Than Words: Collocation Retokenization for <fixed-case>L</fixed-case>atent <fixed-case>D</fixed-case>irichlet <fixed-case>A</fixed-case>llocation Models</title>
@@ -2983,6 +3194,7 @@
       <abstract>Traditionally, Latent Dirichlet Allocation (LDA) ingests words in a collection of documents to discover their latent topics using word-document co-occurrences. Previous studies show that representing bigrams collocations in the input can improve topic coherence in English. However, it is unclear how to achieve the best results for languages without marked word boundaries such as Chinese and Thai. Here, we explore the use of retokenization based on chi-squared measures, <tex-math>t</tex-math>-statistics, and raw frequency to merge frequent token ngrams into collocations when preparing input to the LDA model. Based on the goodness of fit and the coherence metric, we show that topics trained with merged tokens result in topic keys that are clearer, more coherent, and more effective at distinguishing topics than those of unmerged models.</abstract>
       <url hash="6887b674">2022.findings-acl.212</url>
       <bibkey>cheevaprawatdomrong-etal-2022-words</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.212</doi>
     </paper>
     <paper id="213">
       <title><i>Generalized but not Robust?</i> Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness</title>
@@ -3001,6 +3213,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/svhn">SVHN</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.213</doi>
     </paper>
     <paper id="214">
       <title><fixed-case>ASSIST</fixed-case>: Towards Label Noise-Robust Dialogue State Tracking</title>
@@ -3015,6 +3228,7 @@
       <pwccode url="https://github.com/smartyfh/dst-assist" additional="false">smartyfh/dst-assist</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sgd">SGD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.214</doi>
     </paper>
     <paper id="215">
       <title>Graph Refinement for Coreference Resolution</title>
@@ -3024,6 +3238,7 @@
       <abstract>The state-of-the-art models for coreference resolution are based on independent mention pair-wise decisions. We propose a modelling approach that learns coreference at the document-level and takes global decisions. For this purpose, we model coreference links in a graph structure where the nodes are tokens in the text, and the edges represent the relationship between them. Our model predicts the graph in a non-autoregressive manner, then iteratively refines it based on previous predictions, allowing global dependencies between decisions. The experimental results show improvements over various baselines, reinforcing the hypothesis that document-level information improves conference resolution.</abstract>
       <url hash="df6ef574">2022.findings-acl.215</url>
       <bibkey>miculicich-henderson-2022-graph</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.215</doi>
     </paper>
     <paper id="216">
       <title><fixed-case>ECO</fixed-case> v1: Towards Event-Centric Opinion Mining</title>
@@ -3040,6 +3255,7 @@
       <url hash="323fc93a">2022.findings-acl.216</url>
       <attachment type="software" hash="f04afa7c">2022.findings-acl.216.software.zip</attachment>
       <bibkey>xu-etal-2022-eco</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.216</doi>
     </paper>
     <paper id="217">
       <title>Deep Reinforcement Learning for Entity Alignment</title>
@@ -3053,6 +3269,7 @@
       <attachment type="software" hash="b2de8f78">2022.findings-acl.217.software.zip</attachment>
       <bibkey>guo-etal-2022-deep</bibkey>
       <pwccode url="https://github.com/guolingbing/rlea" additional="false">guolingbing/rlea</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.217</doi>
     </paper>
     <paper id="218">
       <title>Breaking Down Multilingual Machine Translation</title>
@@ -3064,6 +3281,7 @@
       <abstract>While multilingual training is now an essential ingredient in machine translation (MT) systems, recent work has demonstrated that it has different effects in different multilingual settings, such as many-to-one, one-to-many, and many-to-many learning. These training settings expose the encoder and the decoder in a machine translation model with different data distributions. In this paper, we examine how different varieties of multilingual training contribute to learning these two components of the MT model. Specifically, we compare bilingual models with encoders and/or decoders initialized by multilingual training. We show that multilingual training is beneficial to encoders in general, while it only benefits decoders for low-resource languages (LRLs). We further find the important attention heads for each language pair and compare their correlations during inference. Our analysis sheds light on how multilingual translation models work and also enables us to propose methods to improve performance by training with highly related languages. Our many-to-one models for high-resource languages and one-to-many models for LRL outperform the best results reported by Aharoni et al. (2019).</abstract>
       <url hash="72ed420c">2022.findings-acl.218</url>
       <bibkey>chiang-etal-2022-breaking</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.218</doi>
     </paper>
     <paper id="219">
       <title>Mitigating Contradictions in Dialogue Based on Contrastive Learning</title>
@@ -3076,6 +3294,7 @@
       <url hash="24ff9057">2022.findings-acl.219</url>
       <attachment type="software" hash="8c6b78e0">2022.findings-acl.219.software.zip</attachment>
       <bibkey>li-etal-2022-mitigating</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.219</doi>
     </paper>
     <paper id="220">
       <title><fixed-case>ELLE</fixed-case>: Efficient Lifelong Pre-training for Emerging Data</title>
@@ -3092,6 +3311,7 @@
       <attachment type="software" hash="37e6acdc">2022.findings-acl.220.software.zip</attachment>
       <bibkey>qin-etal-2022-elle</bibkey>
       <pwccode url="https://github.com/thunlp/elle" additional="false">thunlp/elle</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.220</doi>
     </paper>
     <paper id="221">
       <title><fixed-case>E</fixed-case>n<fixed-case>CBP</fixed-case>: A New Benchmark Dataset for Finer-Grained Cultural Background Prediction in <fixed-case>E</fixed-case>nglish</title>
@@ -3109,6 +3329,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/goemotions">GoEmotions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.221</doi>
     </paper>
     <paper id="222">
       <title>Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models</title>
@@ -3126,6 +3347,7 @@
       <pwccode url="https://github.com/ucinlp/null-prompts" additional="true">ucinlp/null-prompts</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.222</doi>
     </paper>
     <paper id="223">
       <title>u<fixed-case>FACT</fixed-case>: Unfaithful Alien-Corpora Training for Semantically Consistent Data-to-Text Generation</title>
@@ -3137,6 +3359,7 @@
       <url hash="b9cccf42">2022.findings-acl.223</url>
       <bibkey>anders-etal-2022-ufact</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/viggo">ViGGO</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.223</doi>
     </paper>
     <paper id="224">
       <title>Good Night at 4 pm?! Time Expressions in Different Cultures</title>
@@ -3146,6 +3369,7 @@
       <url hash="97bb54df">2022.findings-acl.224</url>
       <bibkey>shwartz-2022-good</bibkey>
       <pwccode url="https://github.com/vered1986/time_expressions" additional="false">vered1986/time_expressions</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.224</doi>
     </paper>
     <paper id="225">
       <title>Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking</title>
@@ -3158,6 +3382,7 @@
       <url hash="d03852e0">2022.findings-acl.225</url>
       <bibkey>li-etal-2022-extracting</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.225</doi>
     </paper>
     <paper id="226">
       <title><fixed-case>O</fixed-case>ne<fixed-case>A</fixed-case>ligner: Zero-shot Cross-lingual Transfer with One Rich-Resource Language Pair for Low-Resource Sentence Retrieval</title>
@@ -3171,6 +3396,7 @@
       <bibkey>niu-etal-2022-onealigner</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cc100">CC100</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.226</doi>
     </paper>
     <paper id="227">
       <title>Suum Cuique: Studying Bias in Taboo Detection with a Community Perspective</title>
@@ -3184,6 +3410,7 @@
       <bibkey>khalid-etal-2022-suum</bibkey>
       <pwccode url="https://github.com/jonrusert/suumcuique" additional="false">jonrusert/suumcuique</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/olid">OLID</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.227</doi>
     </paper>
     <paper id="228">
       <title>Modeling Intensification for Sign Language Generation: A Computational Approach</title>
@@ -3199,6 +3426,7 @@
       <bibkey>inan-etal-2022-modeling</bibkey>
       <pwccode url="https://github.com/merterm/modeling-intensification-for-slg" additional="false">merterm/modeling-intensification-for-slg</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/phoenix14t">PHOENIX14T</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.228</doi>
     </paper>
     <paper id="229">
       <title>Controllable Natural Language Generation with Contrastive Prefixes</title>
@@ -3212,6 +3440,7 @@
       <url hash="e92eac1f">2022.findings-acl.229</url>
       <bibkey>qian-etal-2022-controllable</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.229</doi>
     </paper>
     <paper id="230">
       <title>Revisiting the Effects of Leakage on Dependency Parsing</title>
@@ -3223,6 +3452,7 @@
       <url hash="9bf61937">2022.findings-acl.230</url>
       <bibkey>krasner-etal-2022-revisiting</bibkey>
       <pwccode url="https://github.com/miriamwanner/reu-nlp-project" additional="false">miriamwanner/reu-nlp-project</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.230</doi>
     </paper>
     <paper id="231">
       <title>Learning to Describe Solutions for Bug Reports Based on Developer Discussions</title>
@@ -3235,6 +3465,7 @@
       <url hash="c04754b3">2022.findings-acl.231</url>
       <bibkey>panthaplackel-etal-2022-learning</bibkey>
       <pwccode url="https://github.com/panthap2/describing-bug-report-solutions" additional="false">panthap2/describing-bug-report-solutions</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.231</doi>
     </paper>
     <paper id="232">
       <title>Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense</title>
@@ -3248,6 +3479,7 @@
       <url hash="9f9bbcca">2022.findings-acl.232</url>
       <bibkey>le-etal-2022-perturbations</bibkey>
       <pwccode url="https://github.com/lethaiq/perturbations-in-the-wild" additional="false">lethaiq/perturbations-in-the-wild</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.232</doi>
     </paper>
     <paper id="233">
       <title>Improving <fixed-case>C</fixed-case>hinese Grammatical Error Detection via Data augmentation by Conditional Error Generation</title>
@@ -3261,6 +3493,7 @@
       <abstract>Chinese Grammatical Error Detection(CGED) aims at detecting grammatical errors in Chinese texts. One of the main challenges for CGED is the lack of annotated data. To alleviate this problem, previous studies proposed various methods to automatically generate more training samples, which can be roughly categorized into rule-based methods and model-based methods. The rule-based methods construct erroneous sentences by directly introducing noises into original sentences. However, the introduced noises are usually context-independent, which are quite different from those made by humans. The model-based methods utilize generative models to imitate human errors. The generative model may bring too many changes to the original sentences and generate semantically ambiguous sentences, so it is difficult to detect grammatical errors in these generated sentences. In addition, generated sentences may be error-free and thus become noisy data. To handle these problems, we propose CNEG, a novel Conditional Non-Autoregressive Error Generation model for generating Chinese grammatical errors. Specifically, in order to generate a context-dependent error, we first mask a span in a correct text, then predict an erroneous span conditioned on both the masked text and the correct span. Furthermore, we filter out error-free spans by measuring their perplexities in the original sentences. Experimental results show that our proposed method achieves better performance than all compared data augmentation methods on the CGED-2018 and CGED-2020 benchmarks.</abstract>
       <url hash="75d7da6e">2022.findings-acl.233</url>
       <bibkey>yue-etal-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.233</doi>
     </paper>
     <paper id="234">
       <title>Modular and Parameter-Efficient Multimodal Fusion with Prompting</title>
@@ -3272,6 +3505,7 @@
       <url hash="5cd11a03">2022.findings-acl.234</url>
       <attachment type="software" hash="8d479d2b">2022.findings-acl.234.software.zip</attachment>
       <bibkey>liang-etal-2022-modular</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.234</doi>
     </paper>
     <paper id="235">
       <title>Synchronous Refinement for Neural Machine Translation</title>
@@ -3284,6 +3518,7 @@
       <abstract>Machine translation typically adopts an encoder-to-decoder framework, in which the decoder generates the target sentence word-by-word in an auto-regressive manner. However, the auto-regressive decoder faces a deep-rooted <tex-math>one</tex-math>-<tex-math>pass</tex-math> issue whereby each generated word is considered as one element of the final output regardless of whether it is correct or not. These generated wrong words further constitute the target historical context to affect the generation of subsequent target words. This paper proposes a novel synchronous refinement method to revise potential errors in the generated words by considering part of the target future context. Particularly, the proposed approach allows the auto-regressive decoder to refine the previously generated target words and generate the next target word synchronously. The experimental results on three widely-used machine translation tasks demonstrated the effectiveness of the proposed approach.</abstract>
       <url hash="7aa63181">2022.findings-acl.235</url>
       <bibkey>chen-etal-2022-synchronous</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.235</doi>
     </paper>
     <paper id="236">
       <title><fixed-case>HIE</fixed-case>-<fixed-case>SQL</fixed-case>: History Information Enhanced Network for Context-Dependent Text-to-<fixed-case>SQL</fixed-case> Semantic Parsing</title>
@@ -3297,6 +3532,7 @@
       <url hash="16e2bb60">2022.findings-acl.236</url>
       <bibkey>zheng-etal-2022-hie</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cosql">CoSQL</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.236</doi>
     </paper>
     <paper id="237">
       <title><fixed-case>CRAS</fixed-case>pell: A Contextual Typo Robust Approach to Improve <fixed-case>C</fixed-case>hinese Spelling Correction</title>
@@ -3313,6 +3549,7 @@
       <attachment type="software" hash="5af34dbc">2022.findings-acl.237.software.zip</attachment>
       <bibkey>liu-etal-2022-craspell</bibkey>
       <pwccode url="https://github.com/liushulinle/craspell" additional="false">liushulinle/craspell</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.237</doi>
     </paper>
     <paper id="238">
       <title><fixed-case>G</fixed-case>aussian Multi-head Attention for Simultaneous Machine Translation</title>
@@ -3323,6 +3560,7 @@
       <url hash="6a1080cf">2022.findings-acl.238</url>
       <bibkey>zhang-feng-2022-gaussian</bibkey>
       <pwccode url="https://github.com/ictnlp/gma" additional="false">ictnlp/gma</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.238</doi>
     </paper>
     <paper id="239">
       <title>Composing Structure-Aware Batches for Pairwise Sentence Classification</title>
@@ -3336,6 +3574,7 @@
       <pwccode url="https://github.com/ukplab/acl2022-structure-batches" additional="false">ukplab/acl2022-structure-batches</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.239</doi>
     </paper>
     <paper id="240">
       <title>Factual Consistency of Multilingual Pretrained Language Models</title>
@@ -3348,6 +3587,7 @@
       <bibkey>fierro-sogaard-2022-factual</bibkey>
       <pwccode url="https://github.com/coastalcph/mpararel" additional="false">coastalcph/mpararel</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.240</doi>
     </paper>
     <paper id="241">
       <title>Selecting Stickers in Open-Domain Dialogue through Multitask Learning</title>
@@ -3362,6 +3602,7 @@
       <attachment type="software" hash="629403cd">2022.findings-acl.241.software.zip</attachment>
       <bibkey>zhang-etal-2022-selecting</bibkey>
       <pwccode url="https://github.com/nonstopfor/sticker-selection" additional="false">nonstopfor/sticker-selection</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.241</doi>
     </paper>
     <paper id="242">
       <title><fixed-case>Z</fixed-case>i<fixed-case>N</fixed-case>et: <fixed-case>L</fixed-case>inking <fixed-case>C</fixed-case>hinese Characters Spanning Three Thousand Years</title>
@@ -3377,6 +3618,7 @@
       <attachment type="software" hash="a027bd75">2022.findings-acl.242.software.zip</attachment>
       <bibkey>chi-etal-2022-zinet</bibkey>
       <pwccode url="https://github.com/yangchijlu/ancientchinesecharsim" additional="false">yangchijlu/ancientchinesecharsim</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.242</doi>
     </paper>
     <paper id="243">
       <title>How Can Cross-lingual Knowledge Contribute Better to Fine-Grained Entity Typing?</title>
@@ -3392,6 +3634,7 @@
       <url hash="abe5c240">2022.findings-acl.243</url>
       <bibkey>jin-etal-2022-cross</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/figer">FIGER</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.243</doi>
     </paper>
     <paper id="244">
       <title><fixed-case>AMR-DA</fixed-case>: <fixed-case>D</fixed-case>ata Augmentation by <fixed-case>A</fixed-case>bstract <fixed-case>M</fixed-case>eaning <fixed-case>R</fixed-case>epresentation</title>
@@ -3403,6 +3646,7 @@
       <url hash="c5cea47e">2022.findings-acl.244</url>
       <bibkey>shou-etal-2022-amr</bibkey>
       <pwccode url="https://github.com/zzshou/amr-data-augmentation" additional="false">zzshou/amr-data-augmentation</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.244</doi>
     </paper>
     <paper id="245">
       <title>Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study</title>
@@ -3414,6 +3658,7 @@
       <abstract>In this work, we present an extensive study on the use of pre-trained language models for the task of automatic Counter Narrative (CN) generation to fight online hate speech in English. We first present a comparative study to determine whether there is a particular Language Model (or class of LMs) and a particular decoding mechanism that are the most appropriate to generate CNs. Findings show that autoregressive models combined with stochastic decodings are the most promising. We then investigate how an LM performs in generating a CN with regard to an unseen target of hate. We find out that a key element for successful ‘out of target’ experiments is not an overall similarity with the training data but the presence of a specific subset of training data, i. e. a target that shares some commonalities with the test target that can be defined a-priori. We finally introduce the idea of a pipeline based on the addition of an automatic post-editing step to refine generated CNs.</abstract>
       <url hash="6eda3f4a">2022.findings-acl.245</url>
       <bibkey>tekiroglu-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.245</doi>
     </paper>
     <paper id="246">
       <title>Improving Robustness of Language Models from a Geometry-aware Perspective</title>
@@ -3429,6 +3674,7 @@
       <bibkey>zhu-etal-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.246</doi>
     </paper>
     <paper id="247">
       <title>Task-guided Disentangled Tuning for Pretrained Language Models</title>
@@ -3444,6 +3690,7 @@
       <pwccode url="https://github.com/lemon0830/tdt" additional="false">lemon0830/tdt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/clue">CLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.247</doi>
     </paper>
     <paper id="248">
       <title>Exploring the Impact of Negative Samples of Contrastive Learning: A Case Study of Sentence Embedding</title>
@@ -3459,6 +3706,7 @@
       <url hash="03859637">2022.findings-acl.248</url>
       <bibkey>cao-etal-2022-exploring</bibkey>
       <pwccode url="https://github.com/xbdxwyh/mocose" additional="false">xbdxwyh/mocose</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.248</doi>
     </paper>
     <paper id="249">
       <title>The Inefficiency of Language Models in Scholarly Retrieval: An Experimental Walk-through</title>
@@ -3469,6 +3717,7 @@
       <url hash="7e85d001">2022.findings-acl.249</url>
       <bibkey>singh-singh-2022-inefficiency</bibkey>
       <pwccode url="https://github.com/shruti-singh/scilm_exp" additional="false">shruti-singh/scilm_exp</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.249</doi>
     </paper>
     <paper id="250">
       <title>Fusing Heterogeneous Factors with Triaffine Mechanism for Nested Named Entity Recognition</title>
@@ -3485,6 +3734,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ace-2004">ACE 2004</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ace-2005">ACE 2005</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/genia">GENIA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.250</doi>
     </paper>
     <paper id="251">
       <title><fixed-case>UNIMO</fixed-case>-2: End-to-End Unified Vision-Language Grounded Learning</title>
@@ -3505,6 +3755,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/snli-ve">SNLI-VE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual-genome">Visual Genome</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.251</doi>
     </paper>
     <paper id="252">
       <title>The Past Mistake is the Future Wisdom: Error-driven Contrastive Probability Optimization for <fixed-case>C</fixed-case>hinese Spell Checking</title>
@@ -3523,6 +3774,7 @@
       <url hash="1c40387b">2022.findings-acl.252</url>
       <attachment type="software" hash="a574d7f1">2022.findings-acl.252.software.zip</attachment>
       <bibkey>li-etal-2022-past</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.252</doi>
     </paper>
     <paper id="253">
       <title><fixed-case>XFUND</fixed-case>: A Benchmark Dataset for Multilingual Visually Rich Form Understanding</title>
@@ -3540,6 +3792,7 @@
       <attachment type="software" hash="c020263a">2022.findings-acl.253.software.zip</attachment>
       <bibkey>xu-etal-2022-xfund</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/funsd">FUNSD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.253</doi>
     </paper>
     <paper id="254">
       <title>Type-Driven Multi-Turn Corrections for Grammatical Error Correction</title>
@@ -3558,6 +3811,7 @@
       <pwccode url="https://github.com/deeplearnxmu/tmtc" additional="false">deeplearnxmu/tmtc</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fce">FCE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/locness-corpus">WI-LOCNESS</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.254</doi>
     </paper>
     <paper id="255">
       <title>Leveraging Knowledge in Multilingual Commonsense Reasoning</title>
@@ -3577,6 +3831,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/x-csqa">X-CSQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xcopa">XCOPA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.255</doi>
     </paper>
     <paper id="256">
       <title>Encoding and Fusing Semantic Connection and Linguistic Evidence for Implicit Discourse Relation Recognition</title>
@@ -3589,6 +3844,7 @@
       <url hash="dc374fbb">2022.findings-acl.256</url>
       <bibkey>xiang-etal-2022-encoding</bibkey>
       <pwccode url="https://github.com/hustminslab/manf" additional="false">hustminslab/manf</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.256</doi>
     </paper>
     <paper id="257">
       <title>One Agent To Rule Them All: Towards Multi-agent Conversational <fixed-case>AI</fixed-case></title>
@@ -3607,6 +3863,7 @@
       <bibkey>clarke-etal-2022-one</bibkey>
       <pwccode url="https://github.com/ChrisIsKing/black-box-multi-agent-integation" additional="false">ChrisIsKing/black-box-multi-agent-integation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bbai-dataset">BBAI Dataset</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.257</doi>
     </paper>
     <paper id="258">
       <title>Word-level Perturbation Considering Word Length and Compositional Subwords</title>
@@ -3620,6 +3877,7 @@
       <url hash="eb299be3">2022.findings-acl.258</url>
       <bibkey>hiraoka-etal-2022-word</bibkey>
       <pwccode url="https://github.com/tathi/cwr" additional="false">tathi/cwr</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.258</doi>
     </paper>
     <paper id="259">
       <title>Bridging Pre-trained Language Models and Hand-crafted Features for Unsupervised <fixed-case>POS</fixed-case> Tagging</title>
@@ -3635,6 +3893,7 @@
       <pwccode url="https://github.com/Jacob-Zhou/FeatureCRFAE" additional="false">Jacob-Zhou/FeatureCRFAE</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.259</doi>
     </paper>
     <paper id="260">
       <title>Controlling the Focus of Pretrained Language Generation Models</title>
@@ -3649,6 +3908,7 @@
       <pwccode url="https://github.com/question406/learningtofocus" additional="false">question406/learningtofocus</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.260</doi>
     </paper>
     <paper id="261">
       <title>Comparative Opinion Summarization via Collaborative Decoding</title>
@@ -3661,6 +3921,7 @@
       <url hash="547a496b">2022.findings-acl.261</url>
       <bibkey>iso-etal-2022-comparative</bibkey>
       <pwccode url="https://github.com/megagonlabs/cocosum" additional="false">megagonlabs/cocosum</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.261</doi>
     </paper>
     <paper id="262">
       <title><fixed-case>I</fixed-case>so<fixed-case>S</fixed-case>core: Measuring the Uniformity of Embedding Space Utilization</title>
@@ -3674,6 +3935,7 @@
       <bibkey>rudman-etal-2022-isoscore</bibkey>
       <pwccode url="https://github.com/bcbi-edu/p_eickhoff_isoscore" additional="false">bcbi-edu/p_eickhoff_isoscore</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.262</doi>
     </paper>
     <paper id="263">
       <title>A Natural Diet: Towards Improving Naturalness of Machine Translation Output</title>
@@ -3686,6 +3948,7 @@
       <abstract>Machine translation (MT) evaluation often focuses on accuracy and fluency, without paying much attention to translation style. This means that, even when considered accurate and fluent, MT output can still sound less natural than high quality human translations or text originally written in the target language. Machine translation output notably exhibits lower lexical diversity, and employs constructs that mirror those in the source sentence. In this work we propose a method for training MT systems to achieve a more natural style, i.e. mirroring the style of text originally written in the target language. Our method tags parallel training data according to the naturalness of the target side by contrasting language models trained on natural and translated data. Tagging data allows us to put greater emphasis on target sentences originally written in the target language. Automatic metrics show that the resulting models achieve lexical richness on par with human translations, mimicking a style much closer to sentences originally written in the target language. Furthermore, we find that their output is preferred by human experts when compared to the baseline translations.</abstract>
       <url hash="4990c0d1">2022.findings-acl.263</url>
       <bibkey>freitag-etal-2022-natural</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.263</doi>
     </paper>
     <paper id="264">
       <title>From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains</title>
@@ -3700,6 +3963,7 @@
       <url hash="96902c8f">2022.findings-acl.264</url>
       <bibkey>mather-etal-2022-stance</bibkey>
       <pwccode url="https://github.com/ihmc/findings-of-acl-2022-concern-detection" additional="false">ihmc/findings-of-acl-2022-concern-detection</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.264</doi>
     </paper>
     <paper id="265">
       <title><fixed-case>CUE</fixed-case> Vectors: Modular Training of Language Models Conditioned on Diverse Contextual Signals</title>
@@ -3711,6 +3975,7 @@
       <abstract>We propose a framework to modularize the training of neural language models that use diverse forms of context by eliminating the need to jointly train context and within-sentence encoders. Our approach, contextual universal embeddings (CUE), trains LMs on one type of contextual data and adapts to novel context types. The model consists of a pretrained neural sentence LM, a BERT-based contextual encoder, and a masked transfomer decoder that estimates LM probabilities using sentence-internal and contextual evidence.When contextually annotated data is unavailable, our model learns to combine contextual and sentence-internal information using noisy oracle unigram embeddings as a proxy. Real context data can be introduced later and used to adapt a small number of parameters that map contextual data into the decoder’s embedding space. We validate the CUE framework on a NYTimes text corpus with multiple metadata types, for which the LM perplexity can be lowered from 36.6 to 27.4 by conditioning on context. Bootstrapping a contextual LM with only a subset of the metadata during training retains 85% of the achievable gain. Training the model initially with proxy context retains 67% of the perplexity gain after adapting to real context. Furthermore, we can swap one type of pretrained sentence LM for another without retraining the context encoders, by only adapting the decoder model. Overall, we obtain a modular framework that allows incremental, scalable training of context-enhanced LMs.</abstract>
       <url hash="23c93066">2022.findings-acl.265</url>
       <bibkey>novotney-etal-2022-cue</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.265</doi>
     </paper>
     <paper id="266">
       <title>Cross-Lingual <fixed-case>UMLS</fixed-case> Named Entity Linking using <fixed-case>UMLS</fixed-case> Dictionary Fine-Tuning</title>
@@ -3725,6 +3990,7 @@
       <pwccode url="https://github.com/rinagalperin/biomedical_nel" additional="false">rinagalperin/biomedical_nel</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bc5cdr">BC5CDR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/medmentions">MedMentions</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.266</doi>
     </paper>
     <paper id="267">
       <title>Aligned Weight Regularizers for Pruning Pretrained Neural Networks</title>
@@ -3735,6 +4001,7 @@
       <abstract>Pruning aims to reduce the number of parameters while maintaining performance close to the original network. This work proposes a novel <i>self-distillation</i> based pruning strategy, whereby the representational similarity between the pruned and unpruned versions of the same network is maximized. Unlike previous approaches that treat distillation and pruning separately, we use distillation to inform the pruning criteria, without requiring a separate student network as in knowledge distillation. We show that the proposed <i>cross-correlation objective for self-distilled pruning</i> implicitly encourages sparse solutions, naturally complementing magnitude-based pruning criteria. Experiments on the GLUE and XGLUE benchmarks show that self-distilled pruning increases mono- and cross-lingual language model performance. Self-distilled pruned models also outperform smaller Transformers with an equal number of parameters and are competitive against (6 times) larger distilled networks. We also observe that self-distillation (1) maximizes class separability, (2) increases the signal-to-noise ratio, and (3) converges faster after pruning steps, providing further insights into why self-distilled pruning improves generalization. </abstract>
       <url hash="fdb3ff58">2022.findings-acl.267</url>
       <bibkey>o-neill-etal-2022-aligned</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.267</doi>
     </paper>
     <paper id="268">
       <title>Consistent Representation Learning for Continual Relation Extraction</title>
@@ -3749,6 +4016,7 @@
       <pwccode url="https://github.com/thuiar/CRL" additional="false">thuiar/CRL</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tacred">TACRED</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.268</doi>
     </paper>
     <paper id="269">
       <title>Event Transition Planning for Open-ended Text Generation</title>
@@ -3763,6 +4031,7 @@
       <url hash="d61b35ed">2022.findings-acl.269</url>
       <bibkey>li-etal-2022-event</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/atomic">ATOMIC</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.269</doi>
     </paper>
     <paper id="270">
       <title>Comprehensive Multi-Modal Interactions for Referring Image Segmentation</title>
@@ -3776,6 +4045,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/google-refexp">Google Refexp</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/refcoco">RefCOCO</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.270</doi>
     </paper>
     <paper id="271">
       <title><fixed-case>M</fixed-case>eta<fixed-case>W</fixed-case>eighting: Learning to Weight Tasks in Multi-Task Learning</title>
@@ -3789,6 +4059,7 @@
       <url hash="4a489e26">2022.findings-acl.271</url>
       <attachment type="software" hash="b2479293">2022.findings-acl.271.software.zip</attachment>
       <bibkey>mao-etal-2022-metaweighting</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.271</doi>
     </paper>
     <paper id="272">
       <title>Improving Controllable Text Generation with Position-Aware Weighted Decoding</title>
@@ -3804,6 +4075,7 @@
       <bibkey>gu-etal-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.272</doi>
     </paper>
     <paper id="273">
       <title>Prompt Tuning for Discriminative Pre-trained Language Models</title>
@@ -3825,6 +4097,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quoref">Quoref</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.273</doi>
     </paper>
     <paper id="274">
       <title>Two Birds with One Stone: Unified Model Learning for Both Recall and Ranking in News Recommendation</title>
@@ -3837,6 +4110,7 @@
       <url hash="6f8bb7c4">2022.findings-acl.274</url>
       <bibkey>wu-etal-2022-two</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mind">MIND</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.274</doi>
     </paper>
     <paper id="275">
       <title>What does it take to bake a cake? The <fixed-case>R</fixed-case>ecipe<fixed-case>R</fixed-case>ef corpus and anaphora resolution in procedural text</title>
@@ -3848,6 +4122,7 @@
       <url hash="74bab014">2022.findings-acl.275</url>
       <bibkey>fang-etal-2022-take</bibkey>
       <pwccode url="https://github.com/biaoyanf/reciperef" additional="false">biaoyanf/reciperef</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.275</doi>
     </paper>
     <paper id="276">
       <title><fixed-case>MERI</fixed-case>t: <fixed-case>M</fixed-case>eta-<fixed-case>P</fixed-case>ath <fixed-case>G</fixed-case>uided <fixed-case>C</fixed-case>ontrastive <fixed-case>L</fixed-case>earning for <fixed-case>L</fixed-case>ogical <fixed-case>R</fixed-case>easoning</title>
@@ -3862,6 +4137,7 @@
       <pwccode url="https://github.com/sparkjiao/merit" additional="false">sparkjiao/merit</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/logiqa">LogiQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/reclor">ReClor</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.276</doi>
     </paper>
     <paper id="277">
       <title><fixed-case>THE</fixed-case>-<fixed-case>X</fixed-case>: Privacy-Preserving Transformer Inference with Homomorphic Encryption</title>
@@ -3884,6 +4160,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.277</doi>
     </paper>
     <paper id="278">
       <title><fixed-case>HLDC</fixed-case>: <fixed-case>H</fixed-case>indi Legal Documents Corpus</title>
@@ -3903,6 +4180,7 @@
       <attachment type="software" hash="c763b971">2022.findings-acl.278.software.zip</attachment>
       <bibkey>kapoor-etal-2022-hldc</bibkey>
       <pwccode url="https://github.com/exploration-lab/hldc" additional="false">exploration-lab/hldc</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.278</doi>
     </paper>
     <paper id="279">
       <title>Rethinking Document-level Neural Machine Translation</title>
@@ -3918,6 +4196,7 @@
       <url hash="621a01ad">2022.findings-acl.279</url>
       <bibkey>sun-etal-2022-rethinking</bibkey>
       <pwccode url="https://github.com/sunzewei2715/Doc2Doc_NMT" additional="false">sunzewei2715/Doc2Doc_NMT</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.279</doi>
     </paper>
     <paper id="280">
       <title>Incremental Intent Detection for Medical Domain with Contrast Replay Networks</title>
@@ -3930,6 +4209,7 @@
       <url hash="32ae01f5">2022.findings-acl.280</url>
       <bibkey>bai-etal-2022-incremental</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/kuake-qic">KUAKE-QIC</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.280</doi>
     </paper>
     <paper id="281">
       <title><fixed-case>L</fixed-case>a<fixed-case>P</fixed-case>ra<fixed-case>D</fixed-case>o<fixed-case>R</fixed-case>: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval</title>
@@ -3950,6 +4230,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scifact">SciFact</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.281</doi>
     </paper>
     <paper id="282">
       <title>Do Pre-trained Models Benefit Knowledge Graph Completion? A Reliable Evaluation and a Reasonable Approach</title>
@@ -3967,6 +4248,7 @@
       <bibkey>lv-etal-2022-pre</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/inferwiki">InferWiki</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/lama">LAMA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.282</doi>
     </paper>
     <paper id="283">
       <title><fixed-case>EICO</fixed-case>: Improving Few-Shot Text Classification via Explicit and Implicit Consistency Regularization</title>
@@ -3978,6 +4260,7 @@
       <bibkey>zhao-yao-2022-eico</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.283</doi>
     </paper>
     <paper id="284">
       <title>Improving the Adversarial Robustness of <fixed-case>NLP</fixed-case> Models by Information Bottleneck</title>
@@ -3994,6 +4277,7 @@
       <bibkey>zhang-etal-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.284</doi>
     </paper>
     <paper id="285">
       <title>Incorporating Dynamic Semantics into Pre-Trained Language Model for Aspect-based Sentiment Analysis</title>
@@ -4008,6 +4292,7 @@
       <abstract>Aspect-based sentiment analysis (ABSA) predicts sentiment polarity towards a specific aspect in the given sentence. While pre-trained language models such as BERT have achieved great success, incorporating dynamic semantic changes into ABSA remains challenging. To this end, in this paper, we propose to address this problem by Dynamic Re-weighting BERT (DR-BERT), a novel method designed to learn dynamic aspect-oriented semantics for ABSA. Specifically, we first take the Stack-BERT layers as a primary encoder to grasp the overall semantic of the sentence and then fine-tune it by incorporating a lightweight Dynamic Re-weighting Adapter (DRA). Note that the DRA can pay close attention to a small region of the sentences at each step and re-weigh the vitally important words for better aspect-aware sentiment understanding. Finally, experimental results on three benchmark datasets demonstrate the effectiveness and the rationality of our proposed model and provide good interpretable insights for future semantic modeling.</abstract>
       <url hash="38af9ebf">2022.findings-acl.285</url>
       <bibkey>zhang-etal-2022-incorporating</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.285</doi>
     </paper>
     <paper id="286">
       <title><fixed-case>DARER</fixed-case>: Dual-task Temporal Relational Recurrent Reasoning Network for Joint Dialog Sentiment Classification and Act Recognition</title>
@@ -4019,6 +4304,7 @@
       <bibkey>xing-tsang-2022-darer</bibkey>
       <pwccode url="https://github.com/xingbowen714/darer" additional="false">xingbowen714/darer</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.286</doi>
     </paper>
     <paper id="287">
       <title>Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents</title>
@@ -4037,6 +4323,7 @@
       <pwccode url="https://github.com/rowitzou/dc-match" additional="false">rowitzou/dc-match</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.287</doi>
     </paper>
     <paper id="288">
       <title>Modular Domain Adaptation</title>
@@ -4050,6 +4337,7 @@
       <pwccode url="https://github.com/jkvc/modular-domain-adaptation" additional="false">jkvc/modular-domain-adaptation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.288</doi>
     </paper>
     <paper id="289">
       <title>Detection of Adversarial Examples in Text Classification: Benchmark and Baseline via Robust Density Estimation</title>
@@ -4066,6 +4354,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.289</doi>
     </paper>
     <paper id="290">
       <title><fixed-case>P</fixed-case>latt-Bin: Efficient Posterior Calibrated Training for <fixed-case>NLP</fixed-case> Classifiers</title>
@@ -4076,6 +4365,7 @@
       <url hash="0fd69edd">2022.findings-acl.290</url>
       <attachment type="software" hash="1ae094f8">2022.findings-acl.290.software.zip</attachment>
       <bibkey>singh-goshtasbpour-2022-platt</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.290</doi>
     </paper>
     <paper id="291">
       <title>Addressing Resource and Privacy Constraints in Semantic Parsing Through Data Augmentation</title>
@@ -4091,6 +4381,7 @@
       <bibkey>yang-etal-2022-addressing</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/atis">ATIS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/break">BREAK</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.291</doi>
     </paper>
     <paper id="292">
       <title>Improving Candidate Retrieval with Entity Profile Generation for <fixed-case>W</fixed-case>ikidata Entity Linking</title>
@@ -4102,6 +4393,7 @@
       <url hash="4b80b6e3">2022.findings-acl.292</url>
       <bibkey>lai-etal-2022-improving</bibkey>
       <pwccode url="https://github.com/laituan245/el-dockers" additional="false">laituan245/el-dockers</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.292</doi>
     </paper>
     <paper id="293">
       <title>Local Structure Matters Most: Perturbation Study in <fixed-case>NLU</fixed-case></title>
@@ -4115,6 +4407,7 @@
       <attachment type="software" hash="b5255e6e">2022.findings-acl.293.software.zip</attachment>
       <bibkey>clouatre-etal-2022-local</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.293</doi>
     </paper>
     <paper id="294">
       <title>Probing Factually Grounded Content Transfer with Factual Ablation</title>
@@ -4126,6 +4419,7 @@
       <abstract>Despite recent success, large neural models often generate factually incorrect text. Compounding this is the lack of a standard automatic evaluation for factuality–it cannot be meaningfully improved if it cannot be measured. Grounded generation promises a path to solving both of these problems: models draw on a reliable external document (grounding) for factual information, simplifying the challenge of factuality. Measuring factuality is also simplified–to factual consistency, testing whether the generation agrees with the grounding, rather than all facts. Yet, without a standard automatic metric for factual consistency, factually grounded generation remains an open problem. We study this problem for content transfer, in which generations extend a prompt, using information from factual grounding. Particularly, this domain allows us to introduce the notion of factual ablation for automatically measuring factual consistency: this captures the intuition that the model should be less likely to produce an output given a less relevant grounding document. In practice, we measure this by presenting a model with two grounding documents, and the model should prefer to use the more factually relevant one. We contribute two evaluation sets to measure this. Applying our new evaluation, we propose multiple novel methods improving over strong baselines.</abstract>
       <url hash="8ab8f758">2022.findings-acl.294</url>
       <bibkey>west-etal-2022-probing</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.294</doi>
     </paper>
     <paper id="295">
       <title><fixed-case>ED</fixed-case>2<fixed-case>LM</fixed-case>: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference</title>
@@ -4146,6 +4440,7 @@
       <bibkey>hui-etal-2022-ed2lm</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ms-marco">MS MARCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.295</doi>
     </paper>
     <paper id="296">
       <title>Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics</title>
@@ -4155,6 +4450,7 @@
       <abstract>Question answering-based summarization evaluation metrics must automatically determine whether the QA model’s prediction is correct or not, a task known as answer verification. In this work, we benchmark the lexical answer verification methods which have been used by current QA-based metrics as well as two more sophisticated text comparison methods, BERTScore and LERC. We find that LERC out-performs the other methods in some settings while remaining statistically indistinguishable from lexical overlap in others. However, our experiments reveal that improved verification performance does not necessarily translate to overall QA-based metric quality: In some scenarios, using a worse verification method — or using none at all — has comparable performance to using the best verification method, a result that we attribute to properties of the datasets.</abstract>
       <url hash="8b2f89e5">2022.findings-acl.296</url>
       <bibkey>deutsch-roth-2022-benchmarking</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.296</doi>
     </paper>
     <paper id="297">
       <title>Prior Knowledge and Memory Enriched Transformer for Sign Language Translation</title>
@@ -4167,6 +4463,7 @@
       <url hash="c4a410ce">2022.findings-acl.297</url>
       <bibkey>jin-etal-2022-prior</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/phoenix14t">PHOENIX14T</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.297</doi>
     </paper>
     <paper id="298">
       <title>Discontinuous Constituency and <fixed-case>BERT</fixed-case>: A Case Study of <fixed-case>D</fixed-case>utch</title>
@@ -4177,6 +4474,7 @@
       <url hash="48d8680d">2022.findings-acl.298</url>
       <attachment type="software" hash="d877a626">2022.findings-acl.298.software.zip</attachment>
       <bibkey>kogkalidis-wijnholds-2022-discontinuous</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.298</doi>
     </paper>
     <paper id="299">
       <title>Probing Multilingual Cognate Prediction Models</title>
@@ -4186,6 +4484,7 @@
       <abstract>Character-based neural machine translation models have become the reference models for cognate prediction, a historical linguistics task. So far, all linguistic interpretations about latent information captured by such models have been based on external analysis (accuracy, raw results, errors). In this paper, we investigate what probing can tell us about both models and previous interpretations, and learn that though our models store linguistic and diachronic information, they do not achieve it in previously assumed ways.</abstract>
       <url hash="c1060f20">2022.findings-acl.299</url>
       <bibkey>fourrier-sagot-2022-probing</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.299</doi>
     </paper>
     <paper id="300">
       <title>A Neural Pairwise Ranking Model for Readability Assessment</title>
@@ -4198,6 +4497,7 @@
       <bibkey>lee-vajjala-2022-neural</bibkey>
       <pwccode url="https://github.com/jlee118/nprm" additional="false">jlee118/nprm</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/newsela">Newsela</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.300</doi>
     </paper>
     <paper id="301">
       <title>First the Worst: Finding Better Gender Translations During Beam Search</title>
@@ -4210,6 +4510,7 @@
       <attachment type="software" hash="cbe41a52">2022.findings-acl.301.software.zip</attachment>
       <bibkey>saunders-etal-2022-first</bibkey>
       <pwccode url="https://github.com/dcsaunders/nmt-gender-rerank" additional="false">dcsaunders/nmt-gender-rerank</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.301</doi>
     </paper>
     <paper id="302">
       <title>Dialogue Summaries as Dialogue States (<fixed-case>DS</fixed-case>2), Template-Guided Summarization for Few-shot Dialogue State Tracking</title>
@@ -4226,6 +4527,7 @@
       <pwccode url="https://github.com/jshin49/ds2" additional="false">jshin49/ds2</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/multiwoz">MultiWOZ</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/samsum-corpus">SAMSum Corpus</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.302</doi>
     </paper>
     <paper id="303">
       <title>Unsupervised Preference-Aware Language Identification</title>
@@ -4242,6 +4544,7 @@
       <attachment type="software" hash="ad928da9">2022.findings-acl.303.software.zip</attachment>
       <bibkey>ren-etal-2022-unsupervised</bibkey>
       <pwccode url="https://github.com/xzhren/preferenceawarelid" additional="false">xzhren/preferenceawarelid</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.303</doi>
     </paper>
     <paper id="304">
       <title>Using <fixed-case>NLP</fixed-case> to quantify the environmental cost and diversity benefits of in-person <fixed-case>NLP</fixed-case> conferences</title>
@@ -4252,6 +4555,7 @@
       <url hash="7a5738d0">2022.findings-acl.304</url>
       <bibkey>przybyla-shardlow-2022-using</bibkey>
       <pwccode url="https://github.com/piotrmp/nlp_geography" additional="false">piotrmp/nlp_geography</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.304</doi>
     </paper>
     <paper id="305">
       <title>Interpretable Research Replication Prediction via Variational Contextual Consistency Sentence Masking</title>
@@ -4265,6 +4569,7 @@
       <attachment type="software" hash="bfebc4b3">2022.findings-acl.305.software.zip</attachment>
       <bibkey>luo-etal-2022-interpretable</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/echr">ECHR</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.305</doi>
     </paper>
     <paper id="306">
       <title><fixed-case>C</fixed-case>hinese Synesthesia Detection: New Dataset and Models</title>
@@ -4276,6 +4581,7 @@
       <abstract>In this paper, we introduce a new task called synesthesia detection, which aims to extract the sensory word of a sentence, and to predict the original and synesthetic sensory modalities of the corresponding sensory word. Synesthesia refers to the description of perceptions in one sensory modality through concepts from other modalities. It involves not only a linguistic phenomenon, but also a cognitive phenomenon structuring human thought and action, which makes it become a bridge between figurative linguistic phenomenon and abstract cognition, and thus be helpful to understand the deep semantics. To address this, we construct a large-scale human-annotated Chinese synesthesia dataset, which contains 7,217 annotated sentences accompanied by 187 sensory words. Based on this dataset, we propose a family of strong and representative baseline models. Upon these baselines, we further propose a radical-based neural network model to identify the boundary of the sensory word, and to jointly detect the original and synesthetic sensory modalities for the word. Through extensive experiments, we observe that the importance of the proposed task and dataset can be verified by the statistics and progressive performances. In addition, our proposed model achieves state-of-the-art results on the synesthesia dataset.</abstract>
       <url hash="8c19f3a7">2022.findings-acl.306</url>
       <bibkey>jiang-etal-2022-chinese</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.306</doi>
     </paper>
     <paper id="307">
       <title>Rethinking Offensive Text Detection as a Multi-Hop Reasoning Problem</title>
@@ -4288,6 +4594,7 @@
       <bibkey>zhang-etal-2022-rethinking</bibkey>
       <pwccode url="https://github.com/qzx7/slight" additional="false">qzx7/slight</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/olid">OLID</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.307</doi>
     </paper>
     <paper id="308">
       <title>On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark</title>
@@ -4306,6 +4613,7 @@
       <attachment type="software" hash="01939158">2022.findings-acl.308.software.zip</attachment>
       <bibkey>sun-etal-2022-safety</bibkey>
       <pwccode url="https://github.com/thu-coai/diasafety" additional="false">thu-coai/diasafety</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.308</doi>
     </paper>
     <paper id="309">
       <title>Word Segmentation by Separation Inference for <fixed-case>E</fixed-case>ast <fixed-case>A</fixed-case>sian Languages</title>
@@ -4319,6 +4627,7 @@
       <url hash="73a0039b">2022.findings-acl.309</url>
       <bibkey>tong-etal-2022-word</bibkey>
       <pwccode url="https://github.com/um-nlper/spin-ws" additional="false">um-nlper/spin-ws</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.309</doi>
     </paper>
     <paper id="310">
       <title>Unsupervised <fixed-case>C</fixed-case>hinese Word Segmentation with <fixed-case>BERT</fixed-case> Oriented Probing and Transformation</title>
@@ -4332,6 +4641,7 @@
       <attachment type="software" hash="a9589624">2022.findings-acl.310.software.zip</attachment>
       <bibkey>li-etal-2022-unsupervised</bibkey>
       <pwccode url="https://github.com/liweitj47/bert_unsupervised_word_segmentation" additional="false">liweitj47/bert_unsupervised_word_segmentation</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.310</doi>
     </paper>
     <paper id="311">
       <title><fixed-case>E</fixed-case>-<fixed-case>KAR</fixed-case>: A Benchmark for Rationalizing Natural Language Analogical Reasoning</title>
@@ -4350,6 +4660,7 @@
       <url hash="ed449cf0">2022.findings-acl.311</url>
       <bibkey>chen-etal-2022-e</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/e-kar">E-KAR</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.311</doi>
     </paper>
     <paper id="312">
       <title>Implicit Relation Linking for Question Answering over Knowledge Graph</title>
@@ -4367,6 +4678,7 @@
       <bibkey>zhao-etal-2022-implicit</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/dbpedia">DBpedia</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/simplequestions">SimpleQuestions</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.312</doi>
     </paper>
     <paper id="313">
       <title>Attention Mechanism with Energy-Friendly Operations</title>
@@ -4383,6 +4695,7 @@
       <url hash="49853277">2022.findings-acl.313</url>
       <bibkey>wan-etal-2022-attention</bibkey>
       <pwccode url="https://github.com/nlp2ct/e-att" additional="false">nlp2ct/e-att</pwccode>
+      <doi>10.18653/v1/2022.findings-acl.313</doi>
     </paper>
     <paper id="314">
       <title>Probing <fixed-case>BERT</fixed-case>’s priors with serial reproduction chains</title>
@@ -4393,6 +4706,7 @@
       <abstract>Sampling is a promising bottom-up method for exposing what generative models have learned about language, but it remains unclear how to generate representative samples from popular masked language models (MLMs) like BERT. The MLM objective yields a dependency network with no guarantee of consistent conditional distributions, posing a problem for naive approaches. Drawing from theories of iterated learning in cognitive science, we explore the use of <i>serial reproduction chains</i> to sample from BERT’s priors. In particular, we observe that a unique and consistent estimator of the ground-truth joint distribution is given by a Generative Stochastic Network (GSN) sampler, which randomly selects which token to mask and reconstruct on each step. We show that the lexical and syntactic statistics of sentences from GSN chains closely match the ground-truth corpus distribution and perform better than other methods in a large corpus of naturalness judgments. Our findings establish a firmer theoretical foundation for bottom-up probing and highlight richer deviations from human priors.</abstract>
       <url hash="0166889c">2022.findings-acl.314</url>
       <bibkey>yamakoshi-etal-2022-probing</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.314</doi>
     </paper>
     <paper id="315">
       <title>Interpreting the Robustness of Neural <fixed-case>NLP</fixed-case> Models to Textual Perturbations</title>
@@ -4404,6 +4718,7 @@
       <abstract>Modern Natural Language Processing (NLP) models are known to be sensitive to input perturbations and their performance can decrease when applied to real-world, noisy data. However, it is still unclear why models are less robust to some perturbations than others. In this work, we test the hypothesis that the extent to which a model is affected by an unseen textual perturbation (robustness) can be explained by the learnability of the perturbation (defined as how well the model learns to identify the perturbation with a small amount of evidence). We further give a causal justification for the learnability metric. We conduct extensive experiments with four prominent NLP models — TextRNN, BERT, RoBERTa and XLNet — over eight types of textual perturbations on three datasets. We show that a model which is better at identifying a perturbation (higher learnability) becomes worse at ignoring such a perturbation at test time (lower robustness), providing empirical support for our hypothesis.</abstract>
       <url hash="47834ba0">2022.findings-acl.315</url>
       <bibkey>zhang-etal-2022-interpreting</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.315</doi>
     </paper>
     <paper id="316">
       <title>Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations</title>
@@ -4419,6 +4734,7 @@
       <bibkey>xin-etal-2022-zero</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/beir">BEIR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.316</doi>
     </paper>
     <paper id="317">
       <title>A Few-Shot Semantic Parser for <fixed-case>W</fixed-case>izard-of-<fixed-case>O</fixed-case>z Dialogues with the Precise <fixed-case>T</fixed-case>hing<fixed-case>T</fixed-case>alk Representation</title>
@@ -4432,6 +4748,7 @@
       <abstract>Previous attempts to build effective semantic parsers for Wizard-of-Oz (WOZ) conversations suffer from the difficulty in acquiring a high-quality, manually annotated training set. Approaches based only on dialogue synthesis are insufficient, as dialogues generated from state-machine based models are poor approximations of real-life conversations. Furthermore, previously proposed dialogue state representations are ambiguous and lack the precision necessary for building an effective agent.This paper proposes a new dialogue representation and a sample-efficient methodology that can predict precise dialogue states in WOZ conversations. We extended the ThingTalk representation to capture all information an agent needs to respond properly. Our training strategy is sample-efficient: we combine (1) few-shot data sparsely sampling the full dialogue space and (2) synthesized data covering a subset space of dialogues generated by a succinct state-based dialogue model. The completeness of the extended ThingTalk language is demonstrated with a fully operational agent, which is also used in training data synthesis. We demonstrate the effectiveness of our methodology on MultiWOZ 3.0, a reannotation of the MultiWOZ 2.1 dataset in ThingTalk. ThingTalk can represent 98% of the test turns, while the simulator can emulate 85% of the validation set. We train a contextual semantic parser using our strategy, and obtain 79% turn-by-turn exact match accuracy on the reannotated test set.</abstract>
       <url hash="4af43fd5">2022.findings-acl.317</url>
       <bibkey>campagna-etal-2022-shot</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.317</doi>
     </paper>
     <paper id="318">
       <title><fixed-case>GCPG</fixed-case>: A General Framework for Controllable Paraphrase Generation</title>
@@ -4448,6 +4765,7 @@
       <url hash="6037d737">2022.findings-acl.318</url>
       <attachment type="software" hash="8ba9c70a">2022.findings-acl.318.software.zip</attachment>
       <bibkey>yang-etal-2022-gcpg</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.318</doi>
     </paper>
     <paper id="319">
       <title><fixed-case>C</fixed-case>ross<fixed-case>A</fixed-case>ligner &amp; Co: Zero-Shot Transfer Methods for Task-Oriented Cross-lingual Natural Language Understanding</title>
@@ -4460,6 +4778,7 @@
       <bibkey>gritta-etal-2022-crossaligner</bibkey>
       <pwccode url="https://github.com/huawei-noah/noah-research/tree/master/NLP/cross_aligner" additional="false">huawei-noah/noah-research</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mtop">MTOP</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.319</doi>
     </paper>
     <paper id="320">
       <title>Attention as Grounding: Exploring Textual and Cross-Modal Attention on Entities and Relations in Language-and-Vision Transformer</title>
@@ -4471,6 +4790,7 @@
       <bibkey>ilinykh-dobnik-2022-attention</bibkey>
       <pwccode url="https://github.com/gu-clasp/attention-as-grounding" additional="false">gu-clasp/attention-as-grounding</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/image-description-sequences">Image Description Sequences</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.320</doi>
     </paper>
     <paper id="321">
       <title>Improving Zero-Shot Cross-lingual Transfer Between Closely Related Languages by Injecting Character-Level Noise</title>
@@ -4481,6 +4801,7 @@
       <url hash="bf52bffc">2022.findings-acl.321</url>
       <bibkey>aepli-sennrich-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.321</doi>
     </paper>
     <paper id="322">
       <title>Structural Supervision for Word Alignment and Machine Translation</title>
@@ -4492,6 +4813,7 @@
       <abstract>Syntactic structure has long been argued to be potentially useful for enforcing accurate word alignment and improving generalization performance of machine translation. Unfortunately, existing wisdom demonstrates its significance by considering only the syntactic structure of source tokens, neglecting the rich structural information from target tokens and the structural similarity between the source and target sentences. In this work, we propose to incorporate the syntactic structure of both source and target tokens into the encoder-decoder framework, tightly correlating the internal logic of word alignment and machine translation for multi-task learning. Particularly, we won’t leverage any annotated syntactic graph of the target side during training, so we introduce Dynamic Graph Convolution Networks (DGCN) on observed target tokens to sequentially and simultaneously generate the target tokens and the corresponding syntactic graphs, and further guide the word alignment. On this basis, Hierarchical Graph Random Walks (HGRW) are performed on the syntactic graphs of both source and target sides, for incorporating structured constraints on machine translation outputs. Experiments on four publicly available language pairs verify that our method is highly effective in capturing syntactic structure in different languages, consistently outperforming baselines in alignment accuracy and demonstrating promising results in translation quality.</abstract>
       <url hash="6d0f7a70">2022.findings-acl.322</url>
       <bibkey>li-etal-2022-structural</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.322</doi>
     </paper>
     <paper id="323">
       <title>Focus on the Action: Learning to Highlight and Summarize Jointly for Email To-Do Items Summarization</title>
@@ -4502,6 +4824,7 @@
       <abstract>Automatic email to-do item generation is the task of generating to-do items from a given email to help people overview emails and schedule daily work. Different from prior research on email summarization, to-do item generation focuses on generating action mentions to provide more structured summaries of email text.Prior work either requires large amount of annotation for key sentences with potential actions or fails to pay attention to nuanced actions from these unstructured emails, and thus often lead to unfaithful summaries. To fill these gaps, we propose a simple and effective learning to highlight and summarize framework (LHS) to learn to identify the most salient text and actions, and incorporate these structured representations to generate more faithful to-do items. Experiments show that our LHS model outperforms the baselines and achieves the state-of-the-art performance in terms of both quantitative evaluation and human judgement. We also discussed specific challenges that current models faced with email to-do summarization.</abstract>
       <url hash="e13d73b0">2022.findings-acl.323</url>
       <bibkey>zhang-etal-2022-focus</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.323</doi>
     </paper>
     <paper id="324">
       <title>Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors</title>
@@ -4512,6 +4835,7 @@
       <abstract>In this paper, we explore the capacity of a language model-based method for grammatical error detection in detail. We first show that 5 to 10% of training data are enough for a BERT-based error detection method to achieve performance equivalent to what a non-language model-based method can achieve with the full training data; recall improves much faster with respect to training data size in the BERT-based method than in the non-language model method. This suggests that (i) the BERT-based method should have a good knowledge of the grammar required to recognize certain types of error and that (ii) it can transform the knowledge into error detection rules by fine-tuning with few training samples, which explains its high generalization ability in grammatical error detection. We further show with pseudo error data that it actually exhibits such nice properties in learning rules for recognizing various types of error. Finally, based on these findings, we discuss a cost-effective method for detecting grammatical errors with feedback comments explaining relevant grammatical rules to learners.</abstract>
       <url hash="a254046f">2022.findings-acl.324</url>
       <bibkey>nagata-etal-2022-exploring</bibkey>
+      <doi>10.18653/v1/2022.findings-acl.324</doi>
     </paper>
     <paper id="325">
       <title>Should We Trust This Summary? <fixed-case>B</fixed-case>ayesian Abstractive Summarization to The Rescue</title>
@@ -4522,6 +4846,7 @@
       <url hash="12a71fd6">2022.findings-acl.325</url>
       <bibkey>gidiotis-tsoumakas-2022-trust</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/aeslc">AESLC</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.325</doi>
     </paper>
     <paper id="326">
       <title>On the data requirements of probing</title>
@@ -4536,6 +4861,7 @@
       <bibkey>zhu-etal-2022-data</bibkey>
       <pwccode url="https://github.com/spoclab-ca/probing_dataset" additional="false">spoclab-ca/probing_dataset</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/senteval">SentEval</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.326</doi>
     </paper>
     <paper id="327">
       <title>Translation Error Detection as Rationale Extraction</title>
@@ -4547,6 +4873,7 @@
       <url hash="35feecd8">2022.findings-acl.327</url>
       <bibkey>fomicheva-etal-2022-translation</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mlqe-pe">MLQE-PE</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.327</doi>
     </paper>
     <paper id="328">
       <title>Towards Collaborative Neural-Symbolic Graph Semantic Parsing via Uncertainty</title>
@@ -4558,6 +4885,7 @@
       <url hash="ca2c1494">2022.findings-acl.328</url>
       <bibkey>lin-etal-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/scan">SCAN</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.328</doi>
     </paper>
     <paper id="329">
       <title>Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework</title>
@@ -4568,6 +4896,7 @@
       <url hash="055efb64">2022.findings-acl.329</url>
       <bibkey>wang-shang-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/funsd">FUNSD</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.329</doi>
     </paper>
     <paper id="330">
       <title>On Length Divergence Bias in Textual Matching Models</title>
@@ -4582,6 +4911,7 @@
       <url hash="fdf34290">2022.findings-acl.330</url>
       <bibkey>jiang-etal-2022-length</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/trecqa">TrecQA</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.330</doi>
     </paper>
     <paper id="331">
       <title>What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation</title>
@@ -4596,6 +4926,7 @@
       <bibkey>ghazarian-etal-2022-wrong</bibkey>
       <pwccode url="https://github.com/alexa/conture" additional="false">alexa/conture</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fed">FED</pwcdataset>
+      <doi>10.18653/v1/2022.findings-acl.331</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.fl4nlp.xml b/data/xml/2022.fl4nlp.xml
index e3ab30f053..f672f77425 100644
--- a/data/xml/2022.fl4nlp.xml
+++ b/data/xml/2022.fl4nlp.xml
@@ -34,6 +34,7 @@
       <abstract>In the context of personalized federated learning (FL), the critical challenge is to balance local model improvement and global model tuning when the personal and global objectives may not be exactly aligned. Inspired by Bayesian hierarchical models, we develop ActPerFL, a self-aware personalized FL method where each client can automatically balance the training of its local personal model and the global model that implicitly contributes to other clients’ training. Such a balance is derived from the inter-client and intra-client uncertainty quantification. Consequently, ActPerFL can adapt to the underlying clients’ heterogeneity with uncertainty-driven local training and model aggregation. With experimental studies on Sent140 and Amazon Alexa audio data, we show that ActPerFL can achieve superior personalization performance compared with the existing counterparts.</abstract>
       <url hash="5de65579">2022.fl4nlp-1.1</url>
       <bibkey>chen-etal-2022-actperfl</bibkey>
+      <doi>10.18653/v1/2022.fl4nlp-1.1</doi>
     </paper>
     <paper id="2">
       <title>Scaling Language Model Size in Cross-Device Federated Learning</title>
@@ -49,6 +50,7 @@
       <url hash="ba10ba8f">2022.fl4nlp-1.2</url>
       <bibkey>ro-etal-2022-scaling</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/billion-word-benchmark">Billion Word Benchmark</pwcdataset>
+      <doi>10.18653/v1/2022.fl4nlp-1.2</doi>
     </paper>
     <paper id="3">
       <title>Adaptive Differential Privacy for Language Model Training</title>
@@ -61,6 +63,7 @@
       <bibkey>wu-etal-2022-adaptive</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.fl4nlp-1.3</doi>
     </paper>
     <paper id="4">
       <title>Intrinsic Gradient Compression for Scalable and Efficient Federated Learning</title>
@@ -72,6 +75,7 @@
       <bibkey>melas-kyriazi-wang-2022-intrinsic</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.fl4nlp-1.4</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.humeval.xml b/data/xml/2022.humeval.xml
index 2e42a427fd..55a10d5b0c 100644
--- a/data/xml/2022.humeval.xml
+++ b/data/xml/2022.humeval.xml
@@ -25,6 +25,7 @@
       <abstract>SacreBLEU, by incorporating a text normalizing step in the pipeline, has become a rising automatic evaluation metric in recent MT studies. With agglutinative languages such as Korean, however, the lexical-level metric cannot provide a conceivable result without a customized pre-tokenization. This paper endeavors to ex- amine the influence of diversified tokenization schemes –word, morpheme, subword, character, and consonants &amp; vowels (CV)– on the metric after its protective layer is peeled off.By performing meta-evaluation with manually- constructed into-Korean resources, our empirical study demonstrates that the human correlation of the surface-based metric and other homogeneous ones (as an extension) vacillates greatly by the token type. Moreover, the human correlation of the metric often deteriorates due to some tokenization, with CV one of its culprits. Guiding through the proper usage of tokenizers for the given metric, we discover i) the feasibility of the character tokens and ii) the deficit of CV in the Korean MT evaluation.</abstract>
       <url hash="c26fee50">2022.humeval-1.1</url>
       <bibkey>kim-kim-2022-vacillating</bibkey>
+      <doi>10.18653/v1/2022.humeval-1.1</doi>
     </paper>
     <paper id="2">
       <title>A Methodology for the Comparison of Human Judgments With Metrics for Coreference Resolution</title>
@@ -37,6 +38,7 @@
       <url hash="5dfc3af5">2022.humeval-1.2</url>
       <bibkey>borovikova-etal-2022-methodology</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2012-1">CoNLL-2012</pwcdataset>
+      <doi>10.18653/v1/2022.humeval-1.2</doi>
     </paper>
     <paper id="3">
       <title>Perceptual Quality Dimensions of Machine-Generated Text with a Focus on Machine Translation</title>
@@ -49,6 +51,7 @@
       <url hash="4301eae2">2022.humeval-1.3</url>
       <bibkey>macketanz-etal-2022-perceptual</bibkey>
       <pwccode url="https://github.com/dfki-nlp/textq" additional="false">dfki-nlp/textq</pwccode>
+      <doi>10.18653/v1/2022.humeval-1.3</doi>
     </paper>
     <paper id="4">
       <title>Human evaluation of web-crawled parallel corpora for machine translation</title>
@@ -61,6 +64,7 @@
       <url hash="141f5f3f">2022.humeval-1.4</url>
       <bibkey>ramirez-sanchez-etal-2022-human</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
+      <doi>10.18653/v1/2022.humeval-1.4</doi>
     </paper>
     <paper id="5">
       <title>Beyond calories: evaluating how tailored communication reduces emotional load in diet-coaching</title>
@@ -70,6 +74,7 @@
       <abstract>Dieting is a behaviour change task that is difficult for many people to conduct successfully. This is due to many factors, including stress and cost. Mobile applications offer an alternative to traditional coaching. However, previous work on apps evaluation only focused on dietary outcomes, ignoring users’ emotional state despite its influence on eating habits. In this work, we introduce a novel evaluation of the effects that tailored communication can have on the emotional load of dieting. We implement this by augmenting a traditional diet-app with affective NLG, text-tailoring and persuasive communication techniques. We then run a short 2-weeks experiment and check dietary outcomes, user feedback of produced text and, most importantly, its impact on emotional state, through PANAS questionnaire. Results show that tailored communication significantly improved users’ emotional state, compared to an app-only control group.</abstract>
       <url hash="0edef9d6">2022.humeval-1.5</url>
       <bibkey>balloccu-reiter-2022-beyond</bibkey>
+      <doi>10.18653/v1/2022.humeval-1.5</doi>
     </paper>
     <paper id="6">
       <title>The Human Evaluation Datasheet: A Template for Recording Details of Human Evaluation Experiments in <fixed-case>NLP</fixed-case></title>
@@ -80,6 +85,7 @@
       <url hash="526cc7c0">2022.humeval-1.6</url>
       <bibkey>shimorina-belz-2022-human</bibkey>
       <pwccode url="https://github.com/Shimorina/human-evaluation-datasheet" additional="false">Shimorina/human-evaluation-datasheet</pwccode>
+      <doi>10.18653/v1/2022.humeval-1.6</doi>
     </paper>
     <paper id="7">
       <title>Toward More Effective Human Evaluation for Machine Translation</title>
@@ -92,6 +98,7 @@
       <url hash="9fffacea">2022.humeval-1.7</url>
       <bibkey>saldias-fuentes-etal-2022-toward</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wmt-2020">WMT 2020</pwcdataset>
+      <doi>10.18653/v1/2022.humeval-1.7</doi>
     </paper>
     <paper id="8">
       <title>A Study on Manual and Automatic Evaluation for Text Style Transfer: The Case of Detoxification</title>
@@ -107,6 +114,7 @@
       <url hash="3324ce8f">2022.humeval-1.8</url>
       <bibkey>logacheva-etal-2022-study</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cola">CoLA</pwcdataset>
+      <doi>10.18653/v1/2022.humeval-1.8</doi>
     </paper>
     <paper id="9">
       <title>Human Judgement as a Compass to Navigate Automatic Metrics for Formality Transfer</title>
@@ -120,6 +128,7 @@
       <bibkey>lai-etal-2022-human</bibkey>
       <pwccode url="https://github.com/laihuiyuan/eval-formality-transfer" additional="false">laihuiyuan/eval-formality-transfer</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/gyafc">GYAFC</pwcdataset>
+      <doi>10.18653/v1/2022.humeval-1.9</doi>
     </paper>
     <paper id="10">
       <title>Towards Human Evaluation of Mutual Understanding in Human-Computer Spontaneous Conversation: An Empirical Study of Word Sense Disambiguation for Naturalistic Social Dialogs in <fixed-case>A</fixed-case>merican <fixed-case>E</fixed-case>nglish</title>
@@ -128,6 +137,7 @@
       <abstract>Current evaluation practices for social dialog systems, dedicated to human-computer spontaneous conversation, exclusively focus on the quality of system-generated surface text, but not human-verifiable aspects of mutual understanding between the systems and their interlocutors. This work proposes Word Sense Disambiguation (WSD) as an essential component of a valid and reliable human evaluation framework, whose long-term goal is to radically improve the usability of dialog systems in real-life human-computer collaboration. The practicality of this proposal is proved via experimentally investigating (1) the WordNet 3.0 sense inventory coverage of lexical meanings in spontaneous conversation between humans in American English, assumed as an upper bound of lexical diversity of human-computer communication, and (2) the effectiveness of state-of-the-art WSD models and pretrained transformer-based contextual embeddings on this type of data.</abstract>
       <url hash="f73bd479">2022.humeval-1.10</url>
       <bibkey>luu-2022-towards</bibkey>
+      <doi>10.18653/v1/2022.humeval-1.10</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.in2writing.xml b/data/xml/2022.in2writing.xml
index de00494aeb..58c50f12dd 100644
--- a/data/xml/2022.in2writing.xml
+++ b/data/xml/2022.in2writing.xml
@@ -31,6 +31,7 @@
       <abstract>Today, data-to-text systems are used as commercial solutions for automated text productionof large quantities of text. Therefore, they already represent a new technology of writing.This new technology requires the author, asan act of writing, both to configure a systemthat then takes over the transformation into areal text, but also to maintain strategies of traditional writing. What should an environmentlook like, where a human guides a machineto write texts? Based on a comparison of theNLG pipeline architecture with the results ofthe research on the human writing process, thispaper attempts to take an overview of whichtasks need to be solved and which strategiesare necessary to produce good texts in this environment. From this synopsis, principles for thedesign of data-to-text systems as a functioningwriting environment are then derived.</abstract>
       <url hash="1b38adb1">2022.in2writing-1.1</url>
       <bibkey>schneider-etal-2022-data</bibkey>
+      <doi>10.18653/v1/2022.in2writing-1.1</doi>
     </paper>
     <paper id="2">
       <title>A Design Space for Writing Support Tools Using a Cognitive Process Model of Writing</title>
@@ -42,6 +43,7 @@
       <abstract>Improvements in language technology have led to an increasing interest in writing support tools. In this paper we propose a design space for such tools based on a cognitive process model of writing. We conduct a systematic review of recent computer science papers that present and/or study such tools, analyzing 30 papers from the last five years using the design space. Tools are plotted according to three distinct cognitive processes–planning, translating, and reviewing–and the level of constraint each process entails. Analyzing recent work with the design space shows that highly constrained planning and reviewing are under-studied areas that recent technology improvements may now be able to serve. Finally, we propose shared evaluation methodologies and tasks that may help the field mature.</abstract>
       <url hash="418e4dd8">2022.in2writing-1.2</url>
       <bibkey>gero-etal-2022-design</bibkey>
+      <doi>10.18653/v1/2022.in2writing-1.2</doi>
     </paper>
     <paper id="3">
       <title>A Selective Summary of Where to Hide a Stolen Elephant: Leaps in Creative Writing with Multimodal Machine Intelligence</title>
@@ -53,6 +55,7 @@
       <abstract>While developing a story, novices and published writers alike have had to look outside themselves for inspiration. Language models have recently been able to generate text fluently, producing new stochastic narratives upon request. However, effectively integrating such capabilities with human cognitive faculties and creative processes remains challenging. We propose to investigate this integration with a multimodal writing support interface that offers writing suggestions textually, visually, and aurally. We conduct an extensive study that combines elicitation of prior expectations before writing, observation and semi-structured interviews during writing, and outcome evaluations after writing. Our results illustrate individual and situational variation in machine-in-the-loop writing approaches, suggestion acceptance, and ways the system is helpful. Centrally, we report how participants perform integrative leaps, by which they do cognitive work to integrate suggestions of varying semantic relevance into their developing stories. We interpret these findings, offering modeling and design recommendations for future creative writing support technologies.</abstract>
       <url hash="1d0ae1ae">2022.in2writing-1.3</url>
       <bibkey>singh-etal-2022-selective</bibkey>
+      <doi>10.18653/v1/2022.in2writing-1.3</doi>
     </paper>
     <paper id="4">
       <title>A text-writing system for Easy-to-Read <fixed-case>G</fixed-case>erman evaluated with low-literate users with cognitive impairment</title>
@@ -63,6 +66,7 @@
       <url hash="4b1daf4e">2022.in2writing-1.4</url>
       <bibkey>steinmetz-harbusch-2022-text</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/celex">CELEX</pwcdataset>
+      <doi>10.18653/v1/2022.in2writing-1.4</doi>
     </paper>
     <paper id="5">
       <title>Language Models as Context-sensitive Word Search Engines</title>
@@ -78,6 +82,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/cloth">CLOTH</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.in2writing-1.5</doi>
     </paper>
     <paper id="6">
       <title>Plug-and-Play Controller for Story Completion: A Pilot Study toward Emotion-aware Story Writing Assistance</title>
@@ -90,6 +95,7 @@
       <url hash="89d1a968">2022.in2writing-1.6</url>
       <bibkey>mori-etal-2022-plug</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/rocstories">ROCStories</pwcdataset>
+      <doi>10.18653/v1/2022.in2writing-1.6</doi>
     </paper>
     <paper id="7">
       <title>Text Revision by On-the-Fly Representation Optimization</title>
@@ -105,6 +111,7 @@
       <pwccode url="https://github.com/jingjingli01/oreo" additional="false">jingjingli01/oreo</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/gyafc">GYAFC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/newsela">Newsela</pwcdataset>
+      <doi>10.18653/v1/2022.in2writing-1.7</doi>
     </paper>
     <paper id="8">
       <title>The Pure Poet: How Good is the Subjective Credibility and Stylistic Quality of Literary Short Texts Written with an Artificial Intelligence Tool as Compared to Texts Written by Human Authors?</title>
@@ -118,6 +125,7 @@
       <abstract>The application of artificial intelligence (AI) for text generation in creative domains raises questions regarding the credibility of AI-generated content. In two studies, we explored if readers can differentiate between AI-based and human-written texts (generated based on the first line of texts and poems of classic authors) and how the stylistic qualities of these texts are rated. Participants read 9 AI-based continuations and either 9 human-written continuations (Study 1, N=120) or 9 original continuations (Study 2, N=302). Participants’ task was to decide whether a continuation was written with an AI-tool or not, to indicate their confidence in each decision, and to assess the stylistic text quality. Results showed that participants generally had low accuracy for differentiating between text types but were overconfident in their decisions. Regarding the assessment of stylistic quality, AI-continuations were perceived as less well-written, inspiring, fascinating, interesting, and aesthetic than both human-written and original continuations.</abstract>
       <url hash="c27b3cb2">2022.in2writing-1.8</url>
       <bibkey>gunser-etal-2022-pure</bibkey>
+      <doi>10.18653/v1/2022.in2writing-1.8</doi>
     </paper>
     <paper id="9">
       <title>Interactive Children’s Story Rewriting Through Parent-Children Interaction</title>
@@ -129,6 +137,7 @@
       <abstract>Storytelling in early childhood provides significant benefits in language and literacy development, relationship building, and entertainment. To maximize these benefits, it is important to empower children with more agency. Interactive story rewriting through parent-children interaction can boost children’s agency and help build the relationship between parent and child as they collaboratively create changes to an original story. However, for children with limited proficiency in reading and writing, parents must carry out multiple tasks to guide the rewriting process, which can incur a high cognitive load. In this work, we introduce an interface design that aims to support children and parents to rewrite stories together with the help of AI techniques. We describe three design goals determined by a review of prior literature in interactive storytelling and existing educational activities. We also propose a preliminary prompt-based pipeline that uses GPT-3 to realize the design goals and enable the interface.</abstract>
       <url hash="35f22f92">2022.in2writing-1.9</url>
       <bibkey>lee-etal-2022-interactive</bibkey>
+      <doi>10.18653/v1/2022.in2writing-1.9</doi>
     </paper>
     <paper id="10">
       <title>News Article Retrieval in Context for Event-centric Narrative Creation</title>
@@ -141,6 +150,7 @@
       <url hash="f34aaaa3">2022.in2writing-1.10</url>
       <bibkey>voskarides-etal-2022-news</bibkey>
       <pwccode url="https://github.com/nickvosk/ictir2021-news-retrieval-in-context" additional="false">nickvosk/ictir2021-news-retrieval-in-context</pwccode>
+      <doi>10.18653/v1/2022.in2writing-1.10</doi>
     </paper>
     <paper id="11">
       <title>Unmet Creativity Support Needs in Computationally Supported Creative Writing</title>
@@ -150,6 +160,7 @@
       <abstract>Large language models (LLMs) enabled by the datasets and computing power of the last decade have recently gained popularity for their capacity to generate plausible natural language text from human-provided prompts. This ability makes them appealing to fiction writers as prospective co-creative agents, addressing the common challenge of writer’s block, or getting unstuck. However, creative writers face additional challenges, including maintaining narrative consistency, developing plot structure, architecting reader experience, and refining their expressive intent, which are not well-addressed by current LLM-backed tools. In this paper, we define these needs by grounding them in cognitive and theoretical literature, then survey previous computational narrative research that holds promise for supporting each of them in a co-creative setting.</abstract>
       <url hash="06bd8d1d">2022.in2writing-1.11</url>
       <bibkey>kreminski-martens-2022-unmet</bibkey>
+      <doi>10.18653/v1/2022.in2writing-1.11</doi>
     </paper>
     <paper id="12">
       <title>Sparks: Inspiration for Science Writing using Language Models</title>
@@ -160,6 +171,7 @@
       <abstract>Large-scale language models are rapidly improving, performing well on a variety of tasks with little to no customization. In this work we investigate how language models can support science writing, a challenging writing task that is both open-ended and highly constrained. We present a system for generating “sparks”, sentences related to a scientific concept intended to inspire writers. We run a user study with 13 STEM graduate students and find three main use cases of sparks—inspiration, translation, and perspective—each of which correlates with a unique interaction pattern. We also find that while participants were more likely to select higher quality sparks, the overall quality of sparks seen by a given participant did not correlate with their satisfaction with the tool.</abstract>
       <url hash="78e0a5c7">2022.in2writing-1.12</url>
       <bibkey>gero-etal-2022-sparks</bibkey>
+      <doi>10.18653/v1/2022.in2writing-1.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>C</fixed-case>hip<fixed-case>S</fixed-case>ong: A Controllable Lyric Generation System for <fixed-case>C</fixed-case>hinese Popular Song</title>
@@ -175,6 +187,7 @@
       <url hash="8ea21d80">2022.in2writing-1.13</url>
       <bibkey>liu-etal-2022-chipsong</bibkey>
       <pwccode url="https://github.com/korokes/chipsong" additional="false">korokes/chipsong</pwccode>
+      <doi>10.18653/v1/2022.in2writing-1.13</doi>
     </paper>
     <paper id="14">
       <title>Read, Revise, Repeat: A System Demonstration for Human-in-the-loop Iterative Text Revision</title>
@@ -188,6 +201,7 @@
       <url hash="8c6086b5">2022.in2writing-1.14</url>
       <bibkey>du-etal-2022-read</bibkey>
       <pwccode url="https://github.com/vipulraheja/iterater" additional="false">vipulraheja/iterater</pwccode>
+      <doi>10.18653/v1/2022.in2writing-1.14</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.insights.xml b/data/xml/2022.insights.xml
index 1ace04d59f..cc861d753d 100644
--- a/data/xml/2022.insights.xml
+++ b/data/xml/2022.insights.xml
@@ -31,6 +31,7 @@
       <url hash="f0fc70dc">2022.insights-1.1</url>
       <bibkey>ding-etal-2022-isotropy</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.1</doi>
     </paper>
     <paper id="2">
       <title>Do Dependency Relations Help in the Task of Stance Detection?</title>
@@ -41,6 +42,7 @@
       <abstract>In this paper we present a set of multilingual experiments tackling the task of Stance Detection in five different languages: English, Spanish, Catalan, French and Italian. Furthermore, we study the phenomenon of stance with respect to six different targets – one per language, and two different for Italian – employing a variety of machine learning algorithms that primarily exploit morphological and syntactic knowledge as features, represented throughout the format of Universal Dependencies. Results seem to suggest that the methodology employed is not beneficial per se, but might be useful to exploit the same features with a different methodology.</abstract>
       <url hash="b50c8b6c">2022.insights-1.2</url>
       <bibkey>cignarella-etal-2022-dependency</bibkey>
+      <doi>10.18653/v1/2022.insights-1.2</doi>
     </paper>
     <paper id="3">
       <title>Evaluating the Practical Utility of Confidence-score based Techniques for Unsupervised Open-world Classification</title>
@@ -50,6 +52,7 @@
       <abstract>Open-world classification in dialog systems require models to detect open intents, while ensuring the quality of in-domain (ID) intent classification. In this work, we revisit methods that leverage distance-based statistics for unsupervised out-of-domain (OOD) detection. We show that despite their superior performance on threshold-independent metrics like AUROC on test-set, threshold values chosen based on the performance on a validation-set do not generalize well to the test-set, thus resulting in substantially lower performance on ID or OOD detection accuracy and F1-scores. Our analysis shows that this lack of generalizability can be successfully mitigated by setting aside a hold-out set from validation data for threshold selection (sometimes achieving relative gains as high as 100%). Extensive experiments on seven benchmark datasets show that this fix puts the performance of these methods at par with, or sometimes even better than, the current state-of-the-art OOD detection techniques.</abstract>
       <url hash="1c3fcf07">2022.insights-1.3</url>
       <bibkey>khosla-gangadharaiah-2022-evaluating</bibkey>
+      <doi>10.18653/v1/2022.insights-1.3</doi>
     </paper>
     <paper id="4">
       <title>Extending the Scope of Out-of-Domain: Examining <fixed-case>QA</fixed-case> models in multiple subdomains</title>
@@ -63,6 +66,7 @@
       <pwccode url="https://github.com/lyuchenyang/analysing-question-answering-data" additional="false">lyuchenyang/analysing-question-answering-data</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/newsqa">NewsQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.4</doi>
     </paper>
     <paper id="5">
       <title>What Do You Get When You Cross Beam Search with Nucleus Sampling?</title>
@@ -72,6 +76,7 @@
       <abstract>We combine beam search with the probabilistic pruning technique of nucleus sampling to create two deterministic nucleus search algorithms for natural language generation. The first algorithm, p-exact search, locally prunes the next-token distribution and performs an exact search over the remaining space. The second algorithm, dynamic beam search, shrinks and expands the beam size according to the entropy of the candidate’s probability distribution. Despite the probabilistic intuition behind nucleus search, experiments on machine translation and summarization benchmarks show that both algorithms reach the same performance levels as standard beam search.</abstract>
       <url hash="30268966">2022.insights-1.5</url>
       <bibkey>shaham-levy-2022-get</bibkey>
+      <doi>10.18653/v1/2022.insights-1.5</doi>
     </paper>
     <paper id="6">
       <title>How Much Do Modifications to Transformer Language Models Affect Their Ability to Learn Linguistic Knowledge?</title>
@@ -83,6 +88,7 @@
       <url hash="ed9bc02c">2022.insights-1.6</url>
       <bibkey>sun-etal-2022-much</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/blimp">BLiMP</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.6</doi>
     </paper>
     <paper id="7">
       <title>Cross-lingual Inflection as a Data Augmentation Method for Parsing</title>
@@ -93,6 +99,7 @@
       <abstract>We propose a morphology-based method for low-resource (LR) dependency parsing. We train a morphological inflector for target LR languages, and apply it to related rich-resource (RR) treebanks to create cross-lingual (x-inflected) treebanks that resemble the target LR language. We use such inflected treebanks to train parsers in zero- (training on x-inflected treebanks) and few-shot (training on x-inflected and target language treebanks) setups. The results show that the method sometimes improves the baselines, but not consistently.</abstract>
       <url hash="6e323907">2022.insights-1.7</url>
       <bibkey>munoz-ortiz-etal-2022-cross</bibkey>
+      <doi>10.18653/v1/2022.insights-1.7</doi>
     </paper>
     <paper id="8">
       <title>Is <fixed-case>BERT</fixed-case> Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification</title>
@@ -108,6 +115,7 @@
       <pwccode url="https://github.com/uds-lsv/bert-lnl" additional="false">uds-lsv/bert-lnl</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.8</doi>
     </paper>
     <paper id="9">
       <title>Ancestor-to-Creole Transfer is Not a Walk in the Park</title>
@@ -118,6 +126,7 @@
       <abstract>We aim to learn language models for Creole languages for which large volumes of data are not readily available, and therefore explore the potential transfer from ancestor languages (the ‘Ancestry Transfer Hypothesis’). We find that standard transfer methods do not facilitate ancestry transfer. Surprisingly, different from other non-Creole languages, a very distinct two-phase pattern emerges for Creoles: As our training losses plateau, and language models begin to overfit on their source languages, perplexity on the Creoles drop. We explore if this compression phase can lead to practically useful language models (the ‘Ancestry Bottleneck Hypothesis’), but also falsify this. Moreover, we show that Creoles even exhibit this two-phase pattern even when training on random, unrelated languages. Thus Creoles seem to be typological outliers and we speculate whether there is a link between the two observations.</abstract>
       <url hash="3f99dae5">2022.insights-1.9</url>
       <bibkey>lent-etal-2022-ancestor</bibkey>
+      <doi>10.18653/v1/2022.insights-1.9</doi>
     </paper>
     <paper id="10">
       <title>What <fixed-case>GPT</fixed-case> Knows About Who is Who</title>
@@ -133,6 +142,7 @@
       <bibkey>yang-etal-2022-gpt</bibkey>
       <pwccode url="https://github.com/awesomecoref/prompt-coref" additional="false">awesomecoref/prompt-coref</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/wsc">WSC</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.10</doi>
     </paper>
     <paper id="11">
       <title>Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the <fixed-case>UMLS</fixed-case> <fixed-case>M</fixed-case>etathesaurus Using <fixed-case>S</fixed-case>iamese Networks</title>
@@ -148,6 +158,7 @@
       <abstract>Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical BERT-based models for synonym prediction in the UMLS by replacing BioWordVec embeddings with embeddings extracted from each biomedical BERT model using different feature extraction methods. Finally, we conduct a thorough grid search, which prior work lacks, to find the best set of hyperparameters. Surprisingly, we find that Siamese Networks initialized with BioWordVec embeddings still out perform the Siamese Networks initialized with embedding extracted from biomedical BERT model.</abstract>
       <url hash="149ea7d1">2022.insights-1.11</url>
       <bibkey>bajaj-etal-2022-evaluating</bibkey>
+      <doi>10.18653/v1/2022.insights-1.11</doi>
     </paper>
     <paper id="12">
       <title>On the Impact of Data Augmentation on Downstream Performance in Natural Language Processing</title>
@@ -160,6 +171,7 @@
       <url hash="32ceab8e">2022.insights-1.12</url>
       <bibkey>okimura-etal-2022-impact</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.12</doi>
     </paper>
     <paper id="13">
       <title>Can Question Rewriting Help Conversational Question Answering?</title>
@@ -176,6 +188,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/coqa">CoQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qrecc">QReCC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/quac">QuAC</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.13</doi>
     </paper>
     <paper id="14">
       <title>Clustering Examples in Multi-Dataset Benchmarks with Item Response Theory</title>
@@ -191,6 +204,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mrqa-2019">MRQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.14</doi>
     </paper>
     <paper id="15">
       <title>On the Limits of Evaluating Embodied Agent Model Generalization Using Validation Sets</title>
@@ -205,6 +219,7 @@
       <bibkey>kim-etal-2022-limits</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ai2-thor">AI2-THOR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/alfred">ALFRED</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.15</doi>
     </paper>
     <paper id="16">
       <title>Do Data-based Curricula Work?</title>
@@ -215,6 +230,7 @@
       <abstract>Current state-of-the-art NLP systems use large neural networks that require extensive computational resources for training. Inspired by human knowledge acquisition, researchers have proposed curriculum learning - sequencing tasks (task-based curricula) or ordering and sampling the datasets (data-based curricula) that facilitate training. This work investigates the benefits of data-based curriculum learning for large language models such as BERT and T5. We experiment with various curricula based on complexity measures and different sampling strategies. Extensive experiments on several NLP tasks show that curricula based on various complexity measures rarely have any benefits, while random sampling performs either as well or better than curricula.</abstract>
       <url hash="e310549a">2022.insights-1.16</url>
       <bibkey>surkov-etal-2022-data</bibkey>
+      <doi>10.18653/v1/2022.insights-1.16</doi>
     </paper>
     <paper id="17">
       <title>The Document Vectors Using Cosine Similarity Revisited</title>
@@ -226,6 +242,7 @@
       <bibkey>bingyu-arefyev-2022-document</bibkey>
       <pwccode url="https://github.com/bgzh/dv_cosine_revisited" additional="false">bgzh/dv_cosine_revisited</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.17</doi>
     </paper>
     <paper id="18">
       <title>Challenges in including extra-linguistic context in pre-trained language models</title>
@@ -236,6 +253,7 @@
       <abstract>To successfully account for language, computational models need to take into account both the linguistic context (the content of the utterances) and the extra-linguistic context (for instance, the participants in a dialogue). We focus on a referential task that asks models to link entity mentions in a TV show to the corresponding characters, and design an architecture that attempts to account for both kinds of context. In particular, our architecture combines a previously proposed specialized module (an “entity library”) for character representation with transfer learning from a pre-trained language model. We find that, although the model does improve linguistic contextualization, it fails to successfully integrate extra-linguistic information about the participants in the dialogue. Our work shows that it is very challenging to incorporate extra-linguistic information into pre-trained language models.</abstract>
       <url hash="9362e98a">2022.insights-1.18</url>
       <bibkey>sorodoc-etal-2022-challenges</bibkey>
+      <doi>10.18653/v1/2022.insights-1.18</doi>
     </paper>
     <paper id="19">
       <title>Label Errors in <fixed-case>BANKING</fixed-case>77</title>
@@ -245,6 +263,7 @@
       <abstract>We investigate potential label errors present in the popular BANKING77 dataset and the associated negative impacts on intent classification methods. Motivated by our own negative results when constructing an intent classifier, we applied two automated approaches to identify potential label errors in the dataset. We found that over 1,400 (14%) of the 10,003 training utterances may have been incorrectly labelled. In a simple experiment, we found that by removing the utterances with potential errors, our intent classifier saw an increase of 4.5% and 8% for the F1-Score and Adjusted Rand Index, respectively, in supervised and unsupervised classification. This paper serves as a warning of the potential of noisy labels in popular NLP datasets. Further study is needed to fully identify the breadth and depth of label errors in BANKING77 and other datasets.</abstract>
       <url hash="5af87c09">2022.insights-1.19</url>
       <bibkey>ying-thomas-2022-label</bibkey>
+      <doi>10.18653/v1/2022.insights-1.19</doi>
     </paper>
     <paper id="20">
       <title>Pathologies of Pre-trained Language Models in Few-shot Fine-tuning</title>
@@ -259,6 +278,7 @@
       <bibkey>chen-etal-2022-pathologies</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.20</doi>
     </paper>
     <paper id="21">
       <title>An Empirical study to understand the Compositional Prowess of Neural Dialog Models</title>
@@ -273,6 +293,7 @@
       <pwccode url="https://github.com/vinayshekharcmu/ComposionalityOfDialogModels" additional="false">vinayshekharcmu/ComposionalityOfDialogModels</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mutualfriends">MutualFriends</pwcdataset>
+      <doi>10.18653/v1/2022.insights-1.21</doi>
     </paper>
     <paper id="22">
       <title>Combining Extraction and Generation for Constructing Belief-Consequence Causal Links</title>
@@ -283,6 +304,7 @@
       <abstract>In this paper, we introduce and justify a new task—causal link extraction based on beliefs—and do a qualitative analysis of the ability of a large language model—InstructGPT-3—to generate implicit consequences of beliefs. With the language model-generated consequences being promising, but not consistent, we propose directions of future work, including data collection, explicit consequence extraction using rule-based and language modeling-based approaches, and using explicitly stated consequences of beliefs to fine-tune or prompt the language model to produce outputs suitable for the task.</abstract>
       <url hash="f268c8c3">2022.insights-1.22</url>
       <bibkey>alexeeva-etal-2022-combining</bibkey>
+      <doi>10.18653/v1/2022.insights-1.22</doi>
     </paper>
     <paper id="23">
       <title>Replicability under Near-Perfect Conditions – A Case-Study from Automatic Summarization</title>
@@ -291,6 +313,7 @@
       <abstract>Replication of research results has become more and more important in Natural Language Processing. Nevertheless, we still rely on results reported in the literature for comparison. Additionally, elements of an experimental setup are not always completely reported. This includes, but is not limited to reporting specific parameters used or omitting an implementational detail. In our experiment based on two frequently used data sets from the domain of automatic summarization and the seemingly full disclosure of research artefacts, we examine how well results reported are replicable and what elements influence the success or failure of replication. Our results indicate that publishing research artifacts is far from sufficient, that that publishing all relevant parameters in all possible detail is cruicial.</abstract>
       <url hash="a382435b">2022.insights-1.23</url>
       <bibkey>mieskes-2022-replicability</bibkey>
+      <doi>10.18653/v1/2022.insights-1.23</doi>
     </paper>
     <paper id="24">
       <title><fixed-case>BPE</fixed-case> beyond Word Boundary: How <fixed-case>NOT</fixed-case> to use Multi Word Expressions in Neural Machine Translation</title>
@@ -303,6 +326,7 @@
       <attachment type="OptionalSupplementaryData" hash="d8ccba59">2022.insights-1.24.OptionalSupplementaryData.zip</attachment>
       <bibkey>kumar-thawani-2022-bpe</bibkey>
       <pwccode url="https://github.com/pegasus-lynx/mwe-bpe" additional="false">pegasus-lynx/mwe-bpe</pwccode>
+      <doi>10.18653/v1/2022.insights-1.24</doi>
     </paper>
     <paper id="25">
       <title>Pre-trained language models evaluating themselves - A comparative study</title>
@@ -314,6 +338,7 @@
       <url hash="6fe00304">2022.insights-1.25</url>
       <bibkey>koch-etal-2022-pre</bibkey>
       <pwccode url="https://github.com/lazerlambda/metricscomparison" additional="false">lazerlambda/metricscomparison</pwccode>
+      <doi>10.18653/v1/2022.insights-1.25</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.iwslt.xml b/data/xml/2022.iwslt.xml
index 953f53be1a..ac23672461 100644
--- a/data/xml/2022.iwslt.xml
+++ b/data/xml/2022.iwslt.xml
@@ -25,6 +25,7 @@
       <abstract>This paper addresses the problem of evaluating the quality of automatically generated subtitles, which includes not only the quality of the machine-transcribed or translated speech, but also the quality of line segmentation and subtitle timing. We propose SubER - a single novel metric based on edit distance with shifts that takes all of these subtitle properties into account. We compare it to existing metrics for evaluating transcription, translation, and subtitle quality. A careful human evaluation in a post-editing scenario shows that the new metric has a high correlation with the post-editing effort and direct human assessment scores, outperforming baseline metrics considering only the subtitle text, such as WER and BLEU, and existing methods to integrate segmentation and timing features.</abstract>
       <url hash="49220c7a">2022.iwslt-1.1</url>
       <bibkey>wilken-etal-2022-suber</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.1</doi>
     </paper>
     <paper id="2">
       <title>Improving <fixed-case>A</fixed-case>rabic Diacritization by Learning to Diacritize and Translate</title>
@@ -35,6 +36,7 @@
       <url hash="9f696203">2022.iwslt-1.2</url>
       <bibkey>thompson-alshehri-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.2</doi>
     </paper>
     <paper id="3">
       <title>Simultaneous Neural Machine Translation with Prefix Alignment</title>
@@ -45,6 +47,7 @@
       <abstract>Simultaneous translation is a task that requires starting translation before the speaker has finished speaking, so we face a trade-off between latency and accuracy. In this work, we focus on prefix-to-prefix translation and propose a method to extract alignment between bilingual prefix pairs. We use the alignment to segment a streaming input and fine-tune a translation model. The proposed method demonstrated higher BLEU than those of baselines in low latency ranges in our experiments on the IWSLT simultaneous translation benchmark.</abstract>
       <url hash="519b6a5c">2022.iwslt-1.3</url>
       <bibkey>kano-etal-2022-simultaneous</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.3</doi>
     </paper>
     <paper id="4">
       <title>Locality-Sensitive Hashing for Long Context Neural Machine Translation</title>
@@ -56,6 +59,7 @@
       <abstract>After its introduction the Transformer architecture quickly became the gold standard for the task of neural machine translation. A major advantage of the Transformer compared to previous architectures is the faster training speed achieved by complete parallelization across timesteps due to the use of attention over recurrent layers. However, this also leads to one of the biggest problems of the Transformer, namely the quadratic time and memory complexity with respect to the input length. In this work we adapt the locality-sensitive hashing approach of Kitaev et al. (2020) to self-attention in the Transformer, we extended it to cross-attention and apply this memory efficient framework to sentence- and document-level machine translation. Our experiments show that the LSH attention scheme for sentence-level comes at the cost of slightly reduced translation quality. For document-level NMT we are able to include much bigger context sizes than what is possible with the baseline Transformer. However, more context does neither improve translation quality nor improve scores on targeted test suites.</abstract>
       <url hash="5eec2b82">2022.iwslt-1.4</url>
       <bibkey>petrick-etal-2022-locality</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.4</doi>
     </paper>
     <paper id="5">
       <title>Anticipation-Free Training for Simultaneous Machine Translation</title>
@@ -67,6 +71,7 @@
       <url hash="0c2d6a44">2022.iwslt-1.5</url>
       <bibkey>chang-etal-2022-anticipation</bibkey>
       <pwccode url="https://github.com/george0828zhang/sinkhorn-simultrans" additional="false">george0828zhang/sinkhorn-simultrans</pwccode>
+      <doi>10.18653/v1/2022.iwslt-1.5</doi>
     </paper>
     <paper id="6">
       <title>Who Are We Talking About? Handling Person Names in Speech Translation</title>
@@ -79,6 +84,7 @@
       <bibkey>gaido-etal-2022-talking</bibkey>
       <pwccode url="https://github.com/hlt-mt/fbk-fairseq" additional="false">hlt-mt/fbk-fairseq</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/europarl-st">Europarl-ST</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.6</doi>
     </paper>
     <paper id="7">
       <title>Joint Generation of Captions and Subtitles with Dual Decoding</title>
@@ -93,6 +99,7 @@
       <bibkey>xu-etal-2022-joint</bibkey>
       <pwccode url="https://github.com/jitao-xu/dual-decoding" additional="false">jitao-xu/dual-decoding</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/must-cinema">MuST-Cinema</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.7</doi>
     </paper>
     <paper id="8">
       <title><fixed-case>M</fixed-case>irror<fixed-case>A</fixed-case>lign: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning</title>
@@ -104,6 +111,7 @@
       <abstract>Word alignment is essential for the downstream cross-lingual language understanding and generation tasks. Recently, the performance of the neural word alignment models has exceeded that of statistical models. However, they heavily rely on sophisticated translation models. In this study, we propose a super lightweight unsupervised word alignment model named MirrorAlign, in which bidirectional symmetric attention trained with a contrastive learning objective is introduced, and an agreement loss is employed to bind the attention maps, such that the alignments follow mirror-like symmetry hypothesis. Experimental results on several public benchmarks demonstrate that our model achieves competitive, if not better, performance compared to the state of the art in word alignment while significantly reducing the training and decoding time on average. Further ablation analysis and case studies show the superiority of our proposed MirrorAlign. Notably, we recognize our model as a pioneer attempt to unify bilingual word embedding and word alignments. Encouragingly, our approach achieves 16.4X speedup against GIZA++, and 50X parameter compression compared with the Transformer-based alignment methods. We release our code to facilitate the community: https://github.com/moore3930/MirrorAlign.</abstract>
       <url hash="37356550">2022.iwslt-1.8</url>
       <bibkey>wu-etal-2022-mirroralign</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.8</doi>
     </paper>
     <paper id="9">
       <title>On the Impact of Noises in Crowd-Sourced Data for Speech Translation</title>
@@ -116,6 +124,7 @@
       <bibkey>ouyang-etal-2022-impact</bibkey>
       <pwccode url="https://github.com/owaski/must-c-clean" additional="false">owaski/must-c-clean</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.9</doi>
     </paper>
     <paper id="10">
       <title>Findings of the <fixed-case>IWSLT</fixed-case> 2022 Evaluation Campaign</title>
@@ -172,6 +181,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/voxpopuli">VoxPopuli</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.10</doi>
     </paper>
     <paper id="11">
       <title>The <fixed-case>Y</fixed-case>i<fixed-case>T</fixed-case>rans Speech Translation System for <fixed-case>IWSLT</fixed-case> 2022 Offline Shared Task</title>
@@ -185,6 +195,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/opensubtitles">OpenSubtitles</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/voxpopuli">VoxPopuli</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.11</doi>
     </paper>
     <paper id="12">
       <title><fixed-case>A</fixed-case>mazon <fixed-case>A</fixed-case>lexa <fixed-case>AI</fixed-case>’s System for <fixed-case>IWSLT</fixed-case> 2022 Offline Speech Translation Shared Task</title>
@@ -199,6 +210,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/europarl-st">Europarl-ST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.12</doi>
     </paper>
     <paper id="13">
       <title>Efficient yet Competitive Speech Translation: <fixed-case>FBK</fixed-case>@<fixed-case>IWSLT</fixed-case>2022</title>
@@ -213,6 +225,7 @@
       <url hash="c6f5ab80">2022.iwslt-1.13</url>
       <bibkey>gaido-etal-2022-efficient</bibkey>
       <pwccode url="https://github.com/hlt-mt/fbk-fairseq" additional="false">hlt-mt/fbk-fairseq</pwccode>
+      <doi>10.18653/v1/2022.iwslt-1.13</doi>
     </paper>
     <paper id="14">
       <title>Effective combination of pretrained models - <fixed-case>KIT</fixed-case>@<fixed-case>IWSLT</fixed-case>2022</title>
@@ -230,6 +243,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/how2">How2</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.14</doi>
     </paper>
     <paper id="15">
       <title>The <fixed-case>USTC</fixed-case>-<fixed-case>NELSLIP</fixed-case> Offline Speech Translation Systems for <fixed-case>IWSLT</fixed-case> 2022</title>
@@ -250,6 +264,7 @@
       <abstract>This paper describes USTC-NELSLIP’s submissions to the IWSLT 2022 Offline Speech Translation task, including speech translation of talks from English to German, English to Chinese and English to Japanese. We describe both cascaded architectures and end-to-end models which can directly translate source speech into target text. In the cascaded condition, we investigate the effectiveness of different model architectures with robust training and achieve 2.72 BLEU improvements over last year’s optimal system on MuST-C English-German test set. In the end-to-end condition, we build models based on Transformer and Conformer architectures, achieving 2.26 BLEU improvements over last year’s optimal end-to-end system. The end-to-end system has obtained promising results, but it is still lagging behind our cascaded models.</abstract>
       <url hash="785f084c">2022.iwslt-1.15</url>
       <bibkey>zhang-etal-2022-ustc</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.15</doi>
     </paper>
     <paper id="16">
       <title>The <fixed-case>AISP</fixed-case>-<fixed-case>SJTU</fixed-case> Simultaneous Translation System for <fixed-case>IWSLT</fixed-case> 2022</title>
@@ -266,6 +281,7 @@
       <abstract>This paper describes AISP-SJTU’s submissions for the IWSLT 2022 Simultaneous Translation task. We participate in the text-to-text and speech-to-text simultaneous translation from English to Mandarin Chinese. The training of the CAAT is improved by training across multiple values of right context window size, which achieves good online performance without setting a prior right context window size for training. For speech-to-text task, the best model we submitted achieves 25.87, 26.21, 26.45 BLEU in low, medium and high regimes on tst-COMMON, corresponding to 27.94, 28.31, 28.43 BLEU in text-to-text task.</abstract>
       <url hash="d2dad4ab">2022.iwslt-1.16</url>
       <bibkey>zhu-etal-2022-aisp</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.16</doi>
     </paper>
     <paper id="17">
       <title>The Xiaomi Text-to-Text Simultaneous Speech Translation System for <fixed-case>IWSLT</fixed-case> 2022</title>
@@ -282,6 +298,7 @@
       <abstract>This system paper describes the Xiaomi Translation System for the IWSLT 2022 Simultaneous Speech Translation (noted as SST) shared task. We participate in the English-to-Mandarin Chinese Text-to-Text (noted as T2T) track. Our system is built based on the Transformer model with novel techniques borrowed from our recent research work. For the data filtering, language-model-based and rule-based methods are conducted to filter the data to obtain high-quality bilingual parallel corpora. We also strengthen our system with some dominating techniques related to data augmentation, such as knowledge distillation, tagged back-translation, and iterative back-translation. We also incorporate novel training techniques such as R-drop, deep model, and large batch training which have been shown to be beneficial to the naive Transformer model. In the SST scenario, several variations of <tex-math>exttt{wait-k}</tex-math> strategies are explored. Furthermore, in terms of robustness, both data-based and model-based ways are used to reduce the sensitivity of our system to Automatic Speech Recognition (ASR) outputs. We finally design some inference algorithms and use the adaptive-ensemble method based on multiple model variants to further improve the performance of the system. Compared with strong baselines, fusing all techniques can improve our system by 2 extasciitilde3 BLEU scores under different latency regimes.</abstract>
       <url hash="1975345a">2022.iwslt-1.17</url>
       <bibkey>guo-etal-2022-xiaomi</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.17</doi>
     </paper>
     <paper id="18">
       <title><fixed-case>NVIDIA</fixed-case> <fixed-case>N</fixed-case>e<fixed-case>M</fixed-case>o Offline Speech Translation Systems for <fixed-case>IWSLT</fixed-case> 2022</title>
@@ -299,6 +316,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/europarl-st">Europarl-ST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/voxpopuli">VoxPopuli</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.18</doi>
     </paper>
     <paper id="19">
       <title>The <fixed-case>N</fixed-case>iu<fixed-case>T</fixed-case>rans’s Submission to the <fixed-case>IWSLT</fixed-case>22 <fixed-case>E</fixed-case>nglish-to-<fixed-case>C</fixed-case>hinese Offline Speech Translation Task</title>
@@ -314,6 +332,7 @@
       <abstract>This paper describes NiuTrans’s submission to the IWSLT22 English-to-Chinese (En-Zh) offline speech translation task. The end-to-end and bilingual system is built by constrained English and Chinese data and translates the English speech to Chinese text without intermediate transcription. Our speech translation models are composed of different pre-trained acoustic models and machine translation models by two kinds of adapters. We compared the effect of the standard speech feature (e.g. log Mel-filterbank) and the pre-training speech feature and try to make them interact. The final submission is an ensemble of three potential speech translation models. Our single best and ensemble model achieves 18.66 BLEU and 19.35 BLEU separately on MuST-C En-Zh tst-COMMON set.</abstract>
       <url hash="f6a9337a">2022.iwslt-1.19</url>
       <bibkey>zhang-etal-2022-niutranss</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.19</doi>
     </paper>
     <paper id="20">
       <title>The <fixed-case>HW</fixed-case>-<fixed-case>TSC</fixed-case>’s Offline Speech Translation System for <fixed-case>IWSLT</fixed-case> 2022 Evaluation</title>
@@ -335,6 +354,7 @@
       <bibkey>wang-etal-2022-hw</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ted-lium-3">TED-LIUM 3</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.20</doi>
     </paper>
     <paper id="21">
       <title>The <fixed-case>HW</fixed-case>-<fixed-case>TSC</fixed-case>’s Simultaneous Speech Translation System for <fixed-case>IWSLT</fixed-case> 2022 Evaluation</title>
@@ -356,6 +376,7 @@
       <bibkey>wang-etal-2022-hw-tscs</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ted-lium-3">TED-LIUM 3</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>MLLP</fixed-case>-<fixed-case>VRAIN</fixed-case> <fixed-case>UPV</fixed-case> systems for the <fixed-case>IWSLT</fixed-case> 2022 Simultaneous Speech Translation and Speech-to-Speech Translation tasks</title>
@@ -376,6 +397,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/europarl-st">Europarl-ST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/opensubtitles">OpenSubtitles</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.22</doi>
     </paper>
     <paper id="23">
       <title>Pretrained Speech Encoders and Efficient Fine-tuning Methods for Speech Translation: <fixed-case>UPC</fixed-case> at <fixed-case>IWSLT</fixed-case> 2022</title>
@@ -390,6 +412,7 @@
       <bibkey>tsiamas-etal-2022-pretrained</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/europarl-st">Europarl-ST</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.23</doi>
     </paper>
     <paper id="24">
       <title><fixed-case>CUNI</fixed-case>-<fixed-case>KIT</fixed-case> System for Simultaneous Speech Translation Task at <fixed-case>IWSLT</fixed-case> 2022</title>
@@ -405,6 +428,7 @@
       <abstract>In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022. We explore strategies to utilize an offline model in a simultaneous setting without the need to modify the original model. In our experiments, we show that our onlinization algorithm is almost on par with the offline setting while being 3x faster than offline in terms of latency on the test set. We also show that the onlinized offline model outperforms the best IWSLT2021 simultaneous system in medium and high latency regimes and is almost on par in the low latency regime. We make our system publicly available.</abstract>
       <url hash="e028d5d1">2022.iwslt-1.24</url>
       <bibkey>polak-etal-2022-cuni</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.24</doi>
     </paper>
     <paper id="25">
       <title><fixed-case>NAIST</fixed-case> Simultaneous Speech-to-Text Translation System for <fixed-case>IWSLT</fixed-case> 2022</title>
@@ -421,6 +445,7 @@
       <url hash="44b5367e">2022.iwslt-1.25</url>
       <bibkey>fukuda-etal-2022-naist</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.25</doi>
     </paper>
     <paper id="26">
       <title>The <fixed-case>HW</fixed-case>-<fixed-case>TSC</fixed-case>’s Speech to Speech Translation System for <fixed-case>IWSLT</fixed-case> 2022 Evaluation</title>
@@ -442,6 +467,7 @@
       <bibkey>guo-etal-2022-hw</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/ted-lium-3">TED-LIUM 3</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.26</doi>
     </paper>
     <paper id="27">
       <title><fixed-case>CMU</fixed-case>’s <fixed-case>IWSLT</fixed-case> 2022 Dialect Speech Translation System</title>
@@ -458,6 +484,7 @@
       <abstract>This paper describes CMU’s submissions to the IWSLT 2022 dialect speech translation (ST) shared task for translating Tunisian-Arabic speech to English text. We use additional paired Modern Standard Arabic data (MSA) to directly improve the speech recognition (ASR) and machine translation (MT) components of our cascaded systems. We also augment the paired ASR data with pseudo translations via sequence-level knowledge distillation from an MT model and use these artificial triplet ST data to improve our end-to-end (E2E) systems. Our E2E models are based on the Multi-Decoder architecture with searchable hidden intermediates. We extend the Multi-Decoder by orienting the speech encoder towards the target language by applying ST supervision as hierarchical connectionist temporal classification (CTC) multi-task. During inference, we apply joint decoding of the ST CTC and ST autoregressive decoder branches of our modified Multi-Decoder. Finally, we apply ROVER voting, posterior combination, and minimum bayes-risk decoding with combined N-best lists to ensemble our various cascaded and E2E systems. Our best systems reached 20.8 and 19.5 BLEU on test2 (blind) and test1 respectively. Without any additional MSA data, we reached 20.4 and 19.2 on the same test sets.</abstract>
       <url hash="ae0c2713">2022.iwslt-1.27</url>
       <bibkey>yan-etal-2022-cmus</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.27</doi>
     </paper>
     <paper id="28">
       <title><fixed-case>ON</fixed-case>-<fixed-case>TRAC</fixed-case> Consortium Systems for the <fixed-case>IWSLT</fixed-case> 2022 Dialect and Low-resource Speech Translation Tasks</title>
@@ -476,6 +503,7 @@
       <abstract>This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2022: low-resource and dialect speech translation. For the Tunisian Arabic-English dataset (low-resource and dialect tracks), we build an end-to-end model as our joint primary submission, and compare it against cascaded models that leverage a large fine-tuned wav2vec 2.0 model for ASR. Our results show that in our settings pipeline approaches are still very competitive, and that with the use of transfer learning, they can outperform end-to-end models for speech translation (ST). For the Tamasheq-French dataset (low-resource track) our primary submission leverages intermediate representations from a wav2vec 2.0 model trained on 234 hours of Tamasheq audio, while our contrastive model uses a French phonetic transcription of the Tamasheq audio as input in a Conformer speech translation architecture jointly trained on automatic speech recognition, ST and machine translation losses. Our results highlight that self-supervised models trained on smaller sets of target data are more effective to low-resource end-to-end ST fine-tuning, compared to large off-the-shelf models. Results also illustrate that even approximate phonetic transcriptions can improve ST scores.</abstract>
       <url hash="66f19526">2022.iwslt-1.28</url>
       <bibkey>zanon-boito-etal-2022-trac</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.28</doi>
     </paper>
     <paper id="29">
       <title><fixed-case>JHU</fixed-case> <fixed-case>IWSLT</fixed-case> 2022 Dialect Speech Translation System Description</title>
@@ -487,6 +515,7 @@
       <abstract>This paper details the Johns Hopkins speech translation (ST) system used in the IWLST2022 dialect speech translation task. Our system uses a cascade of automatic speech recognition (ASR) and machine translation (MT). We use a Conformer model for ASR systems and a Transformer model for machine translation. Surprisingly, we found that while using additional ASR training data resulted in only a negligible change in performance as measured by BLEU or word error rate (WER), aggressive text normalization improved BLEU more significantly. We also describe an approach, similar to back-translation, for improving performance using synthetic dialectal source text produced from source sentences in mismatched dialects.</abstract>
       <url hash="da06169c">2022.iwslt-1.29</url>
       <bibkey>yang-etal-2022-jhu</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.29</doi>
     </paper>
     <paper id="30">
       <title>Controlling Translation Formality Using Pre-trained Multilingual Language Models</title>
@@ -499,6 +528,7 @@
       <bibkey>rippeth-etal-2022-controlling</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ccmatrix">CCMatrix</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.30</doi>
     </paper>
     <paper id="31">
       <title>Controlling Formality in Low-Resource <fixed-case>NMT</fixed-case> with Domain Adaptation and Re-Ranking: <fixed-case>SLT</fixed-case>-<fixed-case>CDT</fixed-case>-<fixed-case>U</fixed-case>o<fixed-case>S</fixed-case> at <fixed-case>IWSLT</fixed-case>2022</title>
@@ -512,6 +542,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.31</doi>
     </paper>
     <paper id="32">
       <title>Improving Machine Translation Formality Control with Weakly-Labelled Data Augmentation and Post Editing Strategies</title>
@@ -524,6 +555,7 @@
       <abstract>This paper describes Amazon Alexa AI’s implementation for the IWSLT 2022 shared task on formality control. We focus on the unconstrained and supervised task for en→hi (Hindi) and en→ja (Japanese) pairs where very limited formality annotated data is available. We propose three simple yet effective post editing strategies namely, T-V conversion, utilizing a verb conjugator and seq2seq models in order to rewrite the translated phrases into formal or informal language. Considering nuances for formality and informality in different languages, our analysis shows that a language-specific post editing strategy achieves the best performance. To address the unique challenge of limited formality annotations, we further develop a formality classifier to perform weakly labelled data augmentation which automatically generates synthetic formality labels from large parallel corpus. Empirical results on the IWSLT formality testset have shown that proposed system achieved significant improvements in terms of formality accuracy while retaining BLEU score on-par with baseline.</abstract>
       <url hash="2b58ba49">2022.iwslt-1.32</url>
       <bibkey>zhang-etal-2022-improving-machine</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.32</doi>
     </paper>
     <paper id="33">
       <title><fixed-case>HW</fixed-case>-<fixed-case>TSC</fixed-case>’s Participation in the <fixed-case>IWSLT</fixed-case> 2022 Isometric Spoken Language Translation</title>
@@ -543,6 +575,7 @@
       <abstract>This paper presents our submissions to the IWSLT 2022 Isometric Spoken Language Translation task. We participate in all three language pairs (English-German, English-French, English-Spanish) under the constrained setting, and submit an English-German result under the unconstrained setting. We use the standard Transformer model as the baseline and obtain the best performance via one of its variants that shares the decoder input and output embedding. We perform detailed pre-processing and filtering on the provided bilingual data. Several strategies are used to train our models, such as Multilingual Translation, Back Translation, Forward Translation, R-Drop, Average Checkpoint, and Ensemble. We investigate three methods for biasing the output length: i) conditioning the output to a given target-source length-ratio class; ii) enriching the transformer positional embedding with length information and iii) length control decoding for non-autoregressive translation etc. Our submissions achieve 30.7, 41.6 and 36.7 BLEU respectively on the tst-COMMON test sets for English-German, English-French, English-Spanish tasks and 100% comply with the length requirements.</abstract>
       <url hash="fbad41df">2022.iwslt-1.33</url>
       <bibkey>li-etal-2022-hw</bibkey>
+      <doi>10.18653/v1/2022.iwslt-1.33</doi>
     </paper>
     <paper id="34">
       <title><fixed-case>A</fixed-case>pp<fixed-case>T</fixed-case>ek’s Submission to the <fixed-case>IWSLT</fixed-case> 2022 Isometric Spoken Language Translation Task</title>
@@ -553,6 +586,7 @@
       <url hash="180fbdd0">2022.iwslt-1.34</url>
       <bibkey>wilken-matusov-2022-appteks</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.34</doi>
     </paper>
     <paper id="35">
       <title>Hierarchical Multi-task learning framework for Isometric-Speech Language Translation</title>
@@ -567,6 +601,7 @@
       <pwccode url="https://github.com/aakash0017/machine-translation-iswlt" additional="false">aakash0017/machine-translation-iswlt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/paws-x">PAWS-X</pwcdataset>
+      <doi>10.18653/v1/2022.iwslt-1.35</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.lchange.xml b/data/xml/2022.lchange.xml
index a9ccee8d26..1853ca666f 100644
--- a/data/xml/2022.lchange.xml
+++ b/data/xml/2022.lchange.xml
@@ -44,6 +44,7 @@
       <abstract>We present a benchmark in six European languages containing manually annotated information about olfactory situations and events following a FrameNet-like approach. The documents selection covers ten domains of interest to cultural historians in the olfactory domain and includes texts published between 1620 to 1920, allowing a diachronic analysis of smell descriptions. With this work, we aim to foster the development of olfactory information extraction approaches as well as the analysis of changes in smell descriptions over time.</abstract>
       <url hash="fd7d9e3f">2022.lchange-1.1</url>
       <bibkey>menini-etal-2022-multilingual</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.1</doi>
     </paper>
     <paper id="2">
       <title>Language Acquisition, Neutral Change, and Diachronic Trends in Noun Classifiers</title>
@@ -54,6 +55,7 @@
       <url hash="489370e3">2022.lchange-1.2</url>
       <bibkey>kali-kodner-2022-language</bibkey>
       <pwccode url="https://github.com/an-k45/classifier-change" additional="false">an-k45/classifier-change</pwccode>
+      <doi>10.18653/v1/2022.lchange-1.2</doi>
     </paper>
     <paper id="3">
       <title>Deconstructing destruction: A Cognitive Linguistics perspective on a computational analysis of diachronic change</title>
@@ -64,6 +66,7 @@
       <abstract>In this paper, we aim to introduce a Cognitive Linguistics perspective into a computational analysis of near-synonyms. We focus on a single set of Dutch near-synonyms, vernielen and vernietigen, roughly translated as ‘to destroy’, replicating the analysis from Geeraerts (1997) with distributional models. Our analysis, which tracks the meaning of both words in a corpus of 16th-20th century prose data, shows that both lexical items have undergone semantic change, led by differences in their prototypical semantic core.</abstract>
       <url hash="497a6d0d">2022.lchange-1.3</url>
       <bibkey>franco-etal-2022-deconstructing</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.3</doi>
     </paper>
     <paper id="4">
       <title>What is Done is Done: an Incremental Approach to Semantic Shift Detection</title>
@@ -75,6 +78,7 @@
       <abstract>Contextual word embedding techniques for semantic shift detection are receiving more and more attention. In this paper, we present What is Done is Done (WiDiD), an incremental approach to semantic shift detection based on incremental clustering techniques and contextual embedding methods to capture the changes over the meanings of a target word along a diachronic corpus. In WiDiD, the word contexts observed in the past are consolidated as a set of clusters that constitute the “memory” of the word meanings observed so far. Such a memory is exploited as a basis for subsequent word observations, so that the meanings observed in the present are stratified over the past ones.</abstract>
       <url hash="389884d6">2022.lchange-1.4</url>
       <bibkey>periti-etal-2022-done</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.4</doi>
     </paper>
     <paper id="5">
       <title>From qualifiers to quantifiers: semantic shift at the paradigm level</title>
@@ -83,6 +87,7 @@
       <abstract>Language change has often been conceived as a competition between linguistic variants. However, language units may be complex organizations in themselves, e.g. in the case of schematic constructions, featuring a free slot. Such a slot is filled by words forming a set or ‘paradigm’ and engaging in inter-related dynamics within this constructional environment. To tackle this complexity, a simple computational method is offered to automatically characterize their interactions, and visualize them through networks of cooperation and competition. Applying this method to the French paradigm of quantifiers, I show that this method efficiently captures phenomena regarding the evolving organization of constructional paradigms, in particular the constitution of competing clusters of fillers that promote different semantic strategies overall.</abstract>
       <url hash="4a3df93e">2022.lchange-1.5</url>
       <bibkey>feltgen-2022-qualifiers</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.5</doi>
     </paper>
     <paper id="6">
       <title>Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change</title>
@@ -93,6 +98,7 @@
       <abstract>Morphological and syntactic changes in word usage — as captured, e.g., by grammatical profiles — have been shown to be good predictors of a word’s meaning change. In this work, we explore whether large pre-trained contextualised language models, a common tool for lexical semantic change detection, are sensitive to such morphosyntactic changes. To this end, we first compare the performance of grammatical profiles against that of a multilingual neural language model (XLM-R) on 10 datasets, covering 7 languages, and then combine the two approaches in ensembles to assess their complementarity. Our results show that ensembling grammatical profiles with XLM-R improves semantic change detection performance for most datasets and languages. This indicates that language models do not fully cover the fine-grained morphological and syntactic signals that are explicitly represented in grammatical profiles. An interesting exception are the test sets where the time spans under analysis are much longer than the time gap between them (for example, century-long spans with a one-year gap between them). Morphosyntactic change is slow so grammatical profiles do not detect in such cases. In contrast, language models, thanks to their access to lexical information, are able to detect fast topical changes.</abstract>
       <url hash="af30a035">2022.lchange-1.6</url>
       <bibkey>giulianelli-etal-2022-fire</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.6</doi>
     </paper>
     <paper id="7">
       <title>Explainable Publication Year Prediction of Eighteenth Century Texts with the <fixed-case>BERT</fixed-case> Model</title>
@@ -109,6 +115,7 @@
       <abstract>In this paper, we describe a BERT model trained on the Eighteenth Century Collections Online (ECCO) dataset of digitized documents. The ECCO dataset poses unique modelling challenges due to the presence of Optical Character Recognition (OCR) artifacts. We establish the performance of the BERT model on a publication year prediction task against linear baseline models and human judgement, finding the BERT model to be superior to both and able to date the works, on average, with less than 7 years absolute error. We also explore how language change over time affects the model by analyzing the features the model uses for publication year predictions as given by the Integrated Gradients model explanation method.</abstract>
       <url hash="5ebf777d">2022.lchange-1.7</url>
       <bibkey>rastas-etal-2022-explainable</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.7</doi>
     </paper>
     <paper id="8">
       <title>Using Cross-Lingual Part of Speech Tagging for Partially Reconstructing the Classic Language Family Tree Model</title>
@@ -119,6 +126,7 @@
       <abstract>The tree model is well known for expressing the historic evolution of languages. This model has been considered as a method of describing genetic relationships between languages. Nevertheless, some researchers question the model’s ability to predict the proximity between two languages, since it represents genetic relatedness rather than linguistic resemblance. Defining other language proximity models has been an active research area for many years. In this paper we explore a part-of-speech model for defining proximity between languages using a multilingual language model that was fine-tuned on the task of cross-lingual part-of-speech tagging. We train the model on one language and evaluate it on another; the measured performance is then used to define the proximity between the two languages. By further developing the model, we show that it can reconstruct some parts of the tree model.</abstract>
       <url hash="a07155f3">2022.lchange-1.8</url>
       <bibkey>samohi-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.8</doi>
     </paper>
     <paper id="9">
       <title>A New Framework for Fast Automated Phonological Reconstruction Using Trimmed Alignments and Sound Correspondence Patterns</title>
@@ -130,6 +138,7 @@
       <url hash="d5bee8a3">2022.lchange-1.9</url>
       <bibkey>list-etal-2022-new</bibkey>
       <pwccode url="https://github.com/lingpy/supervised-reconstruction-paper" additional="false">lingpy/supervised-reconstruction-paper</pwccode>
+      <doi>10.18653/v1/2022.lchange-1.9</doi>
     </paper>
     <paper id="10">
       <title>Caveats of Measuring Semantic Change of Cognates and Borrowings using Multilingual Word Embeddings</title>
@@ -140,6 +149,7 @@
       <url hash="9ca0968d">2022.lchange-1.10</url>
       <bibkey>fourrier-montariol-2022-caveats</bibkey>
       <pwccode url="https://github.com/clefourrier/historical-semantic-change" additional="false">clefourrier/historical-semantic-change</pwccode>
+      <doi>10.18653/v1/2022.lchange-1.10</doi>
     </paper>
     <paper id="11">
       <title>Lexicon of Changes: Towards the Evaluation of Diachronic Semantic Shift in <fixed-case>C</fixed-case>hinese</title>
@@ -150,6 +160,7 @@
       <abstract>Recent research has brought a wind of using computational approaches to the classic topic of semantic change, aiming to tackle one of the most challenging issues in the evolution of human language. While several methods for detecting semantic change have been proposed, such studies are limited to a few languages, where evaluation datasets are available. This paper presents the first dataset for evaluating Chinese semantic change in contexts preceding and following the Reform and Opening-up, covering a 50-year period in Modern Chinese. Following the DURel framework, we collected 6,000 human judgments for the dataset. We also reported the performance of alignment-based word embedding models on this evaluation dataset, achieving high and significant correlation scores.</abstract>
       <url hash="e78e4c6c">2022.lchange-1.11</url>
       <bibkey>chen-etal-2022-lexicon</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.11</doi>
     </paper>
     <paper id="12">
       <title>Low <fixed-case>S</fixed-case>axon dialect distances at the orthographic and syntactic level</title>
@@ -160,6 +171,7 @@
       <abstract>We compare five Low Saxon dialects from the 19th and 21st century from Germany and the Netherlands with each other as well as with modern Standard Dutch and Standard German. Our comparison is based on character n-grams on the one hand and PoS n-grams on the other and we show that these two lead to different distances. Particularly in the PoS-based distances, one can observe all of the 21st century Low Saxon dialects shifting towards the modern majority languages.</abstract>
       <url hash="cbcabff5">2022.lchange-1.12</url>
       <bibkey>siewert-etal-2022-low</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.12</doi>
     </paper>
     <paper id="13">
       <title>“Vaderland”, “Volk” and “Natie”: Semantic Change Related to Nationalism in <fixed-case>D</fixed-case>utch Literature Between 1700 and 1880 Captured with Dynamic <fixed-case>B</fixed-case>ernoulli Word Embeddings</title>
@@ -170,6 +182,7 @@
       <abstract>Languages can respond to external events in various ways - the creation of new words or named entities, additional senses might develop for already existing words or the valence of words can change. In this work, we explore the semantic shift of the Dutch words “natie” (“nation”), “volk” (“people”) and “vaderland” (“fatherland”) over a period that is known for the rise of nationalism in Europe: 1700-1880. The semantic change is measured by means of Dynamic Bernoulli Word Embeddings which allow for comparison between word embeddings over different time slices. The word embeddings were generated based on Dutch fiction literature divided over different decades. From the analysis of the absolute drifts, it appears that the word “natie” underwent a relatively small drift. However, the drifts of “vaderland’” and “volk”’ show multiple peaks, culminating around the turn of the nineteenth century. To verify whether this semantic change can indeed be attributed to nationalistic movements, a detailed analysis of the nearest neighbours of the target words is provided. From the analysis, it appears that “natie”, “volk” and “vaderlan”’ became more nationalistically-loaded over time.</abstract>
       <url hash="62c3d2e5">2022.lchange-1.13</url>
       <bibkey>timmermans-etal-2022-vaderland</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.13</doi>
     </paper>
     <paper id="14">
       <title>Using neural topic models to track context shifts of words: a case study of <fixed-case>COVID</fixed-case>-related terms before and after the lockdown in <fixed-case>A</fixed-case>pril 2020</title>
@@ -179,6 +192,7 @@
       <abstract>This paper explores lexical meaning changes in a new dataset, which includes tweets from before and after the COVID-related lockdown in April 2020. We use this dataset to evaluate traditional and more recent unsupervised approaches to lexical semantic change that make use of contextualized word representations based on the BERT neural language model to obtain representations of word usages. We argue that previous models that encode local representations of words cannot capture global context shifts such as the context shift of face masks since the pandemic outbreak. We experiment with neural topic models to track context shifts of words. We show that this approach can reveal textual associations of words that go beyond their lexical meaning representation. We discuss future work and how to proceed capturing the pragmatic aspect of meaning change as opposed to lexical semantic change.</abstract>
       <url hash="3208f67c">2022.lchange-1.14</url>
       <bibkey>kellert-mahmud-uz-zaman-2022-using</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.14</doi>
     </paper>
     <paper id="15">
       <title>Roadblocks in Gender Bias Measurement for Diachronic Corpora</title>
@@ -192,6 +206,7 @@
       <url hash="b7125197">2022.lchange-1.15</url>
       <bibkey>alshahrani-etal-2022-roadblocks</bibkey>
       <pwccode url="https://github.com/clarkson-accountability-transparency/gbiasroadblocks" additional="false">clarkson-accountability-transparency/gbiasroadblocks</pwccode>
+      <doi>10.18653/v1/2022.lchange-1.15</doi>
     </paper>
     <paper id="16">
       <title><fixed-case>LSCD</fixed-case>iscovery: A shared task on semantic change discovery and detection in <fixed-case>S</fixed-case>panish</title>
@@ -202,6 +217,7 @@
       <abstract>We present the first shared task on semantic change discovery and detection in Spanish. We create the first dataset of Spanish words manually annotated by semantic change using the DURel framewok (Schlechtweg et al., 2018). The task is divided in two phases: 1) graded change discovery, and 2) binary change detection. In addition to introducing a new language for this task, the main novelty with respect to the previous tasks consists in predicting and evaluating changes for all vocabulary words in the corpus. Six teams participated in phase 1 and seven teams in phase 2 of the shared task, and the best system obtained a Spearman rank correlation of 0.735 for phase 1 and an F1 score of 0.735 for phase 2. We describe the systems developed by the competing teams, highlighting the techniques that were particularly useful.</abstract>
       <url hash="218c1308">2022.lchange-1.16</url>
       <bibkey>d-zamora-reina-etal-2022-black</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.16</doi>
     </paper>
     <paper id="17">
       <title><fixed-case>BOS</fixed-case> at <fixed-case>LSCD</fixed-case>iscovery: Lexical Substitution for Interpretable Lexical Semantic Change Detection</title>
@@ -211,6 +227,7 @@
       <abstract>We propose a solution for the LSCDiscovery shared task on Lexical Semantic Change Detection in Spanish. Our approach is based on generating lexical substitutes that describe old and new senses of a given word. This approach achieves the second best result in sense loss and sense gain detection subtasks. By observing those substitutes that are specific for only one time period, one can understand which senses were obtained or lost. This allows providing more detailed information about semantic change to the user and makes our method interpretable.</abstract>
       <url hash="b76ade55">2022.lchange-1.17</url>
       <bibkey>kudisov-arefyev-2022-black</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.17</doi>
     </paper>
     <paper id="18">
       <title><fixed-case>D</fixed-case>eep<fixed-case>M</fixed-case>istake at <fixed-case>LSCD</fixed-case>iscovery: Can a Multilingual Word-in-Context Model Replace Human Annotators?</title>
@@ -220,6 +237,7 @@
       <abstract>In this paper we describe our solution of the LSCDiscovery shared task on Lexical Semantic Change Discovery (LSCD) in Spanish. Our solution employs a Word-in-Context (WiC) model, which is trained to determine if a particular word has the same meaning in two given contexts. We basically try to replicate the annotation of the dataset for the shared task, but replacing human annotators with a neural network. In the graded change discovery subtask, our solution has achieved the 2nd best result according to all metrics. In the main binary change detection subtask, our F1-score is 0.655 compared to 0.716 of the best submission, corresponding to the 5th place. However, in the optional sense gain detection subtask we have outperformed all other participants. During the post-evaluation experiments we compared different ways to prepare WiC data in Spanish for fine-tuning. We have found that it helps leaving only examples annotated as 1 (unrelated senses) and 4 (identical senses) rather than using 2x more examples including intermediate annotations.</abstract>
       <url hash="a81f99f9">2022.lchange-1.18</url>
       <bibkey>homskiy-arefyev-2022-black</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.18</doi>
     </paper>
     <paper id="19">
       <title><fixed-case>UA</fixed-case>lberta at <fixed-case>LSCD</fixed-case>iscovery: Lexical Semantic Change Detection via Word Sense Disambiguation</title>
@@ -230,6 +248,7 @@
       <abstract>We describe our two systems for the shared task on Lexical Semantic Change Discovery in Spanish. For binary change detection, we frame the task as a word sense disambiguation (WSD) problem. We derive sense frequency distributions for target words in both old and modern corpora. We assume that the word semantics have changed if a sense is observed in only one of the two corpora, or the relative change for any sense exceeds a tuned threshold. For graded change discovery, we follow the design of CIRCE (Pömsl and Lyapin, 2020) by combining both static and contextual embeddings. For contextual embeddings, we use XLM-RoBERTa instead of BERT, and train the model to predict a masked token instead of the time period. Our language-independent methods achieve results that are close to the best-performing systems in the shared task.</abstract>
       <url hash="58ef2833">2022.lchange-1.19</url>
       <bibkey>teodorescu-etal-2022-black</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.19</doi>
     </paper>
     <paper id="20">
       <title><fixed-case>C</fixed-case>o<fixed-case>T</fixed-case>o<fixed-case>H</fixed-case>i<fixed-case>L</fixed-case>i at <fixed-case>LSCD</fixed-case>iscovery: the Role of Linguistic Features in Predicting Semantic Change</title>
@@ -243,6 +262,7 @@
       <abstract>This paper presents the contributions of the CoToHiLi team for the LSCDiscovery shared task on semantic change in the Spanish language. We participated in both tasks (graded discovery and binary change, including sense gain and sense loss) and proposed models based on word embedding distances combined with hand-crafted linguistic features, including polysemy, number of neological synonyms, and relation to cognates in English. We find that models that include linguistically informed features combined using weights assigned manually by experts lead to promising results.</abstract>
       <url hash="13885633">2022.lchange-1.20</url>
       <bibkey>sabina-uban-etal-2022-black</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>HSE</fixed-case> at <fixed-case>LSCD</fixed-case>iscovery in <fixed-case>S</fixed-case>panish: Clustering and Profiling for Lexical Semantic Change Discovery</title>
@@ -256,6 +276,7 @@
       <bibkey>kashleva-etal-2022-black</bibkey>
       <revision id="1" href="2022.lchange-1.21v1" hash="072a9322"/>
       <revision id="2" href="2022.lchange-1.21v2" hash="bfc7112d" date="2022-05-21">Various fixes throughout the paper.</revision>
+      <doi>10.18653/v1/2022.lchange-1.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>G</fixed-case>loss<fixed-case>R</fixed-case>eader at <fixed-case>LSCD</fixed-case>iscovery: Train to Select a Proper Gloss in <fixed-case>E</fixed-case>nglish – Discover Lexical Semantic Change in <fixed-case>S</fixed-case>panish</title>
@@ -265,6 +286,7 @@
       <abstract>The contextualized embeddings obtained from neural networks pre-trained as Language Models (LM) or Masked Language Models (MLM) are not well suitable for solving the Lexical Semantic Change Detection (LSCD) task because they are more sensitive to changes in word forms rather than word meaning, a property previously known as the word form bias or orthographic bias. Unlike many other NLP tasks, it is also not obvious how to fine-tune such models for LSCD. In order to conclude if there are any differences between senses of a particular word in two corpora, a human annotator or a system shall analyze many examples containing this word from both corpora. This makes annotation of LSCD datasets very labour-consuming. The existing LSCD datasets contain up to 100 words that are labeled according to their semantic change, which is hardly enough for fine-tuning. To solve these problems we fine-tune the XLM-R MLM as part of a gloss-based WSD system on a large WSD dataset in English. Then we employ zero-shot cross-lingual transferability of XLM-R to build the contextualized embeddings for examples in Spanish. In order to obtain the graded change score for each word, we calculate the average distance between our improved contextualized embeddings of its old and new occurrences. For the binary change detection subtask, we apply thresholding to the same scores. Our solution has shown the best results among all other participants in all subtasks except for the optional sense gain detection subtask.</abstract>
       <url hash="ff0c8fb2">2022.lchange-1.22</url>
       <bibkey>rachinskiy-arefyev-2022-black</bibkey>
+      <doi>10.18653/v1/2022.lchange-1.22</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.lnls.xml b/data/xml/2022.lnls.xml
index d02d8b1635..da6435ffd7 100644
--- a/data/xml/2022.lnls.xml
+++ b/data/xml/2022.lnls.xml
@@ -27,6 +27,7 @@
       <url hash="80adfc4c">2022.lnls-1.1</url>
       <bibkey>ri-etal-2022-finding</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/alfred">ALFRED</pwcdataset>
+      <doi>10.18653/v1/2022.lnls-1.1</doi>
     </paper>
     <paper id="2">
       <title><fixed-case>G</fixed-case>rammar<fixed-case>SHAP</fixed-case>: An Efficient Model-Agnostic and Structure-Aware <fixed-case>NLP</fixed-case> Explainer</title>
@@ -41,6 +42,7 @@
       <bibkey>mosca-etal-2022-grammarshap</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.lnls-1.2</doi>
     </paper>
     <paper id="3">
       <title>Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions</title>
@@ -55,6 +57,7 @@
       <abstract>Current QA systems can generate reasonable-sounding yet false answers without explanation or evidence for the generated answer, which is especially problematic when humans cannot readily check the model’s answers. This presents a challenge for building trust in machine learning systems. We take inspiration from real-world situations where difficult questions are answered by considering opposing sides (see Irving et al., 2018). For multiple-choice QA examples, we build a dataset of single arguments for both a correct and incorrect answer option in a debate-style set-up as an initial step in training models to produce explanations for two candidate answers. We use long contexts—humans familiar with the context write convincing explanations for pre-selected correct and incorrect answers, and we test if those explanations allow humans who have not read the full context to more accurately determine the correct answer. We do not find that explanations in our set-up improve human accuracy, but a baseline condition shows that providing human-selected text snippets does improve accuracy. We use these findings to suggest ways of improving the debate set up for future data collection efforts.</abstract>
       <url hash="47246998">2022.lnls-1.3</url>
       <bibkey>parrish-etal-2022-single</bibkey>
+      <doi>10.18653/v1/2022.lnls-1.3</doi>
     </paper>
     <paper id="4">
       <title>When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data</title>
@@ -68,6 +71,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/tacred">TACRED</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/e-snli">e-SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.lnls-1.4</doi>
     </paper>
     <paper id="5">
       <title>A survey on improving <fixed-case>NLP</fixed-case> models with human explanations</title>
@@ -78,6 +82,7 @@
       <url hash="12898a15">2022.lnls-1.5</url>
       <bibkey>hartmann-sonntag-2022-survey</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/e-snli">e-SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.lnls-1.5</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.ltedi.xml b/data/xml/2022.ltedi.xml
index 9ccbdc425b..ba44e72a92 100644
--- a/data/xml/2022.ltedi.xml
+++ b/data/xml/2022.ltedi.xml
@@ -27,6 +27,7 @@
       <url hash="523a9cba">2022.ltedi-1.1</url>
       <bibkey>markl-2022-mind</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/common-voice">Common Voice</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.1</doi>
     </paper>
     <paper id="2">
       <title>Regex in a Time of Deep Learning: The Role of an Old Technology in Age Discrimination Detection in Job Advertisements</title>
@@ -37,6 +38,7 @@
       <abstract>Deep learning holds great promise for detecting discriminatory language in the public sphere. However, for the detection of illegal age discrimination in job advertisements, regex approaches are still strong performers. In this paper, we investigate job advertisements in the Netherlands. We present a qualitative analysis of the benefits of the ‘old’ approach based on regexes and investigate how neural embeddings could address its limitations.</abstract>
       <url hash="c571dddd">2022.ltedi-1.2</url>
       <bibkey>pillar-etal-2022-regex</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.2</doi>
     </paper>
     <paper id="3">
       <title>Doing not Being: Concrete Language as a Bridge from Language Technology to Ethnically Inclusive Job Ads</title>
@@ -48,6 +50,7 @@
       <abstract>This paper makes the case for studying concreteness in language as a bridge that will allow language technology to support the understanding and improvement of ethnic inclusivity in job advertisements. We propose an annotation scheme that guides the assignment of sentences in job ads to classes that reflect concrete actions, i.e., what the employer needs people to <tex-math>do</tex-math>, and abstract dispositions, i.e., who the employer expects people to <tex-math>be</tex-math>. Using an annotated dataset of Dutch-language job ads, we demonstrate that machine learning technology is effectively able to distinguish these classes.</abstract>
       <url hash="cb31980e">2022.ltedi-1.3</url>
       <bibkey>adams-etal-2022-concrete</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.3</doi>
     </paper>
     <paper id="4">
       <title>Measuring Harmful Sentence Completion in Language Models for <fixed-case>LGBTQIA</fixed-case>+ Individuals</title>
@@ -61,6 +64,7 @@
       <bibkey>nozza-etal-2022-measuring</bibkey>
       <pwccode url="https://github.com/milanlproc/honest" additional="false">milanlproc/honest</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/honest-en">HONEST</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.4</doi>
     </paper>
     <paper id="5">
       <title>Using <fixed-case>BERT</fixed-case> Embeddings to Model Word Importance in Conversational Transcripts for Deaf and Hard of Hearing Users</title>
@@ -72,6 +76,7 @@
       <abstract>Deaf and hard of hearing individuals regularly rely on captioning while watching live TV. Live TV captioning is evaluated by regulatory agencies using various caption evaluation metrics. However, caption evaluation metrics are often not informed by preferences of DHH users or how meaningful the captions are. There is a need to construct caption evaluation metrics that take the relative importance of words in transcript into account. We conducted correlation analysis between two types of word embeddings and human-annotated labelled word-importance scores in existing corpus. We found that normalized contextualized word embeddings generated using BERT correlated better with manually annotated importance scores than word2vec-based word embeddings. We make available a pairing of word embeddings and their human-annotated importance scores. We also provide proof-of-concept utility by training word importance models, achieving an F1-score of 0.57 in the 6-class word importance classification task.</abstract>
       <url hash="453b5472">2022.ltedi-1.5</url>
       <bibkey>amin-etal-2022-using</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.5</doi>
     </paper>
     <paper id="6">
       <title>Detoxifying Language Models with a Toxic Corpus</title>
@@ -82,6 +87,7 @@
       <url hash="d4ba48ea">2022.ltedi-1.6</url>
       <bibkey>park-rudzicz-2022-detoxifying</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/webtext">WebText</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.6</doi>
     </paper>
     <paper id="7">
       <title>Inferring Gender: A Scalable Methodology for Gender Detection with Online Lexical Databases</title>
@@ -92,6 +98,7 @@
       <url hash="8c1b52db">2022.ltedi-1.7</url>
       <bibkey>bartl-leavy-2022-inferring</bibkey>
       <pwccode url="https://github.com/marionbartl/lexical-gender" additional="false">marionbartl/lexical-gender</pwccode>
+      <doi>10.18653/v1/2022.ltedi-1.7</doi>
     </paper>
     <paper id="8">
       <title>Debiasing Pre-Trained Language Models via Efficient Fine-Tuning</title>
@@ -106,6 +113,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/crows-pairs">CrowS-Pairs</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/stereoset">StereoSet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/winobias">WinoBias</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.8</doi>
     </paper>
     <paper id="9">
       <title>Disambiguation of morpho-syntactic features of <fixed-case>A</fixed-case>frican <fixed-case>A</fixed-case>merican <fixed-case>E</fixed-case>nglish – the case of habitual be</title>
@@ -117,6 +125,7 @@
       <abstract>Recent research has highlighted that natural language processing (NLP) systems exhibit a bias againstAfrican American speakers. These errors are often caused by poor representation of linguistic features unique to African American English (AAE), which is due to the relatively low probability of occurrence for many such features. We present a workflow to overcome this issue in the case of habitual “be”. Habitual “be” is isomorphic, and therefore ambiguous, with other forms of uninflected “be” found in both AAE and General American English (GAE). This creates a clear challenge for bias in NLP technologies. To overcome the scarcity, we employ a combination of rule-based filters and data augmentation that generate a corpus balanced between habitual and non-habitual instances. This balanced corpus trains unbiased machine learning classifiers, as demonstrated on a corpus of AAE transcribed texts, achieving .65 F<tex-math>_1</tex-math> score at classifying habitual “be”.</abstract>
       <url hash="979578b7">2022.ltedi-1.9</url>
       <bibkey>santiago-etal-2022-disambiguation</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.9</doi>
     </paper>
     <paper id="10">
       <title>Behind the Mask: Demographic bias in name detection for <fixed-case>PII</fixed-case> masking</title>
@@ -128,6 +137,7 @@
       <url hash="1de9fb7e">2022.ltedi-1.10</url>
       <bibkey>mansfield-etal-2022-behind</bibkey>
       <pwccode url="https://github.com/csmansfield/pii-masking-bias" additional="false">csmansfield/pii-masking-bias</pwccode>
+      <doi>10.18653/v1/2022.ltedi-1.10</doi>
     </paper>
     <paper id="11">
       <title>Mapping the Multilingual Margins: Intersectional Biases of Sentiment Analysis Systems in <fixed-case>E</fixed-case>nglish, <fixed-case>S</fixed-case>panish, and <fixed-case>A</fixed-case>rabic</title>
@@ -140,6 +150,7 @@
       <abstract>As natural language processing systems become more widespread, it is necessary to address fairness issues in their implementation and deployment to ensure that their negative impacts on society are understood and minimized. However, there is limited work that studies fairness using a multilingual and intersectional framework or on downstream tasks. In this paper, we introduce four multilingual Equity Evaluation Corpora, supplementary test sets designed to measure social biases, and a novel statistical framework for studying unisectional and intersectional social biases in natural language processing. We use these tools to measure gender, racial, ethnic, and intersectional social biases across five models trained on emotion regression tasks in English, Spanish, and Arabic. We find that many systems demonstrate statistically significant unisectional and intersectional social biases. We make our code and datasets available for download.</abstract>
       <url hash="79beaa13">2022.ltedi-1.11</url>
       <bibkey>camara-etal-2022-mapping</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.11</doi>
     </paper>
     <paper id="12">
       <title><fixed-case>M</fixed-case>onte <fixed-case>C</fixed-case>arlo Tree Search for Interpreting Stress in Natural Language</title>
@@ -151,6 +162,7 @@
       <url hash="1de1b1ec">2022.ltedi-1.12</url>
       <bibkey>swanson-etal-2022-monte</bibkey>
       <pwccode url="https://github.com/swansonk14/mcts_interpretability" additional="false">swansonk14/mcts_interpretability</pwccode>
+      <doi>10.18653/v1/2022.ltedi-1.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>IIITS</fixed-case>urat@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Hope Speech Detection using Machine Learning</title>
@@ -162,6 +174,7 @@
       <abstract>This paper addresses the issue of Hope Speech detection using machine learning techniques. Designing a robust model that helps in predicting the target class with higher accuracy is a challenging task in machine learning, especially when the distribution of the class labels is highly imbalanced. This study uses and compares the experimental outcomes of the different oversampling techniques. Many models are implemented to classify the comments into Hope and Non-Hope speech, and it found that machine learning algorithms perform better than deep learning models. The English language dataset used in this research was developed by collecting YouTube comments and is part of the task “ACL-2022:Hope Speech Detection for Equality, Diversity, and Inclusion”. The proposed model achieved a weighted F1-score of 0.55 on the test dataset and secured the first rank among the participated teams.</abstract>
       <url hash="69a45d6f">2022.ltedi-1.13</url>
       <bibkey>roy-etal-2022-iiitsurat</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.13</doi>
     </paper>
     <paper id="14">
       <title>The Best of both Worlds: Dual Channel Language modeling for Hope Speech Detection in low-resourced <fixed-case>K</fixed-case>annada</title>
@@ -175,6 +188,7 @@
       <url hash="17b95867">2022.ltedi-1.14</url>
       <bibkey>hande-etal-2022-best</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/kanhope">KanHope</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.14</doi>
     </paper>
     <paper id="15">
       <title><fixed-case>NYCU</fixed-case>_<fixed-case>TWD</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Ensemble Models with <fixed-case>VADER</fixed-case> and Contrastive Learning for Detecting Signs of Depression from Social Media</title>
@@ -186,6 +200,7 @@
       <abstract>This paper presents a state-of-the-art solution to the LT-EDI-ACL 2022 Task 4: <i>Detecting Signs of Depression from Social Media Text</i>. The goal of this task is to detect the severity levels of depression of people from social media posts, where people often share their feelings on a daily basis. To detect the signs of depression, we propose a framework with pre-trained language models using rich information instead of training from scratch, gradient boosting and deep learning models for modeling various aspects, and supervised contrastive learning for the generalization ability. Moreover, ensemble techniques are also employed in consideration of the different advantages of each method. Experiments show that our framework achieves a 2nd prize ranking with a macro F1-score of 0.552, showing the effectiveness and robustness of our approach.</abstract>
       <url hash="d5c8365b">2022.ltedi-1.15</url>
       <bibkey>wang-etal-2022-nycu</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.15</doi>
     </paper>
     <paper id="16">
       <title><fixed-case>UMUT</fixed-case>eam@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Detecting homophobic and transphobic comments in <fixed-case>T</fixed-case>amil</title>
@@ -196,6 +211,7 @@
       <abstract>This working-notes are about the participation of the UMUTeam in a LT-EDI shared task concerning the identification of homophobic and transphobic comments in YouTube. These comments are written in English, which has high availability to machine-learning resources; Tamil, which has fewer resources; and a transliteration from Tamil to Roman script combined with English sentences. To carry out this shared task, we train a neural network that combines several feature sets applying a knowledge integration strategy. These features are linguistic features extracted from a tool developed by our research group and contextual and non-contextual sentence embeddings. We ranked 7th for English subtask (macro f1-score of 45%), 3rd for Tamil subtask (macro f1-score of 82%), and 2nd for Tamil-English subtask (macro f1-score of 58%).</abstract>
       <url hash="f32f88a2">2022.ltedi-1.16</url>
       <bibkey>garcia-diaz-etal-2022-umuteam-lt</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.16</doi>
     </paper>
     <paper id="17">
       <title><fixed-case>UMUT</fixed-case>eam@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Detecting Signs of Depression from text</title>
@@ -205,6 +221,7 @@
       <abstract>Depression is a mental condition related to sadness and the lack of interest in common daily tasks. In this working-notes, we describe the proposal of the UMUTeam in the LT-EDI shared task (ACL 2022) concerning the identification of signs of depression in social network posts. This task is somehow related to other relevant Natural Language Processing tasks such as Emotion Analysis. In this shared task, the organisers challenged the participants to distinguish between moderate and severe signs of depression (or no signs of depression at all) in a set of social posts written in English. Our proposal is based on the combination of linguistic features and several sentence embeddings using a knowledge integration strategy. Our proposal achieved the 6th position, with a macro f1-score of 53.82 in the official leader board.</abstract>
       <url hash="7a96f69d">2022.ltedi-1.17</url>
       <bibkey>garcia-diaz-valencia-garcia-2022-umuteam</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.17</doi>
     </paper>
     <paper id="18">
       <title>bitsa_nlp@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments</title>
@@ -215,6 +232,7 @@
       <url hash="8d387f84">2022.ltedi-1.18</url>
       <bibkey>bhandari-goyal-2022-bitsa</bibkey>
       <pwccode url="https://github.com/vitthal-bhandari/homophobia-transphobia-detection" additional="false">vitthal-bhandari/homophobia-transphobia-detection</pwccode>
+      <doi>10.18653/v1/2022.ltedi-1.18</doi>
     </paper>
     <paper id="19">
       <title><fixed-case>ABLIMET</fixed-case> @<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: A Roberta based Approach for Homophobia/Transphobia Detection in Social Media</title>
@@ -223,6 +241,7 @@
       <abstract>This paper describes our system that participated in LT-EDI-ACL2022- Homophobia/Transphobia Detection in Social Media. Sexual minorities face a lot of unfair treatment and discrimination in our world. This creates enormous stress and many psychological problems for sexual minorities. There is a lot of hate speech on the internet, and Homophobia/Transphobia is the one against sexual minorities. Identifying and processing Homophobia/ Transphobia through natural language processing technology can improve the efficiency of processing Homophobia/ Transphobia, and can quickly screen out Homophobia/Transphobia on the Internet. The organizer of LT-EDI-ACL2022- Homophobia/Transphobia Detection in Social Media constructs a Homophobia/ Transphobia detection dataset based on YouTube comments for English and Tamil. We use a Roberta -based approach to conduct Homophobia/ Transphobia detection experiments on the dataset of the competition, and get better results.</abstract>
       <url hash="fbe03eaf">2022.ltedi-1.19</url>
       <bibkey>maimaitituoheti-2022-ablimet</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.19</doi>
     </paper>
     <paper id="20">
       <title><fixed-case>MUCIC</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Hope Speech Detection using Data Re-Sampling and 1<fixed-case>D</fixed-case> Conv-<fixed-case>LSTM</fixed-case></title>
@@ -234,6 +253,7 @@
       <abstract>Spreading positive vibes or hope content on social media may help many people to get motivated in their life. To address Hope Speech detection in YouTube comments, this paper presents the description of the models submitted by our team - MUCIC, to the Hope Speech Detection for Equality, Diversity, and Inclusion (HopeEDI) shared task at Association for Computational Linguistics (ACL) 2022. This shared task consists of texts in five languages, namely: English, Spanish (in Latin scripts), and Tamil, Malayalam, and Kannada (in code-mixed native and Roman scripts) with the aim of classifying the YouTube comment into “Hope”, “Not-Hope” or “Not-Intended” categories. The proposed methodology uses the re-sampling technique to deal with imbalanced data in the corpus and obtained 1st rank for English language with a macro-averaged F1-score of 0.550 and weighted-averaged F1-score of 0.860. The code to reproduce this work is available in GitHub.</abstract>
       <url hash="4a9ca6e1">2022.ltedi-1.20</url>
       <bibkey>gowda-etal-2022-mucic</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>D</fixed-case>eep<fixed-case>B</fixed-case>lues@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Depression level detection modelling through domain specific <fixed-case>BERT</fixed-case> and short text Depression classifiers</title>
@@ -245,6 +265,7 @@
       <abstract>We discuss a variety of approaches to build a robust Depression level detection model from longer social media posts (i.e., Reddit Depression forum posts) using a mental health text pre-trained BERT model. Further, we report our experimental results based on a strategy to select excerpts from long text and then fine-tune the BERT model to combat the issue of memory constraints while processing such texts. We show that, with domain specific BERT, we can achieve reasonable accuracy with fixed text size (in this case 200 tokens) for this task. In addition we can use short text classifiers to extract relevant text from the long text and achieve slightly better accuracy, albeit, trading off with the processing time for extracting such excerpts.</abstract>
       <url hash="656c9e88">2022.ltedi-1.21</url>
       <bibkey>farruque-etal-2022-deepblues</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.21</doi>
     </paper>
     <paper id="22">
       <title><fixed-case>SSN</fixed-case>_<fixed-case>ARMM</fixed-case>@ <fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case> -<fixed-case>ACL</fixed-case>2022: Hope Speech Detection for Equality, Diversity, and Inclusion Using <fixed-case>ALBERT</fixed-case> model</title>
@@ -259,6 +280,7 @@
       <abstract>In recent years social media has become one of the major forums for expressing human views and emotions. With the help of smartphones and high-speed internet, anyone can express their views on Social media. However, this can also lead to the spread of hatred and violence in society. Therefore it is necessary to build a method to find and support helpful social media content. In this paper, we studied Natural Language Processing approach for detecting Hope speech in a given sentence. The task was to classify the sentences into ‘Hope speech’ and ‘Non-hope speech’. The dataset was provided by LT-EDI organizers with text from Youtube comments. Based on the task description, we developed a system using the pre-trained language model BERT to complete this task. Our model achieved 1st rank in the Kannada language with a weighted average F1 score of 0.750, 2nd rank in the Malayalam language with a weighted average F1 score of 0.740, 3rd rank in the Tamil language with a weighted average F1 score of 0.390 and 6th rank in the English language with a weighted average F1 score of 0.880.</abstract>
       <url hash="950617e9">2022.ltedi-1.22</url>
       <bibkey>vijayakumar-etal-2022-ssn</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.22</doi>
     </paper>
     <paper id="23">
       <title><fixed-case>SUH</fixed-case>_<fixed-case>ASR</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Transformer based Approach for Speech Recognition for Vulnerable Individuals in <fixed-case>T</fixed-case>amil</title>
@@ -268,6 +290,7 @@
       <abstract>An Automatic Speech Recognition System is developed for addressing the Tamil conversational speech data of the elderly people andtransgender. The speech corpus used in this system is collected from the people who adhere their communication in Tamil at some primary places like bank, hospital, vegetable markets. Our ASR system is designed with pre-trained model which is used to recognize the speechdata. WER(Word Error Rate) calculation is used to analyse the performance of the ASR system. This evaluation could help to make acomparison of utterances between the elderly people and others. Similarly, the comparison between the transgender and other people isalso done. Our proposed ASR system achieves the word error rate as 39.65%.</abstract>
       <url hash="682ada8c">2022.ltedi-1.23</url>
       <bibkey>s-b-2022-suh</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.23</doi>
     </paper>
     <paper id="24">
       <title><fixed-case>LPS</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022:An Ensemble Approach about Hope Speech Detection</title>
@@ -276,6 +299,7 @@
       <abstract>The task shared by sponsor about Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI-ACL-2022.The goal of this task is to identify whether a given comment contains hope speech or not,and hope is considered significant for the well-being, recuperation and restoration of human life.Our work aims to change the prevalent way of thinking by moving away from a preoccupation with discrimination, loneliness or the worst things in life to building the confidence, support and good qualities based on comments by individuals. In response to the need to detect equality, diversity and inclusion of hope speech in a multilingual environment, we built an integration model and achieved well performance on multiple datasets presented by the sponsor and the specific results can be referred to the experimental results section.</abstract>
       <url hash="52a48826">2022.ltedi-1.24</url>
       <bibkey>zhu-2022-lps</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.24</doi>
     </paper>
     <paper id="25">
       <title><fixed-case>CURAJ</fixed-case>_<fixed-case>IIITDWD</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case> 2022: Hope Speech Detection in <fixed-case>E</fixed-case>nglish <fixed-case>Y</fixed-case>ou<fixed-case>T</fixed-case>ube Comments using Deep Learning Techniques</title>
@@ -286,6 +310,7 @@
       <abstract>Hope Speech are positive terms that help to promote or criticise a point of view without hurting the user’s or community’s feelings. Non-Hope Speech, on the other side, includes expressions that are harsh, ridiculing, or demotivating. The goal of this article is to find the hope speech comments in a YouTube dataset. The datasets were created as part of the “LT-EDI-ACL 2022: Hope Speech Detection for Equality, Diversity, and Inclusion” shared task. The shared task dataset was proposed in Malayalam, Tamil, English, Spanish, and Kannada languages. In this paper, we worked at English-language YouTube comments. We employed several deep learning based models such as DNN (dense or fully connected neural network), CNN (Convolutional Neural Network), Bi-LSTM (Bidirectional Long Short Term Memory Network), and GRU(Gated Recurrent Unit) to identify the hopeful comments. We also used Stacked LSTM-CNN and Stacked LSTM-LSTM network to train the model. The best macro average F1-score 0.67 for development dataset was obtained using the DNN model. The macro average F1-score of 0.67 was achieved for the classification done on the test data as well.</abstract>
       <url hash="8e4c1537">2022.ltedi-1.25</url>
       <bibkey>jha-etal-2022-curaj</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.25</doi>
     </paper>
     <paper id="26">
       <title><fixed-case>SSN</fixed-case>_<fixed-case>MLRG</fixed-case>3 @<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022-Depression Detection System from Social Media Text using Transformer Models</title>
@@ -299,6 +324,7 @@
       <abstract>Depression is a common mental illness that involves sadness and lack of interest in all day-to-day activities. The task is to classify the social media text as signs of depression into three labels namely “not depressed”, “moderately depressed”, and “severely depressed”. We have build a system using Deep Learning Model “Transformers”. Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. The multi-class classification model used in our system is based on the ALBERT model. In the shared task ACL 2022, Our team SSN_MLRG3 obtained a Macro F1 score of 0.473.</abstract>
       <url hash="1aa67cf7">2022.ltedi-1.26</url>
       <bibkey>esackimuthu-etal-2022-ssn</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.26</doi>
     </paper>
     <paper id="27">
       <title><fixed-case>BERT</fixed-case> 4<fixed-case>EVER</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022-Detecting signs of Depression from Social Media:Detecting Depression in Social Media using Prompt-Learning and Word-Emotion Cluster</title>
@@ -311,6 +337,7 @@
       <abstract>In this paper, we report the solution of the team BERT 4EVER for the LT-EDI-2022 shared task2: Homophobia/Transphobia Detection in social media comments in ACL 2022, which aims to classify Youtube comments into one of the following categories: no,moderate, or severe depression. We model the problem as a text classification task and a text generation task and respectively propose two different models for the tasks.To combine the knowledge learned from these two different models, we softly fuse the predicted probabilities of the models above and then select the label with the highest probability as the final output.In addition, multiple augmentation strategies are leveraged to improve the model generalization capability, such as back translation and adversarial training.Experimental results demonstrate the effectiveness of the proposed models and two augmented strategies.</abstract>
       <url hash="c16cfaf8">2022.ltedi-1.27</url>
       <bibkey>lin-etal-2022-bert</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.27</doi>
     </paper>
     <paper id="28">
       <title><fixed-case>CIC</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Are transformers the only hope? Hope speech detection for <fixed-case>S</fixed-case>panish and <fixed-case>E</fixed-case>nglish comments</title>
@@ -322,6 +349,7 @@
       <abstract>Hope is an inherent part of human life and essential for improving the quality of life. Hope increases happiness and reduces stress and feelings of helplessness. Hope speech is the desired outcome for better and can be studied using text from various online sources where people express their desires and outcomes. In this paper, we address a deep-learning approach with a combination of linguistic and psycho-linguistic features for hope-speech detection. We report our best results submitted to LT-EDI-2022 which ranked 2nd and 3rd in English and Spanish respectively.</abstract>
       <url hash="af9c8baf">2022.ltedi-1.28</url>
       <bibkey>balouchzahi-etal-2022-cic</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.28</doi>
     </paper>
     <paper id="29">
       <title>scube<fixed-case>MSEC</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Detection of Depression using Transformer Models</title>
@@ -334,6 +362,7 @@
       <abstract>Social media platforms play a major role in our day-to-day life and are considered as a virtual friend by many users, who use the social media to share their feelings all day. Many a time, the content which is shared by users on social media replicate their internal life. Nowadays people love to share their daily life incidents like happy or unhappy moments and their feelings in social media and it makes them feel complete and it has become a habit for many users. Social media provides a new chance to identify the feelings of a person through their posts. The aim of the shared task is to develop a model in which the system is capable of analyzing the grammatical markers related to onset and permanent symptoms of depression. We as a team participated in the shared task Detecting Signs of Depression from Social Media Text at LT-EDI 2022- ACL 2022 and we have proposed a model which predicts depression from English social media posts using the data set shared for the task. The prediction is done based on the labels Moderate, Severe and Not Depressed. We have implemented this using different transformer models like DistilBERT, RoBERTa and ALBERT by which we were able to achieve a Macro F1 score of 0.337, 0.457 and 0.387 respectively. Our code is publicly available in the github</abstract>
       <url hash="7bd2faef">2022.ltedi-1.29</url>
       <bibkey>s-etal-2022-scubemsec</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.29</doi>
     </paper>
     <paper id="30">
       <title><fixed-case>SSNCSE</fixed-case>_<fixed-case>NLP</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022:Hope Speech Detection for Equality, Diversity and Inclusion using sentence transformers</title>
@@ -346,6 +375,7 @@
       <abstract>In recent times, applications have been developed to regulate and control the spread of negativity and toxicity on online platforms. The world is filled with serious problems like political &amp; religious conflicts, wars, pandemics, and offensive hate speech is the last thing we desire. Our task was to classify a text into ‘Hope Speech’ and ‘Non-Hope Speech’. We searched for datasets acquired from YouTube comments that offer support, reassurance, inspiration, and insight, and the ones that don’t. The datasets were provided to us by the LTEDI organizers in English, Tamil, Spanish, Kannada, and Malayalam. To successfully identify and classify them, we employed several machine learning transformer models such as m-BERT, MLNet, BERT, XLMRoberta, and XLM_MLM. The observed results indicate that the BERT and m-BERT have obtained the best results among all the other techniques, gaining a weighted F1- score of 0.92, 0.71, 0.76, 0.87, and 0.83 for English, Tamil, Spanish, Kannada, and Malayalam respectively. This paper depicts our work for the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion at LTEDI 2021.</abstract>
       <url hash="8ad795b0">2022.ltedi-1.30</url>
       <bibkey>b-etal-2022-ssncse</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.30</doi>
     </paper>
     <paper id="31">
       <title><fixed-case>SOA</fixed-case>_<fixed-case>NLP</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: An Ensemble Model for Hope Speech Detection from <fixed-case>Y</fixed-case>ou<fixed-case>T</fixed-case>ube Comments</title>
@@ -356,6 +386,7 @@
       <abstract>Language should be accommodating of equality and diversity as a fundamental aspect of communication. The language of internet users has a big impact on peer users all over the world. On virtual platforms such as Facebook, Twitter, and YouTube, people express their opinions in different languages. People respect others’ accomplishments, pray for their well-being, and cheer them on when they fail. Such motivational remarks are hope speech remarks. Simultaneously, a group of users encourages discrimination against women, people of color, people with disabilities, and other minorities based on gender, race, sexual orientation, and other factors. To recognize hope speech from YouTube comments, the current study offers an ensemble approach that combines a support vector machine, logistic regression, and random forest classifiers. Extensive testing was carried out to discover the best features for the aforementioned classifiers. In the support vector machine and logistic regression classifiers, char-level TF-IDF features were used, whereas in the random forest classifier, word-level features were used. The proposed ensemble model performed significantly well among English, Spanish, Tamil, Malayalam, and Kannada YouTube comments.</abstract>
       <url hash="20541550">2022.ltedi-1.31</url>
       <bibkey>kumar-etal-2022-soa</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.31</doi>
     </paper>
     <paper id="32">
       <title><fixed-case>IIT</fixed-case> Dhanbad @<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022- Hope Speech Detection for Equality, Diversity, and Inclusion</title>
@@ -366,6 +397,7 @@
       <abstract>Hope is considered significant for the wellbeing,recuperation and restoration of humanlife by health professionals. Hope speech reflectsthe belief that one can discover pathwaysto their desired objectives and become rousedto utilise those pathways. Hope speech offerssupport, reassurance, suggestions, inspirationand insight. Hate speech is a prevalent practicethat society has to struggle with everyday.The freedom of speech and ease of anonymitygranted by social media has also resulted inincitement to hatred. In this paper, we workto identify and promote positive and supportivecontent on these platforms. We work withseveral machine learning models to classify socialmedia comments as hope speech or nonhopespeech in English. This paper portraysour work for the Shared Task on Hope SpeechDetection for Equality, Diversity, and Inclusionat LT-EDI-ACL 2022.</abstract>
       <url hash="a756987e">2022.ltedi-1.32</url>
       <bibkey>gupta-etal-2022-iit</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.32</doi>
     </paper>
     <paper id="33">
       <title><fixed-case>IISERB</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: A Bag of Words and Document Embeddings Based Framework to Identify Severity of Depression Over Social Media</title>
@@ -374,6 +406,7 @@
       <abstract>The DepSign-LT-EDI-ACL2022 shared task focuses on early prediction of severity of depression over social media posts. The BioNLP group at Department of Data Science and Engineering in Indian Institute of Science Education and Research Bhopal (IISERB) has participated in this challenge and submitted three runs based on three different text mining models. The severity of depression were categorized into three classes, viz., no depression, moderate, and severe and the data to build models were released as part of this shared task. The objective of this work is to identify relevant features from the given social media texts for effective text classification. As part of our investigation, we explored features derived from text data using document embeddings technique and simple bag of words model following different weighting schemes. Subsequently, adaptive boosting, logistic regression, random forest and support vector machine (SVM) classifiers were used to identify the scale of depression from the given texts. The experimental analysis on the given validation data show that the SVM classifier using the bag of words model following term frequency and inverse document frequency weighting scheme outperforms the other models for identifying depression. However, this framework could not achieve a place among the top ten runs of the shared task. This paper describes the potential of the proposed framework as well as the possible reasons behind mediocre performance on the given data.</abstract>
       <url hash="7cedc79e">2022.ltedi-1.33</url>
       <bibkey>basu-2022-iiserb</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.33</doi>
     </paper>
     <paper id="34">
       <title><fixed-case>SSNCSE</fixed-case>_<fixed-case>NLP</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Homophobia/Transphobia Detection in Multiple Languages using <fixed-case>SVM</fixed-case> Classifiers and <fixed-case>BERT</fixed-case>-based Transformers</title>
@@ -385,6 +418,7 @@
       <abstract>Over the years, there has been a slow but steady change in the attitude of society towards different kinds of sexuality. However, on social media platforms, where people have the license to be anonymous, toxic comments targeted at homosexuals, transgenders and the LGBTQ+ community are not uncommon. Detection of homophobic comments on social media can be useful in making the internet a safer place for everyone. For this task, we used a combination of word embeddings and SVM Classifiers as well as some BERT-based transformers. We achieved a weighted F1-score of 0.93 on the English dataset, 0.75 on the Tamil dataset and 0.87 on the Tamil-English Code-Mixed dataset.</abstract>
       <url hash="8cd772cb">2022.ltedi-1.34</url>
       <bibkey>swaminathan-etal-2022-ssncse</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.34</doi>
     </paper>
     <paper id="35">
       <title><fixed-case>KUCST</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Detecting Signs of Depression from Social Media Text</title>
@@ -394,6 +428,7 @@
       <abstract>In this paper we present our approach for detecting signs of depression from social media text. Our model relies on word unigrams, part-of-speech tags, readabilitiy measures and the use of first, second or third person and the number of words. Our best model obtained a macro F1-score of 0.439 and ranked 25th, out of 31 teams. We further take advantage of the interpretability of the Logistic Regression model and we make an attempt to interpret the model coefficients with the hope that these will be useful for further research on the topic.</abstract>
       <url hash="d4fe8de2">2022.ltedi-1.35</url>
       <bibkey>agirrezabal-amann-2022-kucst</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.35</doi>
     </paper>
     <paper id="36">
       <title>E8-<fixed-case>IJS</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022 - <fixed-case>BERT</fixed-case>, <fixed-case>A</fixed-case>uto<fixed-case>ML</fixed-case> and Knowledge-graph backed Detection of Depression</title>
@@ -405,6 +440,7 @@
       <abstract>Depression is a mental illness that negatively affects a person’s well-being and can, if left untreated, lead to serious consequences such as suicide. Therefore, it is important to recognize the signs of depression early. In the last decade, social media has become one of the most common places to express one’s feelings. Hence, there is a possibility of text processing and applying machine learning techniques to detect possible signs of depression. In this paper, we present our approaches to solving the shared task titled Detecting Signs of Depression from Social Media Text. We explore three different approaches to solve the challenge: fine-tuning BERT model, leveraging AutoML for the construction of features and classifier selection and finally, we explore latent spaces derived from the combination of textual and knowledge-based representations. We ranked 9th out of 31 teams in the competition. Our best solution, based on knowledge graph and textual representations, was 4.9% behind the best model in terms of Macro F1, and only 1.9% behind in terms of Recall.</abstract>
       <url hash="92ac5dd7">2022.ltedi-1.36</url>
       <bibkey>tavchioski-etal-2022-e8</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.36</doi>
     </paper>
     <paper id="37">
       <title>Nozza@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Ensemble Modeling for Homophobia and Transphobia Detection</title>
@@ -413,6 +449,7 @@
       <abstract>In this paper, we describe our approach for the task of homophobia and transphobia detection in English social media comments. The dataset consists of YouTube comments, and it has been released for the shared task on Homophobia/Transphobia Detection in social media comments. Given the high class imbalance, we propose a solution based on data augmentation and ensemble modeling. We fine-tuned different large language models (BERT, RoBERTa, and HateBERT) and used the weighted majority vote on their predictions.Our proposed model obtained 0.48 and 0.94 for macro and weighted F1-score, respectively, ranking at the third position.</abstract>
       <url hash="547de08c">2022.ltedi-1.37</url>
       <bibkey>nozza-2022-nozza</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.37</doi>
     </paper>
     <paper id="38">
       <title><fixed-case>KADO</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: <fixed-case>BERT</fixed-case>-based Ensembles for Detecting Signs of Depression from Social Media Text</title>
@@ -423,6 +460,7 @@
       <abstract>Depression is a common and serious mental illness that early detection can improve the patient’s symptoms and make depression easier to treat. This paper mainly introduces the relevant content of the task “Detecting Signs of Depression from Social Media Text at DepSign-LT-EDI@ACL-2022”. The goal of DepSign is to classify the signs of depression into three labels namely “not depressed”, “moderately depressed”, and “severely depressed” based on social media’s posts. In this paper, we propose a predictive ensemble model that utilizes the fine-tuned contextualized word embedding, ALBERT, DistilBERT, RoBERTa, and BERT base model. We show that our model outperforms the baseline models in all considered metrics and achieves an F1 score of 54% and accuracy of 61%, ranking 5th on the leader-board for the DepSign task.</abstract>
       <url hash="6d48753e">2022.ltedi-1.38</url>
       <bibkey>janatdoust-etal-2022-kado</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.38</doi>
     </paper>
     <paper id="39">
       <title>Sammaan@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Ensembled Transformers Against Homophobia and Transphobia</title>
@@ -434,6 +472,7 @@
       <url hash="5c20606c">2022.ltedi-1.39</url>
       <bibkey>upadhyay-etal-2022-sammaan</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.39</doi>
     </paper>
     <paper id="40">
       <title><fixed-case>OPI</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Detecting Signs of Depression from Social Media Text using <fixed-case>R</fixed-case>o<fixed-case>BERT</fixed-case>a Pre-trained Language Models</title>
@@ -443,6 +482,7 @@
       <abstract>This paper presents our winning solution for the Shared Task on Detecting Signs of Depression from Social Media Text at LT-EDI-ACL2022. The task was to create a system that, given social media posts in English, should detect the level of depression as ‘not depressed’, ‘moderately depressed’ or ‘severely depressed’. We based our solution on transformer-based language models. We fine-tuned selected models: BERT, RoBERTa, XLNet, of which the best results were obtained for RoBERTa. Then, using the prepared corpus, we trained our own language model called DepRoBERTa (RoBERTa for Depression Detection). Fine-tuning of this model improved the results. The third solution was to use the ensemble averaging, which turned out to be the best solution. It achieved a macro-averaged F1-score of 0.583. The source code of prepared solution is available at https://github.com/rafalposwiata/depression-detection-lt-edi-2022.</abstract>
       <url hash="5ce2a429">2022.ltedi-1.40</url>
       <bibkey>poswiata-perelkiewicz-2022-opi</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.40</doi>
     </paper>
     <paper id="41">
       <title><fixed-case>F</fixed-case>ilip<fixed-case>N</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022-Detecting signs of Depression from Social Media: Examining the use of summarization methods as data augmentation for text classification</title>
@@ -454,6 +494,7 @@
       <bibkey>nilsson-kovacs-2022-filipn</bibkey>
       <pwccode url="https://github.com/flippe3/dsdsm_augmentation" additional="false">flippe3/dsdsm_augmentation</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/c4">C4</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.41</doi>
     </paper>
     <paper id="42">
       <title><fixed-case>NAYEL</fixed-case> @<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Homophobia/Transphobia Detection for Equality, Diversity, and Inclusion using <fixed-case>SVM</fixed-case></title>
@@ -465,6 +506,7 @@
       <abstract>Analysing the contents of social media platforms such as YouTube, Facebook and Twitter gained interest due to the vast number of users. One of the important tasks is homophobia/transphobia detection. This paper illustrates the system submitted by our team for the homophobia/transphobia detection in social media comments shared task. A machine learning-based model has been designed and various classification algorithms have been implemented for automatic detection of homophobia in YouTube comments. TF/IDF has been used with a range of bigram model for vectorization of comments. Support Vector Machines has been used to develop the proposed model and our submission reported 0.91, 0.92, 0.88 weighted f1-score for English, Tamil and Tamil-English datasets respectively.</abstract>
       <url hash="5832b66d">2022.ltedi-1.42</url>
       <bibkey>ashraf-etal-2022-nayel</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.42</doi>
     </paper>
     <paper id="43">
       <title>gini<fixed-case>U</fixed-case>s @<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Aasha: Transformers based Hope-<fixed-case>EDI</fixed-case></title>
@@ -475,6 +517,7 @@
       <url hash="3c2229ba">2022.ltedi-1.43</url>
       <bibkey>surana-chinagundi-2022-ginius</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/hopeedi">HopeEDI</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.43</doi>
     </paper>
     <paper id="44">
       <title><fixed-case>SSN</fixed-case>_<fixed-case>MLRG</fixed-case>1@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Multi-Class Classification using <fixed-case>BERT</fixed-case> models for Detecting Depression Signs from Social Media Text</title>
@@ -487,6 +530,7 @@
       <abstract>DepSign-LT-EDI@ACL-2022 aims to ascer-tain the signs of depression of a person fromtheir messages and posts on social mediawherein people share their feelings and emo-tions. Given social media postings in English,the system should classify the signs of depres-sion into three labels namely “not depressed”,“moderately depressed”, and “severely de-pressed”. To achieve this objective, we haveadopted a fine-tuned BERT model. This solu-tion from team SSN_MLRG1 achieves 58.5%accuracy on the DepSign-LT-EDI@ACL-2022test set.</abstract>
       <url hash="3e4039dd">2022.ltedi-1.44</url>
       <bibkey>anantharaman-etal-2022-ssn</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.44</doi>
     </paper>
     <paper id="45">
       <title><fixed-case>D</fixed-case>epression<fixed-case>O</fixed-case>ne@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Using Machine Learning with <fixed-case>SMOTE</fixed-case> and Random <fixed-case>U</fixed-case>nder<fixed-case>S</fixed-case>ampling to Detect Signs of Depression on Social Media Text.</title>
@@ -496,6 +540,7 @@
       <abstract>Depression is a common and serious medical illness that negatively affects how you feel, the way you think, and how you act. Detecting depression is essential as it must be treated early to avoid painful consequences. Nowadays, people are broadcasting how they feel via posts and comments. Using social media, we can extract many comments related to depression and use NLP techniques to train and detect depression. This work presents the submission of the DepressionOne team at LT-EDI-2022 for the shared task, detecting signs of depression from social media text. The depression data is small and unbalanced. Thus, we have used oversampling and undersampling methods such as SMOTE and RandomUnderSampler to represent the data. Later, we used machine learning methods to train and detect the signs of depression.</abstract>
       <url hash="ca3be52f">2022.ltedi-1.45</url>
       <bibkey>dowlagar-mamidi-2022-depressionone</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.45</doi>
     </paper>
     <paper id="46">
       <title><fixed-case>L</fixed-case>eaning<fixed-case>T</fixed-case>ower@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: When Hope and Hate Collide</title>
@@ -507,6 +552,7 @@
       <abstract>The 2022 edition of LT-EDI proposed two tasks in various languages. Task Hope Speech Detection required models for the automatic identification of hopeful comments for equality, diversity, and inclusion. Task Homophobia/Transphobia Detection focused on the identification of homophobic and transphobic comments. We targeted both tasks in English by using reinforced BERT-based approaches. Our core strategy aimed at exploiting the data available for each given task to augment the amount of supervised instances in the other. On the basis of an active learning process, we trained a model on the dataset for Task <tex-math>i</tex-math> and applied it to the dataset for Task <tex-math>j</tex-math> to iteratively integrate new silver data for Task <tex-math>i</tex-math>. Our official submissions to the shared task obtained a macro-averaged F<tex-math>_1</tex-math> score of 0.53 for Hope Speech and 0.46 for Homo/Transphobia, placing our team in the third and fourth positions out of 11 and 12 participating teams respectively. </abstract>
       <url hash="0215aee4">2022.ltedi-1.46</url>
       <bibkey>muti-etal-2022-leaningtower</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.46</doi>
     </paper>
     <paper id="47">
       <title><fixed-case>MUCS</fixed-case>@Text-<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>@<fixed-case>ACL</fixed-case> 2022: Detecting Sign of Depression from Social Media Text using Supervised Learning Approach</title>
@@ -518,6 +564,7 @@
       <abstract>Social media has seen enormous growth in its users recently and knowingly or unknowingly the behavior of a person will be reflected in the comments she/he posts on social media. Users having the sign of depression may post negative or disturbing content seeking the attention of other users. Hence, social media data can be analysed to check whether the users’ have the sign of depression and help them to get through the situation if required. However, as analyzing the increasing amount of social media data manually in laborious and error-prone, automated tools have to be developed for the same. To address the issue of detecting the sign of depression content on social media, in this paper, we - team MUCS, describe an Ensemble of Machine Learning (ML) models and a Transfer Learning (TL) model submitted to “Detecting Signs of Depression from Social Media Text-LT-EDI@ACL 2022” (DepSign-LT-EDI@ACL-2022) shared task at Association for Computational Linguistics (ACL) 2022. Both frequency and text based features are used to train an Ensemble model and Bidirectional Encoder Representations from Transformers (BERT) fine-tuned with raw text is used to train the TL model. Among the two models, the TL model performed better with a macro averaged F-score of 0.479 and placed 18th rank in the shared task. The code to reproduce the proposed models is available in github page1.</abstract>
       <url hash="355d7bc3">2022.ltedi-1.47</url>
       <bibkey>hegde-etal-2022-mucs-text</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.47</doi>
     </paper>
     <paper id="48">
       <title><fixed-case>SSNCSE</fixed-case>_<fixed-case>NLP</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Speech Recognition for Vulnerable Individuals in <fixed-case>T</fixed-case>amil using pre-trained <fixed-case>XLSR</fixed-case> models</title>
@@ -529,6 +576,7 @@
       <abstract>Automatic speech recognition is a tool used to transform human speech into a written form. It is used in a variety of avenues, such as in voice commands, customer, service and more. It has emerged as an essential tool in the digitisation of daily life. It has been known to be of vital importance in making the lives of elderly and disabled people much easier. In this paper we describe an automatic speech recognition model, determined by using three pre-trained models, fine-tuned from the Facebook XLSR Wav2Vec2 model, which was trained using the Common Voice Dataset. The best model for speech recognition in Tamil is determined by finding the word error rate of the data. This work explains the submission made by SSNCSE_NLP in the shared task organized by LT-EDI at ACL 2022. A word error rate of 39.4512 is achieved.</abstract>
       <url hash="1229dc1e">2022.ltedi-1.48</url>
       <bibkey>srinivasan-etal-2022-ssncse</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.48</doi>
     </paper>
     <paper id="49">
       <title><fixed-case>IDIAP</fixed-case>_<fixed-case>TIET</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022 : Hope Speech Detection in Social Media using Contextualized <fixed-case>BERT</fixed-case> with Attention Mechanism</title>
@@ -540,6 +588,7 @@
       <url hash="e83f6bb5">2022.ltedi-1.49</url>
       <bibkey>khanna-etal-2022-idiap</bibkey>
       <pwccode url="https://github.com/deepanshu-beep/hope-speech-attention" additional="false">deepanshu-beep/hope-speech-attention</pwccode>
+      <doi>10.18653/v1/2022.ltedi-1.49</doi>
     </paper>
     <paper id="50">
       <title><fixed-case>SSN</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Transfer Learning using <fixed-case>BERT</fixed-case> for Detecting Signs of Depression from Social Media Texts</title>
@@ -549,6 +598,7 @@
       <abstract>Depression is one of the most common mentalissues faced by people. Detecting signs ofdepression early on can help in the treatmentand prevention of extreme outcomes like suicide.Since the advent of the internet, peoplehave felt more comfortable discussing topicslike depression online due to the anonymityit provides. This shared task has used datascraped from various social media sites andaims to develop models that detect signs andthe severity of depression effectively. In thispaper, we employ transfer learning by applyingenhanced BERT model trained for Wikipediadataset to the social media text and performtext classification. The model gives a F1-scoreof 63.8% which was reasonably better than theother competing models.</abstract>
       <url hash="fddb4aa7">2022.ltedi-1.50</url>
       <bibkey>s-antony-2022-ssn</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.50</doi>
     </paper>
     <paper id="51">
       <title>Findings of the Shared Task on Detecting Signs of Depression from Social Media</title>
@@ -560,6 +610,7 @@
       <abstract>Social media is considered as a platform whereusers express themselves. The rise of social me-dia as one of humanity’s most important publiccommunication platforms presents a potentialprospect for early identification and manage-ment of mental illness. Depression is one suchillness that can lead to a variety of emotionaland physical problems. It is necessary to mea-sure the level of depression from the socialmedia text to treat them and to avoid the nega-tive consequences. Detecting levels of depres-sion is a challenging task since it involves themindset of the people which can change period-ically. The aim of the DepSign-LT-EDI@ACL-2022 shared task is to classify the social me-dia text into three levels of depression namely“Not Depressed”, “Moderately Depressed”, and“Severely Depressed”. This overview presentsa description on the task, the data set, method-ologies used and an analysis on the results ofthe submissions. The models that were submit-ted as a part of the shared task had used a va-riety of technologies from traditional machinelearning algorithms to deep learning models.It could be observed from the result that thetransformer based models have outperformedthe other models. Among the 31 teams whohad submitted their results for the shared task,the best macro F1-score of 0.583 was obtainedusing transformer based model.</abstract>
       <url hash="44d75e4e">2022.ltedi-1.51</url>
       <bibkey>s-etal-2022-findings</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.51</doi>
     </paper>
     <paper id="52">
       <title>Findings of the Shared Task on Speech Recognition for Vulnerable Individuals in <fixed-case>T</fixed-case>amil</title>
@@ -573,6 +624,7 @@
       <abstract>This paper illustrates the overview of the sharedtask on automatic speech recognition in the Tamillanguage. In the shared task, spontaneousTamil speech data gathered from elderly andtransgender people was given for recognitionand evaluation. These utterances were collected from people when they communicatedin the public locations such as hospitals, markets, vegetable shop, etc. The speech corpusincludes utterances of male, female, and transgender and was split into training and testingdata. The given task was evaluated using WER(Word Error Rate). The participants used thetransformer-based model for automatic speechrecognition. Different results using differentpre-trained transformer models are discussedin this overview paper.</abstract>
       <url hash="759a72ea">2022.ltedi-1.52</url>
       <bibkey>b-etal-2022-findings-shared</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.52</doi>
     </paper>
     <paper id="53">
       <title><fixed-case>DLRG</fixed-case>@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022:Detecting signs of Depression from Social Media using <fixed-case>XGB</fixed-case>oost Method</title>
@@ -582,6 +634,7 @@
       <abstract>Depression is linked to the development of dementia.Cognitive functions such as thinkingand remembering generally deteriorate in dementiapatients. Social media usage has beenincreased among the people in recent days. Thetechnology advancements help the communityto express their views publicly. Analysing thesigns of depression from texts has become animportant area of research now, as it helps toidentify this kind of mental disorders among thepeople from their social media posts. As part ofthe shared task on detecting signs of depressionfrom social media text, a dataset has been providedby the organizers (Sampath et al.). Weapplied different machine learning techniquessuch as Support Vector Machine, Random Forestand XGBoost classifier to classify the signsof depression. Experimental results revealedthat, the XGBoost model outperformed othermodels with the highest classification accuracyof 0.61% and an Macro F1 score of 0.54.</abstract>
       <url hash="89608bd9">2022.ltedi-1.53</url>
       <bibkey>sharen-rajalakshmi-2022-dlrg</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.53</doi>
     </paper>
     <paper id="54">
       <title><fixed-case>IDIAP</fixed-case> Submission@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022 : Hope Speech Detection for Equality, Diversity and Inclusion</title>
@@ -593,6 +646,7 @@
       <bibkey>singh-motlicek-2022-idiap</bibkey>
       <pwccode url="https://github.com/muskaan-singh/hate-speech-detection" additional="false">muskaan-singh/hate-speech-detection</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/hopeedi">HopeEDI</pwcdataset>
+      <doi>10.18653/v1/2022.ltedi-1.54</doi>
     </paper>
     <paper id="55">
       <title><fixed-case>IDIAP</fixed-case> Submission@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Homophobia/Transphobia Detection in social media comments</title>
@@ -603,6 +657,7 @@
       <url hash="368de3ca">2022.ltedi-1.55</url>
       <bibkey>singh-motlicek-2022-idiap-submission</bibkey>
       <pwccode url="https://github.com/muskaan-singh/homophobia-and-transphobia-acl-submission" additional="false">muskaan-singh/homophobia-and-transphobia-acl-submission</pwccode>
+      <doi>10.18653/v1/2022.ltedi-1.55</doi>
     </paper>
     <paper id="56">
       <title><fixed-case>IDIAP</fixed-case> Submission@<fixed-case>LT</fixed-case>-<fixed-case>EDI</fixed-case>-<fixed-case>ACL</fixed-case>2022: Detecting Signs of Depression from Social Media Text</title>
@@ -612,6 +667,7 @@
       <abstract>Depression is a common illness involving sadness and lack of interest in all day-to-day activities. It is important to detect depression at an early stage as it is treated at an early stage to avoid consequences. In this paper, we present our system submission of ARGUABLY for DepSign-LT-EDI@ACL-2022. We aim to detect the signs of depression of a person from their social media postings wherein people share their feelings and emotions. The proposed system is an ensembled voting model with fine-tuned BERT, RoBERTa, and XLNet. Given social media postings in English, the submitted system classify the signs of depression into three labels, namely “not depressed,” “moderately depressed,” and “severely depressed.” Our best model is ranked <tex-math>3^{rd}</tex-math> position with 0.54% accuracy . We make our codebase accessible here.</abstract>
       <url hash="fcf0dce7">2022.ltedi-1.56</url>
       <bibkey>singh-motlicek-2022-idiap-submission-lt</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.56</doi>
     </paper>
     <paper id="57">
       <title>Overview of The Shared Task on Homophobia and Transphobia Detection in Social Media Comments</title>
@@ -626,6 +682,7 @@
       <abstract>Homophobia and Transphobia Detection is the task of identifying homophobia, transphobia, and non-anti-LGBT+ content from the given corpus. Homophobia and transphobia are both toxic languages directed at LGBTQ+ individuals that are described as hate speech. This paper summarizes our findings on the “Homophobia and Transphobia Detection in social media comments” shared task held at LT-EDI 2022 - ACL 2022 1. This shared taskfocused on three sub-tasks for Tamil, English, and Tamil-English (code-mixed) languages. It received 10 systems for Tamil, 13 systems for English, and 11 systems for Tamil-English. The best systems for Tamil, English, and Tamil-English scored 0.570, 0.870, and 0.610, respectively, on average macro F1-score.</abstract>
       <url hash="2a5c8557">2022.ltedi-1.57</url>
       <bibkey>chakravarthi-etal-2022-overview</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.57</doi>
     </paper>
     <paper id="58">
       <title>Overview of the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion</title>
@@ -645,6 +702,7 @@
       <abstract>Hope Speech detection is the task of classifying a sentence as hope speech or non-hope speech given a corpus of sentences. Hope speech is any message or content that is positive, encouraging, reassuring, inclusive and supportive that inspires and engenders optimism in the minds of people. In contrast to identifying and censoring negative speech patterns, hope speech detection is focussed on recognising and promoting positive speech patterns online. In this paper, we report an overview of the findings and results from the shared task on hope speech detection for Tamil, Malayalam, Kannada, English and Spanish languages conducted in the second workshop on Language Technology for Equality, Diversity and Inclusion (LT-EDI-2022) organised as a part of ACL 2022. The participants were provided with annotated training &amp; development datasets and unlabelled test datasets in all the five languages. The goal of the shared task is to classify the given sentences into one of the two hope speech classes. The performances of the systems submitted by the participants were evaluated in terms of micro-F1 score and weighted-F1 score. The datasets for this challenge are openly available</abstract>
       <url hash="0ea1187a">2022.ltedi-1.58</url>
       <bibkey>chakravarthi-etal-2022-overview-shared</bibkey>
+      <doi>10.18653/v1/2022.ltedi-1.58</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.mml.xml b/data/xml/2022.mml.xml
index e8107572d4..aa74f4da47 100644
--- a/data/xml/2022.mml.xml
+++ b/data/xml/2022.mml.xml
@@ -38,6 +38,7 @@
       <bibkey>jung-etal-2022-language</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/coco-cn">COCO-CN</pwcdataset>
+      <doi>10.18653/v1/2022.mml-1.1</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.nlp4convai.xml b/data/xml/2022.nlp4convai.xml
index aa965f8ace..4cc5a1de09 100644
--- a/data/xml/2022.nlp4convai.xml
+++ b/data/xml/2022.nlp4convai.xml
@@ -31,6 +31,7 @@
       <url hash="eafb4bc5">2022.nlp4convai-1.1</url>
       <bibkey>lee-etal-2022-randomized</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.1</doi>
     </paper>
     <paper id="2">
       <title>Are Pre-trained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection</title>
@@ -45,6 +46,7 @@
       <abstract>Pre-trained Transformer-based models were reported to be robust in intent classification. In this work, we first point out the importance of in-domain out-of-scope detection in few-shot intent recognition tasks and then illustrate the vulnerability of pre-trained Transformer-based models against samples that are in-domain but out-of-scope (ID-OOS). We construct two new datasets, and empirically show that pre-trained models do not perform well on both ID-OOS examples and general out-of-scope examples, especially on fine-grained few-shot intent detection tasks.</abstract>
       <url hash="e08ad7d3">2022.nlp4convai-1.2</url>
       <bibkey>zhang-etal-2022-pre-trained</bibkey>
+      <doi>10.18653/v1/2022.nlp4convai-1.2</doi>
     </paper>
     <paper id="3">
       <title>Conversational <fixed-case>AI</fixed-case> for Positive-sum Retailing under Falsehood Control</title>
@@ -56,6 +58,7 @@
       <abstract>Retailing combines complicated communication skills and strategies to reach an agreement between buyer and seller with identical or different goals. In each transaction a good seller finds an optimal solution by considering his/her own profits while simultaneously considering whether the buyer’s needs have been met. In this paper, we manage the retailing problem by mixing cooperation and competition. We present a rich dataset of buyer-seller bargaining in a simulated marketplace in which each agent values goods and utility separately. Various attributes (preference, quality, and profit) are initially hidden from one agent with respect to its role; during the conversation, both sides may reveal, fake, or retain the information uncovered to come to a final decision through natural language. Using this dataset, we leverage transfer learning techniques on a pretrained, end-to-end model and enhance its decision-making ability toward the best choice in terms of utility by means of multi-agent reinforcement learning. An automatic evaluation shows that our approach results in more optimal transactions than human does. We also show that our framework controls the falsehoods generated by seller agents.</abstract>
       <url hash="64b9d100">2022.nlp4convai-1.3</url>
       <bibkey>liao-etal-2022-conversational</bibkey>
+      <doi>10.18653/v1/2022.nlp4convai-1.3</doi>
     </paper>
     <paper id="4">
       <title><fixed-case>D</fixed-case>-<fixed-case>REX</fixed-case>: Dialogue Relation Extraction with Explanations</title>
@@ -70,6 +73,7 @@
       <bibkey>albalak-etal-2022-rex</bibkey>
       <pwccode url="https://github.com/alon-albalak/D-REX" additional="false">alon-albalak/D-REX</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dialogre">DialogRE</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.4</doi>
     </paper>
     <paper id="5">
       <title>Data Augmentation for Intent Classification with Off-the-shelf Large Language Models</title>
@@ -85,6 +89,7 @@
       <bibkey>sahu-etal-2022-data</bibkey>
       <pwccode url="https://github.com/elementai/data-augmentation-with-llms" additional="false">elementai/data-augmentation-with-llms</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/clinc150">CLINC150</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.5</doi>
     </paper>
     <paper id="6">
       <title>Extracting and Inferring Personal Attributes from Dialogue</title>
@@ -101,6 +106,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.6</doi>
     </paper>
     <paper id="7">
       <title>From Rewriting to Remembering: Common Ground for Conversational <fixed-case>QA</fixed-case> Models</title>
@@ -114,6 +120,7 @@
       <url hash="3a0d4a15">2022.nlp4convai-1.7</url>
       <bibkey>tredici-etal-2022-rewriting</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/qrecc">QReCC</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.7</doi>
     </paper>
     <paper id="8">
       <title>Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents</title>
@@ -128,6 +135,7 @@
       <url hash="f788e50a">2022.nlp4convai-1.8</url>
       <bibkey>smith-etal-2022-human</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>KG</fixed-case>-<fixed-case>CR</fixed-case>u<fixed-case>SE</fixed-case>: Recurrent Walks over Knowledge Graph for Explainable Conversation Reasoning using Semantic Embeddings</title>
@@ -140,6 +148,7 @@
       <bibkey>sarkar-etal-2022-kg</bibkey>
       <pwccode url="https://github.com/rajbsk/kg-cruse" additional="false">rajbsk/kg-cruse</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/opendialkg">OpenDialKG</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.9</doi>
     </paper>
     <paper id="10">
       <title>Knowledge Distillation Meets Few-Shot Learning: An Approach for Few-Shot Intent Classification Within and Across Domains</title>
@@ -151,6 +160,7 @@
       <url hash="6a7c7cbc">2022.nlp4convai-1.10</url>
       <bibkey>sauer-etal-2022-knowledge</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/atis">ATIS</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.10</doi>
     </paper>
     <paper id="11">
       <title><fixed-case>MTL</fixed-case>-<fixed-case>SLT</fixed-case>: Multi-Task Learning for Spoken Language Tasks</title>
@@ -169,6 +179,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/slurp">SLURP</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/spoken-squad">Spoken-SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.11</doi>
     </paper>
     <paper id="12">
       <title>Multimodal Conversational <fixed-case>AI</fixed-case>: A Survey of Datasets and Approaches</title>
@@ -193,6 +204,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/visual-question-answering">Visual Question Answering</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/visual7w">Visual7W</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/youcook2">YouCook2</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.12</doi>
     </paper>
     <paper id="13">
       <title>Open-domain Dialogue Generation: What We Can Do, Cannot Do, And Should Do Next</title>
@@ -207,6 +219,7 @@
       <bibkey>kann-etal-2022-open</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/persona-chat-1">PERSONA-CHAT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.13</doi>
     </paper>
     <paper id="14">
       <title>Relevance in Dialogue: Is Less More? An Empirical Comparison of Existing Metrics, and a Novel Simple Metric</title>
@@ -219,6 +232,7 @@
       <pwccode url="https://github.com/ikb-a/idk-dialogue-relevance" additional="false">ikb-a/idk-dialogue-relevance</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/fed">FED</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/topical-chat">Topical-Chat</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.14</doi>
     </paper>
     <paper id="15">
       <title><fixed-case>R</fixed-case>etro<fixed-case>NLU</fixed-case>: Retrieval Augmented Task-Oriented Semantic Parsing</title>
@@ -232,6 +246,7 @@
       <url hash="6263d84c">2022.nlp4convai-1.15</url>
       <bibkey>gupta-etal-2022-retronlu</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/topv2">TOPv2</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.15</doi>
     </paper>
     <paper id="16">
       <title>Stylistic Response Generation by Controlling Personality Traits and Intent</title>
@@ -246,6 +261,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/pandora">PANDORA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/topical-chat">Topical-Chat</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wizard-of-wikipedia">Wizard of Wikipedia</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.16</doi>
     </paper>
     <paper id="17">
       <title>Toward Knowledge-Enriched Conversational Recommendation Systems</title>
@@ -262,6 +278,7 @@
       <bibkey>zhang-etal-2022-toward</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conceptnet">ConceptNet</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/redial">ReDial</pwcdataset>
+      <doi>10.18653/v1/2022.nlp4convai-1.17</doi>
     </paper>
     <paper id="18">
       <title>Understanding and Improving the Exemplar-based Generation for Open-domain Conversation</title>
@@ -274,6 +291,7 @@
       <abstract>Exemplar-based generative models for open-domain conversation produce responses based on the exemplars provided by the retriever, taking advantage of generative models and retrieval models. However, due to the one-to-many problem of the open-domain conversation, they often ignore the retrieved exemplars while generating responses or produce responses over-fitted to the retrieved exemplars. To address these advantages, we introduce a training method selecting exemplars that are semantically relevant to the gold response but lexically distanced from the gold response. In the training phase, our training method first uses the gold response instead of dialogue context as a query to select exemplars that are semantically relevant to the gold response. And then, it eliminates the exemplars that lexically resemble the gold responses to alleviate the dependency of the generative models on that exemplars. The remaining exemplars could be irrelevant to the given context since they are searched depending on the gold response. Thus, our training method further utilizes the relevance scores between the given context and the exemplars to penalize the irrelevant exemplars. Extensive experiments demonstrate that our proposed training method alleviates the drawbacks of the existing exemplar-based generative models and significantly improves the performance in terms of appropriateness and informativeness.</abstract>
       <url hash="d98c36b8">2022.nlp4convai-1.18</url>
       <bibkey>han-etal-2022-understanding</bibkey>
+      <doi>10.18653/v1/2022.nlp4convai-1.18</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.nlppower.xml b/data/xml/2022.nlppower.xml
index d704c35a00..7aaf0b6864 100644
--- a/data/xml/2022.nlppower.xml
+++ b/data/xml/2022.nlppower.xml
@@ -30,6 +30,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/superglue">SuperGLUE</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.1</doi>
     </paper>
     <paper id="2">
       <title>Towards Stronger Adversarial Baselines Through Human-<fixed-case>AI</fixed-case> Collaboration</title>
@@ -40,6 +41,7 @@
       <url hash="8da8e617">2022.nlppower-1.2</url>
       <bibkey>you-lowd-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.2</doi>
     </paper>
     <paper id="3">
       <title>Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model</title>
@@ -54,6 +56,7 @@
       <bibkey>naseem-etal-2022-benchmarking</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/dreaddit">Dreaddit</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/pubhealth">PUBHEALTH</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.3</doi>
     </paper>
     <paper id="4">
       <title>Why only Micro-F1? Class Weighting of Measures for Relation Classification</title>
@@ -67,6 +70,7 @@
       <bibkey>harbecke-etal-2022-micro</bibkey>
       <pwccode url="https://github.com/dfki-nlp/weighting-schemes-report" additional="false">dfki-nlp/weighting-schemes-report</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/docred">DocRED</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.4</doi>
     </paper>
     <paper id="5">
       <title>Automatically Discarding Straplines to Improve Data Quality for Abstractive News Summarization</title>
@@ -81,6 +85,7 @@
       <bibkey>keleg-etal-2022-automatically</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/cnn-daily-mail-1">CNN/Daily Mail</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/newsroom">NEWSROOM</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.5</doi>
     </paper>
     <paper id="6">
       <title>A global analysis of metrics used for measuring performance in natural language processing</title>
@@ -94,6 +99,7 @@
       <url hash="ef76da73">2022.nlppower-1.6</url>
       <bibkey>blagec-etal-2022-global</bibkey>
       <pwccode url="https://github.com/OpenBioLink/ITO" additional="false">OpenBioLink/ITO</pwccode>
+      <doi>10.18653/v1/2022.nlppower-1.6</doi>
     </paper>
     <paper id="7">
       <title>Beyond Static models and test sets: Benchmarking the potential of pre-trained models across tasks and languages</title>
@@ -112,6 +118,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/xcopa">XCOPA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.7</doi>
     </paper>
     <paper id="8">
       <title>Checking <fixed-case>H</fixed-case>ate<fixed-case>C</fixed-case>heck: a cross-functional analysis of behaviour-aware learning for hate speech detection</title>
@@ -122,6 +129,7 @@
       <url hash="6c3e065c">2022.nlppower-1.8</url>
       <bibkey>henrique-luz-de-araujo-roth-2022-checking</bibkey>
       <pwccode url="https://github.com/peluz/checking-hatecheck-code" additional="false">peluz/checking-hatecheck-code</pwccode>
+      <doi>10.18653/v1/2022.nlppower-1.8</doi>
     </paper>
     <paper id="9">
       <title>Language Invariant Properties in Natural Language Processing</title>
@@ -133,6 +141,7 @@
       <url hash="daab4210">2022.nlppower-1.9</url>
       <bibkey>bianchi-etal-2022-language</bibkey>
       <pwccode url="https://github.com/milanlproc/language-invariant-properties" additional="false">milanlproc/language-invariant-properties</pwccode>
+      <doi>10.18653/v1/2022.nlppower-1.9</doi>
     </paper>
     <paper id="10">
       <title><fixed-case>DACT</fixed-case>-<fixed-case>BERT</fixed-case>: Differentiable Adaptive Computation Time for an Efficient <fixed-case>BERT</fixed-case> Inference</title>
@@ -145,6 +154,7 @@
       <url hash="4bc89bee">2022.nlppower-1.10</url>
       <bibkey>eyzaguirre-etal-2022-dact</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.10</doi>
     </paper>
     <paper id="11">
       <title>Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection</title>
@@ -157,6 +167,7 @@
       <url hash="1e45e01b">2022.nlppower-1.11</url>
       <bibkey>attanasio-etal-2022-benchmarking</bibkey>
       <pwccode url="https://github.com/milanlproc/benchmarking-xai-misogyny" additional="false">milanlproc/benchmarking-xai-misogyny</pwccode>
+      <doi>10.18653/v1/2022.nlppower-1.11</doi>
     </paper>
     <paper id="12">
       <title>Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context <fixed-case>NLP</fixed-case> Models</title>
@@ -172,6 +183,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/lra">LRA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/qasper">QASPER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/scrolls">SCROLLS</pwcdataset>
+      <doi>10.18653/v1/2022.nlppower-1.12</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.repl4nlp.xml b/data/xml/2022.repl4nlp.xml
index 3091fb389c..96a47c208a 100644
--- a/data/xml/2022.repl4nlp.xml
+++ b/data/xml/2022.repl4nlp.xml
@@ -38,6 +38,7 @@
       <url hash="12206eb8">2022.repl4nlp-1.1</url>
       <bibkey>valerio-miceli-barone-etal-2022-distributionally</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mtnt">MTNT</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.1</doi>
     </paper>
     <paper id="2">
       <title><fixed-case>Q</fixed-case>-Learning Scheduler for Multi Task Learning Through the use of Histogram of Task Uncertainty</title>
@@ -50,6 +51,7 @@
       <bibkey>meshgi-etal-2022-q</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/penn-treebank">Penn Treebank</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.2</doi>
     </paper>
     <paper id="4">
       <title>When does <fixed-case>CLIP</fixed-case> generalize better than unimodal models? When judging human-centric concepts</title>
@@ -62,6 +64,7 @@
       <url hash="7452646f">2022.repl4nlp-1.4</url>
       <bibkey>bielawski-etal-2022-clip</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/book-cover-dataset">Book Cover Dataset</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.4</doi>
     </paper>
     <paper id="5">
       <title>From Hyperbolic Geometry Back to Word Embeddings</title>
@@ -74,6 +77,7 @@
       <url hash="34fa5d5e">2022.repl4nlp-1.5</url>
       <bibkey>assylbekov-etal-2022-hyperbolic</bibkey>
       <pwccode url="https://github.com/soltustik/rhg" additional="false">soltustik/rhg</pwccode>
+      <doi>10.18653/v1/2022.repl4nlp-1.5</doi>
     </paper>
     <paper id="6">
       <title>A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition</title>
@@ -90,6 +94,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/few-nerd">Few-NERD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wnut-2017-emerging-and-rare-entity">WNUT 2017</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.6</doi>
     </paper>
     <paper id="7">
       <title>Clozer”:" Adaptable Data Augmentation for Cloze-style Reading Comprehension</title>
@@ -105,6 +110,7 @@
       <url hash="c359b4b1">2022.repl4nlp-1.7</url>
       <bibkey>lovenia-etal-2022-clozer</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/recam">ReCAM</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.7</doi>
     </paper>
     <paper id="8">
       <title>Analyzing Gender Representation in Multilingual Models</title>
@@ -115,6 +121,7 @@
       <abstract>Multilingual language models were shown to allow for nontrivial transfer across scripts and languages. In this work, we study the structure of the internal representations that enable this transfer. We focus on the representations of gender distinctions as a practical case study, and examine the extent to which the gender concept is encoded in shared subspaces across different languages. Our analysis shows that gender representations consist of several prominent components that are shared across languages, alongside language-specific components. The existence of language-independent and language-specific components provides an explanation for an intriguing empirical observation we make”:" while gender classification transfers well across languages, interventions for gender removal trained on a single language do not transfer easily to others.</abstract>
       <url hash="3c161c39">2022.repl4nlp-1.8</url>
       <bibkey>gonen-etal-2022-analyzing</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.8</doi>
     </paper>
     <paper id="9">
       <title>Detecting Textual Adversarial Examples Based on Distributional Characteristics of Data Representations</title>
@@ -130,6 +137,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/multinli">MultiNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.9</doi>
     </paper>
     <paper id="10">
       <title>A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning</title>
@@ -143,6 +151,7 @@
       <abstract>Subword tokenization is a commonly used input pre-processing step in most recent NLP models. However, it limits the models’ ability to leverage end-to-end task learning. Its frequency-based vocabulary creation compromises tokenization in low-resource languages, leading models to produce suboptimal representations. Additionally, the dependency on a fixed vocabulary limits the subword models’ adaptability across languages and domains. In this work, we propose a vocabulary-free neural tokenizer by distilling segmentation information from heuristic-based subword tokenization. We pre-train our character-based tokenizer by processing unique words from multilingual corpus, thereby extensively increasing word diversity across languages. Unlike the predefined and fixed vocabularies in subword methods, our tokenizer allows end-to-end task learning, resulting in optimal task-specific tokenization. The experimental results show that replacing the subword tokenizer with our neural tokenizer consistently improves performance on multilingual (NLI) and code-switching (sentiment analysis) tasks, with larger gains in low-resource languages. Additionally, our neural tokenizer exhibits a robust performance on downstream tasks when adversarial noise is present (typos and misspelling), further increasing the initial improvements over statistical subword tokenizers.</abstract>
       <url hash="98eacdf2">2022.repl4nlp-1.10</url>
       <bibkey>mofijul-islam-etal-2022-vocabulary</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.10</doi>
     </paper>
     <paper id="11">
       <title>Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models</title>
@@ -160,6 +169,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/qnli">QNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.11</doi>
     </paper>
     <paper id="12">
       <title>Temporal Knowledge Graph Reasoning with Low-rank and Model-agnostic Representations</title>
@@ -172,6 +182,7 @@
       <bibkey>dikeoulias-etal-2022-temporal</bibkey>
       <pwccode url="https://github.com/iodike/chronokge" additional="false">iodike/chronokge</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/icews">ICEWS</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.12</doi>
     </paper>
     <paper id="13">
       <title><fixed-case>ANNA</fixed-case>”:" Enhanced Language Representation for Question Answering</title>
@@ -189,6 +200,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/c4">C4</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/glue">GLUE</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.13</doi>
     </paper>
     <paper id="15">
       <title>Video Language Co-Attention with Multimodal Fast-Learning Feature Fusion for <fixed-case>V</fixed-case>ideo<fixed-case>QA</fixed-case></title>
@@ -201,6 +213,7 @@
       <bibkey>abdessaied-etal-2022-video</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/msr-vtt">MSR-VTT</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/msvd">MSVD</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.15</doi>
     </paper>
     <paper id="16">
       <title>Detecting Word-Level Adversarial Text Attacks via <fixed-case>SH</fixed-case>apley Additive ex<fixed-case>P</fixed-case>lanations</title>
@@ -215,6 +228,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/ag-news">AG News</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.16</doi>
     </paper>
     <paper id="17">
       <title>Binary Encoded Word Mover’s Distance</title>
@@ -223,6 +237,7 @@
       <abstract>Word Mover’s Distance is a textual distance metric which calculates the minimum transport cost between two sets of word embeddings. This metric achieves impressive results on semantic similarity tasks, but is slow and difficult to scale due to the large number of floating point calculations. This paper demonstrates that by combining pre-existing lower bounds with binary encoded word vectors, the metric can be rendered highly efficient in terms of computation time and memory while still maintaining accuracy on several textual similarity tasks.</abstract>
       <url hash="ed8a9135">2022.repl4nlp-1.17</url>
       <bibkey>johnson-2022-binary</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.17</doi>
     </paper>
     <paper id="18">
       <title>Unsupervised Geometric and Topological Approaches for Cross-Lingual Sentence Representation and Comparison</title>
@@ -232,6 +247,7 @@
       <abstract>We propose novel structural-based approaches for the generation and comparison of cross lingual sentence representations. We do so by applying geometric and topological methods to analyze the structure of sentences, as captured by their word embeddings. The key properties of our methods are”:" (a) They are designed to be isometric invariant, in order to provide language-agnostic representations. (b) They are fully unsupervised, and use no cross-lingual signal. The quality of our representations, and their preservation across languages, are evaluated in similarity comparison tasks, achieving competitive results. Furthermore, we show that our structural-based representations can be combined with existing methods for improved results.</abstract>
       <url hash="cf5a4ec3">2022.repl4nlp-1.18</url>
       <bibkey>haim-meirom-bobrowski-2022-unsupervised</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.18</doi>
     </paper>
     <paper id="19">
       <title>A Study on Entity Linking Across Domains”:" Which Data is Best for Fine-Tuning?</title>
@@ -244,6 +260,7 @@
       <abstract>Entity linking disambiguates mentions by mapping them to entities in a knowledge graph (KG). One important question in today’s research is how to extend neural entity linking systems to new domains. In this paper, we aim at a system that enables linking mentions to entities from a general-domain KG and a domain-specific KG at the same time. In particular, we represent the entities of different KGs in a joint vector space and address the questions of which data is best suited for creating and fine-tuning that space, and whether fine-tuning harms performance on the general domain. We find that a combination of data from both the general and the special domain is most helpful. The first is especially necessary for avoiding performance loss on the general domain. While additional supervision on entities that appear in both KGs performs best in an intrinsic evaluation of the vector space, it has less impact on the downstream task of entity linking.</abstract>
       <url hash="341e93d9">2022.repl4nlp-1.19</url>
       <bibkey>soliman-etal-2022-study</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.19</doi>
     </paper>
     <paper id="20">
       <title><fixed-case>TRA</fixed-case>ttack”:" Text Rewriting Attack Against Text Retrieval</title>
@@ -256,6 +273,7 @@
       <abstract>Text retrieval has been widely-used in many online applications to help users find relevant information from a text collection. In this paper, we study a new attack scenario against text retrieval to evaluate its robustness to adversarial attacks under the black-box setting, in which attackers want their own texts to always get high relevance scores with different users’ input queries and thus be retrieved frequently and can receive large amounts of impressions for profits. Considering that most current attack methods only simply follow certain fixed optimization rules, we propose a novel text rewriting attack (TRAttack) method with learning ability from the multi-armed bandit mechanism. Extensive experiments conducted on simulated victim environments demonstrate that TRAttack can yield texts that have higher relevance scores with different given users’ queries than those generated by current state-of-the-art attack methods. We also evaluate TRAttack on Tencent Cloud’s and Baidu Cloud’s commercially-available text retrieval APIs, and the rewritten adversarial texts successfully get high relevance scores with different user queries, which shows the practical potential of our method and the risk of text retrieval systems.</abstract>
       <url hash="2793af9d">2022.repl4nlp-1.20</url>
       <bibkey>song-etal-2022-trattack</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.20</doi>
     </paper>
     <paper id="21">
       <title>On the Geometry of Concreteness</title>
@@ -264,6 +282,7 @@
       <abstract>In this paper we investigate how concreteness and abstractness are represented in word embedding spaces. We use data for English and German, and show that concreteness and abstractness can be determined independently and turn out to be completely opposite directions in the embedding space. Various methods can be used to determine the direction of concreteness, always resulting in roughly the same vector. Though concreteness is a central aspect of the meaning of words and can be detected clearly in embedding spaces, it seems not as easy to subtract or add concreteness to words to obtain other words or word senses like e.g. can be done with a semantic property like gender.</abstract>
       <url hash="d6f70d97">2022.repl4nlp-1.21</url>
       <bibkey>wartena-2022-geometry</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.21</doi>
     </paper>
     <paper id="23">
       <title>Towards Improving Selective Prediction Ability of <fixed-case>NLP</fixed-case> Systems</title>
@@ -276,6 +295,7 @@
       <bibkey>varshney-etal-2022-towards</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mrpc">MRPC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/snli">SNLI</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.23</doi>
     </paper>
     <paper id="24">
       <title>On Target Representation in Continuous-output Neural Machine Translation</title>
@@ -285,6 +305,7 @@
       <abstract>Continuous generative models proved their usefulness in high-dimensional data, such as image and audio generation. However, continuous models for text generation have received limited attention from the community. In this work, we study continuous text generation using Transformers for neural machine translation (NMT). We argue that the choice of embeddings is crucial for such models, so we aim to focus on one particular aspect”:" target representation via embeddings. We explore pretrained embeddings and also introduce knowledge transfer from the discrete Transformer model using embeddings in Euclidean and non-Euclidean spaces. Our results on the WMT Romanian-English and English-Turkish benchmarks show such transfer leads to the best-performing continuous model.</abstract>
       <url hash="95130264">2022.repl4nlp-1.24</url>
       <bibkey>tokarchuk-niculae-2022-target</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.24</doi>
     </paper>
     <paper id="25">
       <title>Zero-shot Cross-lingual Transfer is Under-specified Optimization</title>
@@ -296,6 +317,7 @@
       <url hash="2cc40ed5">2022.repl4nlp-1.25</url>
       <bibkey>wu-etal-2022-zero</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.25</doi>
     </paper>
     <paper id="26">
       <title>Same Author or Just Same Topic? Towards Content-Independent Style Representations</title>
@@ -306,6 +328,7 @@
       <abstract>Linguistic style is an integral component of language. Recent advances in the development of style representations have increasingly used training objectives from authorship verification (AV)”:" Do two texts have the same author? The assumption underlying the AV training task (same author approximates same writing style) enables self-supervised and, thus, extensive training. However, a good performance on the AV task does not ensure good “general-purpose” style representations. For example, as the same author might typically write about certain topics, representations trained on AV might also encode content information instead of style alone. We introduce a variation of the AV training task that controls for content using conversation or domain labels. We evaluate whether known style dimensions are represented and preferred over content information through an original variation to the recently proposed STEL framework. We find that representations trained by controlling for conversation are better than representations trained with domain or no content control at representing style independent from content.</abstract>
       <url hash="07aeaaba">2022.repl4nlp-1.26</url>
       <bibkey>wegmann-etal-2022-author</bibkey>
+      <doi>10.18653/v1/2022.repl4nlp-1.26</doi>
     </paper>
     <paper id="27">
       <title><fixed-case>W</fixed-case>ea<fixed-case>NF</fixed-case>”:" Weak Supervision with Normalizing Flows</title>
@@ -316,6 +339,7 @@
       <url hash="2d1b6741">2022.repl4nlp-1.27</url>
       <bibkey>stephan-roth-2022-weanf</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.repl4nlp-1.27</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.slpat.xml b/data/xml/2022.slpat.xml
index d1ddf15bae..cdf2b36f1d 100644
--- a/data/xml/2022.slpat.xml
+++ b/data/xml/2022.slpat.xml
@@ -24,6 +24,7 @@
       <abstract>We present MozoLM, an open-source language model microservice package intended for use in AAC text-entry applications, with a particular focus on the design principles of the library. The intent of the library is to allow the ensembling of multiple diverse language models without requiring the clients (user interface designers, system users or speech-language pathologists) to attend to the formats of the models. Issues around privacy, security, dynamic versus static models, and methods of model combination are explored and specific design choices motivated. Some simulation experiments demonstrating the benefits of personalized language model ensembling via the library are presented.</abstract>
       <url hash="bb606aa8">2022.slpat-1.1</url>
       <bibkey>roark-gutkin-2022-design</bibkey>
+      <doi>10.18653/v1/2022.slpat-1.1</doi>
     </paper>
     <paper id="2">
       <title><fixed-case>C</fixed-case>olor<fixed-case>C</fixed-case>ode: A <fixed-case>B</fixed-case>ayesian Approach to Augmentative and Alternative Communication with Two Buttons</title>
@@ -33,6 +34,7 @@
       <url hash="c94bc999">2022.slpat-1.2</url>
       <bibkey>daly-2022-colorcode</bibkey>
       <pwccode url="https://github.com/mrdaly/colorcode" additional="false">mrdaly/colorcode</pwccode>
+      <doi>10.18653/v1/2022.slpat-1.2</doi>
     </paper>
     <paper id="3">
       <title>A glimpse of assistive technology in daily life</title>
@@ -48,6 +50,7 @@
       <abstract>Robitaille (2010) wrote ‘if all technology companies have accessibility in their mind then people with disabilities won’t be left behind.’ Current technology has come a long way from where it stood decades ago; however, researchers and manufacturers often do not include people with disabilities in the design process and tend to accommodate them after the fact. In this paper we share feedback from four assistive technology users who rely on one or more assistive technology devices in their everyday lives. We believe end users should be part of the design process and that by bringing together experts and users, we can bridge the research/practice gap.</abstract>
       <url hash="21d8bf34">2022.slpat-1.3</url>
       <bibkey>vaidyanathan-etal-2022-glimpse</bibkey>
+      <doi>10.18653/v1/2022.slpat-1.3</doi>
     </paper>
     <paper id="4">
       <title>A comparison study on patient-psychologist voice diarization</title>
@@ -64,6 +67,7 @@
       <abstract>Conversations between a clinician and a patient, in natural conditions, are valuable sources of information for medical follow-up. The automatic analysis of these dialogues could help extract new language markers and speed up the clinicians’ reports. Yet, it is not clear which model is the most efficient to detect and identify the speaker turns, especially for individuals with speech disorders. Here, we proposed a split of the data that allows conducting a comparative evaluation of different diarization methods. We designed and trained end-to-end neural network architectures to directly tackle this task from the raw signal and evaluate each approach under the same metric. We also studied the effect of fine-tuning models to find the best performance. Experimental results are reported on naturalistic clinical conversations between Psychologists and Interviewees, at different stages of Huntington’s disease, displaying a large panel of speech disorders. We found out that our best end-to-end model achieved 19.5 % IER on the test set, compared to 23.6% achieved by the finetuning of the X-vector architecture. Finally, we observed that we could extract clinical markers directly from the automatic systems, highlighting the clinical relevance of our methods.</abstract>
       <url hash="2349f946">2022.slpat-1.4</url>
       <bibkey>riad-etal-2022-comparison</bibkey>
+      <doi>10.18653/v1/2022.slpat-1.4</doi>
     </paper>
     <paper id="5">
       <title>Producing Standard <fixed-case>G</fixed-case>erman Subtitles for <fixed-case>S</fixed-case>wiss <fixed-case>G</fixed-case>erman <fixed-case>TV</fixed-case> Content</title>
@@ -74,6 +78,7 @@
       <abstract>In this study we compare two approaches (neural machine translation and edit-based) and the use of synthetic data for the task of translating normalised Swiss German ASR output into correct written Standard German for subtitles, with a special focus on syntactic differences. Results suggest that NMT is better suited to this task and that relatively simple rule-based generation of training data could be a valuable approach for cases where little training data is available and transformations are simple.</abstract>
       <url hash="b49f1258">2022.slpat-1.5</url>
       <bibkey>gerlach-etal-2022-producing</bibkey>
+      <doi>10.18653/v1/2022.slpat-1.5</doi>
     </paper>
     <paper id="6">
       <title>Investigating the Medical Coverage of a Translation System into Pictographs for Patients with an Intellectual Disability</title>
@@ -85,6 +90,7 @@
       <abstract>Communication between physician and patients can lead to misunderstandings, especially for disabled people. An automatic system that translates natural language into a pictographic language is one of the solutions that could help to overcome this issue. In this preliminary study, we present the French version of a translation system using the Arasaac pictographs and we investigate the strategies used by speech therapists to translate into pictographs. We also evaluate the medical coverage of this tool for translating physician questions and patient instructions.</abstract>
       <url hash="e8cd017e">2022.slpat-1.6</url>
       <bibkey>norre-etal-2022-investigating</bibkey>
+      <doi>10.18653/v1/2022.slpat-1.6</doi>
     </paper>
     <paper id="7">
       <title>On the Ethical Considerations of Text Simplification</title>
@@ -94,6 +100,7 @@
       <url hash="a55ffbae">2022.slpat-1.7</url>
       <bibkey>gooding-2022-ethical</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/newsela">Newsela</pwcdataset>
+      <doi>10.18653/v1/2022.slpat-1.7</doi>
     </paper>
     <paper id="8">
       <title>Applying the Stereotype Content Model to assess disability bias in popular pre-trained <fixed-case>NLP</fixed-case> models underlying <fixed-case>AI</fixed-case>-based assistive technologies</title>
@@ -104,6 +111,7 @@
       <abstract>Stereotypes are a positive or negative, generalized, and often widely shared belief about the attributes of certain groups of people, such as people with sensory disabilities. If stereotypes manifest in assistive technologies used by deaf or blind people, they can harm the user in a number of ways, especially considering the vulnerable nature of the target population. AI models underlying assistive technologies have been shown to contain biased stereotypes, including racial, gender, and disability biases. We build on this work to present a psychology-based stereotype assessment of the representation of disability, deafness, and blindness in BERT using the Stereotype Content Model. We show that BERT contains disability bias, and that this bias differs along established stereotype dimensions.</abstract>
       <url hash="26ff2437">2022.slpat-1.8</url>
       <bibkey>herold-etal-2022-applying</bibkey>
+      <doi>10.18653/v1/2022.slpat-1.8</doi>
     </paper>
     <paper id="9">
       <title><fixed-case>C</fixed-case>ue<fixed-case>B</fixed-case>ot: Cue-Controlled Response Generation for Assistive Interaction Usages</title>
@@ -119,6 +127,7 @@
       <url hash="f1e20d11">2022.slpat-1.9</url>
       <bibkey>h-kumar-etal-2022-cuebot</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/dailydialog">DailyDialog</pwcdataset>
+      <doi>10.18653/v1/2022.slpat-1.9</doi>
     </paper>
     <paper id="10">
       <title>Challenges in assistive technology development for an endangered language: an <fixed-case>I</fixed-case>rish (<fixed-case>G</fixed-case>aelic) perspective</title>
@@ -133,6 +142,7 @@
       <abstract>This paper describes three areas of assistive technology development which deploy the resources and speech technology for Irish (Gaelic), newly emerging from the ABAIR initiative. These include (i) a screenreading facility for visually impaired people, (ii) an application to help develop phonological awareness and early literacy for dyslexic people (iii) a speech-enabled AAC system for non-speaking people. Each of these is at a different stage of development and poses unique challenges: these are dis-cussed along with the approaches adopted to address them. Three guiding principles underlie development. Firstly, the sociolinguistic context and the needs of the community are essential considerations in setting priorities. Secondly, development needs to be language sensitive. The need for skilled researchers with a deep knowledge of Irish structure is illustrated in the case of (ii) and (iii), where aspects of Irish linguistic structure (phonological, morphological and grammatical) and the striking differences from English pose challenges for systems aimed at bilingual Irish-English users. Thirdly, and most importantly, the users and their support networks are central – not as passive recipients of ready-made technologies, but as active partners at every stage of development, from design to implementation, evaluation and dissemination.</abstract>
       <url hash="52bff89f">2022.slpat-1.10</url>
       <bibkey>ni-chasaide-etal-2022-challenges</bibkey>
+      <doi>10.18653/v1/2022.slpat-1.10</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.spanlp.xml b/data/xml/2022.spanlp.xml
index 14fe7903a5..9f1d51557c 100644
--- a/data/xml/2022.spanlp.xml
+++ b/data/xml/2022.spanlp.xml
@@ -30,6 +30,7 @@
       <bibkey>tran-etal-2022-improving</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/fewrel">FewRel</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wiki-zsl">Wiki-ZSL</pwcdataset>
+      <doi>10.18653/v1/2022.spanlp-1.1</doi>
     </paper>
     <paper id="2">
       <title>Choose Your <fixed-case>QA</fixed-case> Model Wisely: A Systematic Study of Generative and Extractive Readers for Question Answering</title>
@@ -46,6 +47,7 @@
       <pwcdataset url="https://paperswithcode.com/dataset/mrqa-2019">MRQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
+      <doi>10.18653/v1/2022.spanlp-1.2</doi>
     </paper>
     <paper id="3">
       <title>Efficient Machine Translation Domain Adaptation</title>
@@ -57,6 +59,7 @@
       <url hash="5884847a">2022.spanlp-1.3</url>
       <bibkey>martins-etal-2022-efficient</bibkey>
       <pwccode url="https://github.com/deep-spin/efficient_knn_mt" additional="false">deep-spin/efficient_knn_mt</pwccode>
+      <doi>10.18653/v1/2022.spanlp-1.3</doi>
     </paper>
     <paper id="4">
       <title>Field Extraction from Forms with Unlabeled Data</title>
@@ -71,6 +74,7 @@
       <url hash="5dd5ede6">2022.spanlp-1.4</url>
       <bibkey>gao-etal-2022-field</bibkey>
       <pwccode url="https://github.com/salesforce/inv-cdip" additional="true">salesforce/inv-cdip</pwccode>
+      <doi>10.18653/v1/2022.spanlp-1.4</doi>
     </paper>
     <paper id="5">
       <title>Knowledge Base Index Compression via Dimensionality and Precision Reduction</title>
@@ -84,6 +88,7 @@
       <bibkey>zouhar-etal-2022-knowledge</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/hotpotqa">HotpotQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/natural-questions">Natural Questions</pwcdataset>
+      <doi>10.18653/v1/2022.spanlp-1.5</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.spnlp.xml b/data/xml/2022.spnlp.xml
index f5ef4c8df4..c71a8f4e7d 100644
--- a/data/xml/2022.spnlp.xml
+++ b/data/xml/2022.spnlp.xml
@@ -29,6 +29,7 @@
       <bibkey>kando-etal-2022-multilingual</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/clams">CLAMS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/universal-dependencies">Universal Dependencies</pwcdataset>
+      <doi>10.18653/v1/2022.spnlp-1.1</doi>
     </paper>
     <paper id="2">
       <title>Joint Entity and Relation Extraction Based on Table Labeling Using Convolutional Neural Networks</title>
@@ -40,6 +41,7 @@
       <url hash="6949900b">2022.spnlp-1.2</url>
       <bibkey>ma-etal-2022-joint</bibkey>
       <pwccode url="https://github.com/youmima/tablert-cnn" additional="false">youmima/tablert-cnn</pwccode>
+      <doi>10.18653/v1/2022.spnlp-1.2</doi>
     </paper>
     <paper id="3">
       <title><fixed-case>T</fixed-case>emp<fixed-case>C</fixed-case>aps: A Capsule Network-based Embedding Model for Temporal Knowledge Graph Completion</title>
@@ -57,6 +59,7 @@
       <bibkey>fu-etal-2022-tempcaps</bibkey>
       <pwccode url="https://github.com/fuguigui/tempcaps" additional="false">fuguigui/tempcaps</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/icews">ICEWS</pwcdataset>
+      <doi>10.18653/v1/2022.spnlp-1.3</doi>
     </paper>
     <paper id="4">
       <title><fixed-case>S</fixed-case>lot<fixed-case>GAN</fixed-case>: Detecting Mentions in Text via Adversarial Distant Learning</title>
@@ -68,6 +71,7 @@
       <url hash="f1cb76a4">2022.spnlp-1.4</url>
       <bibkey>daza-etal-2022-slotgan</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
+      <doi>10.18653/v1/2022.spnlp-1.4</doi>
     </paper>
     <paper id="5">
       <title>A Joint Learning Approach for Semi-supervised Neural Topic Modeling</title>
@@ -80,6 +84,7 @@
       <abstract>Topic models are some of the most popular ways to represent textual data in an interpret-able manner. Recently, advances in deep generative models, specifically auto-encoding variational Bayes (AEVB), have led to the introduction of unsupervised neural topic models, which leverage deep generative models as opposed to traditional statistics-based topic models. We extend upon these neural topic models by introducing the Label-Indexed Neural Topic Model (LI-NTM), which is, to the extent of our knowledge, the first effective upstream semi-supervised neural topic model. We find that LI-NTM outperforms existing neural topic models in document reconstruction benchmarks, with the most notable results in low labeled data regimes and for data-sets with informative labels; furthermore, our jointly learned classifier outperforms baseline classifiers in ablation studies.</abstract>
       <url hash="3aa971bf">2022.spnlp-1.5</url>
       <bibkey>chiu-etal-2022-joint</bibkey>
+      <doi>10.18653/v1/2022.spnlp-1.5</doi>
     </paper>
     <paper id="6">
       <title>Neural String Edit Distance</title>
@@ -90,6 +95,7 @@
       <url hash="d2a70555">2022.spnlp-1.6</url>
       <bibkey>libovicky-fraser-2022-neural</bibkey>
       <pwccode url="https://github.com/jlibovicky/neural-string-edit-distance" additional="false">jlibovicky/neural-string-edit-distance</pwccode>
+      <doi>10.18653/v1/2022.spnlp-1.6</doi>
     </paper>
     <paper id="7">
       <title>Predicting Attention Sparsity in Transformers</title>
@@ -104,6 +110,7 @@
       <bibkey>treviso-etal-2022-predicting</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-103">WikiText-103</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikitext-2">WikiText-2</pwcdataset>
+      <doi>10.18653/v1/2022.spnlp-1.7</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.wassa.xml b/data/xml/2022.wassa.xml
index 803f7975ab..781cc1cf20 100644
--- a/data/xml/2022.wassa.xml
+++ b/data/xml/2022.wassa.xml
@@ -30,6 +30,7 @@
       <abstract>Authors of posts in social media communicate their emotions and what causes them with text and images. While there is work on emotion and stimulus detection for each modality separately, it is yet unknown if the modalities contain complementary emotion information in social media. We aim at filling this research gap and contribute a novel, annotated corpus of English multimodal Reddit posts. On this resource, we develop models to automatically detect the relation between image and text, an emotion stimulus category and the emotion class. We evaluate if these tasks require both modalities and find for the image–text relations, that text alone is sufficient for most categories (complementary, illustrative, opposing): the information in the text allows to predict if an image is required for emotion understanding. The emotions of anger and sadness are best predicted with a multimodal model, while text alone is sufficient for disgust, joy, and surprise. Stimuli depicted by objects, animals, food, or a person are best predicted by image-only models, while multimodal mod- els are most effective on art, events, memes, places, or screenshots.</abstract>
       <url hash="f1144da4">2022.wassa-1.1</url>
       <bibkey>khlyzova-etal-2022-complementarity</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.1</doi>
     </paper>
     <paper id="2">
       <title>Multiplex Anti-<fixed-case>A</fixed-case>sian Sentiment before and during the Pandemic: Introducing New Datasets from <fixed-case>T</fixed-case>witter Mining</title>
@@ -42,6 +43,7 @@
       <abstract>COVID-19 has disproportionately threatened minority communities in the U.S, not only in health but also in societal impact. However, social scientists and policymakers lack critical data to capture the dynamics of the anti-Asian hate trend and to evaluate its scale and scope. We introduce new datasets from Twitter related to anti-Asian hate sentiment before and during the pandemic. Relying on Twitter’s academic API, we retrieve hateful and counter-hate tweets from the Twitter Historical Database. To build contextual understanding and collect related racial cues, we also collect instances of heated arguments, often political, but not necessarily hateful, discussing Chinese issues. We then use the state-of-the-art hate speech classifiers to discern whether these tweets express hatred. These datasets can be used to study hate speech, general anti-Asian or Chinese sentiment, and hate linguistics by social scientists as well as to evaluate and build hate speech or sentiment analysis classifiers by computational scholars.</abstract>
       <url hash="e21dd7e2">2022.wassa-1.2</url>
       <bibkey>lin-etal-2022-multiplex</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.2</doi>
     </paper>
     <paper id="3">
       <title>Domain-Aware Contrastive Knowledge Transfer for Multi-domain Imbalanced Data</title>
@@ -53,6 +55,7 @@
       <url hash="0b4574c0">2022.wassa-1.3</url>
       <bibkey>ke-etal-2022-domain</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/liar">LIAR</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.3</doi>
     </paper>
     <paper id="4">
       <title>“splink” is happy and “phrouth” is scary: Emotion Intensity Analysis for Nonsense Words</title>
@@ -64,6 +67,7 @@
       <abstract>People associate affective meanings to words - “death” is scary and sad while “party” is connotated with surprise and joy. This raises the question if the association is purely a product of the learned affective imports inherent to semantic meanings, or is also an effect of other features of words, e.g., morphological and phonological patterns. We approach this question with an annotation-based analysis leveraging nonsense words. Specifically, we conduct a best-worst scaling crowdsourcing study in which participants assign intensity scores for joy, sadness, anger, disgust, fear, and surprise to 272 non-sense words and, for comparison of the results to previous work, to 68 real words. Based on this resource, we develop character-level and phonology-based intensity regressors. We evaluate them on both nonsense words and real words (making use of the NRC emotion intensity lexicon of 7493 words), across six emotion categories. The analysis of our data reveals that some phonetic patterns show clear differences between emotion intensities. For instance, s as a first phoneme contributes to joy, sh to surprise, p as last phoneme more to disgust than to anger and fear. In the modelling experiments, a regressor trained on real words from the NRC emotion intensity lexicon shows a higher performance (r = 0.17) than regressors that aim at learning the emotion connotation purely from nonsense words. We conclude that humans do associate affective meaning to words based on surface patterns, but also based on similarities to existing words (“juy” to “joy”, or “flike” to “like”).</abstract>
       <url hash="b586cefe">2022.wassa-1.4</url>
       <bibkey>sabbatino-etal-2022-splink</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.4</doi>
     </paper>
     <paper id="5">
       <title><fixed-case>S</fixed-case>ent<fixed-case>EMO</fixed-case>: A Multilingual Adaptive Platform for Aspect-based Sentiment and Emotion Analysis</title>
@@ -78,6 +82,7 @@
       <abstract>In this paper, we present the SentEMO platform, a tool that provides aspect-based sentiment analysis and emotion detection of unstructured text data such as reviews, emails and customer care conversations. Currently, models have been trained for five domains and one general domain and are implemented in a pipeline approach, where the output of one model serves as the input for the next. The results are presented in three dashboards, allowing companies to gain more insights into what stakeholders think of their products and services. The SentEMO platform is available at https://sentemo.ugent.be</abstract>
       <url hash="c3d79441">2022.wassa-1.5</url>
       <bibkey>de-geyndt-etal-2022-sentemo</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.5</doi>
     </paper>
     <paper id="6">
       <title>Can Emotion Carriers Explain Automatic Sentiment Prediction? A Study on Personal Narratives</title>
@@ -91,6 +96,7 @@
       <url hash="b629ee5d">2022.wassa-1.6</url>
       <bibkey>mousavi-etal-2022-emotion</bibkey>
       <pwccode url="https://gitlab.com/sislab/pns_val-ec_annotation" additional="false">sislab/pns_val-ec_annotation</pwccode>
+      <doi>10.18653/v1/2022.wassa-1.6</doi>
     </paper>
     <paper id="7">
       <title>Infusing Knowledge from <fixed-case>W</fixed-case>ikipedia to Enhance Stance Detection</title>
@@ -102,6 +108,7 @@
       <url hash="27e3a95c">2022.wassa-1.7</url>
       <bibkey>he-etal-2022-infusing</bibkey>
       <pwccode url="https://github.com/zihaohe123/wiki-enhanced-stance-detection" additional="false">zihaohe123/wiki-enhanced-stance-detection</pwccode>
+      <doi>10.18653/v1/2022.wassa-1.7</doi>
     </paper>
     <paper id="8">
       <title>Uncertainty Regularized Multi-Task Learning</title>
@@ -113,6 +120,7 @@
       <url hash="1fdd716d">2022.wassa-1.8</url>
       <bibkey>meshgi-etal-2022-uncertainty</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.8</doi>
     </paper>
     <paper id="9">
       <title>Evaluating Contextual Embeddings and their Extraction Layers for Depression Assessment</title>
@@ -123,6 +131,7 @@
       <abstract>Many recent works in natural language processing have demonstrated ability to assess aspects of mental health from personal discourse. At the same time, pre-trained contextual word embedding models have grown to dominate much of NLP but little is known empirically on how to best apply them for mental health assessment. Using degree of depression as a case study, we do an empirical analysis on which off-the-shelf language model, individual layers, and combinations of layers seem most promising when applied to human-level NLP tasks. Notably, we find RoBERTa most effective and, despite the standard in past work suggesting the second-to-last or concatenation of the last 4 layers, we find layer 19 (sixth-to last) is at least as good as layer 23 when using 1 layer. Further, when using multiple layers, distributing them across the second half (i.e. Layers 12+), rather than last 4, of the 24 layers yielded the most accurate results.</abstract>
       <url hash="831966c8">2022.wassa-1.9</url>
       <bibkey>matero-etal-2022-understanding</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.9</doi>
     </paper>
     <paper id="10">
       <title>Emotion Analysis of Writers and Readers of <fixed-case>J</fixed-case>apanese Tweets on Vaccinations</title>
@@ -136,6 +145,7 @@
       <url hash="f220cd74">2022.wassa-1.10</url>
       <bibkey>ramos-etal-2022-emotion</bibkey>
       <pwccode url="https://github.com/patrickjohnramos/bert-japan-vaccination" additional="false">patrickjohnramos/bert-japan-vaccination</pwccode>
+      <doi>10.18653/v1/2022.wassa-1.10</doi>
     </paper>
     <paper id="11">
       <title>Opinion-based Relational Pivoting for Cross-domain Aspect Term Extraction</title>
@@ -149,6 +159,7 @@
       <abstract>Domain adaptation methods often exploit domain-transferable input features, a.k.a. pivots. The task of Aspect and Opinion Term Extraction presents a special challenge for domain transfer: while opinion terms largely transfer across domains, aspects change drastically from one domain to another (e.g. from restaurants to laptops). In this paper, we investigate and establish empirically a prior conjecture, which suggests that the linguistic relations connecting opinion terms to their aspects transfer well across domains and therefore can be leveraged for cross-domain aspect term extraction. We present several analyses supporting this conjecture, via experiments with four linguistic dependency formalisms to represent relation patterns. Subsequently, we present an aspect term extraction method that drives models to consider opinion–aspect relations via explicit multitask objectives. This method provides significant performance gains, even on top of a prior state-of-the-art linguistically-informed model, which are shown in analysis to stem from the relational pivoting signal.</abstract>
       <url hash="660f634d">2022.wassa-1.11</url>
       <bibkey>klein-etal-2022-opinion</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.11</doi>
     </paper>
     <paper id="12">
       <title><fixed-case>E</fixed-case>nglish-<fixed-case>M</fixed-case>alay Word Embeddings Alignment for Cross-lingual Emotion Classification with Hierarchical Attention Network</title>
@@ -158,6 +169,7 @@
       <abstract>The main challenge in English-Malay cross-lingual emotion classification is that there are no Malay training emotion corpora. Given that machine translation could fall short in contextually complex tweets, we only limited machine translation to the word level. In this paper, we bridge the language gap between English and Malay through cross-lingual word embeddings constructed using singular value decomposition. We pre-trained our hierarchical attention model using English tweets and fine-tuned it using a set of gold standard Malay tweets. Our model uses significantly less computational resources compared to the language models. Experimental results show that the performance of our model is better than mBERT in zero-shot learning by 2.4% and Malay BERT by 0.8% when a limited number of Malay tweets is available. In exchange for 6 – 7 times less in computational time, our model only lags behind mBERT and XLM-RoBERTa by a margin of 0.9 – 4.3 % in few-shot learning. Also, the word-level attention could be transferred to the Malay tweets accurately using the cross-lingual word embeddings.</abstract>
       <url hash="86144052">2022.wassa-1.12</url>
       <bibkey>lim-liew-2022-english-malay</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.12</doi>
     </paper>
     <paper id="13">
       <title>Assessment of Massively Multilingual Sentiment Classifiers</title>
@@ -171,6 +183,7 @@
       <abstract>Models are increasing in size and complexity in the hunt for SOTA. But what if those 2%increase in performance does not make a difference in a production use case? Maybe benefits from a smaller, faster model outweigh those slight performance gains. Also, equally good performance across languages in multilingual tasks is more important than SOTA results on a single one. We present the biggest, unified, multilingual collection of sentiment analysis datasets. We use these to assess 11 models and 80 high-quality sentiment datasets (out of 342 raw datasets collected) in 27 languages and included results on the internally annotated datasets. We deeply evaluate multiple setups, including fine-tuning transformer-based models for measuring performance. We compare results in numerous dimensions addressing the imbalance in both languages coverage and dataset sizes. Finally, we present some best practices for working with such a massive collection of datasets and models for a multi-lingual perspective.</abstract>
       <url hash="0b63c14e">2022.wassa-1.13</url>
       <bibkey>rajda-etal-2022-assessment</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.13</doi>
     </paper>
     <paper id="14">
       <title>Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning</title>
@@ -180,6 +193,7 @@
       <abstract>Masked language models (MLMs) are pre-trained with a denoising objective that is in a mismatch with the objective of downstream fine-tuning. We propose pragmatic masking and surrogate fine-tuning as two complementing strategies that exploit social cues to drive pre-trained representations toward a broad set of concepts useful for a wide class of social meaning tasks. We test our models on 15 different Twitter datasets for social meaning detection. Our methods achieve 2.34% <tex-math>F_1</tex-math> over a competitive baseline, while outperforming domain-specific language models pre-trained on large datasets. Our methods also excel in few-shot learning: with only 5% of training data (severely few-shot), our methods enable an impressive 68.54% average <tex-math>F_1</tex-math>. The methods are also language agnostic, as we show in a zero-shot setting involving six datasets from three different languages.</abstract>
       <url hash="aceb9e12">2022.wassa-1.14</url>
       <bibkey>zhang-abdul-mageed-2022-improving</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.14</doi>
     </paper>
     <paper id="15">
       <title>Distinguishing In-Groups and Onlookers by Language Use</title>
@@ -195,6 +209,7 @@
       <abstract>Inferring group membership of social media users is of high interest in many domains. Group membership is typically inferred via network interactions with other members, or by the usage of in-group language. However, network information is incomplete when users or groups move between platforms, and in-group keywords lose significance as public discussion about a group increases. Similarly, using keywords to filter content and users can fail to distinguish between the various groups that discuss a topic—perhaps confounding research on public opinion and narrative trends. We present a classifier intended to distinguish members of groups from users discussing a group based on contextual usage of keywords. We demonstrate the classifier on a sample of community pairs from Reddit and focus on results related to the COVID-19 pandemic.</abstract>
       <url hash="b3649789">2022.wassa-1.15</url>
       <bibkey>minot-etal-2022-distinguishing</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.15</doi>
     </paper>
     <paper id="16">
       <title>Irony Detection for <fixed-case>D</fixed-case>utch: a Venture into the Implicit</title>
@@ -206,6 +221,7 @@
       <abstract>This paper presents the results of a replication experiment for automatic irony detection in Dutch social media text, investigating both a feature-based SVM classifier, as was done by Van Hee et al. (2017) and and a transformer-based approach. In addition to building a baseline model, an important goal of this research is to explore the implementation of common-sense knowledge in the form of implicit sentiment, as we strongly believe that common-sense and connotative knowledge are essential to the identification of irony and implicit meaning in tweets.We show promising results and the presented approach can provide a solid baseline and serve as a staging ground to build on in future experiments for irony detection in Dutch.</abstract>
       <url hash="2a4f1d1c">2022.wassa-1.16</url>
       <bibkey>maladry-etal-2022-irony</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.16</doi>
     </paper>
     <paper id="17">
       <title>Pushing on Personality Detection from Verbal Behavior: A Transformer Meets Text Contours of Psycholinguistic Features</title>
@@ -217,6 +233,7 @@
       <abstract>Research at the intersection of personality psychology, computer science, and linguistics has recently focused increasingly on modeling and predicting personality from language use. We report two major improvements in predicting personality traits from text data: (1) to our knowledge, the most comprehensive set of theory-based psycholinguistic features and (2) hybrid models that integrate a pre-trained Transformer Language Model BERT and Bidirectional Long Short-Term Memory (BLSTM) networks trained on within-text distributions (‘text contours’) of psycholinguistic features. We experiment with BLSTM models (with and without Attention) and with two techniques for applying pre-trained language representations from the transformer model - ‘feature-based’ and ‘fine-tuning’. We evaluate the performance of the models we built on two benchmark datasets that target the two dominant theoretical models of personality: the Big Five Essay dataset (Pennebaker and King, 1999) and the MBTI Kaggle dataset (Li et al., 2018). Our results are encouraging as our models outperform existing work on the same datasets. More specifically, our models achieve improvement in classification accuracy by 2.9% on the Essay dataset and 8.28% on the Kaggle MBTI dataset. In addition, we perform ablation experiments to quantify the impact of different categories of psycholinguistic features in the respective personality prediction models.</abstract>
       <url hash="5a844fd3">2022.wassa-1.17</url>
       <bibkey>kerz-etal-2022-pushing</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.17</doi>
     </paper>
     <paper id="18">
       <title><fixed-case>XLM</fixed-case>-<fixed-case>EMO</fixed-case>: Multilingual Emotion Prediction in Social Media Text</title>
@@ -228,6 +245,7 @@
       <url hash="43314f0d">2022.wassa-1.18</url>
       <bibkey>bianchi-etal-2022-xlm</bibkey>
       <pwccode url="https://github.com/milanlproc/xlm-emo" additional="false">milanlproc/xlm-emo</pwccode>
+      <doi>10.18653/v1/2022.wassa-1.18</doi>
     </paper>
     <paper id="19">
       <title>Evaluating Content Features and Classification Methods for Helpfulness Prediction of Online Reviews: Establishing a Benchmark for <fixed-case>P</fixed-case>ortuguese</title>
@@ -237,6 +255,7 @@
       <abstract>Over the years, the review helpfulness prediction task has been the subject of several works, but remains being a challenging issue in Natural Language Processing, as results vary a lot depending on the domain, on the adopted features and on the chosen classification strategy. This paper attempts to evaluate the impact of content features and classification methods for two different domains. In particular, we run our experiments for a low resource language – Portuguese –, trying to establish a benchmark for this language. We show that simple features and classical classification methods are powerful for the task of helpfulness prediction, but are largely outperformed by a convolutional neural network-based solution.</abstract>
       <url hash="ce6baf6c">2022.wassa-1.19</url>
       <bibkey>sousa-pardo-2022-evaluating</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.19</doi>
     </paper>
     <paper id="20">
       <title><fixed-case>WASSA</fixed-case> 2022 Shared Task: Predicting Empathy, Emotion and Personality in Reaction to News Stories</title>
@@ -249,6 +268,7 @@
       <url hash="8f9814c8">2022.wassa-1.20</url>
       <bibkey>barriere-etal-2022-wassa</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/goemotions">GoEmotions</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.20</doi>
     </paper>
     <paper id="21">
       <title><fixed-case>IUCL</fixed-case> at <fixed-case>WASSA</fixed-case> 2022 Shared Task: A Text-only Approach to Empathy and Emotion Detection</title>
@@ -259,6 +279,7 @@
       <abstract>Our system, IUCL, participated in the WASSA 2022 Shared Task on Empathy Detection and Emotion Classification. Our main goal in building this system is to investigate how the use of demographic attributes influences performance. Our (official) results show that our text-only systems perform very competitively, ranking first in the empathy detection task, reaching an average Pearson correlation of 0.54, and second in the emotion classification task, reaching a Macro-F of 0.572. Our systems that use both text and demographic data are less competitive.</abstract>
       <url hash="3be6f5b0">2022.wassa-1.21</url>
       <bibkey>chen-etal-2022-iucl</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.21</doi>
     </paper>
     <paper id="22">
       <title>Continuing Pre-trained Model with Multiple Training Strategies for Emotional Classification</title>
@@ -271,6 +292,7 @@
       <abstract>Emotion is the essential attribute of human beings. Perceiving and understanding emotions in a human-like manner is the most central part of developing emotional intelligence. This paper describes the contribution of the LingJing team’s method to the Workshop on Computational Approaches to Subjectivity, Sentiment &amp; Social Media Analysis (WASSA) 2022 shared task on Emotion Classification. The participants are required to predict seven emotions from empathic responses to news or stories that caused harm to individuals, groups, or others. This paper describes the continual pre-training method for the masked language model (MLM) to enhance the DeBERTa pre-trained language model. Several training strategies are designed to further improve the final downstream performance including the data augmentation with the supervised transfer, child-tuning training, and the late fusion method. Extensive experiments on the emotional classification dataset show that the proposed method outperforms other state-of-the-art methods, demonstrating our method’s effectiveness. Moreover, our submission ranked Top-1 with all metrics in the evaluation phase for the Emotion Classification task.</abstract>
       <url hash="baae19bf">2022.wassa-1.22</url>
       <bibkey>li-etal-2022-continuing</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.22</doi>
     </paper>
     <paper id="23">
       <title>Empathy and Distress Prediction using Transformer Multi-output Regression and Emotion Analysis with an Ensemble of Supervised and Zero-Shot Learning Models</title>
@@ -283,6 +305,7 @@
       <url hash="ad8b21d6">2022.wassa-1.23</url>
       <bibkey>del-arco-etal-2022-empathy</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/emotion">CARER</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.23</doi>
     </paper>
     <paper id="24">
       <title>Leveraging Emotion-Specific features to improve Transformer performance for Emotion Classification</title>
@@ -295,6 +318,7 @@
       <abstract>This paper describes team PVG’s AI Club’s approach to the Emotion Classification shared task held at WASSA 2022. This Track 2 sub-task focuses on building models which can predict a multi-class emotion label based on essays from news articles where a person, group or another entity is affected. Baseline transformer models have been demonstrating good results on sequence classification tasks, and we aim to improve this performance with the help of ensembling techniques, and by leveraging two variations of emotion-specific representations. We observe better results than our baseline models and achieve an accuracy of 0.619 and a macro F1 score of 0.520 on the emotion classification task.</abstract>
       <url hash="97f62c4d">2022.wassa-1.24</url>
       <bibkey>desai-etal-2022-leveraging</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.24</doi>
     </paper>
     <paper id="25">
       <title>Transformer based ensemble for emotion detection</title>
@@ -307,6 +331,7 @@
       <url hash="fdae3d1d">2022.wassa-1.25</url>
       <bibkey>kane-etal-2022-transformer</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/goemotions">GoEmotions</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.25</doi>
     </paper>
     <paper id="26">
       <title>Team <fixed-case>IITP</fixed-case>-<fixed-case>AINLPML</fixed-case> at <fixed-case>WASSA</fixed-case> 2022: Empathy Detection, Emotion Classification and Personality Detection</title>
@@ -318,6 +343,7 @@
       <abstract>Computational comprehension and identifying emotional components in language have been critical in enhancing human-computer connection in recent years. The WASSA 2022 Shared Task introduced four tracks and released a dataset of news stories: Track-1 for Empathy and Distress Prediction, Track-2 for Emotion classification, Track-3 for Personality prediction, and Track-4 for Interpersonal Reactivity Index prediction at the essay level. This paper describes our participation in the WASSA 2022 shared task on the tasks mentioned above. We developed multi-task deep learning methods to address Tracks 1 and 2 and machine learning models for Track 3 and 4. Our developed systems achieved average Pearson scores of 0.483, 0.05, and 0.08 for Track 1, 3, and 4, respectively, and a macro F1 score of 0.524 for Track 2 on the test set. We ranked 8th, 11th, 2nd and 2nd for tracks 1, 2, 3, and 4 respectively.</abstract>
       <url hash="d44145b1">2022.wassa-1.26</url>
       <bibkey>ghosh-etal-2022-team</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.26</doi>
     </paper>
     <paper id="27">
       <title>Transformer-based Architecture for Empathy Prediction and Emotion Classification</title>
@@ -329,6 +355,7 @@
       <abstract>This paper describes the contribution of team PHG to the WASSA 2022 shared task on Empathy Prediction and Emotion Classification. The broad goal of this task was to model an empathy score, a distress score and the type of emotion associated with the person who had reacted to the essay written in response to a newspaper article. We have used the RoBERTa model for training and top of which few layers are added to finetune the transformer. We also use few machine learning techniques to augment as well as upsample the data. Our system achieves a Pearson Correlation Coefficient of 0.488 on Task 1 (Empathy - 0.470 and Distress - 0.506) and Macro F1-score of 0.531 on Task 2.</abstract>
       <url hash="7991fc47">2022.wassa-1.27</url>
       <bibkey>vasava-etal-2022-transformer</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.27</doi>
     </paper>
     <paper id="28">
       <title>Prompt-based Pre-trained Model for Personality and Interpersonal Reactivity Prediction</title>
@@ -342,6 +369,7 @@
       <abstract>This paper describes the LingJing team’s method to the Workshop on Computational Approaches to Subjectivity, Sentiment &amp; Social Media Analysis (WASSA) 2022 shared task on Personality Prediction (PER) and Reactivity Index Prediction (IRI). In this paper, we adopt the prompt-based method with the pre-trained language model to accomplish these tasks. Specifically, the prompt is designed to provide knowledge of the extra personalized information for enhancing the pre-trained model. Data augmentation and model ensemble are adopted for obtaining better results. Extensive experiments are performed, which shows the effectiveness of the proposed method. On the final submission, our system achieves a Pearson Correlation Coefficient of 0.2301 and 0.2546 on Track 3 and Track 4 respectively. We ranked 1-st on both sub-tasks.</abstract>
       <url hash="147bb1ac">2022.wassa-1.28</url>
       <bibkey>li-etal-2022-prompt-based</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.28</doi>
     </paper>
     <paper id="29">
       <title><fixed-case>SURREY</fixed-case>-<fixed-case>CTS</fixed-case>-<fixed-case>NLP</fixed-case> at <fixed-case>WASSA</fixed-case>2022: An Experiment of Discourse and Sentiment Analysis for the Prediction of Empathy, Distress and Emotion</title>
@@ -355,6 +383,7 @@
       <url hash="8b17fb6c">2022.wassa-1.29</url>
       <bibkey>qian-etal-2022-surrey</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/goemotions">GoEmotions</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.29</doi>
     </paper>
     <paper id="30">
       <title>An Ensemble Approach to Detect Emotions at an Essay Level</title>
@@ -366,6 +395,7 @@
       <bibkey>maheshwari-varma-2022-ensemble</bibkey>
       <pwccode url="https://github.com/him-mah10/an-ensemble-approach-to-detect-emotions-at-an-essay-level" additional="false">him-mah10/an-ensemble-approach-to-detect-emotions-at-an-essay-level</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/goemotions">GoEmotions</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.30</doi>
     </paper>
     <paper id="31">
       <title><fixed-case>CAISA</fixed-case> at <fixed-case>WASSA</fixed-case> 2022: Adapter-Tuning for Empathy Prediction</title>
@@ -378,6 +408,7 @@
       <bibkey>lahnala-etal-2022-caisa</bibkey>
       <pwccode url="https://github.com/caisa-lab/wassa-empathy-adapters" additional="false">caisa-lab/wassa-empathy-adapters</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/emotion">CARER</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.31</doi>
     </paper>
     <paper id="32">
       <title><fixed-case>NLPOP</fixed-case>: a Dataset for Popularity Prediction of Promoted <fixed-case>NLP</fixed-case> Research on <fixed-case>T</fixed-case>witter</title>
@@ -389,6 +420,7 @@
       <url hash="23d6f20e">2022.wassa-1.32</url>
       <bibkey>obadic-etal-2022-nlpop</bibkey>
       <pwccode url="https://github.com/lobadic/nlpop" additional="false">lobadic/nlpop</pwccode>
+      <doi>10.18653/v1/2022.wassa-1.32</doi>
     </paper>
     <paper id="33">
       <title>Tagging Without Rewriting: A Probabilistic Model for Unpaired Sentiment and Style Transfer</title>
@@ -399,6 +431,7 @@
       <bibkey>shuo-2022-tagging</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/gyafc">GYAFC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/imdb-movie-reviews">IMDb Movie Reviews</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.33</doi>
     </paper>
     <paper id="34">
       <title>Polite Task-oriented Dialog Agents: To Generate or to Rewrite?</title>
@@ -410,6 +443,7 @@
       <url hash="a7e065e6">2022.wassa-1.34</url>
       <bibkey>silva-etal-2022-polite</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/mmd">MMD</pwcdataset>
+      <doi>10.18653/v1/2022.wassa-1.34</doi>
     </paper>
     <paper id="35">
       <title>Items from Psychometric Tests as Training Data for Personality Profiling Models of <fixed-case>T</fixed-case>witter Users</title>
@@ -420,6 +454,7 @@
       <abstract>Machine-learned models for author profiling in social media often rely on data acquired via self-reporting-based psychometric tests (questionnaires) filled out by social media users. This is an expensive but accurate data collection strategy. Another, less costly alternative, which leads to potentially more noisy and biased data, is to rely on labels inferred from publicly available information in the profiles of the users, for instance self-reported diagnoses or test results. In this paper, we explore a third strategy, namely to directly use a corpus of items from validated psychometric tests as training data. Items from psychometric tests often consist of sentences from an I-perspective (e.g., ‘I make friends easily.’). Such corpora of test items constitute ‘small data’, but their availability for many concepts is a rich resource. We investigate this approach for personality profiling, and evaluate BERT classifiers fine-tuned on such psychometric test items for the big five personality traits (openness, conscientiousness, extraversion, agreeableness, neuroticism) and analyze various augmentation strategies regarding their potential to address the challenges coming with such a small corpus. Our evaluation on a publicly available Twitter corpus shows a comparable performance to in-domain training for 4/5 personality traits with T5-based data augmentation.</abstract>
       <url hash="9b836154">2022.wassa-1.35</url>
       <bibkey>kreuter-etal-2022-items</bibkey>
+      <doi>10.18653/v1/2022.wassa-1.35</doi>
     </paper>
   </volume>
 </collection>
diff --git a/data/xml/2022.wit.xml b/data/xml/2022.wit.xml
index b3d18dbec0..d3ae7690df 100644
--- a/data/xml/2022.wit.xml
+++ b/data/xml/2022.wit.xml
@@ -27,6 +27,7 @@
       <url hash="df4345e0">2022.wit-1.1</url>
       <bibkey>park-lee-2022-unsupervised</bibkey>
       <pwccode url="https://github.com/seongminp/graph-dialogue-summary" additional="false">seongminp/graph-dialogue-summary</pwccode>
+      <doi>10.18653/v1/2022.wit-1.1</doi>
     </paper>
     <paper id="2">
       <title>An Interactive Analysis of User-reported Long <fixed-case>COVID</fixed-case> Symptoms using <fixed-case>T</fixed-case>witter Data</title>
@@ -37,6 +38,7 @@
       <abstract>With millions of documented recoveries from COVID-19 worldwide, various long-term sequelae have been observed in a large group of survivors. This paper is aimed at systematically analyzing user-generated conversations on Twitter that are related to long-term COVID symptoms for a better understanding of the Long COVID health consequences. Using an interactive information extraction tool built especially for this purpose, we extracted key information from the relevant tweets and analyzed the user-reported Long COVID symptoms with respect to their demographic and geographical characteristics. The results of our analysis are expected to improve the public awareness on long-term COVID-19 sequelae and provide important insights to public health authorities.</abstract>
       <url hash="e73ec03b">2022.wit-1.2</url>
       <bibkey>miao-etal-2022-interactive</bibkey>
+      <doi>10.18653/v1/2022.wit-1.2</doi>
     </paper>
     <paper id="3">
       <title>Bi-Directional Recurrent Neural Ordinary Differential Equations for Social Media Text Classification</title>
@@ -47,6 +49,7 @@
       <abstract>Classification of posts in social media such as Twitter is difficult due to the noisy and short nature of texts. Sequence classification models based on recurrent neural networks (RNN) are popular for classifying posts that are sequential in nature. RNNs assume the hidden representation dynamics to evolve in a discrete manner and do not consider the exact time of the posting. In this work, we propose to use recurrent neural ordinary differential equations (RNODE) for social media post classification which consider the time of posting and allow the computation of hidden representation to evolve in a time-sensitive continuous manner. In addition, we propose a novel model, Bi-directional RNODE (Bi-RNODE), which can consider the information flow in both the forward and backward directions of posting times to predict the post label. Our experiments demonstrate that RNODE and Bi-RNODE are effective for the problem of stance classification of rumours in social media.</abstract>
       <url hash="2fba72ce">2022.wit-1.3</url>
       <bibkey>tamire-etal-2022-bi</bibkey>
+      <doi>10.18653/v1/2022.wit-1.3</doi>
     </paper>
   </volume>
 </collection>