Update metadata from Papers with Code

EricPeter · Aug 11, 2022 · bdc0ece · bdc0ece
1 parent b0d8978
commit bdc0ece
Show file tree

Hide file tree

Showing 30 changed files with 808 additions and 795 deletions.
diff --git a/data/xml/2020.acl.xml b/data/xml/2020.acl.xml
@@ -6614,7 +6614,7 @@
       <doi>10.18653/v1/2020.acl-main.447</doi>
       <video href="http://slideslive.com/38929131"/>
       <bibkey>lo-etal-2020-s2orc</bibkey>
-      <pwccode url="https://github.com/allenai/s2orc" additional="true">allenai/s2orc</pwccode>
+      <pwccode url="https://github.com/allenai/s2-gorc" additional="true">allenai/s2-gorc</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/s2orc">S2ORC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/cord-19">CORD-19</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/dblp">DBLP</pwcdataset>

diff --git a/data/xml/2021.eacl.xml b/data/xml/2021.eacl.xml
@@ -2866,7 +2866,7 @@
       <url hash="73788212">2021.eacl-main.213</url>
       <bibkey>padmakumar-he-2021-unsupervised</bibkey>
       <doi>10.18653/v1/2021.eacl-main.213</doi>
-      <pwccode url="https://github.com/vishakhpk/mi-unsup-summ" additional="true">vishakhpk/mi-unsup-summ</pwccode>
+      <pwccode url="https://github.com/vishakhpk/mi-unsup-summ" additional="false">vishakhpk/mi-unsup-summ</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/reddit-tifu">Reddit TIFU</pwcdataset>
     </paper>
     <paper id="214">

diff --git a/data/xml/2021.naacl.xml b/data/xml/2021.naacl.xml
@@ -6767,7 +6767,6 @@
       <doi>10.18653/v1/2021.naacl-main.441</doi>
       <bibkey>chen-etal-2021-shadowgnn</bibkey>
       <video href="2021.naacl-main.441.mp4"/>
-      <pwccode url="https://github.com/WowCZ/shadowgnn" additional="false">WowCZ/shadowgnn</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/scan">SCAN</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/spider-1">SPIDER</pwcdataset>
     </paper>

diff --git a/data/xml/2022.acl.xml b/data/xml/2022.acl.xml
@@ -2673,6 +2673,7 @@
       <url hash="a7669577">2022.acl-long.172</url>
       <bibkey>ding-etal-2022-redistributing</bibkey>
       <doi>10.18653/v1/2022.acl-long.172</doi>
+      <pwccode url="https://github.com/alphadl/rlfw-nat.mono" additional="false">alphadl/rlfw-nat.mono</pwccode>
     </paper>
     <paper id="173">
       <title>Dependency Parsing as <fixed-case>MRC</fixed-case>-based Span-Span Prediction</title>
@@ -3752,6 +3753,7 @@
       <bibkey>krojer-etal-2022-image</bibkey>
       <doi>10.18653/v1/2022.acl-long.241</doi>
       <pwccode url="https://github.com/mcgill-nlp/imagecode" additional="false">mcgill-nlp/imagecode</pwccode>
+      <pwcdataset url="https://paperswithcode.com/dataset/imagecode">ImageCoDe</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/spot-the-diff">Spot-the-diff</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/video-storytelling">Video Storytelling</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/youcook">YouCook</pwcdataset>
@@ -8129,6 +8131,7 @@ in the Case of Unambiguous Gender</title>
       <pwcdataset url="https://paperswithcode.com/dataset/rxr">RxR</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/streetlearn">StreetLearn</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/talk-the-walk">Talk the Walk</pwcdataset>
+      <pwcdataset url="https://paperswithcode.com/dataset/vln-ce">VLN-CE</pwcdataset>
     </paper>
     <paper id="525">
       <title>Learning to Generate Programs for Table Fact Verification via Structure-Aware Semantic Parsing</title>

diff --git a/data/xml/2022.autosimtrans.xml b/data/xml/2022.autosimtrans.xml
@@ -30,9 +30,9 @@
       <abstract>This paper reports the results of the shared task we hosted on the Third Workshop of Automatic Simultaneous Translation (AutoSimTrans). The shared task aims to promote the development of text-to-text and speech-to-text simultaneous translation, and includes Chinese-English and English-Spanish tracks. The number of systems submitted this year has increased fourfold compared with last year. Additionally, the top 1 ranked system in the speech-to-text track is the first end-to-end submission we have received in the past three years, which has shown great potential. This paper reports the results and descriptions of the 14 participating teams, compares different evaluation metrics, and revisits the ranking method.</abstract>
       <url hash="3926445e">2022.autosimtrans-1.1</url>
       <bibkey>zhang-etal-2022-findings</bibkey>
+      <doi>10.18653/v1/2022.autosimtrans-1.1</doi>
       <pwcdataset url="https://paperswithcode.com/dataset/aishell-1">AISHELL-1</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/bstc">BSTC</pwcdataset>
-      <doi>10.18653/v1/2022.autosimtrans-1.1</doi>
     </paper>
     <paper id="2">
       <title>Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation</title>
@@ -44,8 +44,8 @@
       <abstract>Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL). In this paper we highlight that, despite its widespread adoption, AL provides underestimated scores for systems that generate longer predictions compared to the corresponding references. We also show that this problem has practical relevance, as recent SimulST systems have indeed a tendency to over-generate. As a solution, we propose LAAL (Length-Adaptive Average Lagging), a modified version of the metric that takes into account the over-generation phenomenon and allows for unbiased evaluation of both under-/over-generating systems.</abstract>
       <url hash="2210de52">2022.autosimtrans-1.2</url>
       <bibkey>papi-etal-2022-generation</bibkey>
-      <pwccode url="https://github.com/hlt-mt/fbk-fairseq" additional="false">hlt-mt/fbk-fairseq</pwccode>
       <doi>10.18653/v1/2022.autosimtrans-1.2</doi>
+      <pwccode url="https://github.com/hlt-mt/fbk-fairseq" additional="false">hlt-mt/fbk-fairseq</pwccode>
     </paper>
     <paper id="3">
       <title>System Description on Automatic Simultaneous Translation Workshop</title>
@@ -56,9 +56,9 @@
       <abstract>This paper describes our system submitted on the third automatic simultaneous translation workshop at NAACL2022. We participate in the Chinese audio-&gt;English text direction of Chinese-to-English translation. Our speech-to-text system is a pipeline system, in which we resort to rhymological features for audio split, ASRT model for speech recoginition, STACL model for streaming text translation. To translate streaming text, we use wait-k policy trained to generate the target sentence concurrently with the source sentence, but always k words behind. We propose a competitive simultaneous translation system and rank 3rd in the audio input track. The code will release soon.</abstract>
       <url hash="9101567a">2022.autosimtrans-1.3</url>
       <bibkey>li-etal-2022-system</bibkey>
+      <doi>10.18653/v1/2022.autosimtrans-1.3</doi>
       <pwcdataset url="https://paperswithcode.com/dataset/aishell-1">AISHELL-1</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/thchs-30">THCHS-30</pwcdataset>
-      <doi>10.18653/v1/2022.autosimtrans-1.3</doi>
     </paper>
     <paper id="4">
       <title>System Description on Third Automatic Simultaneous Translation Workshop</title>
@@ -79,8 +79,8 @@
       <abstract>This paper describes the system submitted to AutoSimTrans 2022 from Huawei Noah’s Ark Lab, which won the first place in the audio input track of the Chinese-English translation task. Our system is based on RealTranS, an end-to-end simultaneous speech translation model. We enhance the model with pretraining, by initializing the acoustic encoder with ASR encoder, and the semantic encoder and decoder with NMT encoder and decoder, respectively. To relieve the data scarcity, we further construct pseudo training corpus as a kind of knowledge distillation with ASR data and the pretrained NMT model. Meanwhile, we also apply several techniques to improve the robustness and domain generalizability, including punctuation removal, token-level knowledge distillation and multi-domain finetuning. Experiments show that our system significantly outperforms the baselines at all latency and also verify the effectiveness of our proposed methods.</abstract>
       <url hash="8bcdac35">2022.autosimtrans-1.5</url>
       <bibkey>zeng-etal-2022-end</bibkey>
-      <pwcdataset url="https://paperswithcode.com/dataset/bstc">BSTC</pwcdataset>
       <doi>10.18653/v1/2022.autosimtrans-1.5</doi>
+      <pwcdataset url="https://paperswithcode.com/dataset/bstc">BSTC</pwcdataset>
     </paper>
     <paper id="6">
       <title><fixed-case>BIT</fixed-case>-Xiaomi’s System for <fixed-case>A</fixed-case>uto<fixed-case>S</fixed-case>im<fixed-case>T</fixed-case>rans 2022</title>
@@ -97,8 +97,8 @@
       <abstract>This system paper describes the BIT-Xiaomi simultaneous translation system for Autosimtrans 2022 simultaneous translation challenge. We participated in three tracks: the Zh-En text-to-text track, the Zh-En audio-to-text track and the En-Es test-to-text track. In our system, wait-k is employed to train prefix-to-prefix translation models. We integrate streaming chunking to detect boundaries as the source streaming read in. We further improve our system with data selection, data-augmentation and R-drop training methods. Results show that our wait-k implementation outperforms organizer’s baseline by 8 BLEU score at most, and our proposed streaming chunking method further improves about 2 BLEU in low latency regime.</abstract>
       <url hash="04afefe1">2022.autosimtrans-1.6</url>
       <bibkey>liu-etal-2022-bit</bibkey>
-      <pwcdataset url="https://paperswithcode.com/dataset/bstc">BSTC</pwcdataset>
       <doi>10.18653/v1/2022.autosimtrans-1.6</doi>
+      <pwcdataset url="https://paperswithcode.com/dataset/bstc">BSTC</pwcdataset>
     </paper>
     <paper id="7">
       <title><fixed-case>USST</fixed-case>’s System for <fixed-case>A</fixed-case>uto<fixed-case>S</fixed-case>im<fixed-case>T</fixed-case>rans 2022</title>
@@ -108,9 +108,9 @@
       <abstract>This paper describes our submitted text-to-text Simultaneous translation (ST) system, which won the second place in the Chinese→English streaming translation task of AutoSimTrans 2022. Our baseline system is a BPE-based Transformer model trained with the PaddlePaddle framework. In our experiments, we employ data synthesis and ensemble approaches to enhance the base model. In order to bridge the gap between general domain and spoken domain, we select in-domain data from general corpus and mixed then with spoken corpus for mixed fine tuning. Finally, we adopt fixed wait-k policy to transfer our full-sentence translation model to simultaneous translation model. Experiments on the development data show that our system outperforms than the baseline system.</abstract>
       <url hash="6cb6f915">2022.autosimtrans-1.7</url>
       <bibkey>hui-jun-2022-ussts</bibkey>
+      <doi>10.18653/v1/2022.autosimtrans-1.7</doi>
       <pwccode url="https://github.com/tyy2022/usst_autosimultrans2022" additional="false">tyy2022/usst_autosimultrans2022</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/bstc">BSTC</pwcdataset>
-      <doi>10.18653/v1/2022.autosimtrans-1.7</doi>
     </paper>
   </volume>
 </collection>