Add pages for volume v262

mlresearch · Dec 10, 2024 · a97ddfa · a97ddfa
commit a97ddfa
Show file tree

Hide file tree

Showing 51 changed files with 3,242 additions and 0 deletions.
diff --git a/Gemfile b/Gemfile
@@ -0,0 +1,15 @@
+source "https://rubygems.org"
+
+git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
+
+gem 'jekyll'
+
+group :jekyll_plugins do
+  gem 'github-pages'
+  gem 'jekyll-remote-theme'
+  gem 'jekyll-include-cache'
+  gem 'webrick'
+end
+
+# gem "rails"
+
diff --git a/README.md b/README.md
@@ -0,0 +1,28 @@
+# PMLR 262
+
+To suggest fixes to this volume please make a pull request containing the changes requested and a justification for the changes.
+
+To edit the details of this conference work edit the [_config.yml](./_config.yml) file and submit a pull request.
+
+To make changes to the individual paper details, edit the associated paper file in the [./_posts](./_posts) subdirectory.
+
+For details of how to publish in PMLR please check https://proceedings.mlr.press/faq.html
+
+For details of what is required to submit a proceedings please check https://proceedings.mlr.press/spec.html
+
+
+
+Published as Volume 262 by the Proceedings of Machine Learning Research on 10 December 2024.
+
+Volume Edited by:
+  * Mehdi Rezagholizadeh
+  * Peyman Passban
+  * Soheila Samiee
+  * Vahid Partovi Nia
+  * Yu Cheng
+  * Yue Deng
+  * Qun Liu
+  * Boxing Chen
+
+Series Editors:
+  * Neil D. Lawrence
diff --git a/_config.yml b/_config.yml
@@ -0,0 +1,110 @@
+---
+booktitle: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing
+  Workshop
+shortname: ENLSP-IV 2024
+sections:
+- name: Training
+  title: Training
+- name: Model Design \& Architecture
+  title: Model Design \& Architecture
+- name: Model Efficiency \& Compression
+  title: Model Efficiency \& Compression
+- name: Inference
+  title: Inference
+- name: " Benchmark \\& Evaluation"
+  title: " Benchmark \\& Evaluation"
+- name: 'Applications '
+  title: 'Applications '
+volume: '262'
+year: '2024'
+start: &1 2024-12-14
+end: 2024-12-14
+published: 2024-12-10
+layout: proceedings
+series: Proceedings of Machine Learning Research
+publisher: PMLR
+issn: 2640-3498
+id: ENLSP-2024
+month: 0
+cycles: false
+bibtex_editor: Rezagholizadeh, Mehdi and Passban, Peyman and Samiee, Soheila and Partovi
+  Nia, Vahid and Cheng, Yu and Deng, Yue and Liu, Qun and Chen, Boxing
+editor:
+- given: Mehdi
+  family: Rezagholizadeh
+- given: Peyman
+  family: Passban
+- given: Soheila
+  family: Samiee
+- given: Vahid
+  family: Partovi Nia
+- given: Yu
+  family: Cheng
+- given: Yue
+  family: Deng
+- given: Qun
+  family: Liu
+- given: Boxing
+  family: Chen
+title: Proceedings of Machine Learning Research
+description: |
+  Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop
+    Held in Vancouver, British Columbia, Canada on 14 December 2024
+
+  Published as Volume 262 by the Proceedings of Machine Learning Research on 10 December 2024.
+
+  Volume Edited by:
+    Mehdi Rezagholizadeh
+    Peyman Passban
+    Soheila Samiee
+    Vahid Partovi Nia
+    Yu Cheng
+    Yue Deng
+    Qun Liu
+    Boxing Chen
+
+  Series Editors:
+    Neil D. Lawrence
+date_str: 14 Dec
+url: https://proceedings.mlr.press
+author:
+  name: PMLR
+baseurl: "/v262"
+twitter_username: MLResearchPress
+github_username: mlresearch
+markdown: kramdown
+exclude:
+- README.md
+- Gemfile
+- ".gitignore"
+plugins:
+- jekyll-feed
+- jekyll-seo-tag
+- jekyll-remote-theme
+remote_theme: mlresearch/jekyll-theme
+style: pmlr
+permalink: "/:title.html"
+ghub:
+  edit: true
+  repository: v262
+display:
+  copy_button:
+    bibtex: true
+    endnote: true
+    apa: true
+  comments: false
+volume_type: Volume
+volume_dir: v262
+email: ''
+conference:
+  name: NeurIPS Efficient Natural Language and Speech Processing Workshop
+  url: https://neurips2024-enlsp.github.io/
+  location: Vancouver, British Columbia, Canada
+  dates:
+  - *1
+analytics:
+  google:
+    tracking_id: UA-92432422-1
+orig_bibfile: "/Users/neil/mlresearch/v262/enlsp24.bib"
+# Site settings
+# Original source:  /Users/neil/mlresearch/v262/enlsp24.bib
diff --git a/_posts/2024-12-10-agrawal24a.md b/_posts/2024-12-10-agrawal24a.md
@@ -0,0 +1,60 @@
+---
+title: 'AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models
+  via an Entropy-based Lower Bound on Token Acceptance Probability'
+section: Inference
+abstract: 'Speculative decoding is a powerful technique that attempts to circumvent
+  the autoregressive constraint of modern Large Language Models (LLMs). The aim of
+  speculative decoding techniques is to improve the average inference time of a large,
+  target model without sacrificing its accuracy, by using a more efficient draft model
+  to propose draft tokens which are then verified in parallel. The number of draft
+  tokens produced in each drafting round is referred to as the draft length and is
+  often a static hyperparameter chosen based on the acceptance rate statistics of
+  the draft tokens. However, setting a static draft length can negatively impact performance,
+  especially in scenarios where drafting is expensive and there is a high variance
+  in the number of tokens accepted. Adaptive Entropy-based Draft Length (AdaEDL) is
+  a simple, training and parameter-free criteria which allows for early stopping of
+  the token drafting process by approximating a lower bound on the expected acceptance
+  probability of the drafted token based on the currently observed entropy of the
+  drafted logits. We show that AdaEDL consistently outperforms static draft-length
+  speculative decoding by 10%-57% as well as other training-free draft-stopping techniques
+  by upto 10% in a variety of settings and datasets. At the same time, we show that
+  AdaEDL is more robust than these techniques and preserves performance in high-sampling-temperature
+  scenarios. Since it is training-free, in contrast to techniques that rely on the
+  training of dataset-specific draft-stopping predictors, AdaEDL can seamlessly be
+  integrated into a variety of pre-existing LLM systems. '
+layout: inproceedings
+series: Proceedings of Machine Learning Research
+publisher: PMLR
+issn: 2640-3498
+id: agrawal24a
+month: 0
+tex_title: "{AdaEDL}: Early Draft Stopping for Speculative Decoding of Large Language
+  Models via an Entropy-based Lower Bound on Token Acceptance Probability"
+firstpage: 355
+lastpage: 369
+page: 355-369
+order: 355
+cycles: false
+bibtex_author: Agrawal, Sudhanshu and Jeon, Wonseok and Lee, Mingu
+author:
+- given: Sudhanshu
+  family: Agrawal
+- given: Wonseok
+  family: Jeon
+- given: Mingu
+  family: Lee
+date: 2024-12-10
+address:
+container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
+  Processing Workshop
+volume: '262'
+genre: inproceedings
+issued:
+  date-parts:
+  - 2024
+  - 12
+  - 10
+pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/agrawal24a/agrawal24a.pdf
+extras: []
+# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
+---
diff --git a/_posts/2024-12-10-ali-sadraei-javaheri24a.md b/_posts/2024-12-10-ali-sadraei-javaheri24a.md
@@ -0,0 +1,57 @@
+---
+title: 'SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition
+  of Multi Token Embeddings'
+section: Training
+abstract: 'Soft prompt tuning techniques have recently gained traction as an effective
+  strategy for the parameter-efficient tuning of pre-trained language models, particularly
+  minimizing the required adjustment of model parameters. Despite their growing use,
+  achieving optimal tuning with soft prompts, especially with smaller datasets, remains
+  a substantial challenge. This study makes two contributions in this domain: (i)
+  we introduce SuperPos-Prompt, a new reparameterization technique employing the superposition
+  of multiple pre-trained vocabulary embeddings to improve the learning of soft prompts.
+  Our experiments across several GLUE and SuperGLUE benchmarks consistently highlight
+  SuperPos-Prompt’s superiority over Residual Prompt tuning, exhibiting an average
+  score increase of +6.4 in T5-Small and +5.0 in T5-Base along with a faster convergence.
+  Remarkably, SuperPos-Prompt occasionally outperforms even full fine-tuning methods.
+  (ii) Additionally, we demonstrate enhanced performance and rapid convergence by
+  omitting dropouts from the frozen network, yielding consistent improvements across
+  various scenarios and tuning methods.'
+layout: inproceedings
+series: Proceedings of Machine Learning Research
+publisher: PMLR
+issn: 2640-3498
+id: ali-sadraei-javaheri24a
+month: 0
+tex_title: "{SuperPos-Prompt}: Enhancing Soft Prompt Tuning of Language Models with
+  Superposition of Multi Token Embeddings"
+firstpage: 34
+lastpage: 46
+page: 34-46
+order: 34
+cycles: false
+bibtex_author: Ali Sadraei Javaheri, Mohammad and Asgari, Ehsaneddin and C. McHardy,
+  Alice and R. Rabiee, Hamid
+author:
+- given: Mohammad
+  family: Ali Sadraei Javaheri
+- given: Ehsaneddin
+  family: Asgari
+- given: Alice
+  family: C. McHardy
+- given: Hamid
+  family: R. Rabiee
+date: 2024-12-10
+address:
+container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
+  Processing Workshop
+volume: '262'
+genre: inproceedings
+issued:
+  date-parts:
+  - 2024
+  - 12
+  - 10
+pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/ali-sadraei-javaheri24a/ali-sadraei-javaheri24a.pdf
+extras: []
+# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
+---
diff --git a/_posts/2024-12-10-alizadeh-vahid24a.md b/_posts/2024-12-10-alizadeh-vahid24a.md
@@ -0,0 +1,73 @@
+---
+title: 'Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models'
+section: Inference
+abstract: 'Large Language Models (LLMs) typically generate outputs token by token
+  using a fixed compute budget, leading to inefficient resource utilization. To address
+  this shortcoming, recent advancements in mixture of expert (MoE) models, speculative
+  decoding, and early exit strategies leverage the insight that computational demands
+  can vary significantly based on the complexity and nature of the input. However,
+  identifying optimal routing patterns for dynamic execution remains an open challenge,
+  limiting the full potential of these adaptive methods. To address this need, we
+  study adaptive computation in LLMs more systematically. We propose a novel framework
+  that integrates smaller auxiliary modules within each Feed-Forward Network layer
+  of the LLM. This design enables dynamic routing of tokens based on task complexity:
+  tokens can be processed by either the small or big modules at each layer, or even
+  bypass certain layers entirely. This allows us to introduce a novel notion of a
+  token’s difficulty, defined by its potential to benefit from additional computational
+  resources. Importantly, by employing oracles to identify optimal patterns of adaptive
+  computations, we gain valuable insights into the internal workings of LLMs and the
+  routing processes in a simplified heterogeneous MoE setup. We show that trained
+  routers operate differently from oracles and often yield suboptimal solutions. Notably,
+  activating a large module in just one layer outperforms models that use large modules
+  across all layers, underscoring the gap between practical implementations of routing
+  in MoE models and theoretical optima for adaptive computation.'
+layout: inproceedings
+series: Proceedings of Machine Learning Research
+publisher: PMLR
+issn: 2640-3498
+id: alizadeh-vahid24a
+month: 0
+tex_title: "{Duo-LLM}: A Framework for Studying Adaptive Computation in Large Language
+  Models"
+firstpage: 443
+lastpage: 455
+page: 443-455
+order: 443
+cycles: false
+bibtex_author: Alizadeh-Vahid, Keivan and Iman Mirzadeh, Seyed and Shahrkokhi, Hooman
+  and Belenko, Dmitry and Sun, Frank and Cho, Minsik and Hossein Sekhavat, Mohammad
+  and Nabi, Moin and Farajtabar, Mehrdad
+author:
+- given: Keivan
+  family: Alizadeh-Vahid
+- given: Seyed
+  family: Iman Mirzadeh
+- given: Hooman
+  family: Shahrkokhi
+- given: Dmitry
+  family: Belenko
+- given: Frank
+  family: Sun
+- given: Minsik
+  family: Cho
+- given: Mohammad
+  family: Hossein Sekhavat
+- given: Moin
+  family: Nabi
+- given: Mehrdad
+  family: Farajtabar
+date: 2024-12-10
+address:
+container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
+  Processing Workshop
+volume: '262'
+genre: inproceedings
+issued:
+  date-parts:
+  - 2024
+  - 12
+  - 10
+pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/alizadeh-vahid24a/alizadeh-vahid24a.pdf
+extras: []
+# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
+---
diff --git a/_posts/2024-12-10-ardestani24a.md b/_posts/2024-12-10-ardestani24a.md
@@ -0,0 +1,45 @@
+---
+title: Text Summarization With Graph Attention Networks
+section: Applications
+abstract: This study aimed to leverage graph information, particularly Rhetorical
+  Structure Theory (RST) and Co-reference (Coref) graphs, to enhance the performance
+  of our baseline summarization models. Specifically, we experimented with a Graph
+  Attention Network architecture to incorporate graph information. However, this architecture
+  did not enhance the performance. Subsequently, we used a simple Multi-layer Perceptron
+  architecture, which improved the results in our proposed model on our primary dataset,
+  CNN/DM. Additionally, we annotated XSum dataset with RST graph information, establishing
+  a benchmark for future graph-based summarization models. This secondary dataset
+  posed multiple challenges, revealing both the merits and limitations of our models.
+layout: inproceedings
+series: Proceedings of Machine Learning Research
+publisher: PMLR
+issn: 2640-3498
+id: ardestani24a
+month: 0
+tex_title: Text Summarization With Graph Attention Networks
+firstpage: 540
+lastpage: 553
+page: 540-553
+order: 540
+cycles: false
+bibtex_author: Ardestani, Mohammadreza and Chali, Yllias
+author:
+- given: Mohammadreza
+  family: Ardestani
+- given: Yllias
+  family: Chali
+date: 2024-12-10
+address:
+container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
+  Processing Workshop
+volume: '262'
+genre: inproceedings
+issued:
+  date-parts:
+  - 2024
+  - 12
+  - 10
+pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/ardestani24a/ardestani24a.pdf
+extras: []
+# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
+---