Skip to content

Commit

Permalink
Add pages for volume v262
Browse files Browse the repository at this point in the history
  • Loading branch information
lawrennd committed Dec 10, 2024
0 parents commit a97ddfa
Show file tree
Hide file tree
Showing 51 changed files with 3,242 additions and 0 deletions.
15 changes: 15 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
source "https://rubygems.org"

git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }

gem 'jekyll'

group :jekyll_plugins do
gem 'github-pages'
gem 'jekyll-remote-theme'
gem 'jekyll-include-cache'
gem 'webrick'
end

# gem "rails"

28 changes: 28 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# PMLR 262

To suggest fixes to this volume please make a pull request containing the changes requested and a justification for the changes.

To edit the details of this conference work edit the [_config.yml](./_config.yml) file and submit a pull request.

To make changes to the individual paper details, edit the associated paper file in the [./_posts](./_posts) subdirectory.

For details of how to publish in PMLR please check https://proceedings.mlr.press/faq.html

For details of what is required to submit a proceedings please check https://proceedings.mlr.press/spec.html



Published as Volume 262 by the Proceedings of Machine Learning Research on 10 December 2024.

Volume Edited by:
* Mehdi Rezagholizadeh
* Peyman Passban
* Soheila Samiee
* Vahid Partovi Nia
* Yu Cheng
* Yue Deng
* Qun Liu
* Boxing Chen

Series Editors:
* Neil D. Lawrence
110 changes: 110 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
booktitle: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing
Workshop
shortname: ENLSP-IV 2024
sections:
- name: Training
title: Training
- name: Model Design \& Architecture
title: Model Design \& Architecture
- name: Model Efficiency \& Compression
title: Model Efficiency \& Compression
- name: Inference
title: Inference
- name: " Benchmark \\& Evaluation"
title: " Benchmark \\& Evaluation"
- name: 'Applications '
title: 'Applications '
volume: '262'
year: '2024'
start: &1 2024-12-14
end: 2024-12-14
published: 2024-12-10
layout: proceedings
series: Proceedings of Machine Learning Research
publisher: PMLR
issn: 2640-3498
id: ENLSP-2024
month: 0
cycles: false
bibtex_editor: Rezagholizadeh, Mehdi and Passban, Peyman and Samiee, Soheila and Partovi
Nia, Vahid and Cheng, Yu and Deng, Yue and Liu, Qun and Chen, Boxing
editor:
- given: Mehdi
family: Rezagholizadeh
- given: Peyman
family: Passban
- given: Soheila
family: Samiee
- given: Vahid
family: Partovi Nia
- given: Yu
family: Cheng
- given: Yue
family: Deng
- given: Qun
family: Liu
- given: Boxing
family: Chen
title: Proceedings of Machine Learning Research
description: |
Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop
Held in Vancouver, British Columbia, Canada on 14 December 2024
Published as Volume 262 by the Proceedings of Machine Learning Research on 10 December 2024.
Volume Edited by:
Mehdi Rezagholizadeh
Peyman Passban
Soheila Samiee
Vahid Partovi Nia
Yu Cheng
Yue Deng
Qun Liu
Boxing Chen
Series Editors:
Neil D. Lawrence
date_str: 14 Dec
url: https://proceedings.mlr.press
author:
name: PMLR
baseurl: "/v262"
twitter_username: MLResearchPress
github_username: mlresearch
markdown: kramdown
exclude:
- README.md
- Gemfile
- ".gitignore"
plugins:
- jekyll-feed
- jekyll-seo-tag
- jekyll-remote-theme
remote_theme: mlresearch/jekyll-theme
style: pmlr
permalink: "/:title.html"
ghub:
edit: true
repository: v262
display:
copy_button:
bibtex: true
endnote: true
apa: true
comments: false
volume_type: Volume
volume_dir: v262
email: ''
conference:
name: NeurIPS Efficient Natural Language and Speech Processing Workshop
url: https://neurips2024-enlsp.github.io/
location: Vancouver, British Columbia, Canada
dates:
- *1
analytics:
google:
tracking_id: UA-92432422-1
orig_bibfile: "/Users/neil/mlresearch/v262/enlsp24.bib"
# Site settings
# Original source: /Users/neil/mlresearch/v262/enlsp24.bib
60 changes: 60 additions & 0 deletions _posts/2024-12-10-agrawal24a.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: 'AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models
via an Entropy-based Lower Bound on Token Acceptance Probability'
section: Inference
abstract: 'Speculative decoding is a powerful technique that attempts to circumvent
the autoregressive constraint of modern Large Language Models (LLMs). The aim of
speculative decoding techniques is to improve the average inference time of a large,
target model without sacrificing its accuracy, by using a more efficient draft model
to propose draft tokens which are then verified in parallel. The number of draft
tokens produced in each drafting round is referred to as the draft length and is
often a static hyperparameter chosen based on the acceptance rate statistics of
the draft tokens. However, setting a static draft length can negatively impact performance,
especially in scenarios where drafting is expensive and there is a high variance
in the number of tokens accepted. Adaptive Entropy-based Draft Length (AdaEDL) is
a simple, training and parameter-free criteria which allows for early stopping of
the token drafting process by approximating a lower bound on the expected acceptance
probability of the drafted token based on the currently observed entropy of the
drafted logits. We show that AdaEDL consistently outperforms static draft-length
speculative decoding by 10%-57% as well as other training-free draft-stopping techniques
by upto 10% in a variety of settings and datasets. At the same time, we show that
AdaEDL is more robust than these techniques and preserves performance in high-sampling-temperature
scenarios. Since it is training-free, in contrast to techniques that rely on the
training of dataset-specific draft-stopping predictors, AdaEDL can seamlessly be
integrated into a variety of pre-existing LLM systems. '
layout: inproceedings
series: Proceedings of Machine Learning Research
publisher: PMLR
issn: 2640-3498
id: agrawal24a
month: 0
tex_title: "{AdaEDL}: Early Draft Stopping for Speculative Decoding of Large Language
Models via an Entropy-based Lower Bound on Token Acceptance Probability"
firstpage: 355
lastpage: 369
page: 355-369
order: 355
cycles: false
bibtex_author: Agrawal, Sudhanshu and Jeon, Wonseok and Lee, Mingu
author:
- given: Sudhanshu
family: Agrawal
- given: Wonseok
family: Jeon
- given: Mingu
family: Lee
date: 2024-12-10
address:
container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
Processing Workshop
volume: '262'
genre: inproceedings
issued:
date-parts:
- 2024
- 12
- 10
pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/agrawal24a/agrawal24a.pdf
extras: []
# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
---
57 changes: 57 additions & 0 deletions _posts/2024-12-10-ali-sadraei-javaheri24a.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: 'SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition
of Multi Token Embeddings'
section: Training
abstract: 'Soft prompt tuning techniques have recently gained traction as an effective
strategy for the parameter-efficient tuning of pre-trained language models, particularly
minimizing the required adjustment of model parameters. Despite their growing use,
achieving optimal tuning with soft prompts, especially with smaller datasets, remains
a substantial challenge. This study makes two contributions in this domain: (i)
we introduce SuperPos-Prompt, a new reparameterization technique employing the superposition
of multiple pre-trained vocabulary embeddings to improve the learning of soft prompts.
Our experiments across several GLUE and SuperGLUE benchmarks consistently highlight
SuperPos-Prompt’s superiority over Residual Prompt tuning, exhibiting an average
score increase of +6.4 in T5-Small and +5.0 in T5-Base along with a faster convergence.
Remarkably, SuperPos-Prompt occasionally outperforms even full fine-tuning methods.
(ii) Additionally, we demonstrate enhanced performance and rapid convergence by
omitting dropouts from the frozen network, yielding consistent improvements across
various scenarios and tuning methods.'
layout: inproceedings
series: Proceedings of Machine Learning Research
publisher: PMLR
issn: 2640-3498
id: ali-sadraei-javaheri24a
month: 0
tex_title: "{SuperPos-Prompt}: Enhancing Soft Prompt Tuning of Language Models with
Superposition of Multi Token Embeddings"
firstpage: 34
lastpage: 46
page: 34-46
order: 34
cycles: false
bibtex_author: Ali Sadraei Javaheri, Mohammad and Asgari, Ehsaneddin and C. McHardy,
Alice and R. Rabiee, Hamid
author:
- given: Mohammad
family: Ali Sadraei Javaheri
- given: Ehsaneddin
family: Asgari
- given: Alice
family: C. McHardy
- given: Hamid
family: R. Rabiee
date: 2024-12-10
address:
container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
Processing Workshop
volume: '262'
genre: inproceedings
issued:
date-parts:
- 2024
- 12
- 10
pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/ali-sadraei-javaheri24a/ali-sadraei-javaheri24a.pdf
extras: []
# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
---
73 changes: 73 additions & 0 deletions _posts/2024-12-10-alizadeh-vahid24a.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
title: 'Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models'
section: Inference
abstract: 'Large Language Models (LLMs) typically generate outputs token by token
using a fixed compute budget, leading to inefficient resource utilization. To address
this shortcoming, recent advancements in mixture of expert (MoE) models, speculative
decoding, and early exit strategies leverage the insight that computational demands
can vary significantly based on the complexity and nature of the input. However,
identifying optimal routing patterns for dynamic execution remains an open challenge,
limiting the full potential of these adaptive methods. To address this need, we
study adaptive computation in LLMs more systematically. We propose a novel framework
that integrates smaller auxiliary modules within each Feed-Forward Network layer
of the LLM. This design enables dynamic routing of tokens based on task complexity:
tokens can be processed by either the small or big modules at each layer, or even
bypass certain layers entirely. This allows us to introduce a novel notion of a
token’s difficulty, defined by its potential to benefit from additional computational
resources. Importantly, by employing oracles to identify optimal patterns of adaptive
computations, we gain valuable insights into the internal workings of LLMs and the
routing processes in a simplified heterogeneous MoE setup. We show that trained
routers operate differently from oracles and often yield suboptimal solutions. Notably,
activating a large module in just one layer outperforms models that use large modules
across all layers, underscoring the gap between practical implementations of routing
in MoE models and theoretical optima for adaptive computation.'
layout: inproceedings
series: Proceedings of Machine Learning Research
publisher: PMLR
issn: 2640-3498
id: alizadeh-vahid24a
month: 0
tex_title: "{Duo-LLM}: A Framework for Studying Adaptive Computation in Large Language
Models"
firstpage: 443
lastpage: 455
page: 443-455
order: 443
cycles: false
bibtex_author: Alizadeh-Vahid, Keivan and Iman Mirzadeh, Seyed and Shahrkokhi, Hooman
and Belenko, Dmitry and Sun, Frank and Cho, Minsik and Hossein Sekhavat, Mohammad
and Nabi, Moin and Farajtabar, Mehrdad
author:
- given: Keivan
family: Alizadeh-Vahid
- given: Seyed
family: Iman Mirzadeh
- given: Hooman
family: Shahrkokhi
- given: Dmitry
family: Belenko
- given: Frank
family: Sun
- given: Minsik
family: Cho
- given: Mohammad
family: Hossein Sekhavat
- given: Moin
family: Nabi
- given: Mehrdad
family: Farajtabar
date: 2024-12-10
address:
container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
Processing Workshop
volume: '262'
genre: inproceedings
issued:
date-parts:
- 2024
- 12
- 10
pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/alizadeh-vahid24a/alizadeh-vahid24a.pdf
extras: []
# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
---
45 changes: 45 additions & 0 deletions _posts/2024-12-10-ardestani24a.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
title: Text Summarization With Graph Attention Networks
section: Applications
abstract: This study aimed to leverage graph information, particularly Rhetorical
Structure Theory (RST) and Co-reference (Coref) graphs, to enhance the performance
of our baseline summarization models. Specifically, we experimented with a Graph
Attention Network architecture to incorporate graph information. However, this architecture
did not enhance the performance. Subsequently, we used a simple Multi-layer Perceptron
architecture, which improved the results in our proposed model on our primary dataset,
CNN/DM. Additionally, we annotated XSum dataset with RST graph information, establishing
a benchmark for future graph-based summarization models. This secondary dataset
posed multiple challenges, revealing both the merits and limitations of our models.
layout: inproceedings
series: Proceedings of Machine Learning Research
publisher: PMLR
issn: 2640-3498
id: ardestani24a
month: 0
tex_title: Text Summarization With Graph Attention Networks
firstpage: 540
lastpage: 553
page: 540-553
order: 540
cycles: false
bibtex_author: Ardestani, Mohammadreza and Chali, Yllias
author:
- given: Mohammadreza
family: Ardestani
- given: Yllias
family: Chali
date: 2024-12-10
address:
container-title: Proceedings of The 4th NeurIPS Efficient Natural Language and Speech
Processing Workshop
volume: '262'
genre: inproceedings
issued:
date-parts:
- 2024
- 12
- 10
pdf: https://raw.githubusercontent.com/mlresearch/v262/main/assets/ardestani24a/ardestani24a.pdf
extras: []
# Format based on Martin Fenner's citeproc: https://blog.front-matter.io/posts/citeproc-yaml-for-bibliographies/
---
Loading

0 comments on commit a97ddfa

Please sign in to comment.