Continual Joint Multiple Events Extraction via Adaptive Bias Mixing

A paper I am working on

Continual event extraction is of practical utility in natural language processing. In the real world, it is common to encounter novel event types or data sources, on which the model needs to quickly adapt without forgetting knowledge of old tasks. Existing work on continual event extraction either always reuses existing parameters to learn new tasks, or blindly adds new parameters for every new task, incurring significant computational cost while preventing potential share of knowledge between tasks. To get the best of both worlds, in this work, we propose continual joint event extraction with adaptive bias mixing to adapt model for incoming tasks in a parameter efficient manner. We also incorporate metric learning to construct a prototypical network for maximum parameter efficiency. Experiment results on ACE2005 dataset show that our framework retains baseline performance with significantly smaller parameter size.

Visualization of mixed bias composition

Data

I used ACE2005 Dataset, If you want to run your own datasets, please following their guidelines to prepare the data.

Environment

Requirements.txt provide some dependencies. All experiments in this paper are run on 2080 Ti with PyTorch 1.9.

(Note that the authors themselves observe that the same code might get different numbers on different devices/library versions, and the findings in the paper still hold.)

Note that the folder mytransformers contains the 2.0 version of adapter-transformers (aka AdaphterHub). We add some necessary functions to support our framework.

Setup

Create the following two directories in wherever you want. (you can name the directories arbitrarily):
- data directory: Where the dataset will be load by the model.
- model directory: The place for the model to dump its outputs.
Download the dataset using link in prior work's repo.
Setup env file.
Install pyrouge manually, you might find this link useful.
Setup other necessary customized configs.

Training and Testing

Follow guidelines in prior work's repo: LAMOL and L2KD.
We provide examples in LAMOL.sh and LAMOL_myadaptor.sh.
We also provide the details of different hyper parameters in LAMOL.sh and LAMOL_myadaptor.sh.

Tips from the authors

We add a lot of args in settings.py and settings_myadaptor.py. Many of them are not used in our paper, but might be used somewhere in the code (without functioning). For used args we add on the original implementation of LAMOL, we add help to help you know the role of those args.
Our code is based on two prior repos: (i) LAMOL and its following work (LAMOL and L2KD) (ii) adapter-transformer 2.0. Here are some suggestions to understand our code:
1. For training and testing logic, the pattern of LAMOL, try to first read the code from LAMOL.
2. For how to add and use adapter module, try to first read the source code/framework of adapterhub to have basic understanding of how they implement adding adapters and set training adapters.

Acknowledgement

We adapt the code of LAMOL and L2KD. Huge thanks to our open-source prior work!!!
We adapt the code of AdapterHub (Version 2.0). Huge thanks!!!

(Copy from their acknowledgement as follow:)

We use the language model offered by transformers, a state-of-the-art natural language processing models library by Thomas Wolf et al.
The implementation of MAS follows MAS-Memory-Aware-Synapses, the Memory Aware Synapses method implementation code by Aljundi R. et al.
The implementation of GEM follows GradientEpisodicMemory, the Gradient Episodic Memory method implementation code by Lopez-Paz, David et al.
The implementation of fp16 (fp16.py, fp16util.py) is from Megatron-LM, the ongoing research training transformer language models at scale by NVIDIA.
Data format conversion refer to decaNLP, the Natural Language Decathlon: Multitask Learning as Question Answering implementation code by Bryan McCann et al.

Citation

Yet to be published

Questions

If you have any questions about our paper and code, please contact Lu Chengeng via [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.idea		.idea
ace2005_module		ace2005_module
migration_model		migration_model
mytransformers		mytransformers
runs/Sep25_14-31-38_DESKTOP-HAGN2DS		runs/Sep25_14-31-38_DESKTOP-HAGN2DS
.gitignore		.gitignore
LAMOL.sh		LAMOL.sh
LAMOL_myadaptor.sh		LAMOL_myadaptor.sh
LICENSE		LICENSE
README.md		README.md
arch.png		arch.png
data_attrs.json		data_attrs.json
dev.py		dev.py
early_stop.py		early_stop.py
environment		environment
fp16.py		fp16.py
fp16util.py		fp16util.py
gen_pics.py		gen_pics.py
loss_scaler.py		loss_scaler.py
meddra.py		meddra.py
metric_eval.py		metric_eval.py
metric_plotting.py		metric_plotting.py
metrics.py		metrics.py
parallel.py		parallel.py
preprocess.py		preprocess.py
regularizers.py		regularizers.py
regularizers_myadaptor.py		regularizers_myadaptor.py
requirements.txt		requirements.txt
scheduler.py		scheduler.py
settings.py		settings.py
settings_myadaptor.py		settings_myadaptor.py
settings_mybitfit.py		settings_mybitfit.py
settings_mybitfit_exp.py		settings_mybitfit_exp.py
test.py		test.py
test.sh		test.sh
test_myadaptor.py		test_myadaptor.py
test_myadaptor.sh		test_myadaptor.sh
train.py		train.py
train.sh		train.sh
train_myadaptor.py		train_myadaptor.py
train_myadaptor.sh		train_myadaptor.sh
train_mybitfit.py		train_mybitfit.py
train_mybitfit_exp.py		train_mybitfit_exp.py
utils.py		utils.py
utils_myadaptor.py		utils_myadaptor.py
utils_mybitfit.py		utils_mybitfit.py
utils_mybitfit_exp.py		utils_mybitfit_exp.py
v2.gif		v2.gif
vis.py		vis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Continual Joint Multiple Events Extraction via Adaptive Bias Mixing

Visualization of mixed bias composition

Data

Environment

Setup

Training and Testing

Tips from the authors

Acknowledgement

Citation

Questions

About

Releases

Packages

Contributors 2

Languages

License

lu1kaifeng/Adaptive-BitFit-Compositional-Modules

Folders and files

Latest commit

History

Repository files navigation

Continual Joint Multiple Events Extraction via Adaptive Bias Mixing

Visualization of mixed bias composition

Data

Environment

Setup

Training and Testing

Tips from the authors

Acknowledgement

Citation

Questions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages