07 Feb 14:14

hanhainebula

fcdf889

v1.3.4 Latest

Latest

What's Changed

Inference docstring by @ZiyiXia in #1186
delete useless parameters for embedder classes by @hanhainebula in #1189
Bug of BGE M3 training by @baochi0212 in #1183
feat:add bce-embedding-base_v1 by @zhudongwork in #1198
Docstring by @ZiyiXia in #1200
Update AbsDataset.py by @jhyeom1545 in #1204
Fix bugs by @hanhainebula in #1211
fixed a bug in AbsReranker.py for mps device support by @Swgj in #1216
Fix bugs by @hanhainebula in #1219
update stop pool by @545999961 in #1221
update mteb eval by @545999961 in #1227
update adjust batch size by @545999961 in #1229
update mteb eval by @545999961 in #1230
fix bugs and refactor code by @hanhainebula in #1231
update mteb eval by @545999961 in #1235
release training data for bge-multilingual-gemma2 by @hanhainebula in #1245
add missed trust_remote_code for finetune code by @hanhainebula in #1248
fix DecoderOnlyEmbedderICLSameDatasetTrainDataset category index error by @billvsme in #1232
Clean code by @hanhainebula in #1250
Fix bugs by @hanhainebula in #1253
update examples by @545999961 in #1254
update examples by @545999961 in #1255
Fix air-bench eval bugs: AIRBenchEvalArgs by @hanhainebula in #1256
Fix air-bench eval bugs: AIRBenchEvalArgs by @hanhainebula in #1257
update code and README for scripts by @hanhainebula in #1258
update examples by @545999961 in #1261
update C_MTEB reference by @emmanuel-ferdman in #1296
[Bugfix] Typehint error on py38 by @DrDavidS in #1300
Update model_mapping.py by @pengjunfeng11 in #1311
fix bugs for embedder finetune by @hanhainebula in #1328
fix a bug in icl/dataset.py by @hanhainebula in #1330
Fix bugs by @hanhainebula in #1340
fix beir data_loader.py: dev -> validation by @hanhainebula in #1341
update embedder finetune code by @hanhainebula in #1342
Fix Bug: OOM by @545999961 in #1349
fix transformers 4.48.0 by @Hypothesis-Z in #1343
Fix a bug in beir evaluation and release v1.3.4 by @hanhainebula in #1359
del dp code by @hanhainebula in #1360
support musa backend in FlagEmbedding by @qiyulei-mt in #1350
docs: fix link to https://bge-model.com/ within NEWS section by @bufferoverflow in #1355
fix/reranking tutorial typos by @rendyfebry in #1313

New Contributors

@baochi0212 made their first contribution in #1183
@zhudongwork made their first contribution in #1198
@jhyeom1545 made their first contribution in #1204
@Swgj made their first contribution in #1216
@billvsme made their first contribution in #1232
@emmanuel-ferdman made their first contribution in #1296
@DrDavidS made their first contribution in #1300
@pengjunfeng11 made their first contribution in #1311
@Hypothesis-Z made their first contribution in #1343
@qiyulei-mt made their first contribution in #1350
@bufferoverflow made their first contribution in #1355
@rendyfebry made their first contribution in #1313

Full Changelog: v1.3.2-BGE-Update...v1.3.4

Contributors

bufferoverflow, rendyfebry, and 13 other contributors

Assets 2

31 Oct 16:23

545999961

v1.3.2-BGE-Update

d76e51c

1.3.2

We have completely updated the BGE code repository, including the following key improvements:

Inference Code

Added FlagAutoModel and FlagAutoReranker, making it easier to utilize the models.

Inference Optimization

Implemented multi-GPU support.
Introduced dynamic batch sizing to prevent out-of-memory (OOM) issues.
Optimized padding to improve efficiency.

Evaluation Code

Integrated support for common evaluation datasets to enhance user convenience.
Provided a custom evaluation interface, adhering to specified data organization standards, to simplify the evaluation process.

Project Structure Organization

Reorganized the project to streamline processes related to inference, fine-tuning, and evaluation.

Assets 2

02 Feb 05:57

staoxiao

BGE-M3&Beacon

7a5eed7

Release BGE-M3 and Activation Beacon

BGE-M3

A new member of the BGE model series! BGE-M3 stands for Multi-linguality, Multi-granularities (input length up to 8192), and Multi-Functionality (unification of dense, lexical, multi-vec retrieval). It is the first embedding model which supports all three retrieval methods.

For more details please refer to Technical Report and Code.

Activation Beacon

An effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM by x100 times. We extend the context length of Llama-2-chat-7b from 4K to 400K.

For more details please refer to paper and code

Feedback is welcome

Assets 2

24 Nov 09:22

staoxiao

lm-cocktail

61923ac

Release LM-Cocktail

LM-Cocktail

Merge language models (e.g., Llama, bge) to improve the general ability of models.
This method can be used to:

Mitigate the Problem of Catastrophic Forgetting
Improve the performance of new tasks without fine-tuning
Approximate multitask learning or model ensemble

More details please refer to paper and code

Assets 2

28 Sep 07:43

staoxiao

1.1

5277f8b

FlagEmbedding 1.1.2

Create the first release #131

FlagEmbedding

Update Embedding Models bge-*-v1.5:
- alleviate the issue of the similarity distribution
- the new models can do retrieval tasks without instruction, but still recommend using instruction which can have a better performance.
New Models bge-reranker-*: cross-encoders that can rerank the top-k retrieved results
Specify using normalization in the configuration for sentence-transformers, thanks to skirres.
Now users have no need to set normalize_embeddings=True manually when using sentence-transformers.

C-MTEB

Add two cross-lingual retrieval tasks: T2RerankingZh2En and T2RerankingEn2Zh.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

Inference Code

Inference Optimization

Evaluation Code

Project Structure Organization

BGE-M3

Activation Beacon

Feedback is welcome

LM-Cocktail

FlagEmbedding

C-MTEB

Releases: FlagOpen/FlagEmbedding

v1.3.4

What's Changed

New Contributors

Contributors

1.3.2

Inference Code

Inference Optimization

Evaluation Code

Project Structure Organization

Release BGE-M3 and Activation Beacon

BGE-M3

Activation Beacon

Feedback is welcome

Release LM-Cocktail

LM-Cocktail

FlagEmbedding 1.1.2

FlagEmbedding

C-MTEB