The results differ from those presented in the technical report #6

Annapreto23 · 2025-01-08T12:25:03Z

Hello! Thank you for this valuable work! A month ago, I used it to test my chunker and successfully reproduced the results presented in the technical report. However, when I reran those tests this week, I noticed that the results are now significantly lower. I checked the repository, but nothing seems to have changed. Did I do something wrong, or has one of the packages used in the benchmark been updated, which could have affected the results?

For example, for retrieve = 5, with only Chatlogs.md and with all-MiniLM-L6-v2, I have this script with those results :

from chunking_evaluation import GeneralEvaluation
from chromadb.utils import embedding_functions
from chunking_evaluation.chunking import RecursiveTokenChunker
from chunking_evaluation.utils import openai_token_count

# Instantiate the custom chunker and evaluation
chunker = RecursiveTokenChunker(chunk_size=250, chunk_overlap=125, length_function=openai_token_count)
evaluation = GeneralEvaluation()

# Choose embedding function (all-MiniLM-L6-v2 here)
modelPath: str = "/home/apreto/sentencetransformers/"
model_kwargs: dict = {'device': 'cpu'}
encode_kwargs: dict = {'normalize_embeddings': False}
default_ef= embedding_functions.SentenceTransformerEmbeddingFunction(
            model_name=modelPath,
            model_kwargs=model_kwargs,
 )


# Evaluate the chunker
results = evaluation.run(chunker, default_ef, retrieve = 5)

print(results)
# {'iou_mean':0.00855291792458815, 'iou_std': 0.021362467789861218, 
# 'recall_mean': 0.12840199977367153, 'recall_std': 0.32191636533879947}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The results differ from those presented in the technical report #6

The results differ from those presented in the technical report #6

Annapreto23 commented Jan 8, 2025

The results differ from those presented in the technical report #6

The results differ from those presented in the technical report #6

Comments

Annapreto23 commented Jan 8, 2025