Skip to content

Commit

Permalink
datasets: add test to show hash collisions
Browse files Browse the repository at this point in the history
Bug 7209
  • Loading branch information
inashivb committed Nov 20, 2024
1 parent 760b402 commit cef77c3
Show file tree
Hide file tree
Showing 4 changed files with 65,570 additions and 0 deletions.
21 changes: 21 additions & 0 deletions tests/dataset-hash-collisions/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Test Description
================

Datasets use a static DJB2 hash function to hash all types of datasets. These hashed
datasets are stored in the THash API which has no randomization in place. As
a result of this, the hash table can be exploited with a worst case time scenario of
O(n) where n is the total number of entries in the table as a result of excessive chaining
in a single row.

The test shows that it takes excess time for the THash API to load the datasets from the file
as many of them evaluate the exact same hash using the algorithm so this is not even the worst
case scenario. With bigger dataset and lesser system specs/availability of resources,
this can be worse. Note that it is not just about the number of datasets as there already
does exist a test already that loads 1m+ datasets.

Test data procured from: https://bugs.php.net/bug.php?id=70644

Redmine Ticket
==============

https://redmine.openinfosecfoundation.org/issues/7209
Loading

0 comments on commit cef77c3

Please sign in to comment.