Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow AtriumDB to output time information in its native format and add further optimizations. #93

Merged
merged 11 commits into from
Oct 10, 2024

Conversation

WilliamDixon
Copy link
Contributor

@WilliamDixon WilliamDixon commented Jul 26, 2024

A deeper explanation for the following changes can be found in my comment in discussion #85.

AtriumDB 2.2.4

This PR makes use of AtriumDB's 2.2.4 features which is currently in prerelease.

Native AtriumDB Output

To allow one of our AtriumDB classes to output data in its native format, code is added to benchmark.py and utils.py, in order to detect data in time-value pair format and convert to WFDB's nan based format outside of the benchmarking process.

New Nan Adapter Class

A second AtriumDB class called waveform_benchmark.formats.atriumdb.NanAdaptedAtriumDB has been created which converts the data output from AtriumDB's native format to the benchmark's preferred format within the benchmarked region of code.

No sort / check

AtriumDB in normal operation needs to check for the case where interblock (data between two or more blocks) or intrablock (data within a single block) data is out of order, but for this benchmark such a check is not needed.

Buffered Writes

AtriumDB has a new write_buffer which lets it piece together multiple small segments efficiently without the need for code outside code to accomplish best efficiency.

File metadata loading

AtriumDB can now read it's time index for file meta information for the entire file and store that information in memory rather than requerying the file metadata for each small read. This significantly increases the read performance of small reads in exchange for a very light memory usage increase.

Mac OS Error

AtriumDB now gives a descriptive error when you try to use it on a Mac OS (I hope to have Mac support in 2.5.0)

@WilliamDixon
Copy link
Contributor Author

After looking at the cached PR, I thought it would be better if the default AtriumDB version didn't use any metadata caching and instead stored that data in a header-like file to better compare with the other formats.

So I've made that change to this PR.

@briangow
Copy link
Collaborator

Thanks @WilliamDixon , this looks good!

@briangow briangow merged commit 6edb890 into chorus-ai:main Oct 10, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants