-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add TextDistance class for calculating distance between texts. (#113)
* Add TextDistance class for calculating cosine distance between texts * Refactor TextDistance class to use Manhattan distance and normalize character counters * Add pytest markers to test functions * Skip loading prostate and leukemia big datasets due to issue #118 * Skip leukemia big test due to issue #118 * Add counted_char_set to TextDistance class * Normalize counter and calculate Manhatten distance * Add additional tests for TextDistance class This commit adds three new test cases to the `test_text.py` file. The new tests verify the functionality of the `TextDistance` class by checking the distance calculation for different input texts. The tests cover scenarios such as orthogonal texts, extended texts, and an exception case. These new tests enhance the test coverage and ensure the accuracy of the `TextDistance` class. * add validation for max_dimensions parameter in TextDistance class * Add test cases for TextDistance class * Update TextDistance class documentation and add type hints
- Loading branch information
Showing
3 changed files
with
190 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters