Skip to content

Commit

Permalink
ASCII digits only for versions (#652)
Browse files Browse the repository at this point in the history
* port new tests from ComparableVersion
* ASCII digits only for version numbers
* docs
  • Loading branch information
elharo authored Feb 15, 2025
1 parent 9b6adb9 commit 508504d
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -235,8 +235,8 @@ public boolean next() {
index++;
break;
} else {
int digit = Character.digit(c, 10);
if (digit >= 0) {
if (c >= '0' && c <= '9') { // only ASCII digits
int digit = c - '0';
if (state == -1) {
end = index;
terminatedByNumber = true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
* under the License.
*/
/**
* Ready-to-use version scheme for parsing/comparing versions and utility classes.
* Version scheme for parsing/comparing versions and utility classes.
* <p>
* Contains the "generic" scheme {@link org.eclipse.aether.util.version.GenericVersionScheme}
* that serves the purpose of "factory" (and/or parser) for all corresponding elements (all those are package private).
Expand All @@ -28,12 +28,13 @@
* <p>
* Below is the <em>Generic Version Spec</em> described:
* <p>
* Version string is parsed into version according to these rules below:
* Version string is parsed into version according to these rules:
* <ul>
* <li>The version string is parsed into segments, from left to right.</li>
* <li>Segments are explicitly delimited by single {@code "." (dot)}, {@code "-" (hyphen)} or {@code "_" (underscore)} character.</li>
* <li>Segments are implicitly delimited by transition between digits and non-digits.</li>
* <li>Segments are classified as numeric, string, qualifiers (special case of string) and min/max.</li>
* <li>Segments are explicitly delimited by a single {@code "." (dot)}, {@code "-" (hyphen)}, or {@code "_" (underscore)} character.</li>
* <li>Segments are implicitly delimited by a transition between ASCII digits and non-digits.</li>
* <li>Segments are classified as numeric, string, qualifiers (special case of string), and min/max.</li>
* <li>Numeric segments are composed of the ASCII digits 0-9. Non-ASCII digits are treated as letters.
* <li>Numeric segments are sorted numerically, ascending.</li>
* <li>Non-numeric segments may be qualifiers (predefined) or strings (non-empty letter sequence). All of them are interpreted as being case-insensitive in terms of the ROOT locale.</li>
* <li>Qualifier segments (strings listed below) and their sort order (ascending) are:
Expand All @@ -48,7 +49,7 @@
* </ul>
* </li>
* <li>String segments are sorted lexicographically and case-insensitively per ROOT locale, ascending.</li>
* <li>There are two special segments, {@code "min"} and {@code "max"}, they represent absolute minimum and absolute maximum in comparisons. They can be used only as trailing segment.</li>
* <li>There are two special segments, {@code "min"} and {@code "max"} that represent absolute minimum and absolute maximum in comparisons. They can be used only as the trailing segment.</li>
* <li>As last step, trailing "zero segments" are trimmed. Similarly, "zero segments" positioned before numeric and non-numeric transitions (either explicitly or implicitly delimited) are trimmed.</li>
* <li>When trimming, "zero segments" are qualifiers {@code "ga"}, {@code "final"}, {@code "release"} only if being last (right-most) segment, empty string and "0" always.</li>
* <li>In comparison of same kind segments, the given type of segment determines comparison rules.</li>
Expand All @@ -57,11 +58,11 @@
* <li>It is common that a version identifier starts with numeric segment (consider this "best practice").</li>
* </ul>
* <p>
* Note: this version spec does not document (nor cover) many corner cases, that we believe are "atypical" or not
* Note: this version spec does not document (or cover) many corner cases that we believe are "atypical" or not
* used commonly. None of these are enforced, but in future implementations they probably will be. Some known examples are:
* <ul>
* <li>Using "min" or "max" special segments as non-trailing segment. This yields in "undefined" behaviour and should be avoided.</li>
* <li>Having non-number as first segment of version. Versions are expected (but not enforced) to start with numbers.</li>
* <li>Using "min" or "max" special segments as a non-trailing segment. This yields in "undefined" behaviour and should be avoided.</li>
* <li>Having a non-number as the first segment of a version. Versions are expected (but not enforced) to start with numbers.</li>
* </ul>
*
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ void testUnlimitedNumberOfDigitsInNumericComponent() {
}

@Test
void testTransitionFromDigitToLetterAndViceVersaIsEqualivantToDelimiter() {
void testTransitionFromDigitToLetterAndViceVersaIsEquivalentToDelimiter() {
assertOrder(X_EQ_Y, "1alpha10", "1.alpha.10");
assertOrder(X_EQ_Y, "1alpha10", "1-alpha-10");

Expand Down Expand Up @@ -495,8 +495,32 @@ void testMaximumSegment() {
assertOrder(X_LT_Y, "1.max", "2.min");
}

@Test
void testCompareLettersToNumbers() {
assertOrder(X_GT_Y, "1.7", "J");
}

@Test
void testCompareDigitToLetter() {
assertOrder(X_GT_Y, "7", "J");
assertOrder(X_GT_Y, "7", "c");
}

@Test
void testNonAsciiDigits() { // These should not be treated as digits.
String arabicEight = "\u0668";
assertOrder(X_GT_Y, "1", arabicEight);
assertOrder(X_GT_Y, "9", arabicEight);
}

@Test
void testLexicographicOrder() {
assertOrder(X_GT_Y, "zebra", "aardvark");
assertOrder(X_GT_Y, "ζέβρα", "zebra");
}

/**
* UT for <a href="https://issues.apache.org/jira/browse/MRESOLVER-314">MRESOLVER-314</a>.
* Test for <a href="https://issues.apache.org/jira/browse/MRESOLVER-314">MRESOLVER-314</a>.
*
* Generates random UUID string based versions and tries to sort them. While this test is not as reliable
* as {@link #testCompareUuidVersionStringStream()}, it covers broader range and in case it fails it records
Expand Down

0 comments on commit 508504d

Please sign in to comment.