-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed IntegerFormatStyle for values > Int64.max. #262
Commits on Sep 16, 2023
-
Fixed IntegerFormatStyle for values > Int64.max.
If I understand the NumberFormatter implementation in the ICU library, this _should_ handle arbitrary-precision numbers, as long as they have a `description` which returns their POSIX-formatted representation, although that's not in the unit tests because it requires a working arbitrary-precision type, and I assume it's not acceptable to pull in a BigInt package for this purpose, and my efforts to make a quick-and-dirty one for testing purposes led only to tears and threats against Xcode's life.
Configuration menu - View commit details
-
Copy full SHA for 84b6e5a - Browse repository at this point
Copy the full SHA 84b6e5aView commit details
Commits on Sep 26, 2023
-
Removed the reliance on
BinaryInteger.description
for `IntegerForma……tStyle`. This patch introduces an explicit `numericStringRepresentation` property for `BinaryInteger`, that can be relied on to produce exactly the decimal string expected by the ICU library rather than relying on `BinaryInteger`'s default `description` (which happens to produce the same format today but isn't guaranteed to in future, and can be overridden by 3rd party implementations of `BinaryInteger`). This uses a non-trivial algorithm, nominally for performance although it has not been benchmarked. The trivial (relatively-speaking) algorithm is to just iteratively divide `self` by 10, which produces one digit of the result per iteration. This is in fact what is done for all values that fit into a natural machine word (`UInt`), as a fast path optimisation. But for `BinaryInteger`s which are larger than a machine word - and especially for those that have unbounded size, such as arbitrary-precision integers, dividing that entire number by 10 may be quite expensive. So a potential optimisation - as used in this patch - is to leverage the fact that every `BinaryInteger` is defined in terms of one or more `UInt`s, by (essentially) processing a whole `UInt` at a time. There's still a divide-by-ten loop involved per UInt, but integer division on a natural machine word is likely much faster than those same divisions on the full arbitrary-precision number.
Configuration menu - View commit details
-
Copy full SHA for 1ded056 - Browse repository at this point
Copy the full SHA 1ded056View commit details -
Configuration menu - View commit details
-
Copy full SHA for fed358b - Browse repository at this point
Copy the full SHA fed358bView commit details
Commits on Sep 27, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 778a6d0 - Browse repository at this point
Copy the full SHA 778a6d0View commit details
Commits on Sep 29, 2023
-
Configuration menu - View commit details
-
Copy full SHA for dac9328 - Browse repository at this point
Copy the full SHA dac9328View commit details -
Use the
.zero
property ofBinaryInteger
than a literal0
as it ……may be faster. It depends on how the compiler treats the literal - e.g. it might already replace that with a simple reference to the `.zero` constant - but this way removes the possibility of an inefficiency (e.g. actually creating a duplicate instance of the zero constant).
Configuration menu - View commit details
-
Copy full SHA for cbb3bf2 - Browse repository at this point
Copy the full SHA cbb3bf2View commit details -
Renamed the helpers from
test
tocheck
to avoid any confusion wit……h standalone test cases. Technically not an issue when they're nested inside test cases, but still potentially confusing to a human. Plus, this way they can be moved up a level without having to be renamed (if later tests would like to use the same helpers).
Configuration menu - View commit details
-
Copy full SHA for cb4c237 - Browse repository at this point
Copy the full SHA cb4c237View commit details -
Renamed the decimalDigitsAndMagnitudePerWord helper to
check
as wel……l (from `test`), moved it up a level, and changed its `magnitude` parameter to be `UInt`. This way it can be used from additional test cases in future. The parameter type change is helpful for any future tests that want to use this with types that might not be initialisable from the necessary integer literal (e.g. an arbitrary-precision integer which doesn't support ExpressibleByIntegerLiteral from StaticBigInt and is thus limited to Int, yet its word magnitude will still be UInt as required by the BinaryInteger protocol and its word magnitude value will not fit into an Int).
Configuration menu - View commit details
-
Copy full SHA for 4312195 - Browse repository at this point
Copy the full SHA 4312195View commit details
Commits on Sep 30, 2023
-
Switched to using UnsafeMutableBufferPointer.initializeElement(at:to)…
… rather than simple assignment via subscripting, as the latter mysteriously crashes once the buffer gets large enough. I wasn't able to precisely root-cause the crash, but it does seem like it's something to do with ContiguousArray's assignment operator implementation assuming `self` is valid, which of course it isn't (yet) in this case. I didn't explore how `initializeElement(at:to)` does the (figuratively) same assignment without running afoul of this, just observed that it evidently does.
Configuration menu - View commit details
-
Copy full SHA for f72a106 - Browse repository at this point
Copy the full SHA f72a106View commit details -
Corrected the calculation for the (maximum) number of wordStrings nee…
…ded. The previous algorithm worked pretty amazingly well, considering how baseless it was in mathematical correctness (it was inspired by examples found in _many_ arbitrary-precision numerics libraries, suggesting many people before me also cargo-culted the calculation - but at least I *checked* it). It took me a while to find an example case that would actually break it (using a BigInt library) and I wasn't able to find an example that was less than hundreds of digits. I'm sure there's a much smaller value that would expose the problem with the prior code, I just didn't feel it a valuable exercise to go full maths nerd in order to find it.
Configuration menu - View commit details
-
Copy full SHA for 87c866c - Browse repository at this point
Copy the full SHA 87c866cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b78c9db - Browse repository at this point
Copy the full SHA b78c9dbView commit details -
Moved the Word == UInt precondition to a location more directly relev…
…ant to that invariant.
Configuration menu - View commit details
-
Copy full SHA for 9efae3d - Browse repository at this point
Copy the full SHA 9efae3dView commit details -
Added unit tests for numericStringRepresentation and decimalDigitsAnd…
…MagnitudePerWord on a real BigInt implementation.
Configuration menu - View commit details
-
Copy full SHA for 854b010 - Browse repository at this point
Copy the full SHA 854b010View commit details -
Optimised the implementation to use just one buffer to build the output.
This doesn't make the code shorter per-se but it does make it a bit simpler in most respects. It does make the buffer indexing slightly more complicated. But the benefit, of course, is that it no longer allocates twice as much memory as actually needed, and does less work at runtime to assemble the result. Because the algorithm emits the resulting digits in reverse order (least significant to most), the buffers get filled from end to front. Since the estimate of required buffer size is only an upper bound, not a precise value, that means sometimes there's some space at the front of the buffer that ends up unused. Rather than doing a memmove to shunt the contents of the buffer up to its front - which would be pretty expensive for non-trivially-sized numbers - the `numericStringRepresentation` property now simply returns an `ArraySlice` instead of a `ContiguousArray`. The callers don't care - same interface as far as they're concerned - and they quickly dispose of the `ArraySlice` which in turn disposes off the underlying `ContiguousArray` anyway.
Configuration menu - View commit details
-
Copy full SHA for b979c25 - Browse repository at this point
Copy the full SHA b979c25View commit details
Commits on Oct 2, 2023
-
Corrected the
digits
count returned by maximumDecimalDigitsForUnsig……ned (it was one too high). There _was_ actually a reason for this, originally, I vaguely recall… I just don't know now why. What it was (before this patch) returning was actually the _maximum_ number of decimal digits for values of the given type, which is _not_ what it's actually needed for.
Configuration menu - View commit details
-
Copy full SHA for 069a6eb - Browse repository at this point
Copy the full SHA 069a6ebView commit details -
Simplified the specialisation of
numericStringRepresentation
re. ex……tension type - now just "UInt" outright. Of course this is equivalent to the prior extension point of "BinaryInteger where Self == UInt". I think I just wrote it that way because of the particular frame of reference I had at the time. Thankfully testing shows that the compiler doesn't require that more obtuse way of writing "UInt" in order to correctly detect and use the specialisation.
Configuration menu - View commit details
-
Copy full SHA for e5ae0e4 - Browse repository at this point
Copy the full SHA e5ae0e4View commit details -
Added a fast path to
decimalDigitsAndMagnitudePerWord
that computes…… the result directly using pretty trivial maths. This does require doing a power operation, though, so it requires Darwin or equivalent (e.g. Glibc). And since thats now required I removed the previously hard-coded constant for log10(2) in favour of just calling `log10`. Although I'm not sure if the compiler is clever enough to simplify that to the actual value at compile time, to improve runtime performance. 🤔 It is possible to hard-code the guess values for a finite number of bit widths (of which we really only care about 32 & 64), which would avoid the maths and the depending on Darwin etc, but I couldn't find a way to do that which didn't result in the compiler complaining about unreachable code. It only issues warnings in that case, but still, having pointless warnings forever attached to this code is unpalatable.
Configuration menu - View commit details
-
Copy full SHA for 703a6ee - Browse repository at this point
Copy the full SHA 703a6eeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 342513f - Browse repository at this point
Copy the full SHA 342513fView commit details -
Removed outdated precondition check, that Words.Element == UInt.
If that isn't true the code won't compile anyway, as there's no definition of the required helper methods on any type other than UInt.
Configuration menu - View commit details
-
Copy full SHA for 71d661b - Browse repository at this point
Copy the full SHA 71d661bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 087deac - Browse repository at this point
Copy the full SHA 087deacView commit details -
Removed unnecessary use of Word.max.bitWidth when Word.bitWidth retur…
…ns the same thing, but probably more efficiently.
Configuration menu - View commit details
-
Copy full SHA for c73c7cc - Browse repository at this point
Copy the full SHA c73c7ccView commit details -
Replaced UnsafeMutableBufferPoint.Index.advance(by:) with direct arit…
…hmetic, per the directions in that function's definition. …which say to only use `advance(by:)` for Indexes which are not integers. I'm a bit surprised by this advice - it then hard-codes the integer index assumption for types that ever used them, like UnsafeMutableBufferPointer - but it does seem fairly safe in this case as fast manipulation of indices in UnsafeMutableBufferPointers is important, and so likely to remain simple integers.
Configuration menu - View commit details
-
Copy full SHA for 28bb1f5 - Browse repository at this point
Copy the full SHA 28bb1f5View commit details -
Removed an unnecessary temporary variable ('nextWordInsertPoint').
And I only just realised I'd typoed its name anyway - it should have been `nextWordInsertionPoint`. 🙃
Configuration menu - View commit details
-
Copy full SHA for e6f0b18 - Browse repository at this point
Copy the full SHA e6f0b18View commit details -
Made the
numericStringRepresentation
unit tests work when UInt is 3……2-bit… in theory. I don't have any way to build or execute the tests in 32-bit, so I cannot test this.
Configuration menu - View commit details
-
Copy full SHA for 71e8244 - Browse repository at this point
Copy the full SHA 71e8244View commit details
Commits on Oct 4, 2023
-
Commented out the import of attaswift/BigInt and put the tests that d…
…epend on it behind canImport guards. @itingliu recommended (swiftlang#262 (comment)) not adding this as a dependency (by default) as it's a 3rd party library that could change at any time, in a way that erroneously breaks these unit tests. But it's still handy to have the tests available if someone wants to go out of their way to try them, as they did prove quite useful in finding bugs during the initial implementation.
Configuration menu - View commit details
-
Copy full SHA for 0714bfa - Browse repository at this point
Copy the full SHA 0714bfaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 45e44e9 - Browse repository at this point
Copy the full SHA 45e44e9View commit details -
Revised the explanatory comment on why
description
cannot be used i……nstead, despite having the same output for many types of BinaryIntegers. While the prior comment was correct that the output of `description` can change at any time, which is also a problem, it's not the root problem. `description`'s output is fundamentally not a defined format and there's no guarantee (nor requirement) that any particular BinaryInteger-conforming type use any particular format. It doesn't matter the the built-in types happen to use the desired format, this code needs to work for _all_ BinaryIntegers.
Configuration menu - View commit details
-
Copy full SHA for ac4f8f4 - Browse repository at this point
Copy the full SHA ac4f8f4View commit details -
Corrected comment about the UInt specialisation of `numericStringRepr…
…esentation`. Technically it doesn't need to exist now, because the part the generic implementation relies on is actually the `numericStringRepresentation(intoEndOfBuffer:)` method. But the specialisation is still significantly faster, being much simpler, so it's worth keeping even only for that reason.
Configuration menu - View commit details
-
Copy full SHA for dbae775 - Browse repository at this point
Copy the full SHA dbae775View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9c45109 - Browse repository at this point
Copy the full SHA 9c45109View commit details
Commits on Oct 5, 2023
-
Added rudimentary detection of broken quotientAndRemainder(dividingBy…
…:) implementations. As the included comment explains, it'd be entirely understandable for an implementor to use e.g. T-division instead of F-division, since the documentation for quotientAndRemainder(dividingBy:) doesn't directly specify what is expected (it merely hints, obtusely, that it's F-division because that's the only way to satisfy its requirement that the remainder's sign match the original value's sign). It's easy to catch this with a precondition, and it simplifies the code by removing the need for a second "isPositive" / "isNegative" variable. In doing that consolidation I also inverted the variable from "positive" to "negative" because arguably using the term "positive" wasn't correct (it included where the value is zero, which technically doesn't have a sign).
Configuration menu - View commit details
-
Copy full SHA for 37d9207 - Browse repository at this point
Copy the full SHA 37d9207View commit details -
Removed use of
signum()
in favour of just comparing with the value ……directly. I originally assumed (mistakenly) that the return value was some trivial integer type, e.g. `Int`. But it's actually `Self`, making it potentially fairly expensive to create the result, depending on how the type in question is implemented (e.g. Int512 with a fixed-size words array underneath). It's impossible to say in general which approach is faster, but it seems like performance will be more predictable by just comparing with the original value (especially since its comparison operator implementations should by rights be pretty optimised anyway given how likely they are to be frequently used).
Configuration menu - View commit details
-
Copy full SHA for 37932ac - Browse repository at this point
Copy the full SHA 37932acView commit details -
Corrected inaccurate wording in the `decimalDigitsAndMagnitudePerWord…
…` documentation.
Configuration menu - View commit details
-
Copy full SHA for df74622 - Browse repository at this point
Copy the full SHA df74622View commit details -
Added some comments in
decimalDigitsAndMagnitudePerWord
delineating…… the fast vs slow paths and explaining why the fast path isn't universally applicable.
Configuration menu - View commit details
-
Copy full SHA for 371275b - Browse repository at this point
Copy the full SHA 371275bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 50c16c7 - Browse repository at this point
Copy the full SHA 50c16c7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1b0e1f5 - Browse repository at this point
Copy the full SHA 1b0e1f5View commit details -
Added explanatory error messages to the asserts.
These help explain the issue clearly and with relevant details (e.g. ancestor values of the assertion expression itself).
Configuration menu - View commit details
-
Copy full SHA for 7ef6563 - Browse repository at this point
Copy the full SHA 7ef6563View commit details -
Configuration menu - View commit details
-
Copy full SHA for d20025a - Browse repository at this point
Copy the full SHA d20025aView commit details -
Replaced special-casing of zero in `numericStringRepresentation(intoE…
…ndOfBuffer:)` with an assert, at @glessard's request (swiftlang#262 (comment)).
Configuration menu - View commit details
-
Copy full SHA for 72cd8f0 - Browse repository at this point
Copy the full SHA 72cd8f0View commit details -
Moved some comments from the formal method & property documentation i…
…nto the methods themselves. The comments in question pertain to implementation details, that probably aren't of interest to mere users of these methods & properties. At @iCharlesHu's request (swiftlang#262 (comment)).
Configuration menu - View commit details
-
Copy full SHA for 3413e63 - Browse repository at this point
Copy the full SHA 3413e63View commit details
Commits on Oct 6, 2023
-
Removed the commented-out additions of attaswift/BigInt as a dependen…
…cy, at @iCharlesHu's request (swiftlang#262 (comment)).
Configuration menu - View commit details
-
Copy full SHA for 9f90a3f - Browse repository at this point
Copy the full SHA 9f90a3fView commit details -
Changed the "fast path should handle the zero case" assert to a preco…
…ndition, at @iCharlesHu's request (swiftlang#262 (comment)). It should never trip as long as `UInt(exactly:)` and the implementation of the concrete type's words property aren't buggy. If either is, then it's likely that the result of this method will be wrong anyway. So the benefit of the check is probably just a faster and clearer failure. But the downside of the precondition check seems quite small - slight increase in code size and a [hopefully very small] runtime cost.
Configuration menu - View commit details
-
Copy full SHA for b2e2931 - Browse repository at this point
Copy the full SHA b2e2931View commit details -
Made
numericStringRepresentation(intoEndOfBuffer:)
fileprivate (fro……m the default of internal). It's not used outside the file and is a fairly specific implementation detail of the `numericStringRepresentation` property, so it would be wise to discourage casual use.
Configuration menu - View commit details
-
Copy full SHA for 31c585d - Browse repository at this point
Copy the full SHA 31c585dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9a7854b - Browse repository at this point
Copy the full SHA 9a7854bView commit details -
Removed trailing commas that were originally absent, before swiftlang…
…#262. Arguably they should already have been there, to avoid exactly this kind of spurious noise when adding or removing dependencies, but, 🤷♂️.
Configuration menu - View commit details
-
Copy full SHA for b848a28 - Browse repository at this point
Copy the full SHA b848a28View commit details
Commits on Oct 8, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 152c4b8 - Browse repository at this point
Copy the full SHA 152c4b8View commit details
Commits on Oct 9, 2023
-
Added a broken implementation of
actualBitWidth
.This is attempting to work around `bitWidth` not working properly (IMO) for FixedWidthIntegers (they return the bit width of the _type_, not the _actual value at hand_). This is non-trivial to implement because signed numbers. 🙁 Specifically, the use of two's complement (for `BinaryInteger`'s underlying storage, in `words`) - the edge cases of -(every power of two) which fits into a signed type one bit smaller than +(every power of two). So we need to detect powers of two - easy, right? No. We only have a `Numeric` value (for the `magnitude`), which is next to useless because it's _such_ a high-level protocol with very little API. It _might_ be possible to make this work using just the `words`, by basically reinventing all the necessary arithmetic operations on `words`, but it'll be a bit hairy and duplicative of functionality already in the _actual_ `BinaryInteger` we're operating on (but blocked from us by the type system). I'm checking this in for posterity - in case for some reason this turns out to be the only viable path - but I think all this is unnecessary… what we really want is the number of bits required to represent the _magnitude_, because that's what we're actually going to convert to a numeric string…
Configuration menu - View commit details
-
Copy full SHA for 426e11e - Browse repository at this point
Copy the full SHA 426e11eView commit details -
Revert "Replaced special-casing of zero in `numericStringRepresentati…
…on(intoEndOfBuffer:)` with an assert, at @glessard's request (swiftlang#262 (comment))." We'd both overlooked the fact that it _can_ be called on zero because it's used for the individual words of the BinaryInteger, any of which can actually be zero. 72cd8f0
Configuration menu - View commit details
-
Copy full SHA for 4826709 - Browse repository at this point
Copy the full SHA 4826709View commit details
Commits on Oct 10, 2023
-
Finally found a solution for determining the minimum bit width of a g…
…iven `BinaryInteger`. I resisted just deriving it from `words` at the outset because I thought surely that's naive; surely there's a more elegant if not more efficient way. But there isn't, as far as I can tell. And this actually isn't complicated nor all that expensive anyway, it turns out (at least for `BinaryInteger`s which natively store their values in two's complement form, and thus don't have to specially generate `words` on demand). An important aspect of this is that what we care about is _not_ the two's complement form, which is what all the built-ins (e.g. `bitWidth`) nominally return, but that size of the _unsigned_ value that is the _magnitude_. That's what we're actually working on; converting to its constituent decimal digits. So the whole off-by-one problems of two's complement representation, with negative powers of two, etc, are all irrelevant really.
Configuration menu - View commit details
-
Copy full SHA for bc2499f - Browse repository at this point
Copy the full SHA bc2499fView commit details -
Configuration menu - View commit details
-
Copy full SHA for eacacc6 - Browse repository at this point
Copy the full SHA eacacc6View commit details -
Added tests using the Numberick 3rd party package (if it's available)…
…, to cover fixed-width 128-bit and 256-bit integers. These are conditionalised on the inclusion of Numberick in Package.swift, which is not the default. Like the existing BigInt tests. This proved useful as it did reveal a bug - one I'd actually discovered previously but apparently the fix got lost during Git patch shuffling or something - in `magnitudeBitWidth`, whereby it was forgetting to check if the current word is actually non-zero before treating it as relevant (meaning not just zero padding). This wasn't caught by the BigInt tests because those BigInt / BigUInt are flexible-width and only use as many words as strictly necessary to contain the value.
Configuration menu - View commit details
-
Copy full SHA for 75bb993 - Browse repository at this point
Copy the full SHA 75bb993View commit details -
Corrected precondition failure message and comment: T ⇄ F.
I knew what I meant when I wrote it - I got the actual description correct regarding the rounding mode - but I inverted the two letters. T (for truncating, a la `.rounded(.towardZero)`) vs F (for flooring, a la `.rounded(.down)`).
Configuration menu - View commit details
-
Copy full SHA for efb3f08 - Browse repository at this point
Copy the full SHA efb3f08View commit details -
Added a comment on the two's complement of
remainder
noting that wr……ap-around is intentional. The comment might seem excessive, but in reviewing this I had already forgotten that I was using the ~0 + 1 == 0 trick, and I tried to replace '&+' with plain '+', which then crashes (thankfully I'd also had the foresight to make a unit test case which checks this exact situation). So, for future readers who aren't even me, a comment might save them some time (or anguish). 🙂
Configuration menu - View commit details
-
Copy full SHA for 5b9a221 - Browse repository at this point
Copy the full SHA 5b9a221View commit details