Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unaligned GC array accesses in Prim #409

Open
raehik opened this issue Mar 11, 2024 · 8 comments
Open

Support unaligned GC array accesses in Prim #409

raehik opened this issue Mar 11, 2024 · 8 comments

Comments

@raehik
Copy link

raehik commented Mar 11, 2024

GHC exposes a set of (read|write|index)Word8ArrayAs[ty]# primops for safe unaligned accesses to GC pointers. I think the Prim class could sensibly expose these.

Also note that similar primops for Addr# non-GC pointers will be coming in GHC 9.10 .

@raehik
Copy link
Author

raehik commented Mar 11, 2024

@raehik raehik changed the title Support unaligned array accesses in Prim Support unaligned GC array accesses in Prim Mar 11, 2024
@andrewthad
Copy link
Collaborator

A different approach is taken at https://hackage.haskell.org/package/primitive-unaligned-0.1.1.2/docs/Data-Primitive-ByteArray-Unaligned.html

In primitive-unaligned, the typeclass for unaligned access is a totally different typeclass. Adding it to Prim itself is a breaking change for all user-defined instances downstream. It's worth considering though because its not that hard to update any hand-written instances to support this. Do you know which GHC release introduced support for the unaligned primops? We need to consider primitive's GHC support window for something like this.

@raehik
Copy link
Author

raehik commented Mar 26, 2024

Regarding unaligned ByteArray primops, I can find writeWord8ArrayAsWord32# going all the way back to base-4.12.0.0, which is GHC 8.6.1.

Thanks for the primitive-unaligned link. That's effectively what I'm doing in my own code, though I need unaligned Addr# access primops instead (which I fake for now). I believe all this sensibly belongs in primitive (eventually, no particular rush). With that in mind, should I keep this issue open to track the merge?

On a related note, would you take a primitive-unaligned update for unaligned Addr# accesses once GHC 9.10 is out?
(And a note related to user-defined Prim instances, what about a wrapper that defines explicit endianness on supported types? Messy code here. Does this belong in/around primitive?)

@andrewthad
Copy link
Collaborator

What's interesting about your type that makes endianness explicit in bytezap is that it's the same exact way I approached the problem in the byte-order library in System.ByteOrder. You named it ByteOrdered and I named it Fixed, and we've got different names for a bunch of other things, but the interfaces are nearly identical. The fact that we independently invented the same interface suggests that there is something fundamental about it. It's hard to say whether or not it belongs in primitive. Personally, I think it seems within the scope of primitive. Additionally, such a consolidation improves discoverability and builds trust that the implementation is correct, which prevents people from having to reinvent this all the time (I doubt we're the only two people who have written this interface). But there are maintainers whose opinions on this I value, and this would need to be discussed in a separate issue.

We might be able to drop support for GHC < 8.6 soon. Currently we support all the way back to GHC 8.0. I'll open a separate thread discussing when older GHCs can be dropped.

I'll take any PRs to primitive-unaligned even before GHC 9.10 is properly released. If you're playing around with a release candidate of it and you're able to incorporate the unaligned Addr# primops into it, go ahead and PR it. The hardest thing about it will be figuring out how to handle the GHC < 9.10 case. I think they can actually be shimmed by reading one 8-bit word at a time and assembling those into a larger word, but that shim code is a little tedious to write.

@raehik
Copy link
Author

raehik commented Mar 26, 2024

Fab! Thanks for letting me know about byte-order. Agreed that Fixed needs a discussion on precisely how/where to implement it. I strongly believe these additions would be great for discoverability/"feature completion" and would gladly assist where I can.

I might have a think about safely emulating unaligned Addr# accesses for GHC < 9.10 . bytestring should have some relevant code in its builder. I remember store has some weird stuff too.

@raehik
Copy link
Author

raehik commented Mar 26, 2024

bytestring shims unaligned writes using C memcpys. We could maybe do the same, but would have to convert IO (due to FFI) to ST, which I'd need some help with proving safety. Do we require a pure Haskell version?

@andrewthad
Copy link
Collaborator

Using the FFI for the shim isn't necessary. For reading/indexing, see Data.Bytes.Parser.BigEndian:word64 for an example of how to do this:

word64 :: e -> Parser e s Word64
word64 e = uneffectful $ \chunk ->
  if length chunk >= 8
    then
      let wa = PM.indexByteArray (array chunk) (offset chunk) :: Word8
          wb = PM.indexByteArray (array chunk) (offset chunk + 1) :: Word8
          wc = PM.indexByteArray (array chunk) (offset chunk + 2) :: Word8
          wd = PM.indexByteArray (array chunk) (offset chunk + 3) :: Word8
          we = PM.indexByteArray (array chunk) (offset chunk + 4) :: Word8
          wf = PM.indexByteArray (array chunk) (offset chunk + 5) :: Word8
          wg = PM.indexByteArray (array chunk) (offset chunk + 6) :: Word8
          wh = PM.indexByteArray (array chunk) (offset chunk + 7) :: Word8
       in Success
            ( unsafeShiftL (fromIntegral wa) 56
                .|. unsafeShiftL (fromIntegral wb) 48
                .|. unsafeShiftL (fromIntegral wc) 40
                .|. unsafeShiftL (fromIntegral wd) 32
                .|. unsafeShiftL (fromIntegral we) 24
                .|. unsafeShiftL (fromIntegral wf) 16
                .|. unsafeShiftL (fromIntegral wg) 8
                .|. fromIntegral wh
            )
            (offset chunk + 8)
            (length chunk - 8)
    else Failure e

And for the other direction, writing, see Data.Bytes.Builder.Bounded:word64BE:

word64BE :: Word64 -> Builder 8
word64BE w = Unsafe.construct $ \arr off -> do
  writeByteArray arr (off) (fromIntegral @Word64 @Word8 (unsafeShiftR w 56))
  writeByteArray arr (off + 1) (fromIntegral @Word64 @Word8 (unsafeShiftR w 48))
  writeByteArray arr (off + 2) (fromIntegral @Word64 @Word8 (unsafeShiftR w 40))
  writeByteArray arr (off + 3) (fromIntegral @Word64 @Word8 (unsafeShiftR w 32))
  writeByteArray arr (off + 4) (fromIntegral @Word64 @Word8 (unsafeShiftR w 24))
  writeByteArray arr (off + 5) (fromIntegral @Word64 @Word8 (unsafeShiftR w 16))
  writeByteArray arr (off + 6) (fromIntegral @Word64 @Word8 (unsafeShiftR w 8))
  writeByteArray arr (off + 7) (fromIntegral @Word64 @Word8 w)
  pure (off + 8)

Admittedly, that's for ByteArray# and not Addr#, but it illustrates the algorithm.

@raehik
Copy link
Author

raehik commented Mar 26, 2024

Thanks and understood, shame because it's got poor performance but it is a compat layer. I'll start a PR on prim-unaligned later :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants