Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read/write bits #5

Open
MonteShaffer opened this issue Aug 18, 2022 · 1 comment
Open

read/write bits #5

MonteShaffer opened this issue Aug 18, 2022 · 1 comment

Comments

@MonteShaffer
Copy link

Hello, Thank you for your efforts. It is very fast. I tried it on the PRIME sieve where the idx are the prime numbers.

For n=1000 primes, I overshoot using gn = ceiling( n * log(n) + n * log(log(n)) )

Here is the format of the output... I truncate the function result to exactly n = 1000, but keep the full calculation; in this case it looks like there are 1102 primes with gn as upper bound.

bitwhich: 1102/8841 occupying only 1102 int32 in 1 representation
    1     2     3     4     5     6     7     8        8834  8835  8836  8837  8838  8839  8840  8841 
FALSE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE    .. FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE 

My question is about storage. From my understanding, if I just save/load or saveRDS/readRDS on the 8841 elements, the filesize should only be about 1.1KB ... Trying both methods (save/saveRDS), I get 2.6KB ... it throws an error when I try writeBin.

I didn't see any SPECIAL read/write functions in the DOCS. Maybe I missed it.

Any ideas or suggestions?

Again thank you. I wonder if DNA researchers could use a two-bit version of this for "AGCT"? Very powerful.

@MichaelChirico
Copy link
Collaborator

I'm afraid there's some context lost here. But I think the overall point is we'd like an equivalent of readBin()/writeBin() for I/O of bit objects.

Could you give a reprex on your saveRDS() experiment? I don't find the same result, actually I get a lot of compression:

x=bit(1e8)
x[sample(1e8, 1e6)] <- TRUE

saveRDS(x, tmp<-tempfile())

file.size(tmp)
# [1] 1601882
as.double(object.size(x))/file.size(tmp)
# [1] 7.804265

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants