-
Notifications
You must be signed in to change notification settings - Fork 42
issue 1093
This work is being done in branch dk-1093-large-rdi
, not in an older branch called dk-1093
. (I am experimenting with the idea of more informative branch names, of the form developerInitials-issueNumber-WordsSeparatedWithHyphens
.)
-
2017 Jan 4. I think things are working now, for blocks where
from
andto
yield a subset that is small enough to fit into R. However, I do not think this is the common use case. When I work with data, I would likely prefer to work withby
argument, to get a rough overview of the whole timeseries, before focussing on smaller time intervals. I need to write more C code to handleby
in this way, and so I would say the work is only 1/4 done. Remaining tasks:- Handle
by
better, by filling up anunsigned char
array with the results of a series ofseek
andfread
calls. - Handle the case of numeric
from
andto
faster (hand these arguments to the existing C function -- easy peasy). - See whether the present scheme of determining the segment pointers is inefficient. The present code reads the whole file twice: a first pass merely count pointers (for a memory allocation) and the second stores into the allocated memory. Another approach would be to have a growable allocation, so I will try that, now that I have a 6Gb file as a test case. (The worry with growable allocation is that time will be spend copying that memory, especially if the growth factor is small, but that we can still run out of memory, if the growth factor is large.)
- Handle
-
2017 Jan 8. I think this is working now. The code now does all of the profile-finding work in C, not in R. (I was really only weaving back and forth with R so that I could interpret times in R ... but then I realized that we can handle that by assuming GMT times, so standard C libraries work fine for constructing time as numeric.) The code has been tested quit thoroughly: the build suite; the local test suite; my private test suite; reproduces old
data(adp)
properly. I mergeddk-1093-large-rdi
intodk
branch, and have asked Clark to try this out for a while. If things are seen to be okay, we might get a merge intodevelop
within a few days. (I don't want to leave the branch hanging for too long because we have other rdi work to do).