KNETFILE_HOOKS for using alternate network library #5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The UCSC Genome Browser code base includes a network library with an interface analogous to knetfile.c's, but with a few extra features (HTTPS, u/p authentication, sparse file caching of complete URL paths). In order to use UCSC's network library within samtools, I have added a layer of indirection to knetfile functions. An alternate implementation of knetfile functions can be registered by passing in function pointers. Then, when a knetfile function is called, if an alternate function pointer has been registered, the knetfile function calls the alternate function and returns. The new code is all inside "#ifdef KNETFILE_HOOKS" so if -DKNETFILE_HOOKS is omitted, then samtools is compiled without the new layer.
The slight change to knetfile_hooks required a little extra abstraction in a couple other places in the code; for example, in bam_index_load_core, #ifdef KNETFILE_HOOKS then instead of saving the index file to the local directory and accessing it with fread, the index file is accessed using knetfile.
This patch also #ifdef 0's out the EOF check in bam.c, because I considered it costly & unnecessary for the Genome Browser's purposes -- I'd understand if you want to exclude that part. A separate #ifdef for it would be nice, though.
This patch has been in use for over a year in the UCSC Genome Browser and several of our mirror sites. I have provided versions of the patch with different line numbers for different samtools releases, along with instructions for applying the patch, here: http://genomewiki.ucsc.edu/index.php/KNETFILE_HOOKS . Some mirror maintainers have asked that I submit this, in hopes they won't have to manually patch every time they get the latest version of samtools.
I hope you'll consider incorporating this into samtools. Thanks!