Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add async alignment writer #292

Merged
merged 8 commits into from
Aug 26, 2024

Conversation

mbhall88
Copy link
Contributor

@mbhall88 mbhall88 commented Aug 22, 2024

See #286 for more details.

Checklist

  • async Writer
  • Builder
  • Documentation examples
  • examples/ entry

Currently, I cannot compile the async Writer due to

error[E0308]: mismatched types
   --> noodles-util/src/alignment/async/io/writer.rs:80:63
    |
80  |             Self::Cram(writer) => writer.write_record(header, record).await,
    |                                          ------------         ^^^^^^ expected `Record`, found `&dyn Record`
    |                                          |
    |                                          arguments to this method are incorrect
    |
    = note: expected struct `noodles_cram::Record`
            found reference `&dyn noodles_sam::alignment::Record`
note: method defined here
   --> /Users/michael/Projects/noodles/noodles-cram/src/async/io/writer.rs:181:18
    |
181 |     pub async fn write_record(
    |                  ^^^^^^^^^^^^

Which is coming from the fact that cram::async::io::Writer::write_record takes a cram::Record type rather than a &dyn sam::alignment::Record trait.

mut record: Record,

Is that a change that should be made in a separate PR?

In addition, I also noticed some slight differences in function naming schemes between SAM/BAM and CRAM. For example, the SAM and BAM async Writers have a write_header function, however, the reciprocal CRAM function is write_file_header - not sure if this was intended though. Also, the CRAM async builder has the function build_with_writer rather than the build_from_writer. I appreciate these are minor things, but thought I would raise them in case you want to move this library towards consistent names across modules/crates. I guess for backwards compatibility we could add the #[deprecated attribute if you want to introduce the consistent named version.

@zaeleus zaeleus added the util label Aug 22, 2024
@zaeleus zaeleus linked an issue Aug 22, 2024 that may be closed by this pull request
@zaeleus
Copy link
Owner

zaeleus commented Aug 22, 2024

    = note: expected struct `noodles_cram::Record`
            found reference `&dyn noodles_sam::alignment::Record`

Is that a change that should be made in a separate PR?

For now, convert &dyn sam::alignment::Record to a cram::Record using cram::Record::try_from_alignment_record. I can normalize the interfaces after this patch.

In addition, I also noticed some slight differences in function naming schemes between SAM/BAM and CRAM. For example, the SAM and BAM async Writers have a write_header function, however, the reciprocal CRAM function is write_file_header - not sure if this was intended though.

Yes, this is intentional. I/O is modeled after the physical structure for the format (see § 5 "File structure" (2024-03-21) for a visual representation). The file header is named to differentiate itself from the many other headers in the format, e.g., the container header, the slice header, the block header, etc. I did add a convenience method to the blocking reader in 2da4eae, which simply needs to be synchronized with the async reader later. Use the currently available methods to write a default file definition, and use the SAM header to write the file header, e.g.,

writer.write_file_definition().await?;
writer.write_file_header(&header).await?;

Also, the CRAM async builder has the function build_with_writer rather than the build_from_writer.

Good catch; it's definitely a typo. Use the methods that are currently available. I can also fix this after this patch.

@mbhall88
Copy link
Contributor Author

Further to the above, when adding the async Writer example I realised I had missed adding Writer::finish. Again, the individual BAM and CRAM methods that are analogous are called shutdown instead of finish. I took the name finish to mirror the sync alignment API. Not sure if these too should be harmonised?

Aside from that, I think this PR is done and ready for review.

noodles-util/src/alignment/async/io/writer.rs Outdated Show resolved Hide resolved
noodles-util/src/alignment/async/io.rs Outdated Show resolved Hide resolved
noodles-util/src/alignment/async/io/writer/builder.rs Outdated Show resolved Hide resolved
@mbhall88 mbhall88 requested a review from zaeleus August 26, 2024 04:35
Copy link
Owner

@zaeleus zaeleus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unified async alignment reader/writer
2 participants