Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the HiC QC report to the final report #171

Merged
merged 2 commits into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v2.2.0dev - [25-Oct-2024]
## v2.2.0dev - [31-Oct-2024]

### `Added`

Expand All @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
5. Added `text/html` as content mime type for the report file [#146](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/146)
6. Added a sequence labels table below the HiC contact map [#147](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/147)
7. Added parameter `hic_samtools_ext_args` and set its default value to `-F 3852` [#159](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/159)
8. Added the HiC QC report to the final report so that users don't have to navigate to the results folder [#162](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/162)

### `Fixed`

Expand Down
19 changes: 15 additions & 4 deletions bin/report_modules/parsers/hic_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,28 +15,38 @@ def parse_hic_folder(folder_name="hic_outputs"):
return {}

list_of_hic_files = hic_folder_path.glob("*.html")
list_of_hic_files = [
x for x in list_of_hic_files if re.match(r"^\w+\.html$", x.name)
]

data = {"HIC": []}

for hic_path in list_of_hic_files:
hic_file_name = os.path.basename(str(hic_path))

file_tokens = re.findall(
tag = re.findall(
r"([\w]+).html",
hic_file_name,
)[0]

labels_table = pd.read_csv(f"{folder_name}/{file_tokens}.agp.assembly", sep=" ")

# Get the labels table
labels_table = pd.read_csv(f"{folder_name}/{tag}.agp.assembly", sep=" ")
labels_table = labels_table[labels_table.iloc[:, 0].str.startswith(">")].iloc[
:, [0, 2]
]
labels_table.columns = ["Sequence", "Length"]
labels_table.Length = labels_table.Length.astype(int)

# Get the HiC QC report
hicqc_report = [
x
for x in hic_folder_path.glob("*.pdf")
if re.match(rf"[\S]+\.on\.{tag}_qc_report\.pdf", x.name)
][0]

data["HIC"].append(
{
"hap": file_tokens,
"hap": tag,
"hic_html_file_name": hic_file_name,
"labels_table": labels_table.to_dict("records"),
"labels_table_html": tabulate(
Expand All @@ -46,6 +56,7 @@ def parse_hic_folder(folder_name="hic_outputs"):
numalign="left",
showindex=False,
),
"hicqc_report_pdf": os.path.basename(str(hicqc_report)),
}
)

Expand Down
12 changes: 12 additions & 0 deletions bin/report_modules/templates/header.html
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,18 @@

.iframe-wrapper {
text-align: center;
width: 90%;
margin-left: auto;
margin-right: auto;
margin-bottom: 32px;
}

.iframe-wrapper-hic {
width: 700px;
height: 850px;
margin-left: auto;
margin-right: auto;
margin-bottom: 32px;
}

.tab {
Expand Down
13 changes: 11 additions & 2 deletions bin/report_modules/templates/hic/report_contents.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,23 @@
<div class="section-heading-wrapper">
<div class="section-heading">{{ all_stats_dicts['HIC'][item]['hap'] }}</div>
</div>
<div class="iframe-wrapper">
<iframe src="./hic/{{ all_stats_dicts['HIC'][item]['hic_html_file_name'] }}" width="100%" height="100%"></iframe>
<div class="iframe-wrapper-hic">
<iframe src="./hic/{{ all_stats_dicts['HIC'][item]['hic_html_file_name'] }}" width="700px" height="850px"></iframe>
</div>
</div>
<div class="results-section">
<div class="section-para-wrapper">
<p class="section-para"><b>Sequence labels and lengths</b></p>
</div>
<div class="table-outer">
<div class="table-wrapper">{{ all_stats_dicts['HIC'][item]['labels_table_html'] }}</div>
</div>
<div class="section-para-wrapper">
<p class="section-para"><b>HiC QC report</b></p>
</div>
<div class="iframe-wrapper">
<iframe src="./hic/hicqc/{{ all_stats_dicts['HIC'][item]['hicqc_report_pdf'] }}" width="100%" height="100%"></iframe>
</div>
</div>
</div>
{% if vars.update({'is_first': False}) %} {% endif %} {% endfor %}
Binary file added docs/images/hicqc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,12 @@ Kraken2 [assigns taxonomic labels](https://ccb.jhu.edu/software/kraken2/) to seq

Hi-C contact mapping experiments measure the frequency of physical contact between loci in the genome. The resulting dataset, called a “contact map,” is represented using a [two-dimensional heatmap](https://github.com/igvteam/juicebox.js) where the intensity of each pixel indicates the frequency of contact between a pair of loci.

<div align="center"><img src="images/hic_map.png" alt="AssemblyQC - HiC interactive contact map" width="50%"><hr><em>AssemblyQC - HiC interactive contact map</em></div>
<div align="center">
<img src="images/hicqc.png" alt="AssemblyQC - HiC QC report" width="44.59%">
<img src="images/hic_map.png" alt="AssemblyQC - HiC interactive contact map" width="40%">
<hr>
<em>AssemblyQC - HiC results</em>
</div>

### Synteny

Expand Down
2 changes: 2 additions & 0 deletions subworkflows/local/fq2hic.nf
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ workflow FQ2HIC {

HICQC ( ch_bam_and_ref.map { meta3, bam, fa -> [ meta3, bam ] } )

ch_hicqc_pdf = HICQC.out.pdf
ch_versions = ch_versions.mix(HICQC.out.versions)

// MODULE: MAKEAGPFROMFASTA | AGP2ASSEMBLY | ASSEMBLY2BEDPE
Expand Down Expand Up @@ -95,6 +96,7 @@ workflow FQ2HIC {
ch_versions = ch_versions.mix(HIC2HTML.out.versions.first())

emit:
hicqc_pdf = ch_hicqc_pdf
hic = ch_hic
html = HIC2HTML.out.html
assembly = AGP2ASSEMBLY.out.assembly
Expand Down
4 changes: 4 additions & 0 deletions workflows/assemblyqc.nf
Original file line number Diff line number Diff line change
Expand Up @@ -590,12 +590,16 @@ workflow ASSEMBLYQC {
params.hic_skip_fastqc
)

ch_hicqc_pdf = FQ2HIC.out.hicqc_pdf
ch_hic_html = FQ2HIC.out.html
ch_hic_assembly = FQ2HIC.out.assembly
ch_hic_report_files = ch_hic_html
| mix(
ch_hic_assembly.map { tag, assembly -> assembly }
)
| mix(
ch_hicqc_pdf.map { meta, pdf -> pdf }
)
ch_versions = ch_versions.mix(FQ2HIC.out.versions)

// SUBWORKFLOW: FASTA_SYNTENY
Expand Down
Loading