Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Support DOT inputs (and multigraphs); much improved boundary node splitting in pattern identification; add back frayed rope support; lots of other improvements #244

Open
wants to merge 379 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
379 commits
Select commit Hold shift + click to select a range
40d4b95
MNT/DOC: minor node "name"/"ID" chgs
fedarko Apr 14, 2023
d42eed5
MNT: start adding node/edge objs
fedarko Apr 15, 2023
6c7a397
MNT/DOC: continue setup of node/edge objs #204
fedarko Apr 16, 2023
6860ca1
MNT: more data model refactoring #204
fedarko Apr 16, 2023
149d79c
TST: adjust check_attrs test
fedarko Apr 16, 2023
4e654bf
MNT: continue refactor; update scale_nodes() #204
fedarko Apr 16, 2023
9d8c1eb
TST/DOC: update node scaling tests; a comment
fedarko Apr 16, 2023
e657bdf
MNT/DOC: this was located in the wrong folder
fedarko Apr 17, 2023
f38676f
MNT: rename nodes in FASTG graphs
fedarko Apr 17, 2023
78f2aa0
MNT: update edge scaling re: refactor
fedarko Apr 17, 2023
9f3e103
DOC/MNT: slight AsmGraph updates re: refactor
fedarko Apr 17, 2023
b7972d6
MNT: move get_edge_weight_field() down
fedarko Apr 17, 2023
f59cd42
DOC: to_dot() plans
fedarko Apr 17, 2023
7bdd319
MNT: relabel nodes (and kinda edges) in nx graph
fedarko Apr 17, 2023
a14c0c7
MNT: graph __init__
fedarko Apr 17, 2023
fd6f0e7
MNT: hierarchical decomposition plans
fedarko Apr 17, 2023
299f8bf
DOC: link to Nijkamp 2013 at README start [ci skip]
fedarko Apr 17, 2023
f19b77a
DOC: bungled it [ci skip]
fedarko Apr 17, 2023
73b1581
MNT: internal AG get _ prefix; layout separately
fedarko Apr 17, 2023
8941bb8
MNT/ENH: move logging to AG, and output stats
fedarko Apr 17, 2023
f7c2e9a
MNT: gracefully fail if layout not done [ci skip]
fedarko Apr 17, 2023
e2f5b19
STY: black changed its mind abt ** spacing
fedarko Apr 18, 2023
8e3c5cc
MNT/DOC: submodules in mgsc stuff; layout docs
fedarko Apr 18, 2023
d3b787b
MNT/BUG: store node names; setup edgeid2obj right
fedarko Apr 18, 2023
e09aab7
ENH: configurable node splitting names
fedarko Apr 18, 2023
6980a9f
STY: lint [ci skip]
fedarko Apr 18, 2023
50481cb
MNT: refactoring; patterns subclass nodes again
fedarko Apr 18, 2023
a05adb3
BUG/MNT: fix (circular) import crap
fedarko Apr 18, 2023
e8657ca
MNT/DOC: fun ID stuff
fedarko Apr 18, 2023
e369c6d
DOC: notes on too-large-component removal fn
fedarko Apr 18, 2023
9102e9e
MNT: tiny shortening
fedarko Apr 18, 2023
6149307
DOC/MNT: AsmGraph docs; _add_pattern() work; ...
fedarko Apr 18, 2023
6a5c0e6
MNT: frustrating work on decomp stuff
fedarko Apr 18, 2023
e294642
MNT: let validators specify pattern type (bubbles)
fedarko Apr 18, 2023
b71c53a
TST/DOC: fixing tests re bubble/VR chg; upd8 vdocs
fedarko Apr 18, 2023
458f1f8
BUG: fix PT_FRAYEDROPE name
fedarko Apr 18, 2023
7c653ad
MNT: _hierarchically_...() tentatively ready
fedarko Apr 18, 2023
b5db28f
MNT: splitting refactor changes, etc #204 #167
fedarko Apr 18, 2023
24d3e92
DOC: patt decomp TODOs, and Pattern repr
fedarko Apr 18, 2023
56ced2b
DOC: note on "while True" [ci skip]
fedarko Apr 18, 2023
c2e7867
BUG: add back outer "while True" to h.decomp
fedarko Apr 20, 2023
020d893
MNT: abstracting reused code in node splitting
fedarko Apr 20, 2023
8f42f00
DOC/ENH: comments re splitting, etc; err on psplit
fedarko Apr 20, 2023
9699863
DOC/BUG: edge repr; FR discarding bug; plans
fedarko Apr 20, 2023
23db67a
BUG/MNT: fix splitting stuff
fedarko Apr 20, 2023
403a04f
MNT: missing f-strings in Pattern code
fedarko Apr 20, 2023
77739d4
MNT: work on disallowing "trivial" chains
fedarko Apr 20, 2023
c2ec90b
DOC: slight comments
fedarko Apr 20, 2023
5a7be3f
DOC/PERF: note abt uncommon h.decomp perf thing
fedarko Apr 20, 2023
4e43848
MNT: initial untested vsn: no-ETFE chain validator
fedarko Apr 21, 2023
92103d6
MNT: rm ETFEs from chains; fix patt collections
fedarko Apr 21, 2023
e78a2cd
BUG: fix start node assigning in ETFE trimming
fedarko Apr 21, 2023
7404bd1
MNT: rm debugging prints from prev commit
fedarko Apr 21, 2023
f4e2219
DOC/STY: pass_objs code
fedarko Apr 21, 2023
c57453e
BUG/DOC: fix something_collapsed flag; patt repr
fedarko Apr 21, 2023
b37845c
TST/BUG: fix hierarchical test graph
fedarko Apr 21, 2023
23481d6
BUG/ENH: don't split patts; DOT dumping
fedarko Apr 21, 2023
78c7424
TST: update top h decomp test
fedarko Apr 21, 2023
c06e874
DOC: display -> visualize in --no-patterns desc
fedarko Apr 21, 2023
4347d05
DOC/ENH: more AG initialization logging messages
fedarko Apr 21, 2023
92d7492
STY [ci skip]
fedarko Apr 25, 2023
72e48b1
MNT/DOC: tidy up dump_dots() for general use
fedarko Apr 25, 2023
7c82a98
MNT: update no-ETFE chain function (more lenient)
fedarko Apr 25, 2023
01427f0
MNT/DOC: rename _no_etfes to _trimmed_etfes, & doc
fedarko Apr 25, 2023
b34415f
MNT: update pattid2obj, pcolls in _add_pattern()
fedarko Apr 26, 2023
2f48a66
DOC: note about edge parents
fedarko Apr 26, 2023
428cd62
DOC: notes about edge replacing in decomposition
fedarko Apr 26, 2023
cad32f4
MNT: Pass Edges directly to Pattern (not subgraph)
fedarko Apr 27, 2023
4e0158d
DOC: maybe-obvious note about AG stuff
fedarko Apr 27, 2023
fd90598
DOC: _add_pattern() splitting
fedarko Apr 27, 2023
73d1b7b
TST: add "1-in bubble" test
fedarko Apr 27, 2023
0b095b3
DOC: use PT2HR in ValidationResults repr()
fedarko Apr 27, 2023
316faa4
BUG/DOC/MNT: fix decomposed-graph edge routing
fedarko Apr 27, 2023
da96cfa
TST: fix a test i broke when updating VR repr()
fedarko Apr 27, 2023
50f167f
DOC: Edge "levels" -- improve descriptions
fedarko Apr 27, 2023
812e0ad
MNT: remove comment about now-fixed issue :)
fedarko Apr 27, 2023
ca33c42
MNT: refactor Pattern to use Node/Edge obj lists
fedarko Apr 27, 2023
fcc97d0
DOC: accidentally a )
fedarko Apr 27, 2023
dc8d5a6
MNT: removing unnecessary "self" from some calls
fedarko Apr 27, 2023
9426053
BUG/MNT: layout fixes
fedarko Apr 27, 2023
e76ebf5
DOC: "list of" in Pattern constructor
fedarko Apr 27, 2023
585b641
REL: remove pkg imports from top-level __init__
fedarko Apr 27, 2023
48b403d
REL/MNT: restore __all__ imports in t.l. __init__
fedarko Apr 27, 2023
cbfdb1b
MNT: simply Pattern constructor: get ptype from VR
fedarko Apr 27, 2023
bed21a6
DOC: check off completed task (patt subg stuff)
fedarko Apr 27, 2023
3856bca
MNT: ugly & almost-working impl of chain merging
fedarko Apr 28, 2023
4ae1b78
MNT: flip chain-merging func in Pattern
fedarko Apr 28, 2023
e1a69c6
MNT: do away with Pattern.node_ids attr
fedarko Apr 28, 2023
3472dbe
MNT: use node IDs in pattern repr()
fedarko Apr 28, 2023
f6b2224
MNT: chain merging working???
fedarko Apr 29, 2023
c8ce401
DOC/MNT: comments abt chain merging, & err catch
fedarko Apr 29, 2023
d595434
BUG: fix chain merging pbms
fedarko Apr 29, 2023
5d2c253
BUG: forgor to use pred instead of adj
fedarko Apr 29, 2023
bc71b3e
TST: updating h-decomp tests re chain splitting
fedarko Apr 29, 2023
25976c3
BUG: remove merged chains from candidate nodes
fedarko Apr 29, 2023
44a4fad
TST: validators.fail_if_not_single_edge()
fedarko Apr 29, 2023
01c2b17
TST: more validators utils stuff
fedarko Apr 29, 2023
d753c09
DOC: chain merging is done!
fedarko Apr 29, 2023
2ff5aaf
TST: chain-into-cyclic-chain merging :)
fedarko Apr 29, 2023
f10e8bd
MNT: initial impl of removing unnecessary splits
fedarko Apr 29, 2023
26b1289
MNT: "un-split" node names
fedarko Apr 29, 2023
0e79dc6
BUG/MNT: fix unnecessary split removal graph upd8s
fedarko Apr 29, 2023
d070078
STY
fedarko Apr 29, 2023
53ecb7d
BUG: oops, import WeirdError into edge.py
fedarko Apr 29, 2023
033d177
BUG: allow unnec. split nodes to be s/e of patt
fedarko Apr 29, 2023
bff5fa1
BUG/MNT: fix & simplify unnec split rm
fedarko Apr 29, 2023
d4a4906
DOC: rm'ing unnecessary split nodes is done :)
fedarko Apr 29, 2023
29d7f10
MNT: cleaner repr() for Patterns
fedarko Apr 29, 2023
cec9078
BUG/MNT: check *ancestors* for unnec splits
fedarko Apr 29, 2023
58e893f
DOC/MNT: rm incorrect comment
fedarko Apr 29, 2023
563aa7f
BUG/MNT: separate (dec) graph traversal in unnecrm
fedarko Apr 29, 2023
f52333f
ENH: allow (splitting) multiple start/end nodes
fedarko May 1, 2023
c125058
STY/TST: fix bubble tests & restyle code
fedarko May 1, 2023
acfe782
TST: fix chain tests
fedarko May 1, 2023
5f4120a
TST: fix cyclic chain tests
fedarko May 1, 2023
e703b41
TST/MNT: VR sanity chks; fix VR utils tsts
fedarko May 1, 2023
0bf769b
TST: start/end node lists in FR validation
fedarko May 1, 2023
c99df7e
TST: misc_utils
fedarko May 1, 2023
f09fd7e
DOC: rm outdated comment
fedarko May 1, 2023
4e9786f
MNT: rename start_nodes -> start_node_ids in VR
fedarko May 1, 2023
1f34732
BUG: use correct candidate nodes set name
fedarko May 1, 2023
a515a1c
TST: FR splitting :); add nx2 (mc-style) gml func
fedarko May 1, 2023
cd4f758
DOC: rm unnec split comments
fedarko May 1, 2023
5a331d5
BUG: separate (dec)graph traversals in splitting
fedarko May 1, 2023
9d7810f
MNT: rm old prints
fedarko May 1, 2023
dca90c3
STY
fedarko May 1, 2023
4f94f04
BUG: correct key var name
fedarko May 1, 2023
e71f58a
BUG: fix outgoing node ID switchup in splitting
fedarko May 1, 2023
04aa932
BUG: oof that outgoing thing had another pbm
fedarko May 1, 2023
40e19d1
MNT: rm now-unused debug prints
fedarko May 1, 2023
584bfc1
TST: aug1 bug regression test
fedarko May 1, 2023
750d445
TST: another chain-Y test
fedarko May 1, 2023
029c0e3
TST: more FR stuff
fedarko May 1, 2023
73f58eb
TST: rm debug dump_dots
fedarko May 1, 2023
f0ad67e
MNT: simplify not_single_edge() & use in unnec rm
fedarko May 2, 2023
e5bbb28
BUG: use newline to separate patt ct & layout msgs
fedarko May 2, 2023
dbacff6
BUG/MNT: fix&simplify fake edge check in unnec rm
fedarko May 2, 2023
96495a6
STY/BUG: fake_edge_uid -> fe_id; missing import
fedarko May 2, 2023
4f73239
TST: uncollapsed chains within FRs; upd8 test code
fedarko May 2, 2023
d51cf3f
ENH: allow uncollapsed chains in FR middle
fedarko May 2, 2023
5281272
TST/DOC: test & doc misc_utils.verify_single()
fedarko May 2, 2023
35d0727
MNT: rename VR.nodes --> VR.node_ids
fedarko May 2, 2023
b607465
ENH/TST: disallow FRs in FRs
fedarko May 12, 2023
d369656
TST: rename & flesh out attr conflict tests
fedarko May 13, 2023
6971fcd
TST: rename two test files
fedarko May 13, 2023
eec8d56
DOC: minor comment updates in AG codebase
fedarko May 13, 2023
4386857
DOC: get_connected_components docstring formatting
fedarko May 13, 2023
a7f5893
TST: fix cc sorting tests re: refactor
fedarko May 13, 2023
903558d
STY: lint
fedarko May 13, 2023
50ca756
TST: fix edge scaling tests
fedarko May 13, 2023
282ce9e
TST/MNT: trying-to-scale-fake-edges
fedarko May 13, 2023
9db9d31
TST: fix node-scaling tests
fedarko May 13, 2023
f60527b
TST: call layout() in to_dict() test
fedarko May 13, 2023
2cb28f0
ENH: add AssemblyGraph.to_dot()
fedarko May 13, 2023
4012cc1
MNT/ENH: add to_dot() to dump_dots()
fedarko May 13, 2023
62cd418
DOC: unnecessary #
fedarko May 13, 2023
667cf5f
ENH: modify to_dot() to draw graph from L --> R
fedarko May 13, 2023
617ba77
ENH: smaller circles for no-orientation nodes
fedarko May 13, 2023
5522352
MNT: for now, just incl node name in to_dot()
fedarko May 13, 2023
3619b83
ENH: cc stats; disabling maxn/maxe
fedarko May 15, 2023
cd2727c
MNT/TST: merge arg_utils into file_utils; upd8s
fedarko May 15, 2023
1320a4c
DOC: tidy up README intro
fedarko May 15, 2023
6648e20
DOC: Vignettes
fedarko May 15, 2023
dbb9a25
DOC: vignettes ctd; maxn/maxe stuff [ci skip]
fedarko May 15, 2023
c06fd22
DOC: `` -> <code> in details headers [ci skip]
fedarko May 15, 2023
d134794
DOC: dot option & {RMD} stuff skeleton
fedarko May 16, 2023
951f4bf
ENH/DOC: add DOT CLI option; outputs optional; QOL
fedarko May 16, 2023
4beba5e
DOC: vignette tweaks [ciskip]
fedarko May 16, 2023
ef6d555
DOC: typo [ci skip]
fedarko May 16, 2023
11a6899
DOC/ENH: version option & "Documentation" section
fedarko May 16, 2023
ea37d12
DOC: CLI usage & header fix
fedarko May 16, 2023
612bf46
DOC: to_tsv() description tidying [ci skip]
fedarko May 16, 2023
9486940
MNT: some WIP layout stuff in Pattern
fedarko May 16, 2023
4d042f4
DOC: capitalize an S [ci skip]
fedarko May 16, 2023
f7e0574
DOC: trim down & tidy param descriptions [ci skip]
fedarko May 16, 2023
ae92aaa
ENH: node/edge metadata skeleton #243
fedarko May 17, 2023
db68752
ENH: include pattern type in cluster names in DOT
fedarko May 17, 2023
71727f8
ENH: use nice, nested indentation in DOT export
fedarko May 17, 2023
4ee8808
TST/DOC: test --no-patterns & rm old comment
fedarko May 18, 2023
0867678
MNT: remove "graphname" in DOT headers
fedarko May 18, 2023
79cd205
MNT/BUG: simplify DOT export text; PT2HR_NOSPACE
fedarko May 18, 2023
5e1d720
MNT: consistent node width/height for L->R
fedarko May 18, 2023
372adb2
MNT/PERF: more elegant indentation in to_dot()
fedarko May 18, 2023
8419c15
DOC: Pattern.to_dot() return
fedarko May 18, 2023
41b0dc2
TST: fix node scaling tests: re w/h switcheroo
fedarko May 18, 2023
cfee858
MNT: remove a lotta unused stuff, esp in config.py
fedarko May 18, 2023
3126656
TST: beef up some of the h-decomp tests: specifics
fedarko May 18, 2023
9e44bf4
TST: done beefing up decomp tests re: #nodes/edges
fedarko May 18, 2023
34d850b
DOC: better h*(w/2) explanation
fedarko May 19, 2023
e41b05d
TST: AssemblyGraph.to_dot()
fedarko May 19, 2023
7046ea4
TST: misc_utils.verify_subset() msg param
fedarko May 19, 2023
3c0e568
TST: graph.node.get_node_name()
fedarko May 19, 2023
ea811a7
TST: Node constructor tests
fedarko May 19, 2023
3b89530
MNT: Node counterpart sanity check & simplifying
fedarko May 19, 2023
c507db0
MNT: Node constructor handles counterpart splittin
fedarko May 19, 2023
07ec5c3
MNT: simplify Node making-into-split stuff
fedarko May 19, 2023
07822df
DOC: note abt Node constructor, etc
fedarko May 19, 2023
409bbdd
TST: Node() split-but-no-counterpart
fedarko May 19, 2023
dc79e93
DOC: consistent Node() err msgs
fedarko May 19, 2023
6c6fae5
TST: Node(), with counterpart already "just" split
fedarko May 19, 2023
c456282
TST: graph.node.get_opposite_split()
fedarko May 19, 2023
e10b048
MNT: delegate more work to make_into_split()
fedarko May 19, 2023
5785c4c
TST: make_into_split but already split
fedarko May 19, 2023
c040d52
MNT: do away with superfluous ""s in Node err msgs
fedarko May 19, 2023
f2eefd7
TST: make_into_split with existing counterpart
fedarko May 19, 2023
92bffcb
MNT/TST: simplify Node() & add unsplit() tsts
fedarko May 19, 2023
d1722c4
TST: more Node methods
fedarko May 19, 2023
4d91986
TST/MNT: Node.to_dot() tests; catch unsetdims case
fedarko May 19, 2023
07124df
TST: AssemblyGraph.to_tsv()
fedarko May 19, 2023
208d57f
TST: to_tsv() with remaining splits/fakes
fedarko May 19, 2023
68365ea
DOC: note that --output-ccstats incls splits/fakes
fedarko May 19, 2023
c48928e
TST: rename & fix cc max node/edge ct tests
fedarko May 19, 2023
f0756bd
TST: more large cc removal tests
fedarko May 19, 2023
024d568
TST: test to_dict() fails if layout not done
fedarko May 19, 2023
26268c5
STY: update cc removal tests
fedarko May 19, 2023
8674e14
MNT: while we're at it, rename this test file
fedarko May 19, 2023
70e4735
TST: remove "rm output dir" command
fedarko May 19, 2023
7b7a161
TST/MNT: PatternStats tests (& fix its __repr__())
fedarko May 20, 2023
ca18ba6
TST: shorten pattern test filenames
fedarko May 20, 2023
3712f27
ENH/MNT: Component objs; store in AG; better TSV
fedarko May 20, 2023
55af3ab
MNT/TST: adjust&test PS.update(); test PS.sum()
fedarko May 20, 2023
330b70c
STY: spacing & unused imports
fedarko May 20, 2023
b5303c0
DOC: update CLI re: --output-ccstats new fmt
fedarko May 20, 2023
89f5950
TST/MNT/DOC: test&doc Pattern.get_descendant_info
fedarko May 20, 2023
ff4655c
DOC: ComponentSizeRank -> ComponentNumber
fedarko May 20, 2023
2010663
TST: AssemblyGraph.to_tsv() on a multi-cc graph
fedarko May 20, 2023
ec9c196
TST: start testing pattern utils directly
fedarko May 20, 2023
08fadb1
MNT: just incl make_into_split() for Patterns
fedarko May 20, 2023
49bed01
TST: more Pattern utils
fedarko May 20, 2023
a1af78f
DOC: make the first line of the CLI a bit nicer
fedarko May 20, 2023
b381a02
MNT: add Component to mgsc.graph imports; repr()
fedarko May 22, 2023
165e1d6
BUG: catch "strict"/undirected DOT graphs & report
fedarko May 23, 2023
0ed8fa0
MNT: AssemblyGraph.__repr__()
fedarko May 25, 2023
5bb082e
TST: add chr21 test input & acks
fedarko May 25, 2023
509c3fb
TST: rename chr21mat test file, and basic test
fedarko May 25, 2023
af3cc04
BUG: fix chr21 splits-not-being-merged bug
fedarko May 26, 2023
4ca023a
TST: beef up chr21 test
fedarko May 26, 2023
f9686bf
TST: chr21 test - more detail & update re fixed
fedarko May 26, 2023
dee7c74
TST/BUG: reproduce cool and new chr21 bug
fedarko May 26, 2023
facd434
BUG: fix the other chr21 test - dual chain merging
fedarko May 26, 2023
ef1b173
STY
fedarko May 26, 2023
d3296e7
TST: more thorough chr21 test 2
fedarko Jun 19, 2023
0b0b360
STY
fedarko Jun 19, 2023
42bb7df
TST: better edge obj testing; reproduce chr15 bug
fedarko Jun 19, 2023
4df3f97
DOC: better comments in unnec removal func
fedarko Jun 19, 2023
e64d940
MNT: explicitly label fake edges in __repr__
fedarko Jun 19, 2023
a60fd29
DOC: remove outdated comment re: FR splitting
fedarko Jun 21, 2023
0c03540
DOC: note about node IDs defined on multiple lines
fedarko Aug 16, 2023
3667504
MNT: fix typo
fedarko Oct 16, 2023
be8bd66
DOC: some readme tweaking
fedarko Jan 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion .github/workflows/python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@ jobs:

strategy:
matrix:
python-version: [3.6]
# Gotta specify 3.10 as a string to avoid it being converted to 3.1:
# https://github.com/actions/setup-python/issues/160,
# https://github.com/fedarko/pyfastg/blob/master/.github/workflows/main.yml
python-version: [3.6, 3.7, 3.8, 3.9, "3.10"]

steps:

Expand All @@ -27,7 +30,13 @@ jobs:
- name: Install MetagenomeScope (and its pip dependencies)
run: conda run -n mgsc pip install -e .[dev]

# I think later versions of black (e.g. 23.1) changed things in ways that
# will cause this to fail. Since not all black versions support all
# python versions, the easiest solution is to just only do stylechecking
# when we're working with python 3.6.
# See https://stackoverflow.com/q/73598359 re: "if" expressions here.
- name: Lint and stylecheck the Python code
if: ${{ matrix.python-version == 3.6 }}
run: conda run -n mgsc make pystylecheck

- name: Run Python tests
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,6 @@ js_coverage.json
metagenomescope/tests/js_tests/instrumented_js/*
dist/
big-list-of-naughty-strings/
*.gv
*.xdot
*.db
mg2/
Expand Down
37 changes: 37 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# MetagenomeScope development documentation

Thanks for your interest in this project! Or, at least, I'm assuming you're
interested, because you clicked on this document, and you're reading it now,
and wow, you're still reading this sentence -- aren't you a persistent person?
Okay, you've made it to the third sentence of this document; if you've made it
this far, our fates are now intertwined. You're now officially a maintainer of
this project.

## Code structure

MetagenomeScope's code is composed of two main components:

### 1. Preprocessing script

MetagenomeScope's **preprocessing script** (contained in the
`metagenomescope/` directory of this repository) is a mostly-Python script that
takes as input an assembly graph file and produces a directory containing a
HTML visualization of the graph. Once installed, it can be run from the command
line using the `mgsc` command.

### 2. Viewer interface

MetagenomeScope's **viewer interface** (contained in the
`metagenomescope/support_files/` directory
of this repository) is a client-side web application that visualizes laid-out
assembly graphs using [Cytoscape.js](https://js.cytoscape.org/). This interface
includes various features for interacting with the graph and the
identified structural patterns within it.

You should be able to load visualizations created by MetagenomeScope
in most modern web browsers (mobile browsers probably will also work, although
using a desktop browser is recommended).

## That was the worst developer documentation I've read in my life

Sorry -- I'll try to add more stuff here later `._.`
9 changes: 6 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,18 @@

.PHONY: pytest jstest test pystylecheck jsstylecheck stylecheck pystyle jsstyle style demo

PYTEST_COMMAND = python3 -B -m pytest metagenomescope/tests/ --cov-report xml --cov-report term --cov metagenomescope
PYLOCS = metagenomescope/ setup.py
JSLOCS = metagenomescope/support_files/js/*.js metagenomescope/tests/js_tests/*.js docs/js/extra_functionality.js .jshintrc
HTMLCSSLOCS = metagenomescope/support_files/index.html metagenomescope/tests/js_tests/*.html metagenomescope/support_files/css/viewer_style.css docs/404.html docs/index.html docs/css/mgsc_docs_style.css

# -B: don't create __pycache__/ directories
pytest:
$(PYTEST_COMMAND)
rm -f metagenomescope/tests/output/*
python3 -B -m pytest \
metagenomescope/tests/ \
--cov-report xml \
--cov-report term \
--cov-report html \
--cov metagenomescope

jstest:
nyc instrument metagenomescope/support_files/js/ metagenomescope/tests/js_tests/instrumented_js/
Expand Down
426 changes: 355 additions & 71 deletions README.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ channels:
- conda-forge
- defaults
dependencies:
- python = 3.6
- python >= 3.6
- pip
- numpy
- graphviz
Expand Down
14 changes: 10 additions & 4 deletions metagenomescope/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,23 +18,29 @@

# Import submodules so they're easy to see from REPL
from . import (
arg_utils,
assembly_graph_parser,
config,
errors,
file_utils,
graph,
input_node_utils,
layout_utils,
misc_utils,
msg_utils,
parsers,
)

# ... And explicitly declare them in __all__. This will stop flake8 from
# yelling at us about these imports being unused.
__all__ = [
"arg_utils",
"assembly_graph_parser",
"config",
"errors",
"file_utils",
"graph",
"input_node_utils",
"layout_utils",
"misc_utils",
"msg_utils",
"parsers",
]

__version__ = "0.1.0-dev"
179 changes: 83 additions & 96 deletions metagenomescope/_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,19 @@
# https://github.com/biocore/qurro/blob/master/qurro/scripts/_plot.py.

import click
from . import __version__
from .config import MAXN_DEFAULT, MAXE_DEFAULT
from .main import make_viz
from ._param_descriptions import (
INPUT,
OUTPUT_DIR,
OUTPUT_DOT,
OUTPUT_CCSTATS,
NODE_METADATA,
EDGE_METADATA,
MAXN,
MAXE,
PATTERNS_FLAG,
)


Expand All @@ -36,129 +42,110 @@
context_settings={"help_option_names": ["-h", "--help"]},
no_args_is_help=True,
)
@click.option("-i", "--input-file", required=True, help=INPUT)
@click.option("-o", "--output-dir", required=True, help=OUTPUT_DIR)
# @click.option(
# "-ao",
# "--assume-oriented",
# required=False,
# default=False,
# help=ASSUME_ORIENTED,
# )
@click.option(
"-i",
"--input-file",
type=click.Path(exists=True, dir_okay=False, readable=True),
required=True,
help=INPUT,
)
@click.option(
"-o",
"--output-viz-dir",
type=click.Path(exists=False),
default=None,
show_default=True,
help=OUTPUT_DIR,
)
@click.option(
"-od",
"--output-dot",
type=click.Path(dir_okay=False, writable=True),
default=None,
show_default=True,
help=OUTPUT_DOT,
)
@click.option(
"-os",
"--output-ccstats",
type=click.Path(dir_okay=False, writable=True),
default=None,
show_default=True,
help=OUTPUT_CCSTATS,
)
@click.option(
"-n",
"--node-metadata",
type=click.Path(exists=True, dir_okay=False, readable=True),
required=False,
default=None,
show_default=True,
help=NODE_METADATA,
)
@click.option(
"-e",
"--edge-metadata",
type=click.Path(exists=True, dir_okay=False, readable=True),
required=False,
default=None,
show_default=True,
help=EDGE_METADATA,
)
@click.option(
"-maxn",
"--max-node-count",
type=click.IntRange(min=0),
required=False,
default=MAXN_DEFAULT,
help=MAXN,
show_default=True,
help=MAXN,
)
@click.option(
"-maxe",
"--max-edge-count",
type=click.IntRange(min=0),
required=False,
default=MAXE_DEFAULT,
show_default=True,
help=MAXE,
)
@click.option(
"--patterns/--no-patterns",
is_flag=True,
default=True,
show_default=True,
help=PATTERNS_FLAG,
)
# @click.option(
# "-mbf", "--metacarvel-bubble-file", required=False, default=None, help=MBF
# )
# @click.option(
# "-up", "--user-pattern-file", required=False, default=None, help=UP
# )
# @click.option(
# "-spqr",
# "--compute-spqr-data",
# required=False,
# is_flag=True,
# default=False,
# help=SPQR,
# )
# @click.option(
# "-sp",
# "--save-structural-patterns",
# is_flag=True,
# required=False,
# default=False,
# help=STRUCTPATT,
# )
# @click.option(
# "-pg",
# "--preserve-gv",
# is_flag=True,
# required=False,
# default=False,
# help=PG,
# )
# @click.option(
# "-px",
# "--preserve-xdot",
# required=False,
# is_flag=True,
# default=False,
# help=PX,
# )
# @click.option(
# "-nbdf",
# "--save-no-backfill-dot-files",
# is_flag=True,
# required=False,
# default=False,
# help=NBDF,
# )
# @click.option(
# "-npdf",
# "--save-no-pattern-dot-files",
# is_flag=True,
# required=False,
# default=False,
# help=NPDF,
# )
@click.version_option(__version__, "-v", "--version")
def run_script(
input_file: str,
output_dir: str,
# assume_oriented: bool,
output_viz_dir: str,
output_dot: str,
output_ccstats: str,
node_metadata: str,
edge_metadata: str,
max_node_count: int,
max_edge_count: int,
# metacarvel_bubble_file: str,
# user_pattern_file: str,
# compute_spqr_data: bool,
# save_structural_patterns: bool,
# preserve_gv: bool,
# preserve_xdot: bool,
# save_no_backfill_dot_files: bool,
# save_no_pattern_dot_files: bool,
patterns: bool,
) -> None:
"""Visualizes an assembly graph and the structural patterns in it.

This generates a folder containing an interactive HTML/JS visualization of
the graph. The folder's index.html file can be opened in a web browser to
access the visualization.

There are many options available to customize the visualization / output,
but the only two you probably need to worry about are the input file and
output directory: generating a visualization can be as simple as
"""Visualizes an assembly graph.

mgsc -i graph.gfa -o viz
MetagenomeScope supports multiple types of output (-o, -od, -os);
you will probably want to start with -o.

...which will generate an output directory named "viz". (You'll need to
replace "graph.gfa" with whatever the path to your assembly graph is.)
Please check out https://github.com/marbl/MetagenomeScope if you have any
questions, suggestions, etc. about this tool.
"""
make_viz(
input_file,
output_dir,
# assume_oriented,
max_node_count,
max_edge_count,
# metacarvel_bubble_file,
# user_pattern_file,
# compute_spqr_data,
# save_structural_patterns,
# preserve_gv,
# preserve_xdot,
# save_no_backfill_dot_files,
# save_no_pattern_dot_files,
patterns,
output_viz_dir,
output_dot,
output_ccstats,
node_metadata,
edge_metadata,
)


Expand Down
Loading
Loading