Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What if I want to merge genomes with a single reference genome just like how you do with microsysteny? #728

Closed
Triiumpher opened this issue Dec 14, 2024 · 4 comments

Comments

@Triiumpher
Copy link

Triiumpher commented Dec 14, 2024

Hi Tang,
Thank you for your inspiration and I'm about to make mcscan fancier and want to ask for some help.
I noticed that I can merge two microsysteny according to your wiki document, while what if I want to merge the macrosysteny, say two genomes merged to one reference genome. I have performed an analogy of my experience refering the merging of microsysteny but it did not work since there were additional steps of generating .blocks files and how can I make any substitution or accommodation to circumstances, say to simulate a .blocks file just like how I deal with microsysteny merging? I am not sure if I make it clear, while I did make some defeated trials.
Keeep waiting for your reply...
Ivan

68747470733a2f2f7777772e64726f70626f782e636f6d2f732f696d646f7539756b6e74743474616d2f67726170652d70656163682d636163616f2d626c6f636b732e706e673f7261773d31

@tanghaibao
Copy link
Owner

@Triiumpher

For macrosynteny, similar plots are available and you just need to use the pairwise .anchors file. No merging is needed. We have an example:

https://github.com/tanghaibao/jcvi/wiki/MCscan-(Python-version)#macrosynteny-getting-fancy

@Triiumpher
Copy link
Author

Thank you so much and one more question about the --iter option: shall you plz expand this concept to more technical details? Say, how it works to generate differences and what impact can be exterted if we do not have this option or we have different --iter numbers. I think that will help us grasp the exact idea of MCSCAN, thank you !

@tanghaibao
Copy link
Owner

@Triiumpher

The --iter options are organize the pairwise blocks (in .anchors) nicely in a multiple columns format.

Let's say you want to align maize to sorghum where maize has a genome duplication so it's practically 2x sorghum. So researchers want a spreadsheet with sorghum as the reference (1st column), and the two maize columns (2nd, 3rd columns). Then you can use --iter 2. MCscan will try to pack as many blocks as possible, in 2 iterations, and place results in 2 columns. Anything beyond 2, for example, small or weak duplications that don't fit the genome duplication picture gets filtered out.

If you take a look at the generated .blocks file in the wiki, you'll see what I mean.

How can we set the --iter option? This requires prior knowledge of genome duplications. Most often, this is already known. But for novel species or lineages, you'll need to study the duplication patterns, by looking at the dot plot. The wiki shows a few examples on how to analyze the dot plot and the syntenic depth.

Hope this helps.

@Triiumpher
Copy link
Author

Triiumpher commented Dec 22, 2024

Thank you for your prompt reply!
I got you! It is for genome duplication and if we consider the single copy of the genome and focus mainly on gene orthology among species and how gene orthologs evolve, then we keep --iter=1 as default, while we shift the number after --iter=, when we study the structure of the whole genome.
Since I actually miss this --iter= option intentionally or assign random integers,whrere I want to see what differences there will be. I just found out that same replicated genes are listed as new columns and if there is a dot initially, there will be another same dots when I add up the iteger. As a result, I reckon it wouldn't bother if I mainly focus on the studies regarding gene orthologs, how many gene orthologs can be aligned among several species, for example.
Thank you again for your further explanations and best wishes for you!
Ivan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants