-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to build clustering annotation from the command line #207
Comments
What version of AutoAnnotate are you using? Can you please send me your Thanks! |
Dear Mike,
Thank you very much for your prompt reply.
The files you requested are attached.
I am using Autoannotate V.1.4.1 with Cytoscape 3.10.2 Java 10.0.12 on Ubuntu 20.04.
You can see the problem, e.g., in the network "Left_Hemisphere_fMRI_NQ-EF". The command I was using is:
autoannotate annotate-clusterBoosted clusterAlgorithm=MCL labelColumn=EnrichmentMap::GS_DESCR maxWords=3 network=current
Waiting forward for your further help.
Best,
Yaron Caspi
BTW, it was very hard, or even impossible, to find in the documentation the appropriate value for the clusterAlgorithm to put in the command instead of MCL
On 06/09/2024 00:09, Mike Kucera wrote:
What version of AutoAnnotate are you using?
Can you please send me your framework-cytoscape.log file found in the <user-home>/CytoscapeConfiguraiton/3 folder. That should contain the entire exception trace. And if possible please send me your session file.
Thanks!
—
Reply to this email directly, view it on GitHub<#207 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BLBDGVAI7KSWIXOFDKN3UWLZVB6Z3AVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZSGEZDSMBTGE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi, It looks like GitHub didn't attach your files. Can you please send them to me directly at [email protected]. Thanks. |
Hi, there are two things that should help here...
|
Dear Mike,
Thank you so much. After updating to version 1.5.1, it indeed seems to work.
Two more unrelated questions.
A. Is there a simple command to get the list of clustered and number of nodes they include (like the menu item used to export cluster to file)?
B. Is there a way to add words to the "excluded words" list definitely. I mean, is there a file or something similar that I can edit to add several words definitely?
Best,
Yaron
On 11/09/2024 22:18, Mike Kucera wrote:
Hi, there are two things that should help here...
1. Try updating AutoAnnotate to the latest version (currently 1.5.1). I don't get the same error with the latest version.
2. You must use a numeric column for the edgeWeightColumn attribute. Using the 'name' column, which has type String, causes an error in clusterMaker. Try edgeWeightColumn=EnrichmentMap::similarity_coefficient
—
Reply to this email directly, view it on GitHub<#207 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BLBDGVE4WMYC4JDSNR7UWTLZWBGLBAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBTHAYDQOJQGU>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi Yaron, If you are running commands thought R or python, with regards to you first question, there isn't a simple command to get the info but what I usually do is after autoannotating the network I get the node table (I use RCy3 from R and use the function - getTableColumns) with that table you can use the column
Imbedded in one of my R workflows I have: Thanks, |
Dear Ruth,
Thank you so much.
I use R.
When doing it manually (at least for autoannotate), I did not find a mechanism to gets it stored. This is why I thought that there might be an excluded words file somewhere that I can just edit.
I was mainly interested in adding excluded words to the autoannotate clustering algorithm and not word cloud (to get the cluster labeling to fit my purposes).
Thank again.
Best,
Yaron
On 12/09/2024 20:54, Ruth Isserlin wrote:
Hi Yaron,
I know you are running commands but are you running this through R or python?
If you are running commands thought R or python, with regards to you first question, there isn't a simple command to get the info but what I usually do is after autoannotating the network I get the node table (I use RCy3 from R and use the function - getTableColumns)
default_node_table <- getTableColumns(table= "node",network = network_suid)
with that table you can use the column __mclCluster to get the number of nodes in the cluster and their names.
1. With regards to adding words to the exclusion list permanently, In word cloud there is a mechanism to add words to the list and I believe that it gets stored and reloaded but I prefer to run the following command prior to annotating:
wordcloud ignore add value="wordtoignore"network=SUID:1234
Imbedded in one of my R workflows I have:
#add the set of words to ignore
words2ignore <- c("pid",1:10)
responses <- lapply(words2ignore,function(x){ wordcloud2_url <- paste("wordcloud ignore add value="",x, "" ","network=SUID:",network_suid, sep="");
commandsGET(wordcloud2_url)})
Thanks,
Ruth
—
Reply to this email directly, view it on GitHub<#207 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BLBDGVFCTMSPXXX5ANGWOKDZWGFJLAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGIYDQMZUGA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi Yaron, You would need to run the following commands to do it. (This is very hacky, sorry) mv WordCloud-v3.1.4.jar WordCloud-v3.1.4.zip create a FlaggedWords.txt file which looks like this: And then run: mv WordCloud-v3.1.4.zip WordCloud-v3.1.4.jar Alternately, depending on the words, you can ask @mikekucera to add the words to distribution but often words can be very specific to the dataset or data sources you are using so we try to avoid that. Thanks, |
Dear Ruth,
Thank again. I will follow these instructions.
I was mainly referring to dataset pathway name from gene ontology, namely, GOCC, GOMF and GOBP. When working with GSEA - GSEA add these to the node names. Hence, when doing the clustering, there is a bias toward these words in the cluster name.
It might be reasonable to exclude these words (or give an option to exclude those and similar words that GSEA adds) in future distributions, since they are relatively general and not specific.
Best,
Yaron
On 12/09/2024 21:26, Ruth Isserlin wrote:
Hi Yaron,
Autoannotate uses wordcloud to compute the labels so if you want to exclude words you have to make the change in word cloud.
There is a file in the WordCloud jar (which you can find in your CytoscapeConfiguration/3/apps/installed directory) called FlaggedWords.txt that you can add words to.
You would need to run the following commands to do it. (This is very hacky, sorry)
mv WordCloud-v3.1.4.jar WordCloud-v3.1.4.zip
create a FlaggedWords.txt file which looks like this:
kegg
reactome
react
biocarta
go
nci
msigdb
my_new_word1
my_new_word2
And then run:
zip -u WordCloud-v3.1.4.zip FlaggedWords.txt
mv WordCloud-v3.1.4.zip WordCloud-v3.1.4.jar
Alternately, depending on the words, you can ask @mikekucera<https://github.com/mikekucera> to add the words to distribution but often words can be very specific to the dataset or data sources you are using so we try to avoid that.
Thanks,
Ruth
—
Reply to this email directly, view it on GitHub<#207 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BLBDGVGT5CDX3X7ZHP34QX3ZWGI6TAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGI4DAMBRGU>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi Yaron, |
There is no global list of excluded words you can edit. The only way to do it is to modify the default list of words stored in the app jar like Ruth suggested. Excluded words are saved in the session file and can only be set on a per-network basis. If you are using R then they easiest thing to do is have a series of commands of the form |
Dear Ruth,
I am using C5.all.v2024.1.Hs.symbols.gmt, which is distributed with GSEA. That results in EnrichmentMap GS_DESCR mode names like https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/GOBP_ELECTRON_TRANSPORT_CHAIN and Enrichment map node names like GOBP_ELECTRON_TRANSPORT_CHAIN. And this is then taken by autoannotate to include labels that include words such as GOBP ...
Naturally, this can be removed by a python/R scripts. But working manually is cumbersome.
Best,
Yaron
On 9/12/24 21:46, Ruth Isserlin wrote:
Hi Yaron,
Which geneset files are you using? Are you using the one supplied by GSEA? (word cloud weights the words based on occurrence in the network so if GOBP and GOMF are everywhere they shouldn't be coming up in the cluster tag). I don't see them coming up in my networks but I use the baderlab genesets and not the ones supplied with GSEA so I am curious if there is an issue.
Thanks,
Ruth
—
Reply to this email directly, view it on GitHub<#207 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BLBDGVFVEFBOIQCK7OMHMV3ZWGLLJAVCNFSM6AAAAABNVUGW5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGMZTIMBRGI>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi Yaron, |
Hello,
I was trying to build an Autoannotate clustering from the command line using a command:
autoannotate annotate-clusterBoosted clusterAlgorithm=MCL labelColumn=EnrichmentMap::GS_DESCR maxWords=3 network=current edgeWeightColumn=name
However, I get an error message:
Cannot invoke "org.baderlab.autoannotate.internal.model.AnnotationSetBuilder.getClusters()" because "this.builder" is null
Clustering using the Cytoscape Autoannotate menu works just fine. Only the command line send the error message. In addition, if I increase the similaritycutoff of the network so that fewer edges are formed, clustering from the command line or the Cytoscape Autoannotate menu were perfectly well.
What can be the source of the problem?
Best,
Yaron Caspi
The text was updated successfully, but these errors were encountered: