Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

post analysis edges + annotations very buggy - loosing nodes from the main network, randomly creating duplicates of nodes #166

Open
risserlin opened this issue Apr 19, 2021 · 10 comments
Assignees
Labels
Milestone

Comments

@risserlin
Copy link

Take annotated network.
collapse all clusters
run Post analysis on the collapsed network so we can get the true post analysis edges. After running the PA nodes don't show up. Click PA set off and then back on again to see the nodes. --> Missing collapsed nodes and random nodes from some of collapsed set duplicated in left corner.
Expand clusters to get missing clusters back and re-collapse -edges are now there but they are empty. + the appearance of random duplicates

@risserlin risserlin added the bug label Apr 19, 2021
@risserlin
Copy link
Author

risserlin commented Apr 19, 2021

if you un-select the PA nodes that you created when the network was collapsed it removes the collapsed nodes as well as the PA nodes. Re introducing them only brings back the post analysis nodes and not the collapsed nodes. To get the nodes back you have to go back and expand all nodes.

@mikekucera mikekucera self-assigned this Apr 19, 2021
@mikekucera mikekucera added this to the 1.3.4 milestone Apr 19, 2021
@risserlin
Copy link
Author

There is for sure a conflict between collapse/expand and hiding/unhiding post analysis

@risserlin
Copy link
Author

I really need this to work. Is this a cytoscape group issue or an AA issue? Will moving to a previous version of AA or cytoscape fix this problem? (I tried using the summary network but that also had the same issue)

@mikekucera
Copy link
Collaborator

OK I'll make this my top priority.

@risserlin
Copy link
Author

Thanks!

@mikekucera
Copy link
Collaborator

Hi Ruth,
I've never tried running PA on a collapsed network so I'm not at all surprised it doesn't work properly.

Group nodes in Cytoscape are notorious for being a buggy mess. That's the reason we added the summary network function to AA. In fact I may not be able to fix this in EM/AA, its more likely problems with Cytoscape itself.

Could you please explain exactly what you are trying to accomplish, and the list of steps you are trying to take (assuming they actually worked properly). We may have to brainstorm an alternate solution that doesn't involve group nodes.

@risserlin
Copy link
Author

I have a network with many different clusters and post analysis. I have also added additional annotations to the PA edges with the genes responsible for overlap.
Here is an anonymized view of the network:
anon_image

The large clusters don't really add anything to the view of the network and all those extra PA edges are redundant so I want to collapse the network (and in so doing collapse the PA edges as well).

What I have tried:
Collapse all nodes - this is how it all started
Collapse just the large nodes. - didn't fix anything
Start with the network without the PA and try and do the post analysis on the collpased network - doesn't work because the collapsed nodes don't have the gene list so nothing to calculate the overlap with.
Start with network with the PA and create a summary network and do post analysis on the summary network - can't. It isn't an EM so functionality not there and nodes and edges don't contain list of genes

That gets us into hiding and unhiding PA with collapse and expanded messing everything up because all of sudden I have missing node or extra nodes.

Maybe we can set up a meeting and I can show what my session is doing or if it is easier I can send it to you?

Thanks,
Ruth

@mikekucera
Copy link
Collaborator

Send me your session file for now please.

@risserlin
Copy link
Author

sent

@risserlin
Copy link
Author

Hack using cyrest from R, manually collapse the attributes and resend them to Cytoscape.

current_baderlab_network <- setCurrentNetwork(params$network_name )

current_baderlab_nodetable <- getTableColumns(table="node")

current_baderlab_edgetable <- getTableColumns(table="edge")

#get the cluster numbers
all_clusters <- unique(current_baderlab_nodetable$'__mclCluster')
labels <- c()
genes <- list()
signature_edges <- list()

meta_nodes_info <- c()
meta_edges_info <- c()


for(i in 1:length(all_clusters)){
  if(!is.na(all_clusters[i])){
    
      current_nodes <- which(current_baderlab_nodetable$'__mclCluster' == all_clusters[i])
       aa_label_command = paste('autoannotate label-clusterBoosted labelColumn="',
                                colnames(current_baderlab_nodetable)[grep(pattern =                       "GS_DESCR",colnames(current_baderlab_nodetable))],'" nodeList=',
                           paste("SUID:",paste(current_baderlab_nodetable$SUID[current_nodes],collapse = ",SUID:"), sep=""),
                           sep="")
      
      #calculate the current label
      current_label <- commandsGET(aa_label_command)
      labels <- c(labels, current_label)
      
      #get all the genes for this cluster
     genes[i]<-list(unique(unlist(current_baderlab_nodetable$`EnrichmentMap::Genes`[current_nodes])))
     
     #for the set of nodes in this cluster get all the signature edges
     current_signature_edges <- c()
     for(j in 1:length(current_nodes)){
       current_signature_edges <- c(current_signature_edges,intersect(grep(current_baderlab_edgetable$name,
                                                           pattern=current_baderlab_nodetable$name[current_nodes[j]]),
                                                      which(current_baderlab_edgetable$interaction=="sig")))
     }
     
     signature_edges[i] <- list(current_signature_edges)
     
     #calculate the summary node stats - 
     nodes_set_to_collapse <- current_baderlab_nodetable[current_nodes,]
     meta_nodes_info <- rbind(meta_nodes_info, cbind( current_label,
                                                   min(nodes_set_to_collapse[,grep(colnames(nodes_set_to_collapse), pattern="pvalue")]),
                                                   min(nodes_set_to_collapse[,grep(colnames(nodes_set_to_collapse), pattern="fdr_qvalue")]),
                                                   max(nodes_set_to_collapse[,grep(colnames(nodes_set_to_collapse), pattern="NES")]),
                                                   paste(unique(unlist(nodes_set_to_collapse[,grep(colnames(nodes_set_to_collapse), pattern="EnrichmentMap::Genes")])),collapse = ",")))
     
     edge_subset <- current_baderlab_edgetable[unlist(signature_edges[i]),]
  
  if(dim(edge_subset)[1] > 0){
      nodeA <- apply(edge_subset,1,FUN=function(x){unlist(strsplit(x$name,split = " \\("))[1]})
      #we don't really care about the nodeB as they are all part of the cluster and the cluster is going to be collapsed to one node. 
      nodeB <- apply(edge_subset,1,FUN=function(x){unlist(strsplit(x$name,split = "\\) "))[2]})
      
      #get the unique PA nodes 
      unique_PA_nodes <- unique(nodeA)
      
      for(j in 1:length(unique_PA_nodes)){
        #get each edge that has this PA node
        set_to_collapse <- edge_subset[which(nodeA == unique_PA_nodes[j]),]
        
        #currently only interested in the overlapping genes and p-values so just collapse those
        # get the union of all the genes in the overlap
        # get the minimum p-value for mann_whit_greater
        # get the minimum p-value for mann_whit_less
        
        meta_edges_info <- rbind(meta_edges_info, cbind( unique_PA_nodes[j], current_label,
                                                   min(set_to_collapse[,grep(colnames(set_to_collapse), pattern="Overlap_Mann_Whit_greater_pVal")]),
                                                   min(set_to_collapse[,grep(colnames(set_to_collapse), pattern="Overlap_Mann_Whit_less_pVal")]),
                                                   paste(unique(unlist(set_to_collapse[,grep(colnames(set_to_collapse), pattern="Overlap_genes")])),collapse = ",")))
      }
  }
     
     
  }
  else{
    labels <- c(labels, "NA")
    genes[i] <- ""
    signature_edges[i] <- ""
  }
}

meta_edges <- data.frame(pa_node = meta_edges_info[,1], collapsed_node = meta_edges_info[,2],
                             as.numeric(meta_edges_info[,3]),as.numeric(meta_edges_info[,4]), meta_edges_info[,5])

colnames(meta_edges)[3:5] <- c(
                               colnames(current_baderlab_edgetable)[grep(colnames(current_baderlab_edgetable), pattern="Overlap_Mann_Whit_greater_pVal")],
                               colnames(current_baderlab_edgetable)[grep(colnames(current_baderlab_edgetable), pattern="Overlap_Mann_Whit_less_pVal")],
                               colnames(current_baderlab_edgetable)[grep(colnames(current_baderlab_edgetable), pattern="Overlap_genes")]
)

meta_edges$`shared name` <- paste(meta_edges$pa_node, " (meta) ", meta_edges$collapsed_node,sep="")

rownames(meta_edges) <- meta_edges$`shared name`

#make sure the overlap genes are a list.
 meta_edges$`EnrichmentMap::Overlap_genes` <- strsplit(meta_edges$`EnrichmentMap::Overlap_genes`,split = ",")

#collapse the network. 
collapse_command ='autoannotate collapse'
coll_response <- commandsGET(collapse_command ) # --> NOT WORKING (manually collapse network before doing the next command.  autoannotate collapse from the command line window in cytoscape does work though????)
 
 
loadTableData(data=meta_edges,data.key.column = "shared name",table = "edge")

@mikekucera mikekucera modified the milestones: 1.3.4, 1.3.6 Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants