-
Notifications
You must be signed in to change notification settings - Fork 4
Workflow design
holecm edited this page Dec 17, 2013
·
12 revisions
ExpressoinSet introduction by Seth Falcon et al.
Dissecting an ExpressionSet object http://telliott99.blogspot.cz/2011/08/dissecting-expressionset-object.html
abbrevs: features F, samples S, covariates V,
contains:
- assayData (gene expression matrix FxS)
- phenotypeData (qualitative and quantitative data SxV) + metadata
- also contains "class" named as "User_class"
- annotation (e.g., hgu95v2)
- featureData
- experimentData (minimum information about microarray experiment)
- protocolData
methods:
- get feature names
- get sample names
- access phenotype (e.g., gender)
- subsetting
- subset by phenotype (e.g., only male samples)
- transform into R object note: the minimal ExpressionSet object can be initialized by a gene expression matrix only.
(obsolete)
- type: qualitative, quantitative
- 1xS
- allows conversion into a collection of WU sets
- training and testing pair (training ExpressionSet, testing ExpressionSet)
- also includes dependent variable definition
- set (e.g. set of traning and tesing pairs)
- ranking
- note that some methods return only "limited" ranking (e.g. only the first top 200 from 4000 possible)
- wu set
- head type, head description (e.g., URL), tail type, tail members
- in future, working units able represent "inhibition" or "activation"
The plugin description is here.
- load gene expression data
- collapse into gene symbols or ENTREZ IDs
- normalize data
- show boxplot
- accepts:
- returns: ExpressionSet
- impute/remove missing values
- methods
- accepts: ExpressionSet (also in pair container)
- returns: transformed ExpressionSet (also in pair container)
- apply set aggregation
- mean, median, pca, setsig (generally arguments can be: ES_tr, ES_tt, C_tr, C_tt)
- accepts: annotation, wu set, ExpressionSet (also in pair container)
- returns: transformed ExpressionSet (also in pair container)
- apply feature selection
- SVMRFE (select n param), infgain, t-test
- accepts: ExpressionSet (also the first member from the pair container)
- returns: ranking
- apply threshold
- %, n of selected features
- accepts: ranking , ExpressionSet (also in pair container)
- returns: transformed ExpressionSet (also in pair container)
- cross validation (meta plugin)
- set # of folds, stratification, random seed generator
- accepts: dependent variable description , ExpressionSet (also the first member from the pair container)
- returns: set of results
- visualization
- PCA
- accepts: dependent variable description , ExpressionSet (also the first member from the pair container)
- model
- accepts: model
- CV results
- accepts: statistic on set of CV numeric results
- PCA
- ML algorithms
- number of folds in case of just one input dataset
- accepts: dependent variable description , ExpressionSet (also in pair container)
- returns: model, acc, AUC, ...
- Statistical test
- accepts: dependent variable description , ExpressionSet (also the first member from the pair container)
- returns: ranking, result table