layout |
---|
reference |
{:auto_ids}
accession
: a unique identifier assigned to each sequence or set of sequences
categorical variable : Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical). Categorical variables take on a fixed number of values that are names or labels.
cleaned data
: data that has been manipulated post-collection to remove errors or inaccuracies, introduce desired formatting changes, or otherwise prepare the data for analysis
conditional formatting
: formatting that is applied to a specific cell or range of cells depending on a set of criteria
CSV (comma separated values) format
: a plain text file format in which values are separated by commas
factor
: a variable that takes on a limited number of possible values (i.e. categorical data)
Gb : gigabyte of file storage or file size
Gbase : a gigabase represents one billion nucleic acid bases (Gbp may indicate one billion base pairs of nucleic acid)
headers : names at tops of columns that are descriptive about the column contents (sometimes optional)
metadata
: data which describes other data
NGS : common acronym for "Next Generation Sequencing" currently being replaced by "High Throughput Sequencing"
null value
: a value used to record observations missing from a dataset
observation
: a single measurement or record of the object being recorded (e.g. the weight of a particular mouse)
plain text : unformatted text
quality assurance
: any process which checks data for validity during entry
quality control
: any process which removes problematic data from a dataset
raw data
: data that has not been manipulated and represents actual recorded values
rich text
: formatted text (e.g. text that appears bolded, colored or italicized)
string
: a collection of characters (e.g. "thisisastring")
TSV (tab separated values) format
: a plain text file format in which values are separated by tabs
variable
: a category of data being collected on the object being recorded (e.g. a mouse's weight)