Skip to content

Commit

Permalink
feat: support 3D data from chromeleon ascii format, v0.7.2
Browse files Browse the repository at this point in the history
  • Loading branch information
ethanbass committed Dec 14, 2024
1 parent 9db3f66 commit ce549b7
Show file tree
Hide file tree
Showing 6 changed files with 97 additions and 56 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: chromConverter
Title: Chromatographic File Converter
Version: 0.7.1
Version: 0.7.2
Authors@R: c(
person(given = "Ethan", family = "Bass", email = "[email protected]",
role = c("aut", "cre"),
Expand Down
9 changes: 8 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
## chromConverter 0.7.2

* Added preliminary support for extraction of peak tables from 'Shimadzu' `.lcd` files.
* Added support for inference of retention times from 'Shimadzu' `.lcd` files lacking `Data Item` streams.
* Added support for raw format File Properties stream in 'Shimadzu' `.lcd` files.
* Added support for parsing 3D data field from 'Chromeleon' ascii files.

## chromConverter 0.7.1

* Fixed automatic file detection for directories (e.g., Waters `.raw` and Agilent `.D`)
Expand All @@ -10,7 +17,7 @@
### Major features

* Added preliminary support for 'Varian Worktation' (`.sms`) format through `read_varian_sms` function.
* Added preliminary support for 'Shimadzu QGD' GCMS files through the `read_shimadzu_qgd` function.
* Added preliminary support for 'Shimadzu QGD' GC-MS files through the `read_shimadzu_qgd` function.
* Added preliminary support for 'Allotrope Simple Model' (ASM) 2D chromatography date files.
* Added support for reading multiple files from 'Agilent' `.D` directories through `read_agilent_d` function.
* Added internal parser for 'Agilent ChemStation' MS files through `read_agilent_ms`.
Expand Down
88 changes: 49 additions & 39 deletions R/attach_metadata.R
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,7 @@ attach_metadata <- function(x, meta, format_in, format_out, data_format,
fs::path_ext_remove(basename(source_file)),
meta[["SampleInfo.smpl_name"]]),
sample_id = get_metadata_field(meta, "SampleInfo.smpl_id"),
vial = get_metadata_field(meta, 'SampleInfo.smpl_vial'),
sample_type = get_metadata_field(meta, "SampleInfo.smpl_type"),
sample_dilution = get_metadata_field(meta, "SampleInfo.dil_factor"),
sample_injection_volume = get_metadata_field(meta, "SampleInfo.inj_vol"),
Expand All @@ -309,52 +310,73 @@ attach_metadata <- function(x, meta, format_in, format_out, data_format,
parser = "chromconverter",
format_out = format_out)
}, "chromeleon" = {
datetime.idx <- unlist(sapply(c("Date$", "Time$"), function(str){
grep(str, names(meta))
})
)
datetime <- unlist(meta[datetime.idx])
if (length(datetime > 1)){
datetime <- paste(datetime, collapse = " ")
}
datetime <- as.POSIXct(datetime, format = c("%m/%d/%Y %H:%M:%S",
"%d.%m.%Y %H:%M:%S",
"%m/%d/%Y %H:%M:%S %p %z"),
tz = "UTC")
datetime <- datetime[!is.na(datetime)]
if (is.null(meta$`Inject Time`)){
datetime.idx <- unlist(sapply(c("Date$", "Time$"), function(str){
grep(str, names(meta))
})
)
datetime <- unlist(meta[datetime.idx])
if (length(datetime > 1)){
datetime <- paste(datetime, collapse = " ")
}
datetime <- as.POSIXct(datetime, format = c("%m/%d/%Y %H:%M:%S",
"%d.%m.%Y %H:%M:%S",
"%m/%d/%Y %H:%M:%S %p %z"),
tz = "UTC")
datetime <- datetime[!is.na(datetime)]
} else {
datetime <- sub("(\\+\\d{2}):(\\d{2})$", "\\1\\2", meta$`Inject Time`)
datetime <- as.POSIXct(strptime(datetime, format = "%d/%m/%Y %H:%M:%S %z"),
tz="UTC")

}
time_interval_unit <- tryCatch({
get_time_unit(grep("Average Step", names(meta), value = TRUE)[1],
format_in = "chromeleon")}, error = function(err) NA)
time_unit <- tryCatch({
get_time_unit(grep("Time Min.", names(meta), value = TRUE)[1],
format_in = "chromeleon")}, error = function(err) NA)
if (is.null(meta$Name) && !is.null(meta$Injection)){
meta$Name <- meta$Injection
}
if (is.null(meta$`Signal Unit`)){
unit <- grep("Signal Min", names(meta), value = TRUE)
unit <- sub(".*(?:\\((.*)\\)).*|.*", "\\1", unit)
meta$`Signal Unit` <- unit
}

structure(x, instrument = NA,
detector = meta$Detector,
software = meta$`Generating Data System`,
method = meta$`Instrument Method`,
batch = NA,
batch = meta$Sequence,
operator = meta$`Operator`,
run_datetime = datetime,
# run_date = meta$`Injection Date`,
# run_time = meta$`Injection Time`,
sample_name = ifelse(is.null(meta$Injection),
sample_name = ifelse(is.null(meta$Name),
fs::path_ext_remove(basename(source_file)),
meta$Injection),
meta$Name),
sample_id = NA,
sample_injection_volume = meta$`Injection Volume`,
sample_amount = meta$`Injection Volume`,
time_range = c(meta$`Time Min. (min)`, meta$`Time Max. (min)`),
# start_time = meta$`Time Min. (min)`,
# end_time = meta$`Time Max. (min)`,
time_interval = meta[[grep("Average Step", names(meta))]],
vial = meta$Position,
sample_injection_volume = meta[[which(grepl("Volume",names(meta)))]],
sample_amount = NA,
sample_dilution = meta$`Dilution Factor`,
sample_type = get_metadata_field(meta, "Type"),
time_range = c(get_metadata_field(meta, "Time Min. (min)"),
get_metadata_field(meta, "Time Max. (min)")),
time_interval = tryCatch({
meta[[grep("Average Step", names(meta))]]
}, error = function(err) NA),
time_interval_unit = time_interval_unit,
time_unit = time_unit,
# uniform_sampling = meta$`Min. Step (s)` == meta$`Max. Step (s)`,
detector_range = NA,
detector_range = ifelse(meta$`Spectral Field` == "3DFIELD",
c(get_metadata_field(meta, "Scan Min. (nm)"),
get_metadata_field(meta, "Scan Max. (nm)")),
NA),
detector_y_unit = meta$`Signal Unit`,
source_file = source_file,
source_file_format = source_file_format,
source_sha1 = digest::digest(source_file, algo="sha1", file=TRUE),
source_sha1 = digest::digest(source_file, algo = "sha1",
file = TRUE),
format_out = format_out,
data_format = data_format,
parser = "chromconverter"
Expand Down Expand Up @@ -698,18 +720,6 @@ read_masshunter_metadata <- function(file){
meta_sample
}

#' @name read_chromeleon_metadata
#' @return A list containing extracted metadata.
#' @author Ethan Bass
#' @noRd
read_chromeleon_metadata <- function(x){
meta_fields <- grep("Information:", x)
meta <- do.call(rbind, strsplit(x[(meta_fields[1] + 1):(meta_fields[length(meta_fields)] - 1)], "\t"))
rownames(meta) <- meta[, 1]
meta <- as.list(meta[, -1])
meta
}

#' @name read_waters_metadata
#' @param file file
#' @return A list containing extracted metadata.
Expand Down
40 changes: 32 additions & 8 deletions R/read_chromeleon.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

#' Chromeleon ASCII reader
#'
#' Reads 'Thermo Fisher Chromeleon™ CDS' files into R.
Expand All @@ -15,7 +16,8 @@
#' @author Ethan Bass
#' @export

read_chromeleon <- function(path, format_out = c("matrix", "data.frame", "data.table"),
read_chromeleon <- function(path, format_out = c("matrix", "data.frame",
"data.table"),
data_format = c("wide", "long"),
read_metadata = TRUE,
metadata_format = c("chromconverter", "raw")){
Expand All @@ -27,21 +29,29 @@ read_chromeleon <- function(path, format_out = c("matrix", "data.frame", "data.t
xx <- readLines(path)
xx <- remove_unicode_chars(xx)
start <- tail(grep("Data:", xx), 1)
x <- read.csv(path, skip = start, sep = "\t", row.names = NULL)
x <- x[,-2, drop = FALSE]
x <- x[,colSums(is.na(x)) < nrow(x)]
if (any(grepl(",",as.data.frame(x)[-1, 2]))){
x <- read.csv(path, skip = start, sep = "\t", row.names = NULL,
check.names = FALSE)
x <- x[, -2, drop = FALSE]
x <- x[, colSums(is.na(x)) < nrow(x)]
if (any(grepl(",", as.data.frame(x)[-1, 2]))){
decimal_separator <- ","
x <- apply(x, 2, function(x) gsub("\\.", "", x))
x <- apply(x, 2, function(x) gsub(",", ".", x))
} else {
decimal_separator <- "."
}
x <- apply(x, 2, as.numeric)
colnames(x) <- c("rt", "intensity")
if (ncol(x) == 2){
colnames(x) <- c("rt", "intensity")
}
if (data_format == "wide"){
rownames(x) <- x[,1]
x <- x[, 2, drop = FALSE]
rownames(x) <- x[, 1]
x <- x[, -1, drop = FALSE]
}
if (data_format == "long" && ncol(x) > 2){
rownames(x) <- x[, 1]
x <- x[, -1, drop = FALSE]
x <- reshape_chrom(x, data_format = "long")
}
x <- convert_chrom_format(x, format_out = format_out)
if (read_metadata){
Expand All @@ -57,3 +67,17 @@ read_chromeleon <- function(path, format_out = c("matrix", "data.frame", "data.t
}
x
}

#' @name read_chromeleon_metadata
#' @return A list containing extracted metadata.
#' @author Ethan Bass
#' @noRd
read_chromeleon_metadata <- function(x){
start <- tail(grep("Data:", x), 1)
meta <- strsplit(x[seq_len(start - 1)], split = '\t')
meta <- meta[which(sapply(meta,length) == 2)]
meta <- do.call(rbind, meta)
rownames(meta) <- meta[, 1]
meta <- as.list(meta[, -1])
meta
}
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,17 @@ chromConverter aims to facilitate the conversion of chromatography data from var

### Formats

##### ChromConverter
##### ChromConverter (internal parsers)
- 'Agilent ChemStation' & 'OpenLab' `.uv` files (versions 131, 31)
- 'Agilent ChemStation' & 'OpenLab' `.ch` files (versions 30, 130, 8, 81, 179, 181)
- Allotrope® Simple Model (ASM) 2D chromatograms (`.asm`)
- ÅNDI (Analytical Data Interchange) Chromatography & MS formats (`.cdf`)
- ANDI (Analytical Data Interchange) Chromatography & MS formats (`.cdf`)
- 'Allotrope Simple Model' (ASM) 2D chromatograms.
- mzML (`.mzml`) & mzXML (.`mzxml`) (via RaMS).
- mzML (`.mzml`) & mzXML (.`mzxml`) (via *RaMS*).
- 'Shimadzu LabSolutions' ascii (`.txt`)
- 'Shimadzu GCsolution' data files (`.gcd`)
- 'Shimadzu GCMSsolution' data files (`.qgd`)
- 'Shimadzu LabSolutions'`.lcd` (*provisional support* for PDA and chromatogram streams)
- 'Shimadzu LabSolutions'`.lcd` (*provisional support* for PDA, chromatogram, and peak table streams)
- 'Thermo Scientific Chromeleon' ascii (`.txt`)
- 'Varian Workstation' (`.SMS`)
- 'Waters Empower' ascii (`.arw`)
Expand Down Expand Up @@ -149,15 +149,15 @@ Thermo RAW files can be converted by calling the [ThermoRawFileParser](https://g

### Further analysis

For downstream analyses of chromatographic data, you can also check out my package [chromatographR](https://ethanbass.github.io/chromatographR/). For interactive visualization of chromatograms, you can check out my new package [ShinyChromViewer](https://github.com/ethanbass/ShinyChromViewer) (alpha release). There is also a vignette providing an introduction to some basic syntax for [plotting mass spectrometry data](https://ethanbass.github.io/chromConverter/articles/plot_ms.html) returned by chromConverter in various R dialects.
For downstream analyses of chromatographic data, you can also check out my package [chromatographR](https://ethanbass.github.io/chromatographR/). For interactive visualization of chromatograms, you can check out my new package [ShinyChromViewer](https://github.com/ethanbass/ShinyChromViewer) (alpha release). There is also a vignette providing an introduction to some basic syntax for [plotting mass spectrometry data](https://ethanbass.github.io/chromConverter/articles/plot_ms.html) returned by chromConverter in various R dialects (e.g., base R, tidyverse, and data.table).

### Contributing

Contributions of source code, ideas, or documentation are always welcome. Please get in touch (preferable by opening a GitHub [issue](https://github.com/ethanbass/chromatographR/issues)) to discuss any suggestions or to file a bug report. Some good reasons to file an issue:

- You've found an actual bug.
- You're getting a cryptic error message that you don't understand.
- You have a file format you'd like to read that isn't currently supported by chromConverter. (Please make sure to attach example files or a link to the files.)
- You have a file format you'd like to read that isn't currently supported by chromConverter. (Please make sure to attach example files or a link to the files).
- There's another new feature you'd like to see implemented.

**Note: Before filing a bug report, please make sure to install the latest development version of chromConverter from GitHub**, in case your bug has already been patched. After installing the latest version, you may also need to refresh your R session to remove the older version from the cache.
Expand Down
2 changes: 1 addition & 1 deletion inst/CITATION
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ bibentry(
title = "chromConverter: Chromatographic File Converter",
author = "Ethan Bass",
year = "2024",
version = "version 0.7.1",
version = "version 0.7.2",
doi = "10.5281/zenodo.6792521",
url = "https://ethanbass.github.io/chromConverter/",
textVersion = paste("Bass, E. (2024).",
Expand Down

0 comments on commit ce549b7

Please sign in to comment.