SURF

The Statistical Utility for RBP Functions (SURF) is an integrative analysis framework to identify alternative splicing (AS), alternative transcription initiation (ATI), and alternative polyadenylation (APA) events regulated by individual RBPs and elucidate protein-RNA interactions governing these events. We used SURF to analyzed 104 RBP data (K562 cells, available from ENCODE).

A detailed vignette is available here.

Installation

You can install the development version of surf from GitHub with:

# install.packages("devtools")
devtools::install_github("fchen365/surf")

What can you do with SURF?

SURF is versatile in handling ATR event-centric analysis. Provided the data, here are four different things you could do with SURF.

	Data	Format	Task
1	genome annotation	any (gtf, gff, …)	parse ATR events
2	+ RNA-seq	alignment (bam)	detect differential ATR events
3	+ CLIP-seq	alignment (bam)	detect functional association
4	+ external RNA-seq	summarized table	differential transcriptional activity

SURF Pipeline

— One task at one call

The four tasks of SURF pipeline should be streamlined. Once you have the data in hand (see the following sub-section), each step can be performed with a single function:

library(surf)

event <- parseEvent(anno_file)                              # task 1
drr <- drseq(event, rna_seq_sample)                         # task 2
far <- faseq(drr, clip_seq_sample)                          # task 3
dar <- daseq(far, getRankings(exprMat), ext_sample)         # task 4

Here, anno_file, rna_seq_sample, clip_seq_sample, and ext_sample are data description, and exprMat is a table of extra transcriptome quantification (e.g., TCGA, GTEx, …).

— Tell `surf` about your data

Describing your data should be easy. Simply follow the example below.

For task 1, a file directory will do.

anno_file <- "gencode.v24.annotation.filtered.gtf"

For task 2, surf needs to know where the alignment files (bam) are and the experimental condition for differential analysis (e.g., RBP “knock-down” and “wild-type” control).

rna_seq_sample <- data.frame(
  row.names = c('sample1', 'sample2', 'sample3', 'sample4'),
  bam = paste0("rna-seq/bam/sample", 1:4, ".bam"),
  condition = c('knock-down', 'knock-down', 'wild-type', 'wild-type'),
  stringsAsFactors = F
)

Similarly for task 3, surf needs to know where the alignment files (bam) are and the experimental condition (e.g., “IP” and the input control “SMI”).

rna_seq_sample <- data.frame(
  row.names = c('sample5', 'sample6', 'sample7'),
  bam = paste0('clip-seq/bam/', 5:7, '.bam'),
  condition = c('IP', 'IP', 'SMI'),
  stringsAsFactors = F
)

Finally, for task 4, surf assumes that you have transcriptome quantification summarized in a table exprMat, whose rows correspond to genomic features (e.g., genes, transcripts, …) and columns correspond to samples. You can use any your favorite measure (e.g. TPM, RPKM, …). Then, let surf know of the sample group (condition):

ext_sample <- data.frame(
  row.names = colnames(exprMat),
  condition = rep(c('TCGA', 'GTEx'), c(173, 337))
)

Reference

Chen, F., Keleş, S. SURF: integrative analysis of a compendium of RNA-seq and CLIP-seq datasets highlights complex governing of alternative transcriptional regulation by RNA-binding proteins. Genome Biol 21, 139 (2020). doi:10.1186/s13059-020-02039-7

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
R		R
inst		inst
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
surf.Rproj		surf.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SURF

Installation

What can you do with SURF?

SURF Pipeline

— One task at one call

— Tell `surf` about your data

Reference

About

Releases

Packages

Languages

License

fchen365/surf

Folders and files

Latest commit

History

Repository files navigation

SURF

Installation

What can you do with SURF?

SURF Pipeline

— One task at one call

— Tell surf about your data

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

— Tell `surf` about your data

Packages