README.Rmd

---
bibliography: vignettes/bibliography.bib
output: 
  github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "inst/vignette-supp/",
  echo=TRUE, 
  warning=FALSE, 
  message=FALSE,
  tidy=TRUE
)
```

<!-- badges: start -->
[![R-CMD-check](https://github.com/gtonkinhill/fastbaps/workflows/R-CMD-check/badge.svg)](https://github.com/gtonkinhill/fastbaps/actions)
[![DOI](https://zenodo.org/badge/137083307.svg)](https://zenodo.org/badge/latestdoi/137083307)
<!-- badges: end -->


# fastbaps

## Installation

`fastbaps` is currently available on github. It can be installed with `devtools`

```{r, eval = FALSE}
install.packages("devtools")

devtools::install_github("gtonkinhill/fastbaps")
```

If you would like to also build the vignette with your installation run:

```{r, eval=FALSE}
devtools::install_github("gtonkinhill/fastbaps", build_vignettes = TRUE)
```

### Conda

`fastbaps` can also be installed using Conda

```
conda install -c conda-forge -c bioconda -c defaults r-fastbaps
```

## Choice of Prior

Fastbaps includes a number of options for the Dirichlet prior hyperparamters. These range in order from most conservative to least as `symmetric`, `baps`, `optimised.symmetric` and `optimised.baps`. The choice of prior can be set using the `optimise_prior` function.

It is also possible to condition on a pre-existing phylogeny, which allows a user to partition the phylogeny using the fastbaps algorithm. This is described in more detail further down in the introduction.

## Quick Start

Run fastbaps.

**NOTE:** You need to replace the variable `fasta.file.name` with the path to your fasta file. The system.file function is only used in this example vignette.

```{r, fig.width =8, fig.height=6, fig.align='center'}
# devtools::install_github("gtonkinhill/fastbaps")
library(fastbaps)
library(ape)

fasta.file.name <- system.file("extdata", "seqs.fa", package = "fastbaps")
sparse.data <- import_fasta_sparse_nt(fasta.file.name)
sparse.data <- optimise_prior(sparse.data, type = "optimise.symmetric")
baps.hc <- fast_baps(sparse.data)
clusters <- best_baps_partition(sparse.data, as.phylo(baps.hc))
```

All these steps can be combined and the algorithm run over multiple levels by running

```{r}
sparse.data <- optimise_prior(sparse.data, type = "optimise.symmetric")
multi <- multi_res_baps(sparse.data)
```

## Command Line Script

The fastbaps package now includes a command line script. The location of this script can be found by running 

```{r, eval=FALSE}
system.file("run_fastbaps", package = "fastbaps")
```

This script can then be copied to a location on the users path. If you have installed fastbaps using conda, this will already have been done for you.

## Citation

To cite fastbaps please use

> Tonkin-Hill,G., Lees,J.A., Bentley,S.D., Frost,S.D.W. and Corander,J. (2019) Fast hierarchical Bayesian analysis of population structure. Nucleic Acids Res., 10.1093/nar/gkz361.


## Introduction

```{r, echo = FALSE}
intro_rmd <- 'vignettes/introduction.Rmd'

raw_rmd <- readLines(intro_rmd)

# remove yaml 
yaml_lines <- grep("---", raw_rmd)

# remove appendix (session info)
appendix <- grep("Appendix", raw_rmd)

compressed_rmd <- raw_rmd[c(-seq(yaml_lines[1], yaml_lines[2], by = 1), 
                            -seq(appendix, length(raw_rmd)))]
writeLines(compressed_rmd, "child.Rmd")
```

```{r, child = 'child.Rmd'}
```

```{r cleanup, echo=FALSE, include=FALSE}
if (file.exists("child.Rmd")) {
  file.remove("child.Rmd")
}
```