forked from jmcgover/thesis
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path001abstract.tex
29 lines (29 loc) · 2.3 KB
/
001abstract.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
%----------%
Fecal contamination in bodies of water is an issue that frequently plagues public and environmental water supplies.
%----------%
%Often, restricting access to water sources until the contaminants dissipate is the only option natural resource managers have.
%----------%
Finding the source of the fecal matter can help prevent further contamination and more quickly curb future contamination.
%----------%
\MSTlong{} (\mst{}) aims to determine the source \spec{} of strains of microbiological lifeforms and \libdep{} \mst{} is one method that can assist in this fecal matter sourcing.
%----------%
Recently, the Biology Department and the Computer Science Department at \cplong{} (\cp{}) teamed up to build a database called \cploplong{} (\cplop{}).
%----------%
Students collect fecal samples, culture \ecoli{} \isols{} from the samples to pyrosequence the two \itslong{} (\itsshort{}) DNA regions of \ecoli{}, and insert this data, called pyroprints, into \cplop{}.
%----------%
This work investigates two \mst{} methodologies that use \cplop{}: a bacterial strain-based approach and an \isol{}-based approach.
%----------%
%----------%
%This work investigates two \mst{} methodologies that use \cplop{}: one that uses \dbscan{} to cluster for \bslongs{} and another called the \kraplong{} (\krap{}), which consists of four strategies to resolve the multiple \knnlong{} lists that result from querying the two \itsshort{} regions of an \ecoli{} \isol{}.
%----------%
By using \dbscan{} to build \bslongs{}, we found that between 41\% and 51\% of the clustered data fell into pure strains, while another 34\% to 43\% fell into a cluster where its \spec{} was the most dominant, validating the effectiveness of using \ecoli{} in \cplop{}.
%----------%
We also verified the expected existence of transient strains and found them to be few in number.
%----------%
Unfortunately, between 27\% and 53\% of the data remained unclustered.
%----------%
As a fallback, we turn to the \kraplong{} (\krap{}), which consists of four strategies to resolve the multiple \knnlong{} lists that result from querying the two \itsshort{} regions of an \ecoli{} \isol{}.
%----------%
It provides us a variety of resolution strategies that garner between 65\% and 85\% overall accuracy and over 75\% accuracy for well-represented \spec{}.
%----------%
%\vspace{-72pt}