This collection of code examples aims analysing (DNA-) sequence data with python. Initiated as a little private "learning by doing" project on a basic level, the focus is on getting coding experience within a limited amount of free time, sharing ideas and getting inspirations rather then finding the most elegent coding solution. The underlying data can be obtained freely from many scientific bioinformatics databases (check out my website www.bio-century.net for further info). More parts are to come...
Ideas and (fully executable) example code snippets are presented in a Jupyter-Notebook-(.ipynb-)fileformat. Jupyter-NB or equivalent extensions in the IDE of your choice is thus required to modify it.
The SequenceAnalysis.ipynb file contains examples of basic sequence analysis, e.g. sequence indentification, classification of mutations (silent, missense and nonsense),
graphical representation of genes in sequences
and colorcoding the different segments of a tRNA.
Two highlights may be the implementation and visualization of the Needleman-Wunsch-Algorithm for sequence alignment and the graphical user interface for showing multiple sequences of interest (SOIs) within the target sequence.
All you need is a running jupyter notebook distribution of some sort as well as python fulfilling the requirements listed in section Requirements. Strongly recommended is vs code with it's .ipynb-extension
Here is room for your inspiration, which is very much appreciated! Please be patient as concerns implementationof your ideas, since the resources (time and personnel) are limited.
- Progress in groundwork towards NGS-sequencing
- Next Idea 1
- Next Idea 2
- ...
Sequence Analysis Repo
|
| LICENSE
| README.md
| SequenceAnalysis.html html-transformed output of the .ipynb-file for representation purposes
| SequenceAnalysis.ipynb Main .ipynb-file explaining tasks and giving example code to solve them
|
+---ExternalPackages
| | TerminalColors.py External package defining the colors used to print sequences in the Jupyter-Notebook-terminal
| \---__pycache__ +++ (COLLAPSED): Auto-generated pycache
|
+---Figures +++ (COLLAPSED): Example images for clarification
|
+---Figures_scientific
|
+---Icons +++ (COLLAPSED): Icons / Logos of bio-century.net
|
+---ModulesExternal
|
+---ModulesOwn Functions / methods developed for Sequence Analysis
| | A_Groundwork.py
| | B_SimpleTabbedGUI.py
| | D_KmerAnalysis.py
| |
| +---A_Groundwork_Data
| +---B_SimpleTabbedGUI_Data
| +---D_KmerAnalysis
| \---__pycache__ (COLLAPSED): Auto-generated pycache
|
+---requirements
| \---requirements.txt (COLLAPSED): Auto-generated pycache
|
\---_themes +++ (COLLAPSED) Themes for simple GUI in order to make the window look nicer
Listed in ./requirements/requirements.txt
This work is published under the GPL-2.0 license.
Many thanks to the comber.io admin for inspirations, code reviews and for initializing the bio-century.net website.
Sources are given directly in the respective code sections.