Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 889 Bytes

File metadata and controls

10 lines (7 loc) · 889 Bytes

Data-challenge-for-kernel-method

The goal of the data challenge is to learn how to implement machine learning algorithms, gain understanding about them and adapt them to structural data. For this reason, we have chosen a sequence classification task: predicting whether a DNA sequence region is binding site to a specific transcription factor.

Transcription factors (TFs) are regulatory proteins that bind specific sequence motifs in the genome to activate or repress transcription of target genes. Genome-wide protein-DNA binding maps can be profiled using some experimental techniques and thus all genomic segments can be classified into two classes for a TF of interest: bound or unbound. In this challenge, we will work with three datasets corresponding to three different TFs.

Details of data and objective can be seen on https://www.kaggle.com/c/advanced-learning-models-2020