Skip to content

Commit

Permalink
Fix CRISPR parser #299 (#302)
Browse files Browse the repository at this point in the history
* fix CRISPR parser #299

The assumed minimal length of CRISPR spacers was reduced from at least 10 down to at least 1 in the regex.
* polish code
* relaxe CRISPR spacer length regex to 1 or 2 digits
  • Loading branch information
oschwengers authored Jul 16, 2024
1 parent c93c3f1 commit f90c836
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions bakta/features/crispr.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import bakta.utils as bu


RE_CRISPR = re.compile(r'(\d{1,8})\s+(\d{2})\s+(\d{1,3}\.\d)\s+(?:(\d{2})\s+)?([ATGCN]+)?\s+([ATGCN\.-]+)\s*(?:([ATGCN]+))?')
RE_CRISPR = re.compile(r'(\d{1,8})\s+(\d{2})\s+(\d{1,3}\.\d)\s+(?:(\d{1,2})\s+)?([ATGCN]+)?\s+([ATGCN\.-]+)\s*(?:([ATGCN]+))?')


log = logging.getLogger('CRISPR')
Expand Down Expand Up @@ -98,7 +98,7 @@ def predict_crispr(genome: dict, contigs_path: Path):
spacer_length = len(spacer_seq)
crispr_spacer = OrderedDict()
crispr_spacer['strand'] = bc.STRAND_UNKNOWN
crispr_spacer['start'] = position + repeat_length - gap_count
crispr_spacer['start'] = position + repeat_length - gap_count
crispr_spacer['stop'] = position + repeat_length + spacer_length - 1 - gap_count
crispr_spacer['sequence'] = spacer_seq
crispr_array['spacers'].append(crispr_spacer)
Expand Down

0 comments on commit f90c836

Please sign in to comment.