Pattern-Break detects numeric gaps in filenames or directory names. Designed for developers, data managers, artists, photographers, and archivists, it offers powerful features for analyzing sequences, expanding ranges, and producing detailed reports. Whether you need to look for missing images, manage backups, or enforce naming conventions, pattern-break
can help to identify gaps in many datasets.
- Numeric Gap Detection: Identifies missing numeric sequences in files and directories.
- Multi-Block Analysis: Handles filenames with multiple numeric blocks using advanced policies.
- Range Expansion: Detects and integrates numeric ranges within standalone filenames, like
100-120
, into the analysis. - Recursive Directory Scanning: Analyzes subdirectories when needed.
- Flexible Grouping: Groups results by directory or across directories for cross-analysis.
- Threshold Splitting: Divides groups based on large numeric gaps.
- Customizable Outputs: Generates results in various formats:
- Summary
- Inline
- CSV
- JSON
- ASCII table
- Rich table (with ANSI support)
- Output Destinations: Redirect results to the console, files, or clipboard.
-
Dataset Validation
- Ensure sequential file naming without gaps in datasets or archives.
-
Backup and Restore
- Verify integrity and completeness of backups by identifying missing files.
-
Project Management
- Audit files across directories for missing versions or incomplete series.
-
Archival Compliance
- Enforce numeric naming conventions in long-term storage solutions.
git clone https://github.com/dustinjd/pattern-break.git
cd pattern-break
Install required libraries via pip:
pip install pyperclip rich
pyperclip
: Enables clipboard integration.rich
: Provides rich-text table formatting.
pattern-break.py --check both -d . -r -gt 100 --multi-range -fmt summary --stats -o stdout -xd .git -xd .sync -o file -f missing.txt
Searches for pattern breaks in filenames and folder names (--check both
) starting from the current directory (-d .
) recursively (-r
), excluding git folders and sync folders, (-xd .git -xd .sync
), for patterns matching a group threshhold of 100 files (-gt 100
), looking for in-filename patterns as well (--multi-range
), formatting the output in a JSON-like summary format (-fmt summary
) showing statistics (--stats
) and outputting to the console (-o stdout
) and to file (-o file -f missing.txt
)
python pattern-break.py -d /path/to/files --format=summary
For a list of files like 001.txt, 002.txt, 003.txt, 004.txt, 006.txt, 008.txt
would identify 005.txt
and 007.txt
python pattern-break.py -d /path/to/files --multi-range
For a list of files like 001.txt, 002-004.txt, 006.txt, 008.txt
would identify 005.txt
and 007.txt
python pattern-break.py -d /dir1 /dir2 --cross-dir-grouping
As the name suggests would find patterns even across folders, so if 005.txt
or 007.txt
existed in dir2 it would count them as part of the sequence.
python pattern-break.py -d /path/to/files --format=csv -o file --filename gaps.csv
python pattern-break.py -d /path/to/files --group-threshold 50
This means that if there are files named 001.txt to 100.txt it would treat 001.txt to 050.txt as one logical group, and 051.txt to 100.txt as a distinct group when trying to identify missing files.
Argument | Description |
---|---|
--check |
Analyze files , dirs , or both (default=files) |
--pattern |
Only include regex filter matching names. |
-xd , --exclude |
Exclude patterns (e.g. *.txt) |
-d, --dir |
Directories to scan. |
-r, --recursive |
Include subdirectories in the scan. |
--multi-range |
Expand numeric ranges in filenames. |
--block-policy |
Policy for handling numeric blocks (e.g., first , all ). |
--group-threshold |
Split groups if numeric gap > threshold |
--cross-dir-grouping |
Merge coverage from multiple dirs if numeric values align. |
--start-num |
Force sequence start. |
--end-num |
Force sequence end. |
--mod-boundary |
Consider missing up to next boundary (e.g. mod=100) |
-inc , --increment |
Step between consecutive numbers (default=1) |
--format |
Output format (summary , inline , CSV , JSON , etc.). |
-o, --output |
Destination for output (stdout , file , clipboard ). |
--range |
Display missing ranges (compact , all in detail). |
--range-fmt |
Blank lines between segments? (spacing , nospace ) |
-v , --verbose |
Provide additional debugging information. |
-q , --quiet |
Supresses stdout |
Run pattern-break.py --help
for a complete list of arguments.
Contributions to Pattern-Break
are welcome! To contribute:
- Fork this repository and clone it.
- Create a new branch for your feature or bugfix.
- Submit a pull request with a clear description of your changes.
Like the project?
pattern-break.py, Copyright (C) 2025 Dustin Darcy
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.