catbridge_tools

Tools for working with MARC data in Catalogue Bridge.

Borrows heavily from PyMarc (https://pypi.org/project/pymarc/).

[back to top]

Requirements

Requires the regex module from https://bitbucket.org/mrabarnett/mrab-regex. The built-in re module is not sufficient.=

[back to top]

Installation

From GitHub:

python -m pip install git+https://github.com/victoriamorris/CatBridge.git@main

To create stand-alone executable (.exe) files for individual scripts from downloaded source code:

python -m PyInstaller bin/<script_name>.py -F

Executable files will be created in the folder \dist, and should be copied to an executable path.

Both of the above commands can be carried out by running the shell script:

compile_catbridge_tools.sh

[back to top]

Scripts

The scripts listed below can be run from anywhere, once the package is installed and the .exe files have been copied to an executable path.

Correspondence with original Catalogue Bridge tools

Original Catalogue Bridge tool	New tool	Original syntax	Corresponding new syntax
cn-find	cn_find	CN-FIND <infile> <outfile> <configfile>	cn_find -i <input_file> [<input_file> ...] -o <output_file> -c <config_file>
cn-tidy	cn_find	CN-FIND <infile>	cn_find -i <input_file> [<input_file> ...] -o <output_file> -c <config_file> --tidy
del-fld	keep_fld	DEL-FLD <infile> <configfile>	keep_fld -i <input_file> [<input_file> ...] -c <config_file> --delete
del-fld2	keep_fld	DEL-FLD2 <infile> <configfile>	keep_fld -i <input_file> [<input_file> ...] -c <config_file> --delete
fix-fmt	fix_fmt	FIX-FMT <marcfile>	fix_fmt -i <input_file> [<input_file> ...]
keep-fld	keep_fld	KEEP-FLD <infile> <configfile>	keep_fld -i <input_file> [<input_file> ...] -c <config_file>
keep-fld2	keep_fld	KEEP-FLD2 <infile> <configfile>	keep_fld -i <input_file> [<input_file> ...] -c <config_file>
marc-chk	marc_check	MARC-CHK <infile>	marc_check -i <input_file> [<input_file> ...]
marccount	marc_count	MARCCOUNT <infile> [<infile>]	marc_count -i <input_file> [<input_file> ...]

[back to top]

Features common to all scripts

File formats

Unless otherwise specified, MARC files are in MARC 21 format, with .lex file extensions. Unless otherwise specified, text files are UTF-8-encoded, with .txt, .csv or .tsv file extensions. Config files are also text files, but may have the file extension .cfg for convenience.

[back to top]

Help

For any script, use the option --help, or run the script without arguments/options, to display help text.

[back to top]

Logs and debugging

Logs will be written to catbridge.log within the working directory. This is a UTF-8 encoded text field and can be read in any text editor. The default logging level is INFO; if option --debug is set, the logging level is changed to DEBUG. See https://docs.python.org/3/library/logging.html#levels for information about logging levels.

[back to top]

Command line arguments

Command line arguments may be provided in any order.

[back to top]

Control fields

For the purposes of these scripts, a field tag is interpreted as a control field tag if and only if it (a) takes a numerical value starting with two zeros, or (b) is either of the Aleph control fields "DB " or "SYS".

[back to top]

Malformed records/fields

Missing indicators are recorded as blank spaces (data fields only)
Extra indicators are ignored (data fields only)

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
Legacy Catalogue Bridge utilities		Legacy Catalogue Bridge utilities
bin		bin
catbridge_tools		catbridge_tools
exe		exe
test_data		test_data
LICENSE		LICENSE
README.md		README.md
catbridge.png		catbridge.png
compile_catbridge_tools.sh		compile_catbridge_tools.sh
setup.py		setup.py

Control number specification	Description	Regular expression
ISBN	Any structurally plausible ISBN*	\b(?=(?:[0-9]+[- ]?){10})[0-9]{9}[0-9Xx]\b\|\b(?=(?:[0-9]+[- ]?){13})[0-9]{1,5}[- ][0-9]+[- ][0-9]+[- ][0-9Xx]\b\|\b97[89][0-9]{10}\b\|\b(?=(?:[0-9]+[- ]){4})97[89][- 0-9]{13}[0-9]\b
ISBN10	Any structurally plausible 10-digit ISBN*	\b(?=(?:[0-9]+[- ]?){10})[0-9]{9}[0-9Xx]\b\|\b(?=(?:[0-9]+[- ]?){13})[0-9]{1,5}[- ][0-9]+[- ][0-9]+[- ][0-9Xx]\b
ISBN13	Any structurally plausible 13-digit ISBN*	\b97[89][0-9]{10}\b\|\b(?=(?:[0-9]+[- ]){4})97[89][- 0-9]{13}[0-9]\b
ISSN	8 digits with a hyphen in the middle, where the last digit may be an X	\b[0-9]{4}[ -]?[0-9]{3}[0-9Xx]\b
BL001	9 digits	\b[0-9]{9}\b
BNB	See https://www.bl.uk/collection-metadata/metadata-services/structure-of-the-bnb-number	\bGB([0-9]{7}\|[A-Z][0-9][A-Z0-9][0-9]{4})\b
LCCN	See https://www.loc.gov/marc/bibliographic/bd010.html	\b[a-z][a-z ][a-z ]?[0-9]{2}[0-9]{6} ?\b
OCLC	"(OCoLC)" followed by digits	(OCoLC)[0-9]+\b
ISNI	16 digits separated into groups of 4 with spaces or hyphens	\b[0]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{3}[0-9Xx]\b
FAST	"fst" followed by digits	\bfst[0-9]{8}\b

License

victoriamorris/CatBridge

Folders and files

Latest commit

History

Repository files navigation

catbridge_tools

Requirements

Installation

Scripts

Correspondence with original Catalogue Bridge tools

Features common to all scripts

Section contents

File formats

Help

Logs and debugging

Command line arguments

Control fields

Malformed records/fields

cn_find

Section contents

Overview

Files

The config file

Options

--conv

--rid

--tidy

fix_fmt

Section contents

Overview

Files

keep_fld

Section contents

Overview

Files

The config file

Examples

Options

--delete

marc_check

Section contents

Overview

Checks

Files

Test data

marc_count

Section contents

Overview

Files

About

Resources

License

Stars

Watchers

Forks

Languages