nara-scripts

This repository contains scripts used in the work of the US National Archives.

Contributing

Use the issues page to suggest new scripts that others may be able to help with.

1940census.py - Transform 1940 Census metadata for inclusion in the National Archives Catalog (Python 2)
amara.py - Transform Amara video transcriptions for addition to the National Archives Catalog (Python 2)
combinexml-py2.py - Combine multiple XML files in a directory into single files of 75 MB or less (Python 2)
csv-add-headers.py - Add headers to a CSV file (Python 3)
csv-to-xml.py - Convert CSV to simple XML (Python 3)
downloadurls-py2.py - Download all files from URLs listed in a text file (Python 2)
downloadurls-py3.py - Download all files from URLs listed in a text file (Python 3)
file-units.py - Convert file unit submission spreadsheet (CSV) into DAS-compliant XML (Python 3)
ocr-jpg.py - Generate OCR data from JPG files (Python 2)
- Dependencies:
pdf.py - Convert PDF documents to JPGs (Python 2)
s3_file_list.py - Generate a CSV listing of files on S3 cloud storage (Python 2)
rename.py - Rename file names by replacing specific characters (Python 3)

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
python		python
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md