Skip to content
/ repoman Public

A highly scalable apt repository based on Amazon S3 and SimpleDB

License

Notifications You must be signed in to change notification settings

memory/repoman

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Repoman: a scalable apt server using Amazon SimpleDB and S3

Introduction

Back in 2011, a scrappy young startup called "Weatherbill" found itself with a conundrum: we were using Debian's dpkg and apt tools to package and serve binary software within our own infrastructure, but we were also starting to use Amazon's Elastic Mapreduce to create ephemeral Hadoop clusters for geospatial and weather data processing, and it turned out that spinning up a 2,000-node Hadoop cluster and downloading several dozen gigabytes of packages from our standalone apt server onto each Hadoop compute node at once was an excellent way to make the apt server vanish in a puff of smoke.

So with the optimism that comes from youth and too much caffiene, we wrote our own apt server, based on Amazon's Simple Storage Service (S3) and SimpleDB offerings. We called it "Repoman", because the life of a repo man, much like that of an engineer at a small startup, is always intense. And since then, Repoman has successfully served millions of packages to various servers and services here at The Climate Corporation (which Weatherbill became) and we're happy to finally get a chance to share it with you.

Requirements

In order to operate an apt repository with Repoman, you will need at a minimum:

  • A working Python 2.7 or 3.5 installation
  • An Amazon Web Services account
  • User credentials in that AWS account, either in your shell environment, a ~/.aws/Credentials file, or via an EC2 Instance Profile or ECS Task Role. In general if the AWS CLI runs successfully in your environment, Repoman should run.
  • An AWS SimpleDB Domain which Repoman will use to store metadata about your packages.
  • An AWS S3 Bucket which Repoman will use to store your actual .deb package files and also the generated metadata files that form an Apt repository.
  • Your user credentials must have sufficient permissions (via Amazon IAM) to manipulate both the SimpleDB domain and the S3 bucket: you will need at a minimum to be able to create and delete keys in both. A sample IAM security policy is provided in the docs folder

Some basic familiarity with the concepts and nomenclature of running an apt server is assumed: if you don't know what distributions, components and architectures are, you may want to review the apt documentation

Optional

  • Repoman can function with your own user credentials, or it can assume an IAM role before querying AWS APIs; see the installation section for details on how to use roles.
  • Repoman can sign your apt repository's metadata files in order to provide strong assurances that the packages in the repo come from their claimed source. You will need a working GPG keyring, and key management is not provided by Repoman.

Commands, flags and help

The repoman-cli cli tool offers the following commands:

  • setup -- initial setup of the repository
  • checkup -- check that SimpleDB and S3 are configured correctly
  • add -- add packages to the repository
  • rm -- delete packages from the repository
  • cp -- copy packages from one distribution or component in the repository to another distribution or complnent
  • query -- list packages in the repository based on filters
  • publish -- publish the current SimpleDB repository to state to S3
  • backup -- backup the current SimpleDB state to a JSON file
  • restore -- restore SimpleDB state from a JSON file
  • repo -- repository management sub-commands:
    • repo add-distribution -- add a distribution for the repo to serve
    • repo rm-distribution -- remove a distribution for the repo to serve
    • repo add-component -- add a component for the repo to serve
    • repo rm-component -- remove a component for the repo to serve
    • repo add-architecture -- add a architecture for the repo to serve
    • repo rm-architecture -- remove a architecture for the repo to serve
    • repo add-topic -- add an SNS topic to log notifications to
    • repo rm-topic -- remove any configured SNS topic
    • repo show-config -- show the current repository configuration
    • repo add-origin -- set an Origin string for the published repository
    • repo add-label -- set a Label string for the published repository

The repoman-cli utility itself and all of its commands and sub-commands will take a -h or --help flag to show help text and all locally relevant flags.

Most commands that mutate the repository will prompt for confirmation; this step can be bypassed by passing the -y or --confirm flag.

Some particularly dangerous commands (e.g. deleting an entire distribution's worth of packages) will prompt for an extra confirmation step; this too can be bypassed by passing in the --i-fear-no-evil flag but this is a strictly at-your-own-risk proposition.

To automatically publish the repository to s3 after adding, copying or removing packages, pass the --publish flag.

Further Documentation

About

A highly scalable apt repository based on Amazon S3 and SimpleDB

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •