Skip to content

Latest commit

 

History

History
265 lines (184 loc) · 11 KB

INSTALL.asciidoc

File metadata and controls

265 lines (184 loc) · 11 KB

Repocache Installation

Building and installing repocache

Tools required for the build:

  • OpenJDK 8

  • maven (tested with 3.0.5 and 3.3.9)

Build has been tested under Ubuntu 14.04.3 and 16.04.2 LTS.

Steps specific to Ubuntu 14.04.3

sudo add-apt-repository ppa:openjdk-r/ppa
sudo apt-get update
sudo apt-get install openjdk-8-jdk

Generic steps

Copy the following snippet into ~/.m2/toolchains.xml:

<toolchains>
  <!-- ... -->
  <!-- JDK toolchains -->
  <toolchain>
    <type>jdk</type>
    <provides>
      <id>JavaSE-1.8</id>
    </provides>
    <configuration>
      <jdkHome>/usr/lib/jvm/java-8-openjdk-amd64/jre</jdkHome>
    </configuration>
  </toolchain>
  <!-- ... -->
</toolchains>

Install script is as follows - note that it will require sudo password when installing the application:

#!/bin/bash

mkdir -p ~/git/qgears
cd ~/git/qgears

git clone --recursive https://github.com/qgears/repocache.git

cd repocache
mvn package
# or maybe 'mvn integration-test' to run the build with tests

# The file matching the repocache-$VERSION-$DATE.jar pattern will be found here:
cd hu.qgears.repocache/target/
ls -lah

# Selecting the most recently built jar file
MOST_RECENT=$(ls -Art repocache-*.jar | tail -n 1)
echo $MOST_RECENT

INSTALL_DIR=/opt/repocache

sudo mkdir $INSTALL_DIR
sudo cp $MOST_RECENT $INSTALL_DIR
sudo ln -s /opt/repocache/$MOST_RECENT /opt/repocache/repocache.jar

cd ../..

# Edit and customize the following startup script, then them to their final locations!

# Upstart startup script suitable for Ubuntu 14 LTS and 16 LTS:
sudo cp hu.qgears.repocache/doc/repocache.conf /etc/init
sudo chown root:root /etc/init/repocache.conf

# Copying configuration:
sudo cp hu.qgears.repocache/doc/sample-config/* /opt/repocache

# Allowing write access to the current user
sudo chown $(id -ng):$(id -nu) $INSTALL_DIR -Rc

Starting repocache

In order to start repocache manually, issue the following command:

sudo service repocache start

Now, you have a working HTTP proxy repocache installation and a running instance, so you may configure your applications that require repository artifact caching, like maven.

Assumption: the name repocache.yourdomain.com is a valid, resolvable domain name on your computer. For a single-workstation setup, ensure that the following entry is present in your /etc/hosts file:

127.0.1.1       repocache.yourdomain.com

Web administration interface: http://repocache.yourdomain.com:18081

Note that HTTPS proxying is not available at this stage; see the following chapter for the setup.

HTTPS proxy setup

As repocache is a cache, it has to decrypt encrypted communication to be able to store downloaded artifacts. This is why an own (self signed) key set and certificate is to be created and installed: when using repocache as a HTTPS proxy, decrypted traffic will be reencrypted using its own, generated private key and can be accepted by clients by utilizing the corresponding self-signed certificate.

Simple scenario: using the pregenerated cert

Scenario: setting up repocache as a MITM (man-in-the-middle) HTTPS proxy with a self signed certificate generated by repocache itself.

  1. Visit the web administration interface: http://repocache.yourdomain.com:18081/config.html

  2. download the certificate by right clicking on the link titled ''HTTPS root CA''

  3. copy it to the /usr/local/share/ca-certificates directory

  4. issue the following command:

sudo update-ca-certificates

Generating own self signed cert with custom data

Scenario: setting up repocache as a MITM (man-in-the-middle) HTTPS proxy with a self signed certificate, so that the certificate contains data provided by you, such as company name, server hostname and others.

Open http://repocache.yourdomain.com:18081/config.html and press the button titled ''Initialize certs folder''. The /opt/repocache/certs directory will be created.

Customize the certs/template.cert.config with arbitrary data, then execute the rootcerts.sh script as follows, to create a certificate for the repocache.yourdomain.com hostname:

cd /opt/repocache/certs
./rootcerts.sh repocache.yourdomain.com

This way, you will have a self signed certificate, that you will be able to use on your local workstation after installing it as follows:

cd /opt/repocache/certs/public
sudo cp repocache.yourdomain.com.crt /usr/local/share/ca-certificates
sudo update-ca-certificates
# Update the init script to make repocache utilize the newly
# generated certificate and signing keys
sudo sed -i 's/--repocacheHostName\ repocache.qgears.com/--repocacheHostName\ repocache.yourdomain.com/g' /etc/init/repocache.conf
# Restart repocache
sudo service repocache restart

Now your workstation will accept the certificate and will be able to act as HTTPS proxy clients.

Tip
If you want more workstations to use the currently configured repocache instance to use as an HTTPS repo artifact proxy, don’t forget to copy the repocache.yourdomain.com.crt file to their /usr/local/share/ca-certificate folders and issue the update-ca-certificates command on all workstations.

Testing the HTTPS proxy download

Issue the following command:

wget -e use_proxy=yes -e https_proxy=repocache.yourdomain.com:18083 https://repo1.maven.org/maven2/ant/ant/maven-metadata.xml

If the configuration has been successful, wget is expected to produce output similar to this:

--2018-03-19 15:16:04--  https://repo1.maven.org/maven2/ant/ant/maven-metadata.xml
Resolving repocache.yourdomain.com (repocache.yourdomain.com)... 127.0.0.1
Connecting to repocache.yourdomain.com (repocache.yourdomain.com)|127.0.0.1|:18083... connected.
Proxy request sent, awaiting response... 200 OK
Length: 537 [application/xml]
Saving to: ‘maven-metadata.xml’

100%[=========================================================================================================================================================================>] 537         --.-K/s   in 0s

2018-03-19 15:16:05 (155 MB/s) - ‘maven-metadata.xml’ saved [537/537]

Upstream proxy setup

Repocache can be configured to download artifacts through an other proxy server, called 'upstream proxy' henceforth.

The relevant settings can be set either on the web user interface, while repocache is running, or in the repocache.config file, when repocache is not running:

upstreamproxy.hostname=upstreamproxy.yourdomain.com
upstreamproxy.port=3128

Currently, only HTTP upstream proxying is supported without authentication.

Warning
Repocache may delete the above settings from repocache.config file upon exiting, if they are inserted while repocache is running.

Exceptions from upstream proxying

By default, all requests are routed through the upstream proxy if it is configured. However, there might be cases when the upstream proxy is not able to access a certain set of hosts.

Upstream proxy exceptions can be configured on the web user interface, in the list titled ''Exceptions'' in the ''Upstream proxy configuration'' section, by specifying the full host names, one in each line. If repocache clients attempt downloading files from any of the specified hosts, the download will be performed directly, ignoring the upstream proxy.

Relevant settings in the repocache.config file:

upstreamproxy.hostname=upstreamproxy.yourdomain.com
upstreamproxy.port=3128
upstreamproxy.exceptions=upstreamproxyexception.com\r\ndonotproxyme.intranet.net\r\nlocalhost

Example scenario: repocache is utilized for maven build processes in an enterprise environment, so that downloading files from the Internet is only permitted and possible through a corporate proxy, but build artifacts are downloaded both from maven central repo and mirrors hosted in the local network, which can be accessed only directly. If the corporate proxy is not able to access the mirrors in the local network, they have to be added to the upstream proxy exception list in repocace, so that files can be downloaded from the local mirrors directly.

Operation modes

The repocache proxy supports different usage scenarios by certain operation modes. See the case study that gives an example how these settings are to be applied in practice.

Click on the links to read the related documentation in the source code comments on how repocache can operate, i. e. what caching policies are implemented!

These operation modes can be configured either globally or per-site, in the access.config configuration file.

Format of the access.config file

The access.config file consists of [operation-mode whitespace-separators host-and-path-fragment] triplets, one in each line (without the square brackets).

The host-and-path-fragment is a part of the host name and path parts of the URI, mapped into an internal format,

  • prefixed with the /proxy/ string,

  • including the protocol, via which the artifact is accessed (http or https),

  • including a fragment of the host name and path of artifacts as follows:

/proxy/[http|https]/hostname.com/path-fragment

Example:

# Maven is additional, artifacts do not change in it.
# If new ones are required, those are downloaded,
# anyway, everything artifacts are served from the cache,
# the URL of which contains the 'repo1.maven.org/maven2'
# string:
add /proxy/http/repo1.maven.org/maven2

# Repocache will not store the contents of an internal,
# manually maintained download site, as that would be
# unnecessary redundancy:
transparent /proxy/http/your.intranet.net/some-repo

The default access.config configuration file provided in the sample configuration makes repocache work in update mode for all sites. This is likely to be suitable for creating the initial repository contents.

Site aliasespath

Problem: certain repositories publish their contents…​

  • both via HTTP and HTTPS protocol, or

  • via more than one domain names, or

  • via domain names and paths changing in time.

So proxy clients may download the same artifacts via different URLs. If they do, repocache will store these artifacts more than once, redundantly.

TODO finish this!