Tools required for the build:
-
OpenJDK 8
-
maven (tested with 3.0.5 and 3.3.9)
Build has been tested under Ubuntu 14.04.3 and 16.04.2 LTS.
sudo add-apt-repository ppa:openjdk-r/ppa
sudo apt-get update
sudo apt-get install openjdk-8-jdk
Copy the following snippet into ~/.m2/toolchains.xml:
<toolchains>
<!-- ... -->
<!-- JDK toolchains -->
<toolchain>
<type>jdk</type>
<provides>
<id>JavaSE-1.8</id>
</provides>
<configuration>
<jdkHome>/usr/lib/jvm/java-8-openjdk-amd64/jre</jdkHome>
</configuration>
</toolchain>
<!-- ... -->
</toolchains>
Install script is as follows - note that it will require sudo password when installing the application:
#!/bin/bash
mkdir -p ~/git/qgears
cd ~/git/qgears
git clone --recursive https://github.com/qgears/repocache.git
cd repocache
mvn package
# or maybe 'mvn integration-test' to run the build with tests
# The file matching the repocache-$VERSION-$DATE.jar pattern will be found here:
cd hu.qgears.repocache/target/
ls -lah
# Selecting the most recently built jar file
MOST_RECENT=$(ls -Art repocache-*.jar | tail -n 1)
echo $MOST_RECENT
INSTALL_DIR=/opt/repocache
sudo mkdir $INSTALL_DIR
sudo cp $MOST_RECENT $INSTALL_DIR
sudo ln -s /opt/repocache/$MOST_RECENT /opt/repocache/repocache.jar
cd ../..
# Edit and customize the following startup script, then them to their final locations!
# Upstart startup script suitable for Ubuntu 14 LTS and 16 LTS:
sudo cp hu.qgears.repocache/doc/repocache.conf /etc/init
sudo chown root:root /etc/init/repocache.conf
# Copying configuration:
sudo cp hu.qgears.repocache/doc/sample-config/* /opt/repocache
# Allowing write access to the current user
sudo chown $(id -ng):$(id -nu) $INSTALL_DIR -Rc
In order to start repocache manually, issue the following command:
sudo service repocache start
Now, you have a working HTTP proxy repocache installation and a running instance, so you may configure your applications that require repository artifact caching, like maven.
Assumption: the name repocache.yourdomain.com
is a valid, resolvable domain name on your computer. For a single-workstation setup, ensure that the following entry is present in your /etc/hosts
file:
127.0.1.1 repocache.yourdomain.com
Web administration interface: http://repocache.yourdomain.com:18081
Note that HTTPS proxying is not available at this stage; see the following chapter for the setup.
As repocache is a cache, it has to decrypt encrypted communication to be able to store downloaded artifacts. This is why an own (self signed) key set and certificate is to be created and installed: when using repocache as a HTTPS proxy, decrypted traffic will be reencrypted using its own, generated private key and can be accepted by clients by utilizing the corresponding self-signed certificate.
Scenario: setting up repocache as a MITM (man-in-the-middle) HTTPS proxy with a self signed certificate generated by repocache itself.
-
Visit the web administration interface: http://repocache.yourdomain.com:18081/config.html
-
download the certificate by right clicking on the link titled ''HTTPS root CA''
-
copy it to the
/usr/local/share/ca-certificates
directory -
issue the following command:
sudo update-ca-certificates
Scenario: setting up repocache as a MITM (man-in-the-middle) HTTPS proxy with a self signed certificate, so that the certificate contains data provided by you, such as company name, server hostname and others.
Open http://repocache.yourdomain.com:18081/config.html and press the button titled ''Initialize certs folder''. The /opt/repocache/certs
directory will be created.
Customize the certs/template.cert.config
with arbitrary data, then execute the rootcerts.sh
script as follows, to create a certificate for the repocache.yourdomain.com
hostname:
cd /opt/repocache/certs
./rootcerts.sh repocache.yourdomain.com
This way, you will have a self signed certificate, that you will be able to use on your local workstation after installing it as follows:
cd /opt/repocache/certs/public
sudo cp repocache.yourdomain.com.crt /usr/local/share/ca-certificates
sudo update-ca-certificates
# Update the init script to make repocache utilize the newly
# generated certificate and signing keys
sudo sed -i 's/--repocacheHostName\ repocache.qgears.com/--repocacheHostName\ repocache.yourdomain.com/g' /etc/init/repocache.conf
# Restart repocache
sudo service repocache restart
Now your workstation will accept the certificate and will be able to act as HTTPS proxy clients.
Tip
|
If you want more workstations to use the currently configured repocache instance to use as an HTTPS repo artifact proxy, don’t forget to copy the repocache.yourdomain.com.crt file to their /usr/local/share/ca-certificate folders and issue the update-ca-certificates command on all workstations.
|
Issue the following command:
wget -e use_proxy=yes -e https_proxy=repocache.yourdomain.com:18083 https://repo1.maven.org/maven2/ant/ant/maven-metadata.xml
If the configuration has been successful, wget
is expected to produce output similar to this:
--2018-03-19 15:16:04-- https://repo1.maven.org/maven2/ant/ant/maven-metadata.xml
Resolving repocache.yourdomain.com (repocache.yourdomain.com)... 127.0.0.1
Connecting to repocache.yourdomain.com (repocache.yourdomain.com)|127.0.0.1|:18083... connected.
Proxy request sent, awaiting response... 200 OK
Length: 537 [application/xml]
Saving to: ‘maven-metadata.xml’
100%[=========================================================================================================================================================================>] 537 --.-K/s in 0s
2018-03-19 15:16:05 (155 MB/s) - ‘maven-metadata.xml’ saved [537/537]
Repocache can be configured to download artifacts through an other proxy server, called 'upstream proxy' henceforth.
The relevant settings can be set either on the web user interface, while repocache is running, or in the repocache.config
file, when repocache is not running:
upstreamproxy.hostname=upstreamproxy.yourdomain.com upstreamproxy.port=3128
Currently, only HTTP upstream proxying is supported without authentication.
Warning
|
Repocache may delete the above settings from repocache.config file upon exiting, if they are inserted while repocache is running.
|
By default, all requests are routed through the upstream proxy if it is configured. However, there might be cases when the upstream proxy is not able to access a certain set of hosts.
Upstream proxy exceptions can be configured on the web user interface, in the list titled ''Exceptions'' in the ''Upstream proxy configuration'' section, by specifying the full host names, one in each line. If repocache clients attempt downloading files from any of the specified hosts, the download will be performed directly, ignoring the upstream proxy.
Relevant settings in the repocache.config
file:
upstreamproxy.hostname=upstreamproxy.yourdomain.com upstreamproxy.port=3128 upstreamproxy.exceptions=upstreamproxyexception.com\r\ndonotproxyme.intranet.net\r\nlocalhost
Example scenario: repocache is utilized for maven build processes in an enterprise environment, so that downloading files from the Internet is only permitted and possible through a corporate proxy, but build artifacts are downloaded both from maven central repo and mirrors hosted in the local network, which can be accessed only directly. If the corporate proxy is not able to access the mirrors in the local network, they have to be added to the upstream proxy exception list in repocace, so that files can be downloaded from the local mirrors directly.
The repocache proxy supports different usage scenarios by certain operation modes. See the case study that gives an example how these settings are to be applied in practice.
Click on the links to read the related documentation in the source code comments on how repocache can operate, i. e. what caching policies are implemented!
These operation modes can be configured either globally or per-site, in the access.config
configuration file.
The access.config
file consists of [operation-mode whitespace-separators host-and-path-fragment]
triplets, one in each line (without the square brackets).
The host-and-path-fragment
is a part of the host name and path parts of the URI, mapped into an internal format,
-
prefixed with the
/proxy/
string, -
including the protocol, via which the artifact is accessed (
http
orhttps
), -
including a fragment of the host name and path of artifacts as follows:
/proxy/[http|https]/hostname.com/path-fragment
Example:
# Maven is additional, artifacts do not change in it. # If new ones are required, those are downloaded, # anyway, everything artifacts are served from the cache, # the URL of which contains the 'repo1.maven.org/maven2' # string: add /proxy/http/repo1.maven.org/maven2 # Repocache will not store the contents of an internal, # manually maintained download site, as that would be # unnecessary redundancy: transparent /proxy/http/your.intranet.net/some-repo
The default access.config configuration file provided in the sample configuration makes repocache work in update
mode for all sites. This is likely to be suitable for creating the initial repository contents.
Problem: certain repositories publish their contents…
-
both via HTTP and HTTPS protocol, or
-
via more than one domain names, or
-
via domain names and paths changing in time.
So proxy clients may download the same artifacts via different URLs. If they do, repocache will store these artifacts more than once, redundantly.
TODO finish this!