mercator
is a tool that can find packages on the filesystem and it can extract metadata about those packages.
The easiest way how to install mercator
is to install it from RPM package
built in COPR.
How to do it (on Fedora):
- Enable the COPR repository:
dnf copr enable msrb/mercator
- Install
mercator
dnf install mercator
The mercator
binary is distributed in form of a RPM package. If you'd like to release a new version, bump release
number in mercator.spec
:
rpmdev-bumpspec -c 'short description' mercator.spec
And git-commit and push.
Note install package rpmdevtools
if you don't have rpmdev-bumpspec
binary on your system.
In order to be able to build RPMs in CI, there needs be an API token for COPR. The token is valid for 6 months.
When the token expires, you will see an error message in CI and the build will fail:
Error: Login invalid/expired. Please visit https://copr.fedorainfracloud.org/api to get or renew your API token.
Simply visit https://copr.fedorainfracloud.org/api and generate new token. Then update the token in CI.
Note you need to have permissions to collaborate on the mercator COPR project.
Go to https://copr.fedorainfracloud.org/coprs/msrb/mercator/permissions/
and request access (msrb
and msehnout
are admins there).
Language | Ecosystem | Valid Manifest Files |
---|---|---|
Python | PyPI | 1. setup.py 2. PKG-INFO 3. requirements.txt |
Ruby | Gems | 1. Gemspec 2. Gemfile.lock |
Node | NPM | 1. package.json 2. package-lock.json 3. npm-shrinkwrap.json |
Java | Maven | 1. JAR file 2. pom.xml 3. build.gradle |
Rust | Cargo | 1. Cargo.toml 2. Cargo.lock |
.NET | Nuget | 1. .sln files 2. .dll 3. .nupkg file 4. .nuspec file 5. AssemblyInfo.cs file |
Haskell | Hackage | 1. .cabal file |
Golang | Golang | 1. glide.yaml 2. glide.lock 3. Gopkg.toml 4. Godeps.json |
Simply point Mercator at some directory and it will walk down all child directories and collect information about all encountered package manifests. The output is always a JSON document describing what has been found, please note that the key/value layout of the JSON document depends on the package ecosystem that produced it, so if you want to do some further processing or analytics you may want to normalize the data.
See our contributing guidelines for more info.
Mercator uses native libraries/tools whenever possible, but because of that it has quite a lot of external dependencies.
Dependencies required by mercator itself:
openssl-devel git golang make
Per handler dependencies:
Ruby:
ruby
Java:
java-devel maven
Python:
python3 python3-devel
Javascript:
nodejs
Dotnet:
mono-devel nuget
Golang:
glide python34-toml
for Dotnet you have to execute this command first:
yes | certmgr -ssl https://go.microsoft.com && yes | certmgr -ssl https://nuget.org
All handler dependencies together:
ruby java-devel python3 python3-devel nodejs mono-devel nuget glide
If you have all the packages installed, make sure that your GOPATH
is set.
You can set it to for example $(pwd)
or /tmp
like: export GOPATH=/tmp
.
Then just invoke make
:
make build
sudo make install
Some handlers are built/installed by default, some need to be explicitly enabled, see beginning of Makefile.
If you need to build for example dotnet handler (which is disabled by default),
you either have to change DOTNET=NO
to DOTNET=YES
in Makefile or
build it like:
make build DOTNET=YES
Note: You can also take a look at our spec file, which we use to build RPMs.
After that, mercator
is ready to be used:
$ mercator jsl/
[
{
"path": "/home/podvody/Repos/jsl/setup.py",
"ecosystem": "Python",
"result": {
"author": "Anton Romanovich",
"author_email": "[email protected]",
"classifiers": [
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"License :: OSI Approved :: BSD License",
"Operating System :: OS Independent",
"Programming Language :: Python",
"Programming Language :: Python :: 2",
"Programming Language :: Python :: 2.6",
"Programming Language :: Python :: 2.7",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.3",
"Programming Language :: Python :: 3.4",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
"Topic :: Software Development :: Libraries :: Python Modules"
],
"description": "A Python DSL for defining JSON schemas",
"ext_modules": [],
"license": "BSD",
"long_description": "JSL\n===\n\n.. image:: https://travis-ci.org/aromanovich/jsl.svg?branch=master\n :target: https://travis-ci.org/aromanovich/jsl\n :alt: Build Status\n\n.. image:: https://coveralls.io/repos/aromanovich/jsl/badge.svg?branch=master\n :target: https://coveralls.io/r/aromanovich/jsl?branch=master\n :alt: Coverage\n\n.. image:: https://readthedocs.org/projects/jsl/badge/?version=latest\n :target: https://readthedocs.org/projects/jsl/\n :alt: Documentation\n\n.. image:: http://img.shields.io/pypi/v/jsl.svg\n :target: https://pypi.python.org/pypi/jsl\n :alt: PyPI Version\n\n.. image:: http://img.shields.io/pypi/dm/jsl.svg\n :target: https://pypi.python.org/pypi/jsl\n :alt: PyPI Downloads\n\nDocumentation_ | GitHub_ | PyPI_\n\nJSL is a Python DSL for defining JSON Schemas.\n\nExample\n-------\n\n::\n\n import jsl\n\n class Entry(jsl.Document):\n name = jsl.StringField(required=True)\n\n class File(Entry):\n content = jsl.StringField(required=True)\n\n class Directory(Entry):\n content = jsl.ArrayField(jsl.OneOfField([\n jsl.DocumentField(File, as_ref=True),\n jsl.DocumentField(jsl.RECURSIVE_REFERENCE_CONSTANT)\n ]), required=True)\n\n``Directory.get_schema(ordered=True)`` will return the following JSON schema:\n\n::\n\n {\n \"$schema\": \"http://json-schema.org/draft-04/schema#\",\n \"definitions\": {\n \"directory\": {\n \"type\": \"object\",\n \"properties\": {\n \"name\": {\"type\": \"string\"},\n \"content\": {\n \"type\": \"array\",\n \"items\": {\n \"oneOf\": [\n {\"$ref\": \"#/definitions/file\"},\n {\"$ref\": \"#/definitions/directory\"}\n ]\n }\n }\n },\n \"required\": [\"name\", \"content\"],\n \"additionalProperties\": false\n },\n \"file\": {\n \"type\": \"object\",\n \"properties\": {\n \"name\": {\"type\": \"string\"},\n \"content\": {\"type\": \"string\"}\n },\n \"required\": [\"name\", \"content\"],\n \"additionalProperties\": false\n }\n },\n \"$ref\": \"#/definitions/directory\"\n }\n\nInstalling\n----------\n\n::\n\n pip install jsl\n\nLicense\n-------\n\n`BSD license`_\n\n.. _Documentation: http://jsl.readthedocs.org/\n.. _GitHub: https://github.com/aromanovich/jsl\n.. _PyPI: https://pypi.python.org/pypi/jsl\n.. _BSD license: https://github.com/aromanovich/jsl/blob/master/LICENSE\n",
"name": "jsl",
"packages": [
"jsl",
"jsl.fields",
"jsl._compat"
],
"url": "https://jsl.readthedocs.org",
"version": "0.2.1"
}
}
]
To run tests, simply run:
make check
Mercator 1.0 was written mostly in Python, while Python can be considered a ubiquity in certain circles, less so in other circles. The main principle and reason for rewrite was to accomodate for the cases where Python was not installed and installing it didn't make any sense. Expecting a Java dev to install Python is a no-no. Mercator 2.0 is divided between two main components:
- Core
- Handlers
Where Core
is a statically linked binary (thus no external dependencies) and Handlers
can be written in ecosystem specific languages. There are two reasons why it might be better to write a handler in ecosystem specific language:
- The target language is best equipped for handling it's ecosystem as it already contains all the necessary bits to handle the packaging
- Good example is Java and pom.xml files. Maven knows how to work with pom.xml files and thus letting it to extract metadata from pom.xml files is better than trying to implement the same functionality in other languages
- The target language is already present on the box, if I'm developing in Java I have no problem running a handler written in Java since the necessary tooling already has to be there
Another crucial difference is that the handler specification is now declarative, and not some random code in some source file, but more about that below.
Ecosystem specific handlers are configurable via special file handlers.yml
, which allows for specifying the following criteria:
- name: "Python"
description: "PyPI Package (source)"
filepatterns:
- "^setup\\.py$"
binary: "python"
handler: "handlers/python"
Filepatterns
allow for selection of file based on regular expression matching a file namebinary
specifies the main executable supposed to execute the handlerhandler
is the actual handler, the directory is relative to Mercator data directory (configurable in the same YAML, defaults to/usr/share/mercator/
)
- name: "Ruby-Dist"
description: "Installed RubyGem"
pathpatterns:
- "^.*/specifications/.*\\.gemspec$"
handler: "handlers/ruby"
binary: "ruby"
pathpatterns
allow for selection of file based on regular expression matching the whole absolute path
And finally, because Java always needs a special care, there are the following configuration keys available:
- name: "Java"
description: "Java JAR"
types:
- "application/zip"
inarchive:
- "META-INF/MANIFEST.MF"
handler: "handlers/java"
binary: "java"
args:
- "-jar"
types
allows to specify a MIME type the file should haveinarchive
specifies a file that has to be present iftypes
referes to an archive (currently only ZIP is supported)args
additional arguments to thebinary
As stated above, each handler is implemented in the language of the ecosystem, that is, Python handler is written in Python, Java handler in Java etc.
If the handler needs any additional dependencies, those are bundled in the handlers/
directory directly (currently Java and Python has some bundled dependencies).