This package is an effort to port the Chalice packager to a library that can
be used to handle the dependency resilution portion of packaging Python code
for use in AWS Lambda. The scope for this builder is to take an existing
directory containing customer code, and a top-level requirements.txt
file
specifying third party depedencies. The builder will examine the dependencies
and use pip to build and include the dependencies in the customer code bundle
in a way that makes them importable.
Python is particularly difficult to package for other platforms because it is
heavily coupled to your local environment. This is because setup.py
the
"config" to describe an install is just a python file, which means authors can
run code to check things about your current system, and then use that to make
decisions about how to install the package. These decisions are obviously not
valid once we move the built package to a different platform.
Python packaging also has the concept of a wheel, which is more of a self contained package that can be trivially installed into a python environment and does not execute any code. Our goal is to build up a set of these that are known to be compatible with AWS Lambda.
The top level interface is presented by the PythonPipDependencyBuilder
class. There will be one public method build_dependencies
, which takes
the provided arguments and builds python dependencies using pip under
the hood.
def build_dependencies(artifacts_dir_path,
requirements_path,
runtime,
ui=None,
config=None,
):
"""Builds a python project's dependencies into an artifact directory.
:type artifacts_dir_path: str
:param artifacts_dir_path: Directory to write dependencies into.
:type requirements_path: str
:param requirements_path: Path to a requirements.txt file to inspect
for a list of dependencies.
:type runtime: str
:param runtime: Python version to build dependencies for. This can
either be python3.8, python3.9, python3.10, python3.11 or python3.12. These are
currently the only supported values.
:type ui: :class:`lambda_builders.actions.python_pip.utils.UI`
:param ui: A class that traps all progress information such as status
and errors. If injected by the caller, it can be used to monitor
the status of the build process or forward this information
elsewhere.
:type config: :class:`lambda_builders.actions.python_pip.utils.Config`
:param config: To be determined. This is an optional config object
we can extend at a later date to add more options to how pip is
called.
"""
The general algorithm for preparing a python package for use on AWS Lambda is as follows.
Let pip choose what to install, this gives us the best chance of getting
a complete closure over all the requirements from our requirements.txt
file.
We will have a mixture of sdists and wheel files after this step. Pip prefers
wheels so the sdists will be present when we couldn't find a wheel. We now use
this directory full of sdists and wheels as our source of truth for a complete
list of all dependencies we need.
Sort the downloaded packages into three categories:
- sdists (Pip could not get a wheel so it gave us an sdist)
- lambda compatible wheel files
- lambda incompatible wheel files
Pip will give us a wheel when it can, but some distributions do not ship with wheels at all in which case we will have an sdist for it. In some cases a platform specific wheel file may be availble so pip will have downloaded that, if our platform does not match the platform defined for the lambda function (linux/manylinux x86_64 or aarch64) then the downloaded wheel file may not be compatible with lambda. Pure python wheels still will be compatible because they have no platform specific dependencies.
Next we need to go through the downloaded packages and pick out any dependencies that do not have a compatible wheel file downloaded. For these packages we need to explicitly try to download a compatible wheel file. A compatible wheel file means one that is explicitly for marked as supporting the corresponding architecture for the function.
Re-count the wheel files after the second download pass. Anything that has an sdist but not a valid wheel file is still not going to work on AWS Lambda and we must now try and build the sdist into a wheel file ourselves as none was available on PyPi in a compatible format.
Re-count the wheel files after the custom compile pass. If there are still dependencies that only have incompatible wheels or sdists all hope is not lost.
There is still the case where the package had optional C dependencies for speedups. In this case the wheel file will have built above with the C dependencies if it managed to find a C compiler. If we are on an incompatible architecture this means the wheel file generated will not be compatible. Our last ditch effort to build the package will be to try building it again while severing its ability to find the C compiler. If the dependencies were optional it will fall back to pure python and build a valid pure python wheel.
Now there is still the case left over where the setup.py has been made in such a way to be incompatible with python's setup tools, causing it to lie about its compatibility. To fix this we have a hand-curated list of packages that will work, despite claiming otherwise.
At this point there is nothing we can do about any missing wheel files. We tried downloading a compatible version directly and building from source. All we can do here is report that we could not build this dependency and the bundle is missing some dependencies.
For each wheel file that has been built we install it into the target bundle
directory at the top level. This will make it importable assuming the top level
bundle has an __init__.py
and is on the PYTHONPATH
.
The dependencies should now be succesfully installed in the target directory. All the temporary/intermediate files can now be deleting including all the wheel files and sdists.