Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not work in Amazon Linux 2018.03 OS version #3

Open
suriya opened this issue May 31, 2019 · 9 comments
Open

Does not work in Amazon Linux 2018.03 OS version #3

suriya opened this issue May 31, 2019 · 9 comments

Comments

@suriya
Copy link

suriya commented May 31, 2019

In https://aws.amazon.com/blogs/compute/upcoming-updates-to-the-aws-lambda-execution-environment/ AWS announced an upgrade to the Lambda execution environment from Amazon Linux version 2017.03 to version 2018.03.

lambdalatex does not work in the newer environment 2018.03. The latexmk command in lambdalatex needs perl. However, /usr/bin/perl which was present in 2017.03 Lambda images is removed from 2018.03 images.

To use lambdalatex we need to come up with a way to make perl available in the Lambda function. I tried the layer mentioned in https://github.com/moznion/aws-lambda-perl5-layer
However, I got signal 11 while running latexmk. It is possible that that layer is built against 2017.03. I tried to build my own layer but got exit code 127 while invoking latexmk --version. I am not an expert in perl. Nor am I an expert in the texlive distribution. I am unable to make further progress.

Do you have any thoughts on how to make lambdalatex work on Amazon Linux 2018.03 OS version?

@johnstrickler
Copy link

johnstrickler commented Jun 20, 2019

That may explain why I'm getting an error that "document.pdf" is not found.

/usr/bin/env: perl: No such file or directory

{
  "errorMessage": "[Errno 2] No such file or directory: 'document.pdf'",
  "errorType": "FileNotFoundError",
  "stackTrace": [
    [
      "/var/task/lambda_function.py",
      47,
      "lambda_handler",
      "event['output_key'])"
    ],

@wzard
Copy link

wzard commented Jun 24, 2019

Ran into the same issue. Previous Lambda Function works just fine. @suriya Were you able to figure out how to add perl supprt to the new env?

@suriya
Copy link
Author

suriya commented Jun 24, 2019 via email

@Karandaras
Copy link

@wzard @suriya same problem here, but I think I am on the right track
I built a custom Perl Layer based of the linked perl layer that seems to work

here is my Dockerfile for it:

FROM lambci/lambda:build

ARG PERL_VERSION
RUN yum install -y zip curl
RUN curl -L https://raw.githubusercontent.com/tokuhirom/Perl-Build/master/perl-build > /tmp/perl-build
RUN perl /tmp/perl-build ${PERL_VERSION} /opt/ -des -Dcf_by="Red Hat, Inc." -Darchname=x86_64-linux-thread-multi -Dusethreads -Duseithreads -Dusesitecustomize


WORKDIR /opt

And then just build and package the stuff:

docker build --rm --build-arg PERL_VERSION=5.16.3 -t perllayer .
docker run --rm -it -v "YOUR OUTPUT DIRECTORY HERE":/var/host perllayer zip --symlinks -r -9 /var/host/PerlLayer.zip .

Tried some simple cases and it did the job, but needs some more testing after the weekend. Feel free to try it out.

Have a nice weekend

@nhoffman
Copy link

@samoconnor - thanks a lot for this - it was just what I needed. And thanks to @Karandaras for the comment above. I can confirm that adding the perl runtime in this way works. Layers would be better, but in the short term I just added the perl dependencies directly to the same zip file. My minor modifications:

Remove /man from the docker image to reduce the zip file size:

RUN rm -r /opt/man

and include the paths to the perl executable and libs in the script:

    os.environ['PATH'] += ":/var/task/bin/"
    os.environ['PATH'] += ":/var/task/texlive/2019/bin/x86_64-linux/"

    os.environ['PERL5LIB'] = '/var/task/lib/perl5/5.16.3/'
    os.environ['PERL5LIB'] += ":/var/task/texlive/2019/tlpkg/TeXLive/"

As an aside, the approach used in this project to install texlive does not pin to a version, so the user needs to be sure to update the path to texlive accordingly. In my version I'll plan to parameterize both this and the perl version.

@johnstrickler
Copy link

Thanks @Karandaras, I was able to get my lambda function to work again. Also thanks @nhoffman for the optimizations.

My lambda execution time was 7.6 seconds using 128mb for a simple latex document. Not too shabby.

@suriya
Copy link
Author

suriya commented Aug 16, 2019

@Karandaras Worked for me as well. Thank you!

@kevcam4891
Copy link

kevcam4891 commented Sep 4, 2020

Thanks to all for this support in this area specifically. It took me about 3 hours to go from git clone to producing a PDF of my own in lambda/latex. The perl issue was certainly the toughest nut to crack, particularly because you can go about it in a number of ways (standalone layer, include in the latex lambda itself) but this guidance was very helpful.

Including both latex and perl in one lambda is difficult. I already had to strip down latex to NO extra packages, etc to get everything in under the 50MB limit. I'm thinking splitting this up into Perl in its own layer will ultimately be the way to go.

To anyone that wants to just get going with Latex/Perl together to prove out the concept, I'm attaching a Docker file below. It is simply a cookbook that is based on all the helpful comments above. It's can probably be optimized even more, but this "works":

Dockerfile

FROM lambci/lambda:build-python3.6

# Install Perl
ARG PERL_VERSION
RUN yum install -y zip curl
RUN curl -L https://raw.githubusercontent.com/tokuhirom/Perl-Build/master/perl-build > /tmp/perl-build
RUN perl /tmp/perl-build ${PERL_VERSION} /opt/ -des -Dcf_by="Red Hat, Inc." -Darchname=x86_64-linux-thread-multi -Dusethreads -Duseithreads -Dusesitecustomize

# The TeXLive installer needs md5 and wget.
RUN yum -y install perl-Digest-MD5 && \
    yum -y install wget

RUN mkdir /var/src
WORKDIR /var/src

# Download TeXLive installer.
ADD http://mirror.ctan.org/systems/texlive/tlnet/install-tl-unx.tar.gz /var/src/
#RUN pwd && ls -lah /var/src
#COPY install-tl-unx.tar.gz /var/src/

# Minimal TeXLive configuration profile.
COPY texlive.profile /var/src/

# Intstall base TeXLive system.
RUN tar xf install*.tar.gz
RUN cd install-tl-* && \
    ./install-tl --profile ../texlive.profile
    # --location http://ctan.mirror.norbert-ruehl.de/systems/texlive/tlnet


ENV PATH=/var/task/texlive/2017/bin/x86_64-linux/:$PATH

# Install extra packages.
#RUN tlmgr install xcolor \
#                  tcolorbox \
#                  pgf \
#                  environ \
#                  trimspaces \
#                  etoolbox \
#                  booktabs \
#                  lastpage \
#                  pgfplots \
#                  marginnote \
#                  tabu \
#                  varwidth \
#                  makecell \
#                  enumitem \
#                  setspace \
#                  xwatermark \
#                  catoptions \
#                  ltxkeys \
#                  framed \
#                  parskip \
#                  endnotes \
#                  footmisc \
#                  zapfding \
#                  symbol \
#                  lm \
#                  sectsty \
#                  stringstrings \
#                  koma-script \
#                  multirow \
#                  calculator \
#                  adjustbox \
#                  xkeyval \
#                  collectbox \
#                  siunitx \
#                  l3kernel \
#                  l3packages \
#                  helvetic \
#                  charter

# Install latexmk.
RUN tlmgr install latexmk

# Remove LuaTeX.
RUN tlmgr remove --force luatex

# Remove large unneeded files.
RUN rm -rf /var/task/texlive/2017/tlpkg/texlive.tlpdb* \
           /var/task/texlive/2017/texmf-dist/source/latex/koma-script/doc \
           /var/task/texlive/2017/texmf-dist/doc 

RUN mkdir -p /var/task/texlive/2017/tlpkg/TeXLive/Digest/ && \
    mkdir -p /var/task/texlive/2017/tlpkg/TeXLive/auto/Digest/MD5/ && \
    cp /usr/lib64/perl5/vendor_perl/Digest/MD5.pm \
      /var/task/texlive/2017/tlpkg/TeXLive/Digest/ && \
    cp /usr/lib64/perl5/vendor_perl/auto/Digest/MD5/MD5.so \
      /var/task/texlive/2017/tlpkg/TeXLive/auto/Digest/MD5

# Remove perl libraries that don't get used so we can get under the 50 MB limit
RUN rm -rf /opt/lib/perl5/5.16.3/x86_64-linux-thread-multi/auto/Encode/CN
RUN rm -rf /opt/lib/perl5/5.16.3/x86_64-linux-thread-multi/auto/Encode/JP
RUN rm -rf /opt/lib/perl5/5.16.3/x86_64-linux-thread-multi/auto/Encode/KR
RUN rm -rf /opt/lib/perl5/5.16.3/x86_64-linux-thread-multi/auto/Encode/TW

FROM lambci/lambda:build-python3.6

WORKDIR /var/task

ENV PATH=/var/task/texlive/2017/bin/x86_64-linux/:/opt/:$PATH
ENV PERL5LIB=/var/task/texlive/2017/tlpkg/TeXLive/

COPY --from=0 /var/task/ /var/task/
COPY --from=0 /opt/bin/perl /var/task/bin/
COPY --from=0 /opt/lib /var/task/lib
COPY lambda_function.py /var/task

RUN ls -lah /var/task

lambda_function.py
Mostly like the author's, but adding the extra PERL5LIB and PATH paths.

import os
import io
import shutil
import subprocess
import base64
import zipfile
import boto3

def lambda_handler(event, context):
    
    # Extract input ZIP file to /tmp/latex...
    shutil.rmtree("/tmp/latex", ignore_errors=True)
    os.mkdir("/tmp/latex")

    print(event)

    if 'input_bucket' in event:
        r = boto3.client('s3').get_object(Bucket=event['input_bucket'],
                                          Key=event['input_key'])
        bytes = r["Body"].read()
    else:
        bytes = base64.b64decode(event["input"])

    z = zipfile.ZipFile(io.BytesIO(bytes))
    z.extractall(path="/tmp/latex")

    os.environ['PATH'] += ":/var/task/bin"
    os.environ['PATH'] += ":/var/task/texlive/2017/bin/x86_64-linux/"
    os.environ['HOME'] = "/tmp/latex/"

    os.environ['PERL5LIB'] = "/var/task/lib/perl5/5.16.3/"
    os.environ['PERL5LIB'] += ":/var/task/texlive/2017/tlpkg/TeXLive/"

    os.chdir("/tmp/latex/")

    # Run pdflatex...
    r = subprocess.run(["latexmk",
                        "-verbose",
                        "-interaction=batchmode",
                        "-pdf",
                        "-output-directory=/tmp/latex",
                        "document.tex"],
                       stdout=subprocess.PIPE,
                       stderr=subprocess.STDOUT)
    print(r.stdout.decode('utf_8'))

    if "output_bucket" in event:
        boto3.client('s3').upload_file("document.pdf",
                                       event['output_bucket'],
                                       event['output_key'])
        return {
            "stdout": r.stdout.decode('utf_8')
        }

    else:
        # Read "document.pdf"...
        with open("document.pdf", "rb") as f:
            pdf = f.read()

        # Return base64 encoded pdf and stdout log from pdflaxex...
        return {
            "output": base64.b64encode(pdf).decode('ascii'),
            "stdout": r.stdout.decode('utf_8')
        }

Test Event: Once you've uploaded your lambda, in the lambda console, paste this into "Configure test event" panel.

{
  "input": "UEsDBBQACAAIAHM5JFEAAAAAAAAAAFgAAAAMACAAZG9jdW1lbnQudGV4VVQNAAfqIFJfOiNSX/5NUl91eAsAAQT1AQAABBQAAACLSclPLs1NzStJzkksLo42NCwo0clJLSlJLSpILEgtiq1OLCrJTM5JreWKSUpNz8yrhqmv5fJIzcnJVwjPL8pJUeSKSc1LQZLjAgBQSwcIlMFj5UsAAABYAAAAUEsBAhQDFAAIAAgAczkkUZTBY+VLAAAAWAAAAAwAIAAAAAAAAAAAAKSBAAAAAGRvY3VtZW50LnRleFVUDQAH6iBSXzojUl/+TVJfdXgLAAEE9QEAAAQUAAAAUEsFBgAAAAABAAEAWgAAAKUAAAAAAA=="
}

@lpinilla
Copy link

I would like to point out a quicker solution:

Using the layer from ARN arn:aws:lambda:us-east-1:445285296882:layer:perl-5-34-runtime-al2-x86_64:4 (grabbed from here) works.

The input example of the repo took about 1 min to run, I don't know if it's the layer's fault or python but it worked.

I would like to also point out that this arn layer is from the internet and you shouldn't trust it blindly despite the fact that it works, please don't use it for sensitive documents and for a long-term solution, build your own perl layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants