Skip to content

Latest commit

 

History

History
61 lines (40 loc) · 3.82 KB

README.md

File metadata and controls

61 lines (40 loc) · 3.82 KB

No more index.html mess with AWS CloudFront/S3

Problem

Consider you have a statically generated site — a bunch of usual resources, including html files. You test this site locally with a development web server for convenience, and everything works. You set up a private S3 bucket plus a CloudFront distribution authorized to access this bucket to expose site at your domain and get caching benefits.

But once you upload your resources, you run into an issue: your site relies on relative links like /section/, expecting that you'll get contents of /section/index.html file in response — after all, lots of web servers implement this logic as their default behavior — but accessing such relative link over CloudFront returns a 403 error page. After some troubleshooting you figure out that configuring index.html as a default root object on the CloudFront distribution only really works for root object, and does not work with subdirectories.

If this scenario looks familiar, then this tool can help you.

Solution

Command cloudfront-autoindex is an AWS Lambda that processes S3 ObjectCreated events looking for objects with /index.html suffix in their name, and makes copies of those objects with /index.html and index.html suffixes stripped.

This way if you have a directory doc with an index.html file in it, and you upload that directory to an S3 bucket fronted by CloudFront, you can then see your page not only by accessing https://example.org/doc/index.html, but also at https://example.org/doc and https://example.org/doc/, as this Lambda creates two copies of doc/index.html key under doc and doc/ keys.

Setup

Build and compress:

GOOS=linux GOARCH=amd64 go build -o main
zip -9 lambda.zip main

Create a new AWS Lambda, picking the "Go 1.x" runtime. Change its handler name from default "hello" to "main" (binary name you built above), and upload lambda.zip file.

It requires the usual permissions, e.g. AWSLambdaBasicExecutionRole AWS-managed role, plus these permissions to work with S3 (optionally limit with your bucket names):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "CopyObject",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::*/*"
        }
    ]
}

Once Lambda is created and configured, go to S3 bucket settings, Properties → Event notifications → Create event notification. Enter index.html as a Suffix, and select s3:ObjectCreated events except for the s3:ObjectCreated:Copy. Pick your Lambda function created above as event destination.

Caveats

Relative Links

If an index.html page has relative links to other resources, such links may appear broken depending on which path you access such a page.

For example, if you have an /years/index.html page that links to a 2021.html file, such link will be resolved to /years/2021.html when you access the page /years/index.html or its automatically created copy at /years/. But if you access an automatically created copy at /years path, the browser will resolve such a relative link to just /2021.html.

Consider using the <base> element to lock the base path for relative links.

Canonical Path

Search engines may get confused on which page address is the primary one in the presence of such multiple copies. Their primary source choice may be different from yours. Use the canonical link element on your page to set its main address, like <link rel="canonical" href="…">.