Announcing Coherence 2.0 and CNC, the first open source IaC framework
All posts

AWS SPA routing — The bad, the ugly, and the uglier

October 5, 2022

TLDR - Routing with aws is hard 

There are so many kinds of web applications. Backend, frontend, full-stack, e-commerce, blogs, serverless, etc, the list goes on. While they are all special and unique in their own right, there is a lot of very common functionality that needs to be implemented for a large majority of these applications.

To this day, accomplishing some of this common “boiler plate” functionality using major cloud providers remains shrouded in mystery…

Let’s begin our journey to full-stack (API + SPA) application routing on AWS.

Context

  • We have a full stack application, consisting of a backend API only service, and a single page application frontend.
  • All services should be accessible from the same domain but different paths (e.g. /api/)
  • The frontend application will be served from an s3 bucket

Objectives

  • Frontend 404 bucket errors should ultimately result in a 200 status and serve the app index
  • Backend 404 errors should not be modified

We’re starting with a cloudfront distribution that has two origins, one is the s3 bucket (the frontend SPA) and the other is the backend API service. In this state routing works but we see 404 “key not found” errors from the frontend (at any route other than /index.html).

Attempt #1

Cloudfront allows some custom error handling and it seems pretty straightforward, so I tried something like this:

This does the trick for the frontend, but now backend 404 errors are swallowed and instead we get a 200 response along with the frontend app index.

Attempt #2

Since Cloudfront doesn’t allow that custom error behavior to be specified per origin, my next idea was to rely on the s3 bucket settings. I set an error page for the frontend:

This actually didn’t change any routing behavior, and I later learned that any bucket level rules are mostly ignored when using an s3 origin with cloudfront.

To work around that limitation, I replaced the s3 origin with a custom origin pointing to the website endpoint of the s3 bucket:

This seems closer to what we want:

  • Frontend 404s ultimately return the frontend app index
  • Backend 404s remain unmodified

The issue here is that while the frontend returns the index correctly, it still returns a 404 status code.

Attempt #3

One thing I was trying to avoid here was adding unnecessary complexity… 

After a decent amount of research I came to the conclusion that I can’t avoid using a Lambda function for this. Lambda functions can be used as sort of a “middleware” for cloudfront requests/responses. Since this is fairly common functionality that we’re trying to accomplish there were plenty of examples of what this lambda function would look like:


'use strict';

const http = require('https');

const indexPage = 'index.html';

exports.handler = async (event, context, callback) => {
    const cf = event.Records[0].cf;
    const request = cf.request;
    const response = cf.response;
    const statusCode = response.status;
    
    // Only replace 403 and 404 requests typically received
    // when loading a page for a SPA that uses client-side routing
    const doReplace = request.method === 'GET'
                    && (statusCode == '403' || statusCode == '404');
    
    const result = doReplace 
        ? await generateResponseAndLog(cf, request, indexPage)
        : response;
        
    callback(null, result);
};

async function generateResponseAndLog(cf, request, indexPage){
    
    const domain = cf.config.distributionDomainName;
    const appPath = getAppPath(request.uri);
    const indexPath = `/${appPath}/${indexPage}`;
    
    const response = await generateResponse(domain, indexPath);
    
    console.log('response: ' + JSON.stringify(response));
    
    return response;
}

async function generateResponse(domain, path){
    try {
        // Load HTML index from the CloudFront cache
        const s3Response = await httpGet({ hostname: domain, path: path });

        const headers = s3Response.headers || 
            {
                'content-type': [{ value: 'text/html;charset=UTF-8' }]
            };
            
        return {
            status: '200',
            headers: wrapAndFilterHeaders(headers),
            body: s3Response.body
        };
    } catch (error) {
        return {
            status: '500',
            headers:{
                'content-type': [{ value: 'text/plain' }]
            },
            body: 'An error occurred loading the page'
        };
    }
}

function httpGet(params) {
    return new Promise((resolve, reject) => {
        http.get(params, (resp) => {
            console.log(`Fetching ${params.hostname}${params.path}, status code : ${resp.statusCode}`);
            let result = {
                headers: resp.headers,
                body: ''
            };
            resp.on('data', (chunk) => { result.body += chunk; });
            resp.on('end', () => { resolve(result); });
        }).on('error', (err) => {
            console.log(`Couldn't fetch ${params.hostname}${params.path} : ${err.message}`);
            reject(err, null);
        });
    });
}

// Get the app path segment e.g. candidates.app, employers.client etc
function getAppPath(path){
    if(!path){
        return '';
    }
    
    if(path[0] === '/'){
        path = path.slice(1);
    }
    
    const segments = path.split('/');
    
    // will always have at least one segment (may be empty)
    return segments[0];
}

// Cloudfront requires header values to be wrapped in an array
function wrapAndFilterHeaders(headers){
    const allowedHeaders = [
        'content-type',
        'content-length',
        'last-modified',
        'date',
        'etag'
    ];
    
    const responseHeaders = {};
    
    if(!headers){
        return responseHeaders;
    }
    
    for(var propName in headers) {
        // only include allowed headers
        if(allowedHeaders.includes(propName.toLowerCase())){
            var header = headers[propName];
            
            if (Array.isArray(header)){
                // assume already 'wrapped' format
                responseHeaders[propName] = header;
            } else {
                // fix to required format
                responseHeaders[propName] = [{ value: header }];
            }    
        }
        
    }
    
    return responseHeaders;

Wow. Just wow. So it turns out the body of the response is not exposed to the lambda function.

This means that we’ll need to replace the 404 status with a 200, AND make a request to fetch the frontend app index from the s3 bucket to use it to populate the response body.

I’m sure this would have worked but it just seemed a bit much. Lots of moving pieces to accomplish what I felt should be a fairly simple thing.

Attempt #4 (The solution)

After some more research I learned that although the body of the response is not exposed to the lambda function, it will persist as long as the lambda function doesn’t modify the body in any way.

The final solution for me was a combination of attempt #2 and attempt #3. The main issue with #2 was that the frontend still returned a 404 status, so now the lambda function can be simplified to handle just the status:

'use strict';

exports.handler = async (event, context, callback) => {
    const cf = event.Records[0].cf;
    const request = cf.request;
    const response = cf.response;
    const statusCode = response.status;

    if (statusCode == '404') {
        response.status = '200'
    }

    console.log('response: ' + JSON.stringify(response));
    callback(null, response);
    return response;
};


Recap

This story is one of frustration and persistence. Don’t get me wrong, AWS is great and provides seemingly endless solutions to meet your infrastructure needs. These numerous solutions come with the caveat that the “best” or “correct” way isn’t always clear, even for problems that are far from unique (e.g. routing). 

At coherence, we’re doing this kind of work across the development lifecycle so you can focus on what really matters, your actual application/business logic.