lamplightdev

A simple ServiceWorker app

The lack of offline access for the web is often touted as one of the major drawbacks for web apps over native apps. The appcache has been around for a while and does offer offline capability but is well documented to have its limitations (to put it politely).

However a new proposed standard that goes by the name of Service Workers is here to help enable more powerful offline access for web apps (amongst other things such as push notifications, and background data synchronization.)

A lot of helpful material has already been written on what Service Workers can achieve and how to use them, most notably:

The app - R3SEARCH

I thought it might be helpful to build my own simple example app to see how it all works in practice. It’s a search interface to wikipedia that shows a snippet of an article along with an image if available. The articles and images are cached using a service worker so you can view them at a later date without needing network access. Granted it’s not terribly useful but it should help explore the basic concepts.

There’s a working demo, and the source code is available on github. To make things as simple as possible I’ve written the app with no server side code (all API calls are made client side), and with no external dependencies, preprocessors or build tools - just standard HTML, CSS and JavaScript - so should you wish to play with it you can just clone the code and get running straight away on a local web server (or as I’ve done on github pages.)

Prerequisites

  • HTTPS must be running on your server - for security reasons Service Workers must be served over a secure connection. Alternatively servers accessed locally (on localhost, 127.0.0.1, etc.) are whitelisted to simplify local testing so you could use http://localhost/appname for example.

  • Download the cache-polyfill. This is a polyfill for part of the spec that Chrome doesn’t yet ship.

The Service Worker

The Service Worker acts as a proxy to the network - in other words all network requests made by the app (css, images, local javascript, remote 3rd party javascript, web fonts and so on) can be passed to the worker before being passed on to the browser (if necessary). This is very powerful as it means we can do anything with the request, but in terms of offline access it means the first time a resource is requested we can get the worker to pass it on to the network, cache the result and send that result back to our app. On subsequent requests we can get the worker to send the cached version back without having to ask the network for the same resource again.

Browser support

At the time of writing, only Chrome 40+ (current Beta and Canary editions) and Opera 28+ (current Developer edition) have an implementation of Service Workers, but full support should be landing in the stable versions early this year. Firefox should also have an implementation soon, while IE and Safari are yet to decide if it’s for them. More information available at is service worker ready?.

I’ve used Chrome as a reference in this post, but the app should also work in Opera Developer edition and will work (without offline access) in the latest versions of all browsers as the service worker is just ignored and all network requests will be handled by the browser as normal.

The service worker lives in its own file and is referenced from the rest of our script as follows:

// Register our ServiceWorker
if (navigator.serviceWorker) {
  navigator.serviceWorker.register("/r3search/worker.js", {
    scope: "/r3search/"
  }).then(function (reg) {
    console.log("SW register success", reg);
  }, function (err) {
    console.log("SW register fail", err);
  });
}

First we check that the browser supports service workers, then we register the worker ensuring we pass the correct path and scope. The register function returns a promise that takes success and failure callbacks. Note that the path to the worker file is based on the URL that it is accessed at, rather than its location in the file system.

On success we don’t need to do anything since the register function automatically kicks off the installation of the worker on completion.

Scope

The scope of a Service Worker limits the origins of the requests that the worker will intercept. It is specified as a parameter when installing the worker and the worker itself must live at the root of that scope (or above). A few examples are in order:

  1. an app running directly at https://domain.com/ that wants to intercept any request will need the scope parameter set to / (or omitted) and the worker to be served at https://domain.com/worker.js.

  2. an app running at https://domain.com/appname that wants to only intercept requests from the appname subdirectory will need the scope parameter set to /appname/ and the worker served at https://domain.com/appname/worker.js (or https://domain.com/worker.js as this is ‘above’ the scope).

Errors

  • If the path to the worker file is incorrect Chrome will issue a The Service Worker failed by network error.

  • If you serve the worker from a non-https location (other than localhost etc.) Chrome will issue a Only secure origins are allowed error.

  • If the worker is served from a location that’s ‘below’ the scope (e.g. scope set to / and worker served from https://domain.com/appname/worker.js), Chrome will issue a The scope must be under the directory of the script URL error.

  • If the worker fails to install, you won’t receive an error (yet.)

Life cycle

Once successfully registered the service worker goes through the following stages:

  • Install - occurs when the worker has successfully registered, but is not yet active (i.e. it’s not intercepting requests yet.) Any previously activated service workers will still be in control at this point. If there is any change in the service worker file between page reloads then the service worker is considered new and will go through this installation step.

  • Activate - occurs the first time the worker becomes active (i.e. is now able to intercept requests.) This doesn’t occur immediately after the install event, instead it will occur when the page is refreshed (this must be a hard refresh - shift + reload or close the tab and reopen it.) The spec does include a way to force the activate step to occur directly after the install event and so not requiring the hard refresh, but it isn’t implemented yet.

  • Fetch - occurs when a request is made within the current worker’s scope.

  • Message - occurs when a message is sent to it from external javascript (not covered in this post.)

  • Terminate - occurs when the browser needs to reclaim memory, and can happen at any time (outside of a request). The worker will be restarted as needed when a new request is made or message received (but won’t go back through the activate step.) The important thing to realise here is that while the worker will always intercept a request it is registered to catch (even if it needs to be restarted to do so) you can’t guarantee it will be around for any length of time. The consequence of this is that global state will not be preserved so don’t use any global vars within the worker file (instead use localStorage, indexedDB etc. for persistence.)

The contents of the worker file is where we define how we want to process all network requests under our scope:

// Include SW cache polyfill
importScripts("/r3search/js/serviceworker-cache-polyfill.js");

First we import the cache polyfill mentioned previously. importScripts is a Service Worker function that loads the given URL into the current file. This step will become unnecessary in future releases of Chrome.

// Cache name definitions
var cacheNameStatic = "r3search-static-v4";
var cacheNameWikipedia = "r3search-wikipedia-v1";
var cacheNameTypekit = "r3search-typekit-v1";

var currentCacheNames = [
  cacheNameStatic,
  cacheNameWikipedia,
  cacheNameTypekit
];

Next we define our cache names. Service Workers enable multiple caches to be created to store the responses we want to cache. In this case we’ll use:

  • cacheNameStatic to cache our static assets - usually anything that’s needed to render the app in its intial state (html, js, css, images, etc.)
  • cacheNameWikipedia to cache responses from the wikipedia api, as well as images requested by those responses.
  • cacheNameTypekit to cache our webfonts.

By having separate caches we are able to handle different groups of requests in different ways.

// A new ServiceWorker has been registered
self.addEventListener("install", function (event) {
  event.waitUntil(
    caches.open(cacheNameStatic)
      .then(function (cache) {
        return cache.addAll([
          "/r3search/",
          "/r3search/js/app.js",
          "/r3search/css/app.css",
          "/r3search/img/loading.svg"
        ]);
      })
  );
});

Here we register our install function that occurs at the install stage. Here we want to cache all the assets that we’ll keep in cacheNameStatic. The event.waitUntil will only return once the promise chain passed to it has resolved. In this case we pass caches.open which takes the cache name and returns a further promise with the cache object which we then add all our assets to. If all these assets are successfully downloaded then the service worker waits to be activated. If this steps fails it will currently do so silently and the worker discarded.

// A new ServiceWorker is now active
self.addEventListener("activate", function (event) {
  event.waitUntil(
    caches.keys()
      .then(function (cacheNames) {
        return Promise.all(
          cacheNames.map(function (cacheName) {
            if (currentCacheNames.indexOf(cacheName) === -1) {
              return caches.delete(cacheName);
            }
          })
        );
      })
  );
});

Our activate function will be called after the worker has been installed and the page refreshed (using shift + reload) and before any requests are made. It’s best to do any clear up of old caches here, but not much else - any long running processes will delay the rendering of the page.

Here we check to see if there are any caches present that are not in our current list, and remove them if there are. Caches are persistent across service worker updates, so just having a new service worker will not invalidate any cached files - you must explicitly remove them as above if you do not want to reuse them. With the code above, changing a cache name from r3search-wikipedia-v1 to r3search-wikipedia-v2 will therefore force the old v1 cache to be removed, and create a new empty v2 cache.

Instances when you might want to update a cache are:

  • You have updated your logo, css, layout or javascript - update the cacheNameStatic version.
  • You have changed your external webfont - update the cacheNameTypekit version.
// The page has made a request
self.addEventListener("fetch", function (event) {
  var requestURL = new URL(event.request.url);

  event.respondWith(
    caches.match(event.request)
      .then(function (response) {

        // we have a copy of the response in our cache, so return it
        if (response) {
          return response;  //no network request necessary
        }

        var fetchRequest = event.request.clone();

        return fetch(fetchRequest).then(  //
          //handle the response from the network (see next code block)
        );

      })
  );
});

Now we get to the heart of the worker - the fetch event that is called every time a request is made under our scope. First we create a new URL object so we can analyse it later. Then respondWith takes the result of a promise chain and returns that result as the response to the request. If nothing is returned then the browser will continue as normal and fetch the response itself.

We then check all of our caches (caches.match) for a matching request. If we find a match then great - we return it and don’t need to bother the network for another request. If we don’t find a match then we clone the request (so that we work on a copy of it, and leave the original request intact so the browser still has access to it) and fetch it fresh from the network. How we handle the response to that request depends on what the request and response look like:

function (response) {

  var shouldCache = false;

  if (response.type === "basic" && response.status === 200) {
    shouldCache = cacheNameStatic;
  } else if (response.type === "opaque") {
    // if response isn't from our origin / doesn't support CORS

    if (requestURL.hostname.indexOf(".wikipedia.org") > -1) {
      shouldCache = cacheNameWikipedia;
    } else if (requestURL.hostname.indexOf(".typekit.net") > -1) {
      shouldCache = cacheNameTypekit;
    } else {
      // just let response pass through, don't cache
    }

  }

  if (shouldCache) {
    var responseToCache = response.clone();

    caches.open(shouldCache)
      .then(function (cache) {
        var cacheRequest = event.request.clone();
        cache.put(cacheRequest, responseToCache);
      });
  }

  return response;
}

We have two main ways to deal with the response - we either cache it or we don’t. First we check the response type.

Response types

  • Basic - a response from a resource hosted on our domain. We have access to all of the response including full headers. For example our index page, CSS and JavaScript.

  • CORS - a response from an external domain that supports CORS. Again full access to the response is available.

  • Opaque - a response from any other source. In this case we use this for both the wikipedia api calls and the typekit calls that don’t support CORS, and the wikipedia images. The issue with opaque responses is that we don’t get access to the full response so we can’t tell the status of it. Successful responses (status 200) and error responses (statuses in the 400s and 500s) can’t be distinguished between so we might cache an error response. There’s not much that can be done in this case apart from wait for wider support for CORS.

More information on types is available at the fetch standard.

If the type is basic, then we check that the response was successful (status 200) - if it’s not successful we’ll simply let the response pass through without storing it in the cache. Otherwise we indicate we want it stored in cacheNameStatic.

If the type is opaque then we can’t check the status of the response and have to assume it’s successful and cache it. Depending on the form of the URL we’ll either put it in cacheNameWikipedia which will cover both the API responses and the wikipedia images, or in cacheNameTypekit which will contain the web fonts.

If we’ve determined that we should cache the response, then we clone the response and request, open the specified cache and place the request/response pair into it. We don’t have to wait for the promise to resolve to continue - we just pass back the response that we already have and let the cache do its thing (if the caching operation fails then it will be reattempted the next time we request the resource.)

Debugging

As the service worker runs in its own process we can’t inspect it in the regular dev tools window. We can however open up a separate dev tools window to inspect the code in two ways:

  • Open up a tab at chrome://serviceworker-internals. This gives details about registered workers such as their current status and scope, and lets you open up a new dev tools window by clicking the ‘Inspect’ button from where you can set breakpoints, see console output etc. If you don’t see the inspect button then the worker isn’t currently started - click ‘Start’ to start it manually and then click ‘Inspect’.

  • Alternatively open up a new tab at chrome://inspect, and look under the ‘Service workers’ section and you should see an entry for the current page (if the service worker is active.) Click on ‘inspect’ and you’ll get a new dev tools window.

Under the ‘Resources’ tab of these dev tools you’ll also see a ‘Service Worker Cache’ option under which you can see each of your caches and their contents.

And there we have it - a simple standalone app that can work offline.

Improvements

  • While the app enables you to remove your history of viewed articles, it doesn’t actually clear the items from the cache. This is a problem since the size of the cache is limited by the browser so it’s good practice to only keep the items you need. I’ll be implementing this soon.

  • The app should work as a standalone app on Android (using the ‘Add to home screen’ option on Chrome), but this hasn’t been fully tested yet.

Any other suggestions or improvements? Please comment below, or add an issue / send me a pull request on github.

Edited 09/01/2015 - Updated supported browsers to include Opera 28+.