WebAssembly is the future, and at PSPDFKit, we truly believe in it. So much so that in 2017 we launched a serverless version of our PDF viewer and framework that is completely based on WA.

Thanks to this technology, we are able to ship our C++ rendering engine and PDF processor to browsers and give our customers the opportunity to deliver a standalone solution that doesn’t require a complex backend infrastructure, in addition to the existing server-based product.

Since that 2017 release, our efforts have been focused on reducing WebAssembly startup time, because it is the very first aspect that impacts our end users who opt for standalone deployment.

Don’t miss the first part of our series: WebAssembly: A New Hope

Bottlenecks

Depending on the size of the WebAssembly module, the process of creating an instance of it can often take several seconds — usually in the order of tens.

More specifically, the initialization process consists of three steps — download, compile, and instantiation — that individually affect the overall load time.

Optimizations

However, with relatively little work, it is possible to get significant speed improvement.

To demonstrate this, we will provide an overview of four simple but effective solutions we currently employ in PSPDFKit for Web to speed up each individual step, including:

Download

Since .wasm modules are regular files, it is important to make sure they are cached by the browser. This is a very common — and basic if you will — practice when serving assets over the network. In doing so, repeated requests to the same resource can be served from the local resources cache immediately rather than being transferred over the network.

In order for this to work, each server response needs to include the correct Cache-Control HTTP header for the transferred resource, which tells the browser how long the resource should be cached. In an excellent blog post, Ilya Grigorik from Google does a great job at explaining how HTTP caching works. I highly recommend you read it in case you are not familiar with the topic.

Determining whether or not assets are being served from the browser cache is fairly simple. After opening the developer tools, select the “Network” panel and refresh the page. Cached assets will have response status code 304.

Developer Tools Network tab showing cached requests

Please make sure the cache is not disabled when the developer tools are open. Otherwise, resources will never be served from the local cache.

Once our cache is properly configured, we might want to make sure that our server compresses each resource to reduce its size before transmitting it over the network. This is commonly referred as gzipping, and it is supported by all modern browsers. In fact, browsers automatically negotiate GZIP compression for all HTTP requests.

Finally, with service workers, it is possible to serve .wasm modules offline and immediately. The Google Developers site has a worthwhile article on the topic, which you should definitely check out.

Compiling and Caching

Once the WebAssembly binary is downloaded, we need to compile it to a WebAssembly.Module before we can create instances of it.

This can be done using the WebAssembly.compile() function, which takes a typed array or ArrayBuffer containing the binary code of the .wasm module we want to compile, and returns a Promise that resolves to a WebAssembly.Module object representing the compiled module:

WebAssembly.instantiate() can also compile our WebAssembly binary code and then create an instance immediately. We will look into this in the next section.

Regardless of which method we use to compile our .wasm module, this step represents the major bottleneck when using WebAssembly, and it might be very time consuming. In our case, it averages 4s on a modern machine with 16 GB of memory.

The good news is that modern browsers ship with IndexedDB, an API for client-side storage, which we can use to cache compiled WebAssembly modules.

As of January 2018, Firefox and Chrome (Canary, under feature flag. See: chrome://flags) support WebAssembly-structured cloning, which allows for storing the compiled WebAssembly.Module in IndexedDB. In MSEdge, IndexedDB seems to only be working on the main thread (not in web workers).

Although IndexedDB is a transactional database system like an SQL-based RDBMS, it allows you to store and retrieve objects that are indexed with a key.

In our case, the key will be the version of our .wasm module, and the value will be the compiled module itself. Whenever we change the module version, we can ignore the cached value and recompile.

In a real application, another option is to use a checksum of the .wasm file instead of a module version.

It is also important to delete outdated versions of the module from the database. For simplicity’s sake, we suggest deleting every record from the database before caching a new version of the module.

Keep in mind that for security reasons, browsers don’t allow local files to access IndexedDB databases. As such, our application needs to run on a web server.

During the rest of this post, we will use a getCache helper that uses IndexedDB under the hood. You can take a look at its implementation on GitHub. Please keep in mind that getCache is a simpified and non-production-ready abstraction.

Once we have a simple interface to IndexedDB, caching our compiled module is straightforward.

Using our helper and the MODULE_VERSION, we look up the cache to see if we have a compiled module. If not, we fetch the actual .wasm module, compile it, and put it in the cache before returning it. Finally, we can call WebAssembly.instantiate with the compiledModule:

Let’s have a look at some numbers.

With IndexedDB caching:

Without IndexedDB caching:

When reading from cache, initialization is 5 times faster!

Instantiation

Instantiation is the process of creating an instance of a WebAssembly.Module.

When this step is separate from compilation, it might be worth measuring its impact on the overall loading time before doing any (micro)optimization. In many cases in fact, instantiation is fast enough.

But as we mentioned above, WebAssembly.instantiate() can also compile .wasm modules. The method takes the WebAssembly binary code in the form of a typed array or ArrayBuffer and performs both compilation and instantiation in one step.

In this case, we can adapt our caching snippet to work with WebAssembly.instantiate():

Streaming Instantiation: Combining Download and Instantiation

The ultimate optimization nowadays is streaming instantiation, which allows WebAssembly to compile as the payload is downloaded.

This improvement has a significant impact on the initial compile time, and it allows applications to hide the cost of compiling behind download costs.

As of February 2018, only Firefox and Chrome support WebAssembly.instantiateStreaming, with Chrome shipping support on web workers soon.

At PSPDFKit, we use streaming compilation and fall back to traditional compilation when it is not supported.

When the compiled module is not cached, we create a fetch promise to download the file, and if WebAssembly.instantiateStreaming is supported, we invoke it with the promise and the module imports. When streaming instantiation is not supported, we use the fetch promise regularly and then instantiate the module separately:

As a side note, with streaming instantiation, now .wasm files must be served with the correct Content-Type header, which is application/wasm. When the server is not configured to do so, you might get Unhandled promise rejection TypeError: Response has unsupported MIME type.

Mozilla recommends using instantiateStreaming where possible because WebKit’s JSC engine has some performance issues with compileStreaming.

We pulled out some numbers to compare the initialization time of PSPDFKit for Web with and without streaming. On average, streaming initialization is 1.8 times faster in Firefox.

With streaming instantiation:

Without streaming instantiation:

Object Pooling — Caching Instances

In the single-page application era, it might be common to create and destroy WebAssembly.Module instances multiple times during the lifecycle of an application.

This process can add some overhead, especially when compiled modules are not cached permanently using IndexedDB. In the worst case, an application will start the entire initialization process (download, compile, instantiate) from scratch every time it needs to use the .wasm module.

When this occurs, an object pool can be a decent solution to speed up the subsequent creation of instances. In fact, after the initial download, compilation, and instantiation, the warmed up WebAssembly.Module instances can be kept in memory and reused at any time.

The object pool simply holds a fixed number of instances in memory and returns one when we ask for it. If none are available, then it creates a new instance and returns it.

When we no longer need an instance, we can put it back in the pool, which will either recycle it (do some cleanup) and keep it alive, or destroy it i.e. defer the object to garbage collection for cleanup when the pool is full.

Each instance must implement a simple Recyclable interface that defines a recycle and destroy method:

An object pool can be used at any level in an application to keep things in memory and ready to use. For example, it could be employed to cache the entire web worker in which we run our WebAssembly code, or it could just cache an instance of a WebAssembly module.

At PSPDFKit, we use object pooling to cache our WebAssembly backend, which runs in a web worker. This allows for fast instance creation when opening new PDFs.

Conclusion

Although WebAssembly is a fairly new and cutting-edge technology, it has already been employed in production applications like PSPDFKit for Web.

Getting startup time of our application down to 200 – 300ms (best case) is already achievable since most of the following optimizations are already possible:

  • Making sure the .wasm file is properly cached
  • Using streaming instantiation
  • Caching compiled WebAssembly modules in IndexedDB
  • Using object pooling to cache warmed-up instances

In fact, with our most recent release, we implemented IndexedDB caching and streaming instantiation in PSPDFKit for Web and our customers are already benefitting from these improvements.

During the past year, Mozilla has invested a lot into making WebAssembly awesome and fast, and we believe other vendors will follow along. We are looking forward to it!

As a final note to our customers, we also offer the possibility of using asm.js. While some of the optimizations above cannot be implemented when using asm.js, the startup time is usually faster than (unoptimized) .wasm.

Don’t miss the first part of our series: WebAssembly: A New Hope


👋 I hope you enjoyed reading! We are actively looking for frontend engineers to join our web team and work on PSPDFKit for Web and awesome technology like WebAssembly. If interested, please check our jobs page.



Source link