Turbocharge your front end using WebAssembly

Authors: Vaibhav Aparimit & Arun Nalla

WebAssembly

JavaScript has evolved as the lingua franca for front end engineering. Having just one language for the front end is pretty dope as you develop once and run your code on any browser. Of course, you still need to test your web app on different browsers, but that’s okay as the behavioral idiosyncrasies across browsers are limited.

Meanwhile, the web has exploded and performance demands from web apps are insanely high. We are essentially asking JavaScript to do a lot more. Highly performant, compute-optimized apps are written in languages like C/C++/Rust. If only we could combine the power of a statically typed & compiled language like C++ with JavaScript. 

performance demands from web apps are insanely high.

Note that compiled languages will always be faster than interpreted languages. Imagine you implement a “for” loop that runs for a million steps. An interpreted language like Javascript will interpret this “for” loop a million times. This isn’t the case for a compiled language.

Browsers have implemented JIT (just-in-time) compiling, in which the JavaScript engine monitors the code as it runs. If a section of code is used enough times, the engine will attempt to compile that section into machine code. This is great, but still, the code has to be monitored many times before any optimizations can be applied as Javascript is a dynamic typed language. The Javascript engine does not know beforehand how much memory to allocate for a variable.

Also, compilers have a lot of cool optimizations built in. Imagine you write : s = a + b and a is 0. There is no point of performing this computation. Compilers would just reference b‘s memory wherever s is invoked. If you are not using compiler optimizations, there is a lot of overhead like reserving memory for s, then loading a and b, performing additions. Imagine doing this a million times in a loop. Not cool, not cool at all!

Enter WebAssembly

WebAssembly is, well, an assembly-like language that you write for the web. I mean, you don’t write assembly code directly! Instead, it acts as a compilation target for languages like C, C++ and Rust. What this means is that you can take a C++ codebase, compile it to WebAssembly, and run it in the browser at “near-native” speeds.

WebAssembly was designed to be a compiler target from the beginning, so developers who want to use a particular language for web development will be able to do so without having to transpile their code into JavaScript.

WebAssembly isn’t linked in any way to the JavaScript language, which makes WebAssembly a native feature of the web and not a plugin or any arcane workaround (jugaad, as we call it in India). According to ​caniuse.com​, over 91.84% of users worldwide currently run browsers that support WebAssembly. 

Who uses WebAssembly

Samuel L Jackson Movie GIF by Star Wars

A lot of companies use WebAssembly to power up their front end experience. Notable examples include :

  1. Google Earth, a large C++ codebase, n​ow runs on the web​ because of​ ​WebAssembly
  2. AutoCAD​ ​ported their 30-year-old codebase​ to the web using WebAssembly
  3. Doom3​ was ​ported to the web​ with WebAssembly
  4. 1Password​ used WebAssembly to ​speed up their plugin
  5. Figma,​ a prototyping tool for designers, ​used WebAssembly to improve load time

How does WebAssembly work

A browser can run on a number of different processors from desktop computers to smartphones and tablets. Distributing a compiled version of the WebAssembly code for each potential processor would be a pretty bad strategy. 

Instead, what happens is that the high level code written in C++ is converted into an intermediate representation (IR), also known as the WebAssembly binary. This part of the compiler is known as the frontend.  

The bytecode in the Wasm binary isn’t machine code yet. It’s a set of virtual instructions that browsers that support WebAssembly understand. When the wasm binary is loaded into a browser that supports WebAssembly, the binary is compiled into the machine code of the device the browser is running on.

By the way, some of you might have realised that this is the LLVM way of doing things, and different from the gcc/clang way. That’s so astute of you as Emscripten (the most matured toolkit in wasm world) is exactly  based on LLVM architecture. LLVM itself was inspired by the awesomeness of the Java IR i.e. bytecodes. Ah! the beauty, when ideas cross-pollinate ❤️.

Our own WASM experiment in the NetOpt pod

The NetOpt pod within Locus is building the next generation intelligent supply chain optimization product that helps plan flow and inventory of products in a complex network of factories, container ships, warehouses and retailers, around the world, without any human intervention. 

The NetOpt pod has use cases that entail large CSV (close to a million records) upload and validations. We did a small PoC and achieved CSV upload + validation of the said scale in 16 sec. Yup, 16 seconds. All on the browser!

We also didn’t use faster C++ regex or faster parsers or C++ multithreading or web workers, or else we might have brought around significant time reduction. 

Our Approach

We chose C++ for writing the WebAssembly module primarily because we had familiarity with the language and we love C++ :). We used Emscripten as a toolchain for compiling our C++ code to the WebAssembly module. Emscripten helped us bind symbols across Javascript and C++ without really worrying about the C++ name mangling issues. Also, it provides a lot of plumbing JS functions that actually are low-level C APIs under the hood. So, we could just happily focus only on writing the application code. 

We identified three different avenues for passing data from JavaScript to C++.

  1. Heap allocate memory from JavaScript which could be read by C++: We did not take this approach as a lot of custom code had to be written to calculate the memory offsets for different kinds of entities. Addition of any new entity or schema modification of an existing entity would have resulted in writing rewriting low-level heap allocation code again on the JavaScript side.
  2. JavaScript writes to indexed db and C++ file reads indexed db : Good approach but we barely found any good support and references for this approach online. This approach involved mounting indexed db as file, dealing with array buffers, that seemed too much of a low-level implementation. 
  3. Convert CSV to json, serialise json and pass to WebAssembly: We chose this approach because it seemed a simple enough approach. This approach delegated the entire validation and parsing of data to C++, which resulted in better loosely coupled systems. Also, there were a ton of libraries like Papa Parse already available that helped us implement this approach.

So, the final WASM approach we used had the following steps :

  1. User uploads data on the UI
  2. Convert CSV to JSON 
  3. Serialize JSON and pass it to WASM
  4. Deserialize JSON in WASM
  5. Run all validations. Validation logic was written C++ 
  6. WASM returns data after validating, classifying into VALID, TEMPORARY, INVALID
  7. Javascript inserts temporary data into IndexedDB and makes the API call for valid data

One ‘gotcha’ you ought to internalize before embarking on your WASM journey is that the time taken by WASM varies based on the client hardware. Consequently, your performance numbers can never be totally deterministic. So, it’s best to benchmark your WASM application’s performance across a representative sample of your customer’s machines. 

The other thing to keep in mind is the optimization flag. LLVM/Clang has a lot of cool optimizations built in. An important reason why we got so much performance improvement was because our C++ code used regex which basically can generate thousands of lines of assembly code without optimizations. So, LLVM optimizations helped a lot here.

Overall WASM gave us great offline capabilities. No need for any backend API calls.

Ultimately in this case, we did not use WASM for large CSV validations. This was only because we had to also persist the invalidated entities. We had initially thought of maintaining the dirty, invalidated entities in indexed db but that would have resulted in unintended side effects like someone clearing their local storage and losing the data.

Having said that, we thoroughly enjoyed our WebAssembly journey and have identified areas in the product where WebAssembly can dramatically improve our frontend performance. 

iron man eyes GIF

As part of our initial research, we were surprised to learn that not many tech companies have adopted WASM as part of their front end stack and through this blog we wanted to evangelize this tech to the dev community.

One last thing, we are always looking for great engineering talent to join us. Do check out our careers page. We would love to hear your story.

Stay Tuned for More Updates!

Share and Enjoy !

JavaScriptTech ArticlesTechnologyWebAssembly Concepts