A high-level overview of JS engines and Webassembly
Introduction:
How was the first JS engine born?
Until 1995 all web pages were static. Netscape decided that they should come up with a scripting language, that would give the browser the ability of dynamic behavior. After 10 days JS and its interpreter implementation were released. JS was not designed to be a fast and efficient language from a performance point of view. It just had to run in the browser, and perform simple dom manipulations.
Due to the lack of performance requirements and the short time that was allocated to this project, the interpreter solution was chosen.
Interpreter vs. compiler
An interpreter is software that translates source/byte code line by line into a machine code and executes it on a run time. It does not need an optimized or pre-compiled code to a specific machine language.
A Compiler reads over your entire code, does some optimization, and then produces an optimized code.
Why the interpreter approach was chosen?
- Compilers are much more complicated software.
- Hard to port to different CPU and operation systems architecture.
- Slower development circle.
Why JS is considered to be slow?
JS is a dynamically typed language, therefore, information about the variable types and size is not available. It’s difficult to make accurate assumptions of allocating memory because of it. For example, when executing an arithmetical operation such as + between two variables, the JS engine should take into consideration that + can mean adding two numbers or concatenating strings, or concatenating a string to an integer. As a result, the memory allocation would be different for each case. Static languages can compile and optimize the code in a much better way, because of the information they have in advance.
What is a JS engine?
The most basic idea behind the JS engine is: Receiving a JS file script, turning it into an instructions binary code, and serving it to the CPU.
Before going into details, let’s take a short overview of the first JS-engines written by Netscape browser in 1995.
The first step of the engine is receiving a JS source file. The baseline compiler then compiles the JS into a bytecode (a machine code that the interpreter can work with). The compilation is done as fast as possible in order to prevent a major delay on the application’s bootstrap.
The bytecode produced by the compiler is passed to the interpreter.
The interpreter's job is to translate the optimized bytecode line by line into a machine code that the operating system can execute.
This engine was very slow and insufficient because of the unoptimized code that the interpreter received. As the web pages became more complicated, a more efficient JS engine had to be developed. Let’s take a look at how the interpreter reads a simple few lines of code.
function sum(a, b) {
return a + b;
}for(let i=0; i<1000; i++) {
console.log(sum(1,2))
}
This code will be executed immediately line by line till the for loop ends, meaning that the sum function will be called 1000 times. This is called a hot code, we’ll get to this definition later in this article.
V8 google chrome JS engines
We can see a similar mechanism, a baseline compiler that spits an unoptimized binary code as fast as possible to the interpreter, and from there it translated into a machine code and being executed.
There is one major difference, the JIT compiler.
JIT stands for Just In Time, meaning, unlike with a compiled language, such as C, where the compilation is done ahead of time, JavaScript is compiled during execution.
How does the JIT work?
While the code is executed by the interpreter, a new thread will keep tracking and identifying a “hot code”, meaning detecting the parts of the code that are being used the most. These parts will be optimized or deoptimized based on the optimization assumptions and will be translated into a binary code.
How the optimization is done?
- AST (Abstract Syntax Tree), is a tree representation of the code.
In order to bootstrap our app, we don’t need to compile and execute, for example, a user button click, or lazy load after scrolling to the bottom of the page. The AST maps for the compiler according to the code tree the most relevant code for immediate run time. - Unrolling loops.
- Inlining functions.
What is a WebAssembly?
WebAssembly is a new type of code that can be run in modern web browsers. Developers don’t write WebAssembly directly, they write in the language of their choice, which is then compiled into WebAssembly bytecode. The bytecode is then run on the web browser — where it’s translated into native machine code and executed.
Why WebWssembly?
- Speed: The WebAssembly is compiled into a binary file, which is much smaller than a JS source code, therefore the browser will download it much faster. Also, the WASM is a strongly typed language, this fact makes the JS engine compile and make decisions much faster since it does not have to speculate about what type would be used. The compilation is done before the code even reaches the browser.
- It’s possible now to write in more language types, except JS.
How does the browser read WASM?
When the JS engine receives a WASM file, the main difference in the process is the Liftoff component. The goal of Liftoff is to reduce startup time for WebAssembly-based apps by generating code as fast as possible. Code quality is secondary, as hot code is eventually recompiled with TurboFan anyway. Liftoff avoids the time and memory overhead of constructing an IR and generates machine code in a single pass over the bytecode of a WebAssembly function.