Javascript From Source to Browser

Many frontend starter kits include Webpack or Rollup configurations to make it easy to deploy code to the browser. But, have you ever wondered what happens under the hood? 30 October 2020

# Javascript modules

As developers, we split our source files into multiple modules for two reasons: because it is easier to reason with less code at a time, and because Javascript variables (which includes functions) are scoped to module boundaries, allowing us to have variables with the same name in two or more modules without collision. There is no debate this is a good thing. For example, consider module 0 and module 1 below.

/* module 0 */
import { myfunc, myvar } from "./module1";

myfunc(myvar+5);
/* module 1 */
export const myfunc = x => { console.log(x); }
export const myvar = 42;

# Why we should bundle

With code split into multiple modules, to load the complete application is to traverse the dependency tree. From the main entry point, we check for and load its immediate dependents, and from there the next-level dependents, and so on. In a synchronous environment like Node.js , this waterfall of dependencies is loaded very quickly and often imperceptibly. However, in an asynchronous environment like browsers, this waterfall can be very slow, requiring a round trip time to the server for every level of dependencies. For real-world applications, this can get very deep, and thus very expensive, very quickly.

To overcome this, we bundle all our modules together into a single file, so that browsers can load it all in a single round trip. Here is an example of what a simple bundle may look like. Every module is wrapped into functions to limit the scope of variables within it. The bundle comes with helpers to import/export variables across scopes.

(function() {
const modules = [
function (bundler) { /* module 0 */
const { myfunc, myvar } = bundler.import(1); /* import module 1 */
myfunc(myvar+5);
},
function (bundler) { /* module 1 */
const myfunc = x => { console.log(x); };
const myvar = 42;
bundler.export({ myfunc, myvar });
}
]

const bundler = {
init() { ... },
import() { ... },
export() { ... }
}

bundler.init(modules);
})()

Indeed this is how many bundlers, like Webpack  works. The bundle is essentially an array of modules, plus some overhead to orchestrate the import/export. We can optimise this further; a more sophisticated bundler can analyse the Abstract Syntax Tree (AST)  to resolve and flatten the dependencies, renaming as necessary to avoid collisions. In our example, the code could be re-written automatically by the bundler as the following. Producing such bundles with zero overhead is a feature of Rollup .

const myfunc = (x) => { console.log(x); };
const myvar = 42;

myfunc(myvar+5);

# Why we shouldn't bundle

The main problem with bundling, is that whenever any module changes, the entire bundle is invalidated and will need to be redownloaded. To maximise cache reuse, the bundle should be split into smaller chunks, such that only the containing chunk needs to be redownloaded when a module changes. For example, external code from npm  are usually not updated as often as application code, so this could be a good candidate for splitting into its own chunk. Chunking is also useful to split rarely used, or to-be-used-later code into its own chunk that can be dynamically loaded as necessary. Chunks are bundler specific, and possibly even application version specific.

Chunking reintroduces the waterfall problem, which can be solved with preloading. We can preload every dependency by adding to the HTML the directive <link rel="preload" href="chunk.js" as="script">, or by adding to the HTTP response the header Link: <chunk.js>;rel="preload";as="script", or by using HTTP Early Hints . Any of these methods will inform the browser to speculatively load dependent chunks into its cache ahead of time, thus avoiding the waterfall. This is just an optimization, it doesn't actually load the Javascript into memory. That will require the bundle to explicitly request for the chunk at runtime, or by adding a <script src="chunk.js"> to the HTML.

# ES Modules

Recent innovations in browser technology means that modules can now be natively loaded, removing the need for bundler specific chunking. These scripts are preloaded with <link rel="modulepreload"> and the main entrypoint activated with <script type="module">. Subsequent modules are loaded much the same way as you'd expect. To avoid the waterfall of dependencies, all child modules should be preloaded ahead of time.

But this does not mean that bundlers are no longer useful. Bundlers understand the import/export AST, and can provide cache-busting imports, renaming modules as well as references to it. For example, with import { myvar } from "./mymodule.0123456789.mjs". This ensures maximum cacheability, yet with the ability to cache-bust whenever a module changes. Understanding the AST, also means that the bundler can perform tree-shaking, to recognise unused exports and remove them from the output module.

Code downloaded from npm  are often in CommonJS form, as it was originally meant for Node. Rollup  has plugins  to import CommonJS modules, and re-export them as ES modules to allow using native ES Modules everywhere. Webpack 5  has ES Modules as the target, enabled by setting experiments.outputModule = true, but does not produce as clean output modules. We can continue to use chunking methods with ES Modules. A group of modules that are delivered together can be chunked into a single file to improve compression performance, reducing the overall download size, and the number of files that need to be preloaded.

# Transpilers & polyfills

There are many languages that transpile to Javascript. Even Javascript (the living standard) transpiles down to Javascript (a previous standard). Babel  transpiles from next generation Javascript, Typescript  transpiles from Typescript, and so on. This transpilation process is not to be confused with the module dependencies / bundling process described above. Bundlers only handle the import/export and tree-shaking, deferring to transpilers (or loaders ) to transpile.

Transpilation merely changes the syntax to match the target environment, it does not add missing capabilites, like Promises and Generators. To do that, a polyfill like CoreJS  is needed. This is usually very tightly integrated to the transpiler that the process is transparent.

# Optimization

Once the output chunks, bundles, and polyfills are assembled, we can then optimize the lot further. Usually, this means to optimise for size with libraries such as Terser . We do this by renaming functions and variables to shorter names, and removing excess optional syntax such as whitespaces and comments.

It is theoretically possible to optimize for other properties, such as performance. This is common in compiled languages where the target is assembly. GCC  for instance, has a number of techniques like loop-unrolling, loop-invariants, and many more that preserve logic, but ordered differently to optimize performance. Such optimizing transpilers don't really exist for Javascript, as far as I know, but could in theory, be useful in scenarios where network speed is not a problem, but computational performance is.

Indeed, the optimization process is basically a transpilation process too. It takes unoptimized code, and produces optimized code.

# Sourcemaps

With all the conversions from source to output, the mapping from source to output code is lost, making debugging difficult. Sourcemaps are JSON files with a particular format that provides such a map. The original source can be embedded within the sourcemap, or omitted to provide source anonymity.

Each transpilation step, including optimization, produces its own source map, but we are typically interested only in the net source map.

# Summary

There is fair complexity to convert source code to production Javascript. Bundling is necessary to combine multiple modules into a single one, but comes with a cache cost. Splitting the bundle into chunks improves caching, but is bundler specific. ES Modules provide a native way to solve both these issues.

Transpilers and polyfills will likely never be rid of completely, because there will likely always be newer features introduced faster than standards and browsers can keep up with. Optimization for size is important to optimise for transfer speed. Finally, source maps provide a mechanism to debug the original source code.