An epic Webpack mystery

Webpack logo

For reasons I’ll explain elsewhere, I’m building a desktop app, which stores its data locally. It’s built using Electron, a toolkit for writing desktop apps using web technologies. and it uses a pure-JavaScript database called NeDB for persistence. Pretty quickly, I ran into a headscratcher of a problem. My data wasn’t actually being saved to a file but I wasn’t getting any errors or warnings.

As it turned out, it wasn’t a bug, but a complex situation involving Webpack defaults. Understanding and solving this issue took me waaaaaaay down a rabbit hole, and I thought it would be informative to share the story.

What the heck?

At first, I thought my tooling was gaslighting me. Or maybe I was just doing something wrong. I’m also using a package called nedb-promises, which wraps NeDB in a more modern Promise-based interface, so perhaps something was going wrong with that.

But in the Node shell, I was able to confirm that NeDB could, in fact, create files in the way I was using it. I wasn’t doing anything wrong. It was just not working in Electron.

Next, I enabled the Electron Dev Tools and traced execution all the way down to the actual filesystem calls. I turned out that something was substituting different implementations of two files within the NeDB library—namely NeDB’s storage.js and customUtils.js—such that no actual filesystem storage was happening.

Note storage.js and customUtils.js not appearing as siblings of persistence.js.

Note storage.js and customUtils.js not appearing as siblings of persistence.js.

You can see that in persistence.js, there’s a relative import of ./storage. That should be in the same directory. But that file is nowhere to be seen. Instead, a substitute version of it in a different directory has been bundled.

As a side note, even this tracing wasn’t easy because Babel was configured to convert async functions to generators and generators to regenerator functions, and so I had to put breakpoints in the callbacks of every callback-style function in order to trace down the stack, rather than being able to step into async function calls.

Something was causing different files to be included than the source would indicate. Searching for this curious browser-version directory led me to NeDB’s package.json that causes these files to be substituted when running in a browser context.

NeDB’s documentation mentions a browser mode, but describes an explicit way of invoking it, not this implicit behavior. Likewise, this package.json behavior isn’t directly documented in the Webpack docs, let alone how to prevent it.

Quick fix: fork the packages

I assumed that there was probably some simple Webpack configuration fix (more to come on this), but in the meantime, I wanted to get on with my life. I figured I could do this by forking nedb and deleting the problematic browser key.

I tried doing this, but the problem didn’t go away. Tracing through the code again, I realized the original version of NeDB was still being used. Further investigation showd that nedb-promises had its own nested node_modules/nedb.

This is because its package.json pinned a specific version of nedb that didn’t match my fork. So I had to fork nedb-promises, too, and point it at my nedb fork.

Messy, but it did the trick.

Why is this happening?

While this whole situation is a bit nutty, it’s worth taking some paragraphs to explore why this is happening. When Node.js came out, its de facto package manager, NPM, exploded in popularity. JavaScript’s lack of a substantial standard library created a vacuum that was filled with a rich ecosystem of libraries of all shapes and sizes.

It was quickly apparent that many of these utilities would also be useful in JavaScript’s native environment: the Web. Another package manager, Bower, sprung up to fill this need. It has its own manifest file, bower.json, which mirrors NPM’s package.json. However, many library authors and users found it cumbersome to work with two package managers, especially as the concept of universal JavaScript began to gain popularity. But this separateness was seen as a necessary evil, considering that Node and the browser have significant differences, and even had two totally different module systems (CommonJS and AMD, respectively).

The reason for these module systems is twofold. First, JavaScript had no official module system for most of its existence. Secondly, a Node app can expect to have all its necessary modules nearby on the filesystem, whereas a web app needs to retrieve them from a remote server, which can be very costly. For the web, JavaScript needs to be strategically bundled or unbundled, depending on the needs of a given application. This generally requires more sophisticated module and bundling systems.

But for a lot of simple web apps, a single bundle is fine. A project called Browserify provided a simple way to achieve this. It allowed people to structure web projects like Node projects, using Node dependencies pulled from NPM, building a single bundle. A core challenge here was that a lot of NPM packages had dependencies on Node-specific APIs that aren’t available in the browser, even if sometimes, these APIs had browser-appropriate alternatives.

Can you guess what the solution was?

The solution was to create an entry in package.json called "browser", which would target specific files for replacement with browser-specific alternatives.

Browserify was nice for smaller projects, but it lacked the flexibility needed for larger projects, especially ones for which a single CommonJS-based bundle wouldn’t work. Enter Webpack.

Webpack quickly became the standard for web app bundling, despite a high learning curve. Ultimately, it caught on because its general enough to grow to any level of required complexity, but you can also stand up a simple build with very little effort, or just use a premade boilerplate.

In 2014, Webpack adopted the "browser" standard with little fanfare or documentation. I’m sure Browserify compatibility here was seen as an easy win. In 2015, NeDB took advantage of this, in what appears to have been an uncontroversial PR.

What no one could have anticipated was the meteoric rise of a new JavaScript runtime, called Electron, which provides access to Node and browser APIs at the same time in its renderer threads. Webpack has a build target for this called "electron-renderer", but this still recognizes the "browser" field.

Chasing down a Webpack fix

Patching the issue with forks worked fine, but I couldn’t help but think there was probably a simple Webpack configuration fix.

Narrator voiceover: There wasn’t.

This project is largely educational at this point, so I figured it would be worth my time to finally gain some Webpack knowledge. It’s something I’ve put off for at least a couple years, as I’ve always been able to skate by on the occasional incantation learned from Stack Overflow or someone else on the team having the right answer.

I tried a couple fixes using Webpack’s resolver.alias configuration option, but with no joy.

My brother put me on to the most likely location for a fix, Webpack’s enhanced-resolve subproject. This is the piece of code that identifies what actual file should be read in when Webpack is collecting all the code that should belong in the bundle and wiring it all together. The process of figuring exactly what file to use for your import leftPad from "left-pad" is not at all simple. There are an incredible amount of options and conventions to implement and prioritize.

Webpack’s compilation process famously has a system of loaders and plugins, which can be used to configure your build. The enhanced-resolve subsystem is complex enough to have its own system of plugins. In fact, the entire resolution process is implemented with plugins.

The resolution process, in detail

I’m going to skip over all the trial and error that led to my understanding of how this process works and skip to explaining it. If I’m in foo.js and I’m doing an import bar from "./bar", the resolution process starts with a request object containing two basic pieces of data:

  • path the directory containing foo.js
  • request (a confusingly named property of the request object) the actual argument of the require or import.

The goal of the process is to rewrite the request object such that path points to the actual path of the bar script. That might be bar.js, but as you might know, there are a bunch of other possible extensions.

The plugins register themselves in an execution graph, using a libary called tapable. The nodes in this graph are called hooks and the plugins define taps, which navigate from one hook to the next hook, carrying some state along for the ride. The graph has a starting point, a hook called "resolve". All of these hooks pass around a callback — a familiar pattern if you’ve ever written Node code in the days before promises. This callback is used in 3 possible ways:

  • A plugin can call callback(null, request) to terminate the resolution process with request in the state described above. By convention, the only plugin that does this is the ResultPlugin, which only runs when another plugin navigates the "resolved" hook, and it is the only plugin to tap into to this state.
  • A plugin can call callback(err, null) to terminate the resolution process with a fatal error, causing the Webpack build to fail.
  • Most interestingly, a plugin can call callback(), which bails. What this means is that the tapable library returns to the previous hook and tries the next plugin to tap that hook.

In algorithms/data structures terminology, the plugins form something like a directed acylcic graph (DAG), where the hooks are nodes and the plugins decare taps that jump connect one hook to the next. The resolution process executes a depth-first search for the first valid path from "resolve" to "resolved". Bailing is what causes backtracking in this search.

Webpack allows you to define your own resolver plugins, which you can register by adding them to the resolve.plugins array in your Webpack config. Because the default graph of hooks and taps is already defined, it provides a little syntactic sugar for making sure a tap your plugin adds to a hook is either tried first or last.

Weren’t you trying to fix a problem, or something?

I can, at last, take you to the solution I settled upon, with my brother’s help.

// Prevent nedb from substituting browser storage when running from the
// Electron renderer thread.
const fixNedbForElectronRenderer = {
  apply(resolver) {
    resolver
      // Plug in after the description file (package.json) has been
      // identified for the import, because we'll depend on it for some of
      // the logic below.
      .getHook("beforeDescribed-relative")
      .tapAsync(
        "FixNedbForElectronRenderer",
        (request, resolveContext, callback) => {
          // Detect that the import is from NeDB via the description file
          // dectect for the import. Calling `callback` with no parameters
          // "bails", which proceeds with the normal resolution process.
          if (!request.descriptionFileData.name === "nedb") {
            return callback()
          }

          // When a require/import matches the target files from nedb, we
          // can form the paths to the Node-specific versions of the files
          // relative to the location of the description file. We can then
          // short-circuit the Webpack resolution process by calling the
          // callback with the finalized request object -- meaning that
          // the `path` is pointing at the file that should be imported.
          let relativePath
          if (
            request.path.startsWith(
              resolver.join(request.descriptionFileRoot, "lib/storage")
            )
          ) {
            relativePath = "lib/storage.js"
          } else if (
            request.path.startsWith(
              resolver.join(
                request.descriptionFileRoot,
                "lib/customUtils"
              )
            )
          ) {
            relativePath = "lib/customUtils.js"
          } else {
            // Must be a different file from NeDB, so bail.
            return callback()
          }

          const path = resolver.join(
            request.descriptionFileRoot,
            relativePath
          )
          const newRequest = Object.assign({}, request, { path })
          callback(null, newRequest)
        }
      )
  }
}

// Register the resolver plugin in the webpack config
const config = {
  resolve: {
    plugins: [fixNedbForElectronRenderer]
  }
}

In the resolution process, a hook called "described-relative" is called after the package.json for the imported file is located (as descriptionFilePath) and loaded (as descriptionFileData). For reasons I don’t know, Webpack calls this the “description file”. This file is important, because it identifies the Node module a required script belongs to. It’s how we can differentiate two requires that happen to have the same name. It also gives us an anchor point on our filesystem (as descriptionFileRoot) from which we can resolve the file we really want.

This is exactly what we do. Once we have the path of the files we want, we terminate the resolution process without calling any more hooks. This breaks the convention that only ResultPlugin does this…

…but YOLO.

For all imports that aren’t the file we’re looking for, we bail, which causes the rest of the resolution process to proceed as normal.

In reflection

What did we learn here?

Well, this was quite an adventure to solve a problem, when it seems like that problem shouldn’t exist in the first place. But the thing is, it’s tough to identify who’s actually at fault here. Every decision seems rational, considering few people would have imagined this use case for this combination of software several years ago. I suppose you could say this shows how the history of the ecosystem is still with us, even as the technology becomes ever more advanced.

I’ve spent a lot of time trying to figure out what the ideal solution here would be, preserving the idea of a single project for an increasingly diverse set of JavaScript runtimes. My best idea so far is that Webpack should look for an "electron-renderer" key in package.json files when in "target": "electron-renderer" mode, which should be taken to override any "browser" key that may exist. And then library authors can update accordingly.

Lastly, I guess it just goes to show that none of this stuff is magic. Somewhere down the stack, the strange things you see on the surface are happening because of actual code written by a human being, and if you pull at the thread long enough, eventually you’ll figure it out.

Huge thanks to Brian Johnson who hammered out the outline of this fix and helped nail it down.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s