Web front-end
8 minute read

Guide to Monorepos for Front-end Code

Alexander is an experienced front-end developer having built eCommerce and enterprise websites as well as web and mobile applications.

Monorepos are a hot topic for a discussion. There have been a lot of articles recently about why you should and shouldn’t use this type of architecture for your project, but most of them are biased in one way or another. This series is an attempt to gather and explain as much information as possible to understand how and when to use monorepos.

A Monorepository is an architectural concept, which basically contains all the meaning in its title. Instead of managing multiple repositories, you keep all your isolated code parts inside one repository. Keep in mind the word isolated—it means that monorepo has nothing in common with monolithic apps. You can keep many kinds of logical apps inside one repo; for example, a website and its iOS app.

Comparison of a monorepo, single repo, and multi-repo

This concept is relatively old and appeared about a decade ago. Google was one of the first companies that adopted this approach for managing their codebases. You may ask, if it has existed for a decade, then why is it such a hot topic only now? Mostly, over the course of the last 5-6 years, many things have undergone dramatic changes. ES6, SCSS preprocessors, task managers, npm, etc.—nowadays, to maintain a small React-based app, you have to deal with project bundlers, test suites, CI/CD scripts, Docker configurations, and who knows what else. And now imagine that instead of a small app, you need to maintain a huge platform consisting of a lot of functional areas. If you are thinking about architecture, you will want to do two main things: Separate concerns and avoid code dupes.

To make this happen, you will probably want to isolate large features into some packages and then use them via a single entry point in your main app. But how do you manage those packages? Each package will have to have its own workflow environment configuration, and this means that every time you want to create a new package, you will have to configure a new environment, copy over all configuration files, and so on. Or, for example, if you have to change something in your build system, you will have to go over each repo, do a commit, create a pull request, and wait for each build, which slows you down a lot. At this step, we are meeting the monorepos.

Instead of having a lot of repositories with their own configs, we will have only one source of truth—the monorepo: one test suite runner, one Docker configuration file, and one configuration for Webpack. And you still have scalability, opportunity to separate concerns, code sharing with common packages, and a lot of other pros. Sounds nice, right? Well, it is. But there are some drawbacks as well. Let’s take a close look at the exact pros and cons of using the monorepo in the wild.

Monorepo Advantages:

  • One place to store all configs and tests. Since everything is located inside one repo, you can configure your CI/CD and bundler once and then just re-use configs to build all packages before publishing them to remote. Same goes for unit, e2e, and integration tests—your CI will be able to launch all tests without having to deal with additional configuration.
  • Easily refactor global features with atomic commits. Instead of doing a pull request for each repo, figuring out in which order to build your changes, you just need to make an atomic pull request which will contain all commits related to the feature that you are working against.
  • Simplified package publishing. If you plan to implement a new feature inside a package that is dependent on another package with shared code, you can do it with a single command. It is a function that needs some additional configurations, which will be later discussed in a tooling review part of this article. Currently, there is a rich selection of tools, including Lerna, Yarn Workspaces, and Bazel.
  • Easier dependency management. Only one package.json. No need to re-install dependencies in each repo whenever you want to update your dependencies.
  • Re-use code with shared packages while still keeping them isolated. Monorepo allows you to reuse your packages from other packages while keeping them isolated from one another. You can use a reference to the remote package and consume them via a single entry point. To use the local version, you are able to use local symlinks. This feature can be implemented via bash scripts or by introducing some additional tools like Lerna or Yarn.

Monorepo Disadvantages:

  • No way to restrict access only to some parts of the app. Unfortunately, you can’t share only the part of your monorepo—you will have to give access to the whole codebase, which might lead to some security issues.
  • Poor Git performance when working on large-scale projects. This issue starts to appear only on huge applications with more than a million commits and hundreds of devs doing their work simultaneously every day over the same repo. This becomes especially troublesome as Git uses a directed acyclic graph (DAG) to represent the history of a project. With a large number of commits, any command that walks the graph could become slow as the history deepens. Performance slows down as well because of the number of refs (i.e., branches or tags, solvable by removing refs you don’t need anymore) and amount of files tracked (as well as their weight, even though heavy files issue can be resolved using Git LFS).

    Note: Nowadays, Facebook tries to resolve issues with VCS scalability by patching Mercurial and, probably soon, this won’t be such a big issue.

  • Higher build time. Because you will have a lot of source code in one place, it will take way more time for your CI to run everything in order to approve every PR.

Tool Review

The set of tools for managing monorepos is constantly growing, and currently, it’s really easy to get lost in all of the variety of building systems for monorepos. You can always be aware of the popular solutions by using this repo. But for now, let’s get a quick look at the tools that are heavily used nowadays with JavaScript:

  • Bazel is Google’s monorepo-oriented build system. More on Bazel: awesome-bazel
  • Yarn is a JavaScript dependency management tool that supports monorepos through workspaces.
  • Lerna is a tool for managing JavaScript projects with multiple packages, built on Yarn.

Most of the tools use a really similar approach, but there are some nuances.

Illustration of the monorepo git repository's CI/CD process

We will dive deeper into the Lerna workflow as well as into the other tools in Part 2 of this article since it is a rather large topic. For now, let’s just get an overview of what’s inside:

Lerna

This tool really helps while dealing with semantic versions, setting up building workflow, pushing your packages, etc. The main idea behind Lerna is that your project has a packages folder, which contains all of your isolated code parts. And besides packages, you have a main app, which for example can live in the src folder. Almost all operations in Lerna work via a simple rule—you iterate through all of your packages, and do some actions over them, e.g., increase package version, update dependency of all packages, build all packages, etc.

With Lerna, you have two options on how to use your packages:

  1. Without pushing them to remote (NPM)
  2. Pushing your packages to remote

While using the first approach, you are able to use local references for your packages and basically don’t really care about symlinks to resolve them.

But if you are using the second approach, you are forced to import your packages from remote. (e.g., import { something } from @yourcompanyname/packagename;), which means that you will always get the remote version of your package. For local development, you will have to create symlinks in the root of your folder to make the bundler resolve local packages instead of using those that are inside your node_modules/. That’s why, before launching Webpack or your favorite bundler, you will have to launch lerna bootstrap, which will automatically link all packages.

An illustration of namespacing your modules inside a single node package

Yarn

Yarn initially is a dependency manager for NPM packages, which was not initially built to support monorepos. But in version 1.0, Yarn developers released a feature called Workspaces. At release time, it wasn’t that stable, but after a while, it became usable for production projects.

Workspace is basically a package, which has its own package.json and can have some specific build rules (for example, a separate tsconfig.json if you use TypeScript in your projects.). You actually can somehow manage without Yarn Workspaces using bash and have the exact same setup, but this tool helps to ease the process of installation and updating dependencies per package.

At a glance, Yarn with its workspaces provides the following useful features:

  1. Single node_modules folder in the root for all packages. For example, if you have packages/package_a and packages/package_b—with their own package.json—all dependencies will be installed only in the root. That is one of the differences between how Yarn and Lerna work.
  2. Dependency symlinking to allow local package development.
  3. Single lockfile for all dependencies.
  4. Focused dependency update in case if you want to re-install dependencies for only one package. This can be done using the -focus flag.
  5. Integration with Lerna. You can easily make Yarn handle all the installation/symlinking and let Lerna take care of publishing and version control. This is the most popular setup so far since it requires less effort and is easy to work with.

Useful links:

Bazel

Bazel is a build tool for large-scale applications, which can handle multi-language dependencies and support a lot of modern languages (Java, JS, Go, C++, etc.). In most cases, using Bazel for small-to-medium JS applications is overkill, but on a large scale, it may provide a lot of benefit because of its performance.

By its nature, Bazel looks similar to Make, Gradle, Maven, and other tools that allow for project builds based on the file which contains a description of the build rules and project dependencies. The same file in Bazel is called BUILD and is located inside the workspace of the Bazel project. The BUILD file uses its Starlark, a human-readable, high-level build language which looks a lot like Python.

Usually, you won’t be dealing a lot with BUILD because there is a lot of boilerplate that can be easily found on the web and which is already configured and ready for development. Whenever you want to build your project, Bazel basically does the following:

  1. Loads the BUILD files relevant to the target.
  2. Analyzes the inputs and their dependencies, applies the specified build rules, and produces an action graph.
  3. Executes the build actions on the inputs until the final build outputs are produced.

Useful links:

Conclusion

Monorepos are just a tool. There are a lot of arguments of whether it has a future or not, but the truth is that in some cases, this tool does its job and deals with it in an efficient manner. Over the course of the past few years, this tool has evolved, gained much more flexibility, overcome a lot of issues, and removed a complexity layer in terms of configuration.

There are still a lot of issues to figure out, like poor Git performance, but hopefully, this will be resolved in the near future.

If you’d like to learn to build a robust CI/CD pipeline for your app, I recommend How to Build an Effective Initial Deployment Pipeline with GitLab CI.

Understanding the basics

What is meant by monolithic architecture?

Monolithic architecture in software development means an approach where you implement your application as a set of tightly-coupled components/features composed into one piece. You can’t use any components outside of the app’s scope.

What are source code repositories?

A repository is a place to store and retrieve source code to install/develop it. Repos store all versions of data with related metadata, which allows for history revisions or working on separate parallel versions of the same project.

Why is modularity important in programming?

Loosely coupled code has proven to be more reliable because it allows us to introduce changes without being afraid to break things in other places. This makes it safer to develop and simplifies refactoring.

What are symlinks?

Symlinks (symbolic links) are files with reference to an original folder/file in form of relative or absolute path instead of actual content. In this case, they're used to affect pathname resolution to its proper place.

What’s the most common way to make modules in JS?

The most popular way to isolate code into separate modules nowadays is by using npm packages (can be done by Yarn or npm). They are wrapped into a folder with their own config files and then pushed to a remote server.