Codemods with jscodeshift

How many times have you used the find-and-replace functionality across a directory to make changes to JavaScript source files? If you’re good, you’ve gotten fancy and used regular expressions with capturing groups, because it’s worth the effort if your code base is sizable. Regex has limits, though. For non-trivial changes you need a developer who understands the code in context and is also willing to take on the long, tedious, and error-prone process.

This is where “codemods” come in.

Codemods are scripts used to rewrite other scripts. Think of them as a find and replace functionality that can read and write code. You can use them to update source code to fit a team’s coding conventions, make widespread changes when an API is modified, or even auto-fix existing code when your public package makes a breaking change.

The jscodeshift toolkit is great for working with codemods.

Think of codemods as a scripted find and replace functionality that can read and write code.

In this article, we’re going to explore a toolkit for codemods called “jscodeshift” while creating three codemods of increasing complexity. By the end you will have broad exposure to the important aspects of jscodeshift and will be ready to start writing your own codemods. We will go through three exercises that cover some basic, but awesome, uses of codemods, and you can view the source code for these exercises on my github project.

What Is jscodeshift?

The jscodeshift toolkit allows you to pump a bunch of source files through a transform and replace them with what comes out the other end. Inside the transform, you parse the source into an abstract syntax tree (AST), poke around to make your changes, then regenerate the source from the altered AST.

The interface that jscodeshift provides is a wrapper around recast and ast-types packages. recast handles the conversion from source to AST and back while ast-types handles the low-level interaction with the AST nodes.

Setup

To get started, install jscodeshift globally from npm.

npm i -g jscodeshift

There are runner options you can use and an opinionated test setup that makes running a suite of tests via Jest (an open source JavaScript testing framework) really easy, but we’re going to bypass that for now in favor of simplicity:

jscodeshift -t some-transform.js input-file.js -d -p

This will run input-file.js through the transform some-transform.js and print the results without altering the file.

Before jumping in, though, it is important to understand three main object types that the jscodeshift API deals with: nodes, node-paths, and collections.

Nodes

Nodes are the basic building blocks of the AST, often referred to as “AST nodes.” These are what you see when exploring your code with AST Explorer. They are simple objects and do not provide any methods.

Node-paths

Node-paths are wrappers around an AST node provided by ast-types as a way to traverse the abstract syntax tree (AST, remember?). In isolation, nodes do not have any information about their parent or scope, so node-paths take care of that. You can access the wrapped node via the node property and there are several methods available to change the underlying node. node-paths are often referred to as just “paths.”

Collections

Collections are groups of zero or more node-paths that the jscodeshift API returns when you query the AST. They have all sorts of useful methods, some of which we will explore.

Collections contain node-paths, node-paths contain nodes, and nodes are what the AST is made of. Keep that in mind and it will be easy to understand the jscodeshift query API.

It can be tough to keep track of the differences between these objects and their respective API capabilities, so there’s a nifty tool called jscodeshift-helper that logs the object type and provides other key information.

Knowing the difference between nodes, node-paths, and collections is important.

Knowing the difference between nodes, node-paths, and collections is important.

Exercise 1 - Remove Calls To console

To get our feet wet, let’s start with removing calls to all console methods in our code base. While you can do this with find and replace and a little regex, it starts to get tricky with multiline statements, template literals, and more complex calls, so it’s an ideal example to start with.

First, create two files, remove-consoles.js and remove-consoles.input.js:

//remove-consoles.js

export default (fileInfo, api) => {
};
//remove-consoles.input.js

export const sum = (a, b) => {
  console.log('calling sum with', arguments);
  return a + b;
};
  
export const multiply = (a, b) => {
  console.warn('calling multiply with',
    arguments);
  return a * b;
};

export const divide = (a, b) => {
  console.error(`calling divide with ${ arguments }`);
  return a / b;
};

export const average = (a, b) => {
  console.log('calling average with ' + arguments);
  return divide(sum(a, b), 2);
};

Here’s the command we’ll be using in the terminal to push it through jscodeshift:

jscodeshift -t remove-consoles.js remove-consoles.input.js -d -p

If everything’s set up correctly, when you run it you should see something like this.

Processing 1 files... 
Spawning 1 workers...
Running in dry mode, no files will be written! 
Sending 1 files to free worker...
All done. 
Results: 
0 errors
0 unmodified
1 skipped
0 ok
Time elapsed: 0.514seconds

OK, that was a bit anticlimactic since our transform doesn’t actually do anything yet, but at least we know it’s all working. If it doesn’t run at all, make sure you installed jscodeshift globally. If the command to run the transform is incorrect, you’ll either see an “ERROR Transform file … does not exist” message or “TypeError: path must be a string or Buffer” if the input file cannot be found. If you’ve fat-fingered something, it should be easy to spot with the very descriptive transformation errors.

Our end goal though, after a successful transform, is to see this source:

export const sum = (a, b) => {
  return a + b;
};
  
export const multiply = (a, b) => {
  return a * b;
};

export const divide = (a, b) => {
  return a / b;
};

export const average = (a, b) => {
  return divide(sum(a, b), 2);
};

To get there, we need to convert the source into an AST, find the consoles, remove them, and then convert the altered AST back into source. The first and last steps are easy, it’s just:

remove-consoles.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;
  const root = j(fileInfo.source);

  return root.toSource();
};

But how do we find the consoles and remove them? Unless you have some exceptional knowledge of the Mozilla Parser API, you’ll probably need a tool to help understand what the AST looks like. For that you can use the AST Explorer. Paste the contents of remove-consoles.input.js into it and you’ll see the AST. There is a lot of data even in the simplest code, so it helps to hide location data and methods. You can toggle the visibility of properties in AST Explorer with the checkboxes above the tree.

We can see that calls to console methods are referred to as CallExpressions, so how do we find them in our transform? We use jscodeshift’s queries, remembering our earlier discussion on the differences between Collections, node-paths and nodes themselves:

//remove-consoles.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;
  const root = j(fileInfo.source);

  return root.toSource();
};

The line const root = j(fileInfo.source); returns a collection of one node-path, which wraps the root AST node. We can use the collection’s find method to search for descendant nodes of a certain type, like so:

const callExpressions = root.find(j.CallExpression);

This returns another collection of node-paths containing just the nodes that are CallExpressions. At first blush, this seems like what we want, but it is too broad. We might end up running hundreds or thousands of files through our transforms, so we have to be precise to have any confidence that it will work as intended. The naive find above would not just find the console CallExpressions, it would find every CallExpression in the source, including

require('foo')
bar()
setTimeout(() => {}, 0)

To force greater specificity, we provide a second argument to .find: An object of additional parameters, each node needs to be included in the results. We can look at the AST Explorer to see that our console.* calls have the form of:

{
  "type": "CallExpression",
  "callee": {
    "type": "MemberExpression",
    "object": {
      "type": "Identifier",
      "name": "console"
    }
  }
}

With that knowledge, we know to refine our query with a specifier that will return only the type of CallExpressions we’re interested in:

const callExpressions = root.find(j.CallExpression, {
  callee: {
    type: 'MemberExpression',
    object: { type: 'Identifier', name: 'console' },
  },
});

Now that we’ve got an accurate collection of the call sites, let’s remove them from the AST. Conveniently, the collection object type has a remove method that will do just that. Our remove-consoles.js file will now look like this:

//remove-consoles.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;

  const root = j(fileInfo.source)

  const callExpressions = root.find(j.CallExpression, {
      callee: {
        type: 'MemberExpression',
        object: { type: 'Identifier', name: 'console' },
      },
    }
  );

  callExpressions.remove();

  return root.toSource();
};

Now, if we run our transform from the command line using jscodeshift -t remove-consoles.js remove-consoles.input.js -d -p, we should see:

Processing 1 files... 
Spawning 1 workers...
Running in dry mode, no files will be written! 
Sending 1 files to free worker...

export const sum = (a, b) => {
  return a + b;
};
  
export const multiply = (a, b) => {
  return a * b;
};

export const divide = (a, b) => {
  return a / b;
};

export const average = (a, b) => {
  return divide(sum(a, b), 2);
};

All done. 
Results: 
0 errors
0 unmodified
0 skipped
1 ok
Time elapsed: 0.604seconds

It looks good. Now that our transform alters the underlying AST, using .toSource() generates a different string from the original. The -p option from our command displays the result, and a tally of dispositions for each file processed is shown at the bottom. Removing the -d option from our command, would replace the content of remove-consoles.input.js with the output from the transform.

Our first exercise is complete… almost. The code is bizarre looking and probably very offensive to any functional purists out there, and so to make transform code flow better, jscodeshift has made most things chainable. This allows us to rewrite our transform like so:

// remove-consoles.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;

  return j(fileInfo.source)
    .find(j.CallExpression, {
        callee: {
          type: 'MemberExpression',
          object: { type: 'Identifier', name: 'console' },
        },
      }
    )
    .remove()
    .toSource();
};

Much better. To recap exercise 1, we wrapped the source, queried for a collection of node-paths, change the AST, and then regenerated that source. We’ve gotten our feet wet with a pretty simple example and touched on the most important aspects. Now, let’s do something more interesting.

Exercise 2 - Replacing Imported Method Calls

For this scenario, we’ve got a “geometry” module with a method named “circleArea” that we’ve deprecated in favor of “getCircleArea.” We could easily find and replace these with /geometry\.circleArea/g, but what if the user has imported the module and assigned it a different name? For example:

import g from 'geometry';
const area = g.circleArea(radius);

How would we know to replace g.circleArea instead of geometry.circleArea? We certainly cannot assume that all circleArea calls are the ones we’re looking for, we need some context. This is where codemods start showing their value. Let’s start by making two files, deprecated.js and deprecated.input.js.

//deprecated.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;
  const root = j(fileInfo.source);

  return root.toSource();
};
deprecated.input.js

import g from 'geometry';
import otherModule from 'otherModule';

const radius = 20;
const area = g.circleArea(radius);

console.log(area === Math.pow(g.getPi(), 2) * radius);
console.log(area === otherModule.circleArea(radius));

Now run this command to run the codemod.

jscodeshift -t ./deprecated.js ./deprecated.input.js -d -p

You should see output indicating the transform ran, but has not changed anything yet.

Processing 1 files... 
Spawning 1 workers...
Running in dry mode, no files will be written! 
Sending 1 files to free worker...
All done. 
Results: 
0 errors
1 unmodified
0 skipped
0 ok
Time elapsed: 0.892seconds

We need to know what our geometry module has been imported as. Let’s look at the AST Explorer and figure out what we’re looking for. Our import takes this form.

{
  "type": "ImportDeclaration",
  "specifiers": [
    {
      "type": "ImportDefaultSpecifier",
      "local": {
        "type": "Identifier",
        "name": "g"
      }
    }
  ],
  "source": {
    "type": "Literal",
    "value": "geometry"
  }
}

We can specify an object type to find a collection of nodes like this:

const importDeclaration = root.find(j.ImportDeclaration, {
    source: {
      type: 'Literal',
      value: 'geometry',
    },
  });

This gets us the ImportDeclaration used to import “geometry”. From there, dig down to find the local name used to hold the imported module. Since this is the first time we’ve done it, let’s point out an important and confusing point when first starting.

Note: It’s important to know that root.find() returns a collection of node-paths. From there, the .get(n) method returns the node-path at index n in that collection, and to get the actual node, we use .node. The node is basically what we see in AST Explorer. Remember, the node-path is mostly information about the scope and relationships of the node, not the node itself.

// find the Identifiers
const identifierCollection = importDeclaration.find(j.Identifier);

// get the first NodePath from the Collection
const nodePath = identifierCollection.get(0);

// get the Node in the NodePath and grab its "name"
const localName = nodePath.node.name;

This allows us to figure out dynamically what our geometry module has been imported as. Next, we find the places it is being used and change them. By looking at AST Explorer, we can see that we need to find MemberExpressions that look like this:

{
  "type": "MemberExpression",
  "object": {
    "name": "geometry"
  },
  "property": {
    "name": "circleArea"
  }
}

Remember, though, that our module may have been imported with a different name, so we have to account for that by making our query look like this instead:

j.MemberExpression, {
  object: {
    name: localName,
  },
  property: {
    name: "circleArea",
  },
})

Now that we have a query, we can get a collection of all the call sites to our old method and then use the collection’s replaceWith() method to swap them out. The replaceWith() method iterates through the collection, passing each node-path to a callback function. The AST Node is then replaced with whatever Node you return from the callback.

Codemods allow you to script 'intelligent' considerations for refactoring.

Again, understanding the difference between collections, node-paths and nodes is necessary for this to make sense.

Once we’re done with the replacement, we generate the source as usual. Here’s our finished transform:

//deprecated.js
export default (fileInfo, api) => {
  const j = api.jscodeshift;
  const root = j(fileInfo.source);

  // find declaration for "geometry" import
  const importDeclaration = root.find(j.ImportDeclaration, {
    source: {
      type: 'Literal',
      value: 'geometry',
    },
  });

  // get the local name for the imported module
  const localName =
    // find the Identifiers
    importDeclaration.find(j.Identifier)
    // get the first NodePath from the Collection
    .get(0)
    // get the Node in the NodePath and grab its "name"
    .node.name;

  return root.find(j.MemberExpression, {
      object: {
        name: localName,
      },
      property: {
        name: 'circleArea',
      },
    })

    .replaceWith(nodePath => {
      // get the underlying Node
      const { node } = nodePath;
      // change to our new prop
      node.property.name = 'getCircleArea';
      // replaceWith should return a Node, not a NodePath
      return node;
    })

    .toSource();
};

When we run the source through the transform, we see that the call to the deprecated method in the geometry module was changed, but the rest was left unaltered, like so:

import g from 'geometry';
import otherModule from 'otherModule';

const radius = 20;
const area = g.getCircleArea(radius);

console.log(area === Math.pow(g.getPi(), 2) * radius);
console.log(area === otherModule.circleArea(radius));

Exercise 3 - Changing A Method Signature

In the previous exercises we covered querying collections for specific types of nodes, removing nodes, and altering nodes, but what about creating altogether new nodes? That’s what we’ll tackle in this exercise.

In this scenario, we’ve got a method signature that’s gotten out of control with individual arguments as the software has grown, and so it has been decided it would be better to accept an object containing those arguments instead.

Instead of car.factory('white', 'Kia', 'Sorento', 2010, 50000, null, true);

we’d like to see

const suv = car.factory({
  color: 'white',
  make: 'Kia',
  model: 'Sorento',
  year: 2010,
  miles: 50000,
  bedliner: null,
  alarm: true,
});

Let’s start by making the transform and an input file to test with:

//signature-change.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;
  const root = j(fileInfo.source);

  return root.toSource();
};
//signature-change.input.js

import car from 'car';

const suv = car.factory('white', 'Kia', 'Sorento', 2010, 50000, null, true);
const truck = car.factory('silver', 'Toyota', 'Tacoma', 2006, 100000, true, true);

Our command to run the transform will be jscodeshift -t signature-change.js signature-change.input.js -d -p and the steps we need to perform this transformation are:

  • Find the local name for the imported module
  • Find all call sites to the .factory method
  • Read all arguments being passed in
  • Replace that call with a single argument which contains an object with the original values

Using the AST Explorer and the process we used in the previous exercises, the first two steps are easy:

//signature-change.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;
  const root = j(fileInfo.source);

  // find declaration for "car" import
  const importDeclaration = root.find(j.ImportDeclaration, {
    source: {
      type: 'Literal',
      value: 'car',
    },
  });

  // get the local name for the imported module
  const localName =
    importDeclaration.find(j.Identifier)
    .get(0)
    .node.name;

  // find where `.factory` is being called
  return root.find(j.CallExpression, {
      callee: {
        type: 'MemberExpression',
        object: {
          name: localName,
        },
        property: {
          name: 'factory',
        },
      }
    })
    .toSource();
};

For reading all the arguments currently being passed in, we use thereplaceWith() method on our collection of CallExpressions to swap each of the nodes. The new nodes will replace node.arguments with a new single argument, an object.

Easily swap method arguments with jscodeshift!

Change method signatures with 'replacewith()' and swap out entire nodes.

Let’s give it a try with a simple object to make sure we know how this works before we use the proper values:

    .replaceWith(nodePath => {
      const { node } = nodePath;
      node.arguments = [{ foo: 'bar' }];
      return node;
    })

When we run this (jscodeshift -t signature-change.js signature-change.input.js -d -p), the transform will blow up with:

 ERR signature-change.input.js Transformation error
Error: {foo: bar} does not match type Printable

Turns out we can’t just jam plain objects into our AST nodes. Instead, we need to use builders to create proper nodes.

Node Builders

Builders allow us to create new nodes properly; they are provided by ast-types and surfaced through jscodeshift. They rigidly check that the different types of nodes are created correctly, which can be frustrating when you’re hacking away on a roll, but ultimately, this is a good thing. To understand how to use builders, there are two things you should keep in mind:

All of the available AST node types are defined in the deffolder of the ast-types github project, mostly in core.js There are builders for all the AST node types, but they use camel-cased version of the node type, not pascal-case. (This isn’t explicitly stated, but you can see this is the case in the ast-types source

If we use AST Explorer with an example of what we want the result to be, we can piece this together pretty easily. In our case, we want the new single argument to be an ObjectExpression with a bunch of properties. Looking at the type definitions mentioned above, we can see what this entails:

def("ObjectExpression")
    .bases("Expression")
    .build("properties")
    .field("properties", [def("Property")]);

def("Property")
    .bases("Node")
    .build("kind", "key", "value")
    .field("kind", or("init", "get", "set"))
    .field("key", or(def("Literal"), def("Identifier")))
    .field("value", def("Expression"));

So, the code to build an AST node for { foo: ‘bar’ } would look like:

j.objectExpression([
  j.property(
    'init',
    j.identifier('foo'),
    j.literal('bar')
  )  
]);

Take that code and plug it into our transform like so:

.replaceWith(nodePath => {
      const { node } = nodePath;
      const object = j.objectExpression([
        j.property(
          'init',
          j.identifier('foo'),
          j.literal('bar')
        )
      ]);
      node.arguments = [object];
      return node;
    })

Running this gets us the result:

import car from 'car';

const suv = car.factory({
  foo: "bar"
});
const truck = car.factory({
  foo: "bar"
});

Now that we know how to create a proper AST node, it’s easy to loop through the old arguments and generate a new object to use, instead. Here’s what our signature-change.js file looks like now:

//signature-change.js

export default (fileInfo, api) => {
  const j = api.jscodeshift;
  const root = j(fileInfo.source);

  // find declaration for "car" import
  const importDeclaration = root.find(j.ImportDeclaration, {
    source: {
      type: 'Literal',
      value: 'car',
    },
  });

  // get the local name for the imported module
  const localName =
    importDeclaration.find(j.Identifier)
    .get(0)
    .node.name;

  // current order of arguments
  const argKeys = [
    'color',
    'make',
    'model',
    'year',
    'miles',
    'bedliner',
    'alarm',
  ];

  // find where `.factory` is being called
  return root.find(j.CallExpression, {
      callee: {
        type: 'MemberExpression',
        object: {
          name: localName,
        },
        property: {
          name: 'factory',
        },
      }
    })
    .replaceWith(nodePath => {
      const { node } = nodePath;

      // use a builder to create the ObjectExpression
      const argumentsAsObject = j.objectExpression(

        // map the arguments to an Array of Property Nodes
        node.arguments.map((arg, i) =>
          j.property(
            'init',
            j.identifier(argKeys[i]),
            j.literal(arg.value)
          )
        )
      );

      // replace the arguments with our new ObjectExpression
      node.arguments = [argumentsAsObject];

      return node;
    })

    // specify print options for recast
    .toSource({ quote: 'single', trailingComma: true });
};

Run the transform (jscodeshift -t signature-change.js signature-change.input.js -d -p) and we’ll see the signatures have been updated as expected:

import car from 'car';

const suv = car.factory({
  color: 'white',
  make: 'Kia',
  model: 'Sorento',
  year: 2010,
  miles: 50000,
  bedliner: null,
  alarm: true,
});
const truck = car.factory({
  color: 'silver',
  make: 'Toyota',
  model: 'Tacoma',
  year: 2006,
  miles: 100000,
  bedliner: true,
  alarm: true,
});

Codemods With jscodeshift Recap

It took a little time and effort to get to this point, but the benefits are huge when faced with mass refactoring. Distributing groups of files to different processes and running them in parallel is something jscodeshift excels at, allowing you to run complex transformations across a huge codebase in seconds. As you become more proficient with codemods, you’ll start repurposing existing scripts (such as the react-codemod github repository or writing your own for all sorts of tasks, and that will make you, your team, and your package-users more efficient.

About the author

Jeremy Greer, United States
member since February 24, 2016
Jeremy is a senior software engineer with a passion for modern JavaScript--client and server-side--including React, Redux, Angular, and Express. He believes in clean code, testing, and reading the manual. Making cool software makes him giddy, and he is deeply moved every time he sees his work being used by others. [click to continue...]
Hiring? Meet the Top 10 Freelance JavaScript Developers for Hire in December 2016

Comments

Jeremy Greer
@dhayanandhan:disqus, thanks for reading. Send me a link to any cool codemods you whip up.
Dhayanandhan Raju
Thanks for the article. Helped me understand codeshift
Diego De Santiago Ruiz
Great article, thanks, there is another way to override source code.
Pieter Vanderwerff
Also check out https://github.com/cpojer/js-codemod. It has some great general codemods you can use as well as some useful jscodeshift extensions.
comments powered by Disqus
Subscribe
The #1 Blog for Engineers
Get the latest content first.
No spam. Just great engineering and design posts.
The #1 Blog for Engineers
Get the latest content first.
Thank you for subscribing!
You can edit your subscription preferences here.
Trending articles
Relevant technologies
About the author
Jeremy Greer
JavaScript Developer
Jeremy is a senior software engineer with a passion for modern JavaScript--client and server-side--including React, Redux, Angular, and Express. He believes in clean code, testing, and reading the manual. Making cool software makes him giddy, and he is deeply moved every time he sees his work being used by others.