An Array of Possibilities: A Guide to Ruby Pattern Matching
Pattern matching is a powerful tool commonly found in functional programming languages. The Ruby 2.7 release is going to include this feature.
In this article, Toptal Ruby Developer Noppakun Wongsrinoppakun provides a breakdown of what this addition will include and why it matters.
Pattern matching is a powerful tool commonly found in functional programming languages. The Ruby 2.7 release is going to include this feature.
In this article, Toptal Ruby Developer Noppakun Wongsrinoppakun provides a breakdown of what this addition will include and why it matters.
Noppakun is a Tokyo-based full-stack software engineer with extensive experience using Vue.js and Ruby on Rails.
Expertise
Pattern matching is the big new feature coming to Ruby 2.7. It has been committed to the trunk so anyone who is interested can install Ruby 2.7.0-dev and check it out. Please bear in mind that none of these are finalized and the dev team is looking for feedback so if you have any, you can let the committers know before the feature is actually out.
I hope you will understand what pattern matching is and how to use it in Ruby after reading this article.
What Is Pattern Matching?
Pattern matching is a feature that is commonly found in functional programming languages. According to Scala documentation, pattern matching is “a mechanism for checking a value against a pattern. A successful match can also deconstruct a value into its constituent parts.”
This is not to be confused with Regex, string matching, or pattern recognition. Pattern matching has nothing to do with string, but instead data structure. The first time I encountered pattern matching was around two years ago when I tried out Elixir. I was learning Elixir and trying to solve algorithms with it. I compared my solution to others and realized they used pattern matching, which made their code a lot more succinct and easier to read.
Because of that, pattern matching really made an impression on me. This is what pattern matching in Elixir looks like:
[a, b, c] = [:hello, "world", 42]
a #=> :hello
b #=> "world"
c #=> 42
The example above looks very much like a multiple assignment in Ruby. However, it is more than that. It also checks whether or not the values match:
[a, b, 42] = [:hello, "world", 42]
a #=> :hello
b #=> "world"
In the examples above, the number 42 on the left hand side isn’t a variable that is being assigned. It is a value to check that the same element in that particular index matches that of the right hand side.
[a, b, 88] = [:hello, "world", 42]
** (MatchError) no match of right hand side value
In this example, instead of the values being assigned, MatchError
is raised instead. This is because the number 88 does not match number 42.
It also works with maps (which is similar to hash in Ruby):
%{"name": "Zote", "title": title } = %{"name": "Zote", "title": "The mighty"}
title #=> The mighty
The example above checks that the value of the key name
is Zote
, and binds the value of the key title
to the variable title.
This concept works very well when the data structure is complex. You can assign your variable and check for values or types all in one line.
Furthermore, It also allows a dynamically typed language like Elixir to have method overloading:
def process(%{"animal" => animal}) do
IO.puts("The animal is: #{animal}")
end
def process(%{"plant" => plant}) do
IO.puts("The plant is: #{plant}")
end
def process(%{"person" => person}) do
IO.puts("The person is: #{person}")
end
Depending on the key of the hash of the argument, different methods get executed.
Hopefully, that shows you how powerful pattern matching can be. There are many attempts to bring pattern matching into Ruby with gems such as noaidi, qo, and egison-ruby.
Ruby 2.7 also has its own implementation not too different from these gems, and this is how it’s being done currently.
Ruby Pattern Matching Syntax
Pattern matching in Ruby is done through a case
statement. However, instead of using the usual when
, the keyword in
is used instead. It also supports the use of if
or unless
statements:
case [variable or expression]
in [pattern]
...
in [pattern] if [expression]
...
else
...
end
Case statement can accept a variable or an expression and this will be matched against patterns provided in the in clause. If or unless statements can also be provided after the pattern. The equality check here also uses ===
like the normal case statement. This means you can match subsets and instance of classes. Here is an example of how you use it:
Matching Arrays
translation = ['th', 'เต้', 'ja', 'テイ']
case translation
in ['th', orig_text, 'en', trans_text]
puts "English translation: #{orig_text} => #{trans_text}"
in ['th', orig_text, 'ja', trans_text]
# this will get executed
puts "Japanese translation: #{orig_text} => #{trans_text}"
end
In the example above, the variable translation
gets matched against two patterns:
['th', orig_text, 'en', trans_text]
and ['th', orig_text, 'ja', trans_text]
. What it does is to check if the values in the pattern match the values in the translation
variable in each of the indices. If the values do match, it assigns the values in the translation
variable to the variables in the pattern in each of the indices.
Matching Hashes
translation = {orig_lang: 'th', trans_lang: 'en', orig_txt: 'เต้', trans_txt: 'tae' }
case translation
in {orig_lang: 'th', trans_lang: 'en', orig_txt: orig_txt, trans_txt: trans_txt}
puts "#{orig_txt} => #{trans_txt}"
end
In the example above, the translation
variable is now a hash. It gets matched against another hash in the in
clause. What happens is that the case statement checks if all the keys in the pattern matches the keys in the translation
variable. It also checks that all the values for each key match. It then assigns the values to the variable in the hash.
Matching subsets
The quality check used in pattern matching follows the logic of ===
.
Multiple Patterns
-
|
can be used to define multiple patterns for one block.
translation = ['th', 'เต้', 'ja', 'テイ']
case array
in {orig_lang: 'th', trans_lang: 'ja', orig_txt: orig_txt, trans_txt: trans_txt} | ['th', orig_text, 'ja', trans_text]
puts orig_text #=> เต้
puts trans_text #=> テイ
end
In the example above, the translation
variable is match against both the {orig_lang: 'th', trans_lang: 'ja', orig_txt: orig_txt, trans_txt: trans_txt}
hash and the ['th', orig_text, 'ja', trans_text]
array.
This is useful when you have slightly different types of data structures that represent the same thing and you want both data structures to execute the same block of code.
Arrow Assignment
In this case, =>
can be used to assign matched value to a variable.
case ['I am a string', 10]
in [Integer, Integer] => a
# not reached
in [String, Integer] => b
puts b #=> ['I am a string', 10]
end
This is useful when you want to check values inside the data structure but also bind these values to a variable.
Pin Operator
Here, the pin operator prevents variables from getting reassigned.
case [1,2,2]
in [a,a,a]
puts a #=> 2
end
In the example above, variable a in the pattern is matched against 1, 2, and then 2. It will be assigned to 1, then 2, then to 2. This isn’t an ideal situation if you want to check that all the values in the array are the same.
case [1,2,2]
in [a,^a,^a]
# not reached
in [a,b,^b]
puts a #=> 1
puts b #=> 2
end
When the pin operator is used, it evaluates the variable instead of reassigning it. In the example above, [1,2,2] doesn’t match [a,^a,^a] because in the first index, a is assigned to 1. In the second and third, a is evaluated to be 1, but is matched against 2.
However [a,b,^b] matches [1,2,2] since a is assigned to 1 in the first index, b is assigned to 2 in the second index, then ^b, which is now 2, is matched against 2 in the third index so it passes.
a = 1
case [2,2]
in [^a,^a]
#=> not reached
in [b,^b]
puts b #=> 2
end
Variables from outside the case statement can also be used as shown in the example above.
Underscore (_
) Operator
Underscore (_
) is used to ignore values. Let’s see it in a couple of examples:
case ['this will be ignored',2]
in [_,a]
puts a #=> 2
end
case ['a',2]
in [_,a] => b
puts a #=> 2
Puts b #=> ['a',2]
end
In the two examples above, any value that matches against _
passes. In the second case statement, =>
operator captures the value that has been ignored as well.
Use Cases for Pattern Matching in Ruby
Imagine that you have the following JSON data:
{
nickName: 'Tae'
realName: {firstName: 'Noppakun', lastName: 'Wongsrinoppakun'}
username: 'tae8838'
}
In your Ruby project, you want to parse this data and display the name with the following conditions:
- If the username exists, return the username.
- If the nickname, first name, and last name exist, return the nickname, first name, and then the last name.
- If the nickname doesn’t exist, but the first and last name do, return the first name and then the last name.
- If none of the conditions apply, return “New User.”
This is how I would write this program in Ruby right now:
def display_name(name_hash)
if name_hash[:username]
name_hash[:username]
elsif name_hash[:nickname] && name_hash[:realname] && name_hash[:realname][:first] && name_hash[:realname][:last]
"#{name_hash[:nickname]} #{name_hash[:realname][:first]} #{name_hash[:realname][:last]}"
elsif name_hash[:first] && name_hash[:last]
"#{name_hash[:first]} #{name_hash[:last]}"
else
'New User'
end
end
Now, let’s see what it looks like with pattern matching:
def display_name(name_hash)
case name_hash
in {username: username}
username
in {nickname: nickname, realname: {first: first, last: last}}
"#{nickname} #{first} #{last}"
in {first: first, last: last}
"#{first} #{last}"
else
'New User'
end
end
Syntax preference can be a little subjective, but I do prefer the pattern matching version. This is because pattern matching allows us to write out the hash we expect, instead of describing and checking the values of the hash. This makes it easier to visualize what data to expect:
`{nickname: nickname, realname: {first: first, last: last}}`
Instead of:
`name_hash[:nickname] && name_hash[:realname] && name_hash[:realname][:first] && name_hash[:realname][:last]`.
Deconstruct and Deconstruct_keys
There are two new special methods being introduced in Ruby 2.7: deconstruct
and deconstruct_keys
. When an instance of a class is being matched against an array or hash, deconstruct
or deconstruct_keys
are called, respectively.
The results from these methods will be used to match against patterns. Here is an example:
class Coordinate
attr_accessor :x, :y
def initialize(x, y)
@x = x
@y = y
end
def deconstruct
[@x, @y]
end
def deconstruct_key
{x: @x, y: @y}
end
end
The code defines a class called Coordinate
. It has x and y as its attributes. It also has deconstruct
and deconstruct_keys
methods defined.
c = Coordinates.new(32,50)
case c
in [a,b]
p a #=> 32
p b #=> 50
end
Here, an instance of Coordinate
is being defined and pattern matched against an array. What happens here is that Coordinate#deconstruct
is called and the result is used to match against the array [a,b]
defined in the pattern.
case c
in {x:, y:}
p x #=> 32
p y #=> 50
end
In this example, the same instance of Coordinate
is being pattern-matched against a hash. In this case, the Coordinate#deconstruct_keys
result is used to match against the hash {x: x, y: y}
defined in the pattern.
An Exciting Experimental Feature
Having first experienced pattern matching in Elixir, I had thought this feature might include method overloading and implemented with a syntax that only requires one line. However, Ruby isn’t a language that is built with pattern matching in mind, so this is understandable.
Using a case statement is probably a very lean way of implementing this and also does not affect existing code (apart from deconstruct
and deconstruct_keys
methods). The use of the case statement is actually similar to that of Scala’s implementation of pattern matching.
Personally, I think pattern matching is an exciting new feature for Ruby developers. It has the potential to make code a lot cleaner and make Ruby feel a bit more modern and exciting. I would love to see what people make of this and how this feature evolves in the future.
Understanding the basics
What is pattern matching in functional programming?
According to Scala documentation, pattern matching is “a mechanism for checking a value against a pattern. A successful match can also deconstruct a value into its constituent parts.”
Why is pattern matching useful?
It has the potential to make code a lot cleaner and make Ruby feel a bit more modern and exciting.
How does pattern matching work?
Through switch case statement in Ruby 2.7, using the keyword “in” instead of the usual “when.”
What is meant by pattern matching?
According to Scala documentation, pattern matching is “a mechanism for checking a value against a pattern. A successful match can also deconstruct a value into its constituent parts.”
Bangkok, Thailand
Member since December 14, 2017
About the author
Noppakun is a Tokyo-based full-stack software engineer with extensive experience using Vue.js and Ruby on Rails.