Mixed-integer Programming: A Guide to Computational Decision-making
Mixed-integer programming is one of the many ways you can automate and enhance your decision-making processes.
In this article, Toptal Freelance Software Engineer Shanglun Wang explores the power of operations research by walking through a hypothetical problem and shows how mixed-integer programming can be used to solve it.
Mixed-integer programming is one of the many ways you can automate and enhance your decision-making processes.
In this article, Toptal Freelance Software Engineer Shanglun Wang explores the power of operations research by walking through a hypothetical problem and shows how mixed-integer programming can be used to solve it.
Sean is a passionate polyglot: A full-stack wizard, sys admin, and data scientist. He’s also developed market intelligence software.
Expertise
PREVIOUSLY AT
Operations research, the science of using computers to make optimal decisions, has been used and applied in many areas of software. To give a few examples, logistics companies use it to determine the optimal route to get from point A to point B, power companies use it to determine power production schedules, and manufacturing companies use it to find optimal staffing patterns for their factories.
Today, we will explore the power of operations research by walking through a hypothetical problem. Specifically, we will use mixed-integer programming (MIP) to determine the optimal staffing pattern for a hypothetical factory.
Operations Research Algorithms
Before we dive into our example problem, it is helpful to go over some basic concepts in operations research and understand a bit of the tools we will be using today.
Linear Programming Algorithms
Linear programming is an operations research technique used to determine the best outcome in a mathematical model where the objective and the constraints are expressed as a system of linear equations. An example linear programming model might look like this:
Maximize a + b (objective)
Subject to:
a <= 2 (constraint 1)
b <= 3 (constraint 2)
In our very simple example, we can see that the optimal outcome is 5, with a = 2 and b = 3. While this is a rather trivial example, you can probably imagine a linear programming model that utilizes thousands of variables and hundreds of constraints.
A good example might be an investment portfolio problem, where you might end up with something like the below, in pseudo-code:
Maximize <expected profit from all stock investments>
Subject to:
<investment in the technology sector must be between 10% - 20% of portfolio>
<investment in the consumer staples must not exceed investment in financials>
Etc.
...
Which would be rather complex and difficult to solve by hand or trial and error. Operations research software will use several algorithms to solve these problems quickly. If you’re interested in the underlying algorithms, I recommend learning about the simplex method here and the interior point method here.
Integer Programming Algorithms
Integer programming is like linear programming with an additional allowance for some or all of the variables to be integer values. While this may not seem like a large improvement at first, it allows us to solve many problems that could have remained unsolved using linear programming alone.
One such problem is the knapsack problem, where we are given a set of items with assigned values and weights and are asked to find the highest value combination of items we can fit into our knapsack. A linear programming model will not be able to solve this because there is no way to express the idea that you can either put an item into your knapsack or not, but you cannot put part of an item into your knapsack—every variable is a continuous variable!
An example knapsack problem might have the following parameters:
Object | Value (units of $10) | Size (generic units) |
---|---|---|
Camera | 5 | 2 |
Mysterious figurine | 7 | 4 |
Huge bottle of apple cider | 2 | 7 |
French horn | 10 | 10 |
And the MIP model will look like this:
Maximize 5a + 7b + 2c + 10d (objective: maximize value of items take)
Subject to:
2a + 4b + 7c + 10d <= 15 (space constraint)
The optimal solution, in this case, is a=0, b=1, c=0, d=1, with the value of the total item being 17.
The problem we will solve today will also require integer programming since an employee at a factory can either be scheduled for a shift or not—for the sake of simplicity, you cannot schedule an employee for 2/3 of a shift at this factory.
To make integer programming possible, several mathematical algorithms are used. If you are interested in the underlying algorithms, I recommend studying the cutting-planes algorithm and the branch-and-bound algorithm here.
Example Problem: Scheduling
Problem Description
Today, we will explore the problem of staffing a factory. As the management of the factory, we will want to minimize labor costs, but we want to ensure sufficient coverage for every shift to meet production demand.
Suppose we have five shifts with the following staffing demands:
Shift 1 | Shift 2 | Shift 3 | Shift 4 | Shift 5 |
---|---|---|---|---|
1 person | 4 people | 3 people | 5 people | 2 people |
And, suppose we have the following workers:
Name | Availability | Cost per Shift ($) |
---|---|---|
Melisandre | 1, 2, 5 | 20 |
Bran | 2, 3, 4, 5 | 15 |
Cersei | 3, 4 | 35 |
Daenerys | 4, 5 | 35 |
Theon | 2, 4, 5 | 10 |
Jon | 1, 3, 5 | 25 |
Tyrion | 2, 4, 5 | 30 |
Jaime | 2, 3, 5 | 20 |
Arya | 1, 2, 4 | 20 |
To keep the problem simple, let us assume for the moment that availability and cost are the sole concerns.
Our Tools
For today’s problem, we will use a piece of open source branch-and-cut software called CBC. We will interface with this software using PuLP, which is a popular operations research modeling library for Python. You can find information about it here.
PuLP allows us to construct MIP models in a very Pythonic fashion and integrates nicely with existing Python code. This is very useful as some of the most popular data manipulation and analysis tools are written in Python, and most commercial operations research systems require extensive data processing before and after the optimization.
To demonstrate the simplicity and elegance of PuLP, here is the knapsack problem from earlier and solved in PuLP:
import pulp as pl
# declare some variables
# each variable is a binary variable that is either 0 or 1
# 1 means the item will go into the knapsack
a = pl.LpVariable("a", 0, 1, pl.LpInteger)
b = pl.LpVariable("b", 0, 1, pl.LpInteger)
c = pl.LpVariable("c", 0, 1, pl.LpInteger)
d = pl.LpVariable("d", 0, 1, pl.LpInteger)
# define the problem
prob = pl.LpProblem("knapsack", pl.LpMaximize)
# objective function - maximize value of objects in knapsack
prob += 5 * a + 7 * b + 2 * c + 10 * d
# constraint - weight of objects cannot exceed 15
prob += 2 * a + 4 * b + 7 * c + 10 * d <= 15
status = prob.solve() # solve using the default solver, which is cbc
print(pl.LpStatus[status]) # print the human-readable status
# print the values
print("a", pl.value(a))
print("b", pl.value(b))
print("c", pl.value(c))
print("d", pl.value(d))
Run this, and you should get the output:
Optimal
a 0.0
b 1.0
c 0.0
d 1.0
Now onto our scheduling problem!
Coding Our Solution
The coding of our solution is fairly straightforward. First, we will want to define our data:
import pulp as pl
import collections as cl
# data
shift_requirements = [1, 4, 3, 5, 2]
workers = {
"Melisandre": {
"availability": [0, 1, 4],
"cost": 20
},
"Bran": {
"availability": [1, 2, 3, 4],
"cost": 15
},
"Cersei": {
"availability": [2, 3],
"cost": 35
},
"Daenerys": {
"availability": [3, 4],
"cost": 35
},
"Theon": {
"availability": [1, 3, 4],
"cost": 10
},
"Jon": {
"availability": [0, 2, 4],
"cost": 25
},
"Tyrion": {
"availability": [1, 3, 4],
"cost": 30
},
"Jaime": {
"availability": [1, 2, 4],
"cost": 20
},
"Arya": {
"availability": [0, 1, 3],
"cost": 20
}
}
Next, we will want to define the model:
# define the model: we want to minimize the cost
prob = pl.LpProblem("scheduling", pl.LpMinimize)
# some model variables
cost = []
vars_by_shift = cl.defaultdict(list)
for worker, info in workers.items():
for shift in info['availability']:
worker_var = pl.LpVariable("%s_%s" % (worker, shift), 0, 1, pl.LpInteger)
vars_by_shift[shift].append(worker_var)
cost.append(worker_var * info['cost'])
# set the objective to be the sum of cost
prob += sum(cost)
# set the shift requirements
for shift, requirement in enumerate(shift_requirements):
prob += sum(vars_by_shift[shift]) >= requirement
And now we just ask it to solve and print the results!
status = prob.solve()
print("Result:", pl.LpStatus[status])
results = []
for shift, vars in vars_by_shift.items():
results.append({
"shift": shift,
"workers": [var.name for var in vars if var.varValue == 1],
})
for result in sorted(results, key=lambda x: x['shift']):
print("Shift:", result['shift'], 'workers:', ', '.join(result['workers']))
Once you run the code, you should see the following outputs:
Result: Optimal
Shift: 0 workers: Arya_0
Shift: 1 workers: Melisandre_1, Bran_1, Theon_1, Arya_1
Shift: 2 workers: Bran_2, Jon_2, Jaime_2
Shift: 3 workers: Bran_3, Daenerys_3, Theon_3, Tyrion_3, Arya_3
Shift: 4 workers: Bran_4, Theon_4
Crank Up the Difficulty a Bit: Additional Constraints
Even though the previous model was interesting and useful, it does not fully demonstrate the power of mixed-integer programming. We could also easily write a for
loop to find the cheapest x
workers for every shift, where x
is the number of workers needed for a shift. To demonstrate how MIP can be used to solve a problem that would be challenging to solve in an imperative fashion, let us examine what would happen if we add a few extra constraints.
Additional Constraints
Suppose that, due to new labor regulations, no workers can be assigned to more than 2 shifts. To account for the increased work, we have recruited the help of Dothraki Staffing Group, who will supply up to 10 Dothraki workers for each shift at a rate of $40 per shift.
Additionally suppose that, due to some ongoing interpersonal conflicts outside of our factory, Cersei and Jaime are unable to work any shifts alongside either Daenerys or Jon. Additionally, Jaime and Cersei are unable to work any shifts together. Finally, Arya, who has particularly complex interpersonal relationships, is unable to work in the same shift with Jaime, Cersei, or Melisandre.
The addition of these two new constraints and resources by no means makes the problem unsolvable through imperative means, but it makes the solution much harder, as one will have to account for opportunity costs of scheduling a worker to a particular shift.
Solution
With mixed-integer programming, however, it is much easier. We simply need to amend our code like so:
Define the ban list and some constraints:
ban_list = {
("Daenerys", "Jaime"),
("Daenerys", "Cersei"),
("Jon", "Jaime"),
("Jon", "Cersei"),
("Arya", "Jaime"),
("Arya", "Cersei"),
("Arya", "Melisandre"),
("Jaime", "Cersei")
}
DOTHRAKI_MAX = 10
DOTHRAKI_COST = 45
Populate variables for implementing the ban and max shift constraints:
for worker, info in workers.items():
for shift in info['availability']:
worker_var = pl.LpVariable("%s_%d" % (worker, shift), 0, 1, pl.LpInteger)
# store some variable data so we can implement the ban constraint
var_data = (worker,)
vars_by_shift[shift].append((worker_var, var_data))
# store vars by variable so we can implement the max shift constraint
vars_by_worker[worker].append(worker_var)
cost.append(worker_var * info['cost'])
Add the Dothraki variables:
for shift in range(len(shift_requirements)):
dothraki_var = pl.LpVariable('dothraki_%d' % shift, 0, DOTHRAKI_MAX, pl.LpInteger)
cost.append(dothraki_var * DOTHRAKI_COST)
dothrakis_by_shift[shift] = dothraki_var
We will also need a slightly modified loop for computing shift and ban requirements:
# set the shift requirements
for shift, requirement in enumerate(shift_requirements):
prob += sum([var[0] for var in vars_by_shift[shift]]) + dothrakis_by_shift[shift] >= requirement
# set the ban requirements
for shift, vars in vars_by_shift.items():
for var1 in vars:
for var2 in vars:
worker_pair = var1[1][0], var2[1][0]
if worker_pair in ban_list:
prob += var1[0] + var2[0] <= 1
# set the labor standards:
for worker, vars in vars_by_worker.items():
prob += sum(vars) <= 2
And finally, to print the results:
status = prob.solve()
print("Result:", pl.LpStatus[status])
results = []
for shift, vars in vars_by_shift.items():
results.append({
"shift": shift,
"workers": [var[1][0] for var in vars if var[0].varValue == 1],
"dothrakis": dothrakis_by_shift[shift].varValue
})
for result in sorted(results, key=lambda x: x['shift']):
print("Shift:", result['shift'], 'workers:', ', '.join(result['workers']), 'dothrakis hired:', int(result['dothrakis']))
And we should be good to go. Run the code and you should see output as below:
Result: Optimal
Shift: 0 workers: Arya dothrakis hired: 0
Shift: 1 workers: Melisandre, Theon, Tyrion, Jaime dothrakis hired: 0
Shift: 2 workers: Bran, Jon dothrakis hired: 1
Shift: 3 workers: Bran, Daenerys, Theon, Tyrion, Arya dothrakis hired: 0
Shift: 4 workers: Melisandre, Jaime dothrakis hired: 0
And there we have it, a result that respects the banned workers’ list, follows labor regulations, and uses Dothraki workers judiciously.
Conclusion
Today, we explored using mixed-integer programming to make better decisions. We discussed the underlying algorithms and techniques used in operations research and looked at an example problem that is representative of how mixed-integer programming is used in the real world.
I hope this article inspired you to learn more about operations research and made you think about how this technology can be applied to your projects. We’ve only really seen the tip of the iceberg when it comes to the exciting world of optimization algorithms and operations research.
You can find all the code related to this article on GitHub. If you would like to discuss more, share your comments below!
Further Reading on the Toptal Blog:
Understanding the basics
What is linear programming?
Linear programming is an operations research technique used to determine the best outcome in a mathematical model where the objective and the constraints are expressed as a system of linear equations.
What is integer programming?
Integer programming is like linear programming with an additional allowance for some or all of the variables to be integer values.
Shanglun Wang
New York, NY, United States
Member since December 16, 2016
About the author
Sean is a passionate polyglot: A full-stack wizard, sys admin, and data scientist. He’s also developed market intelligence software.
Expertise
PREVIOUSLY AT