What Is Business Intelligence?
Business intelligence (BI), a term nowadays intrinsically associated with information technology, has been evolving for over 150 years. Although its origins predate the invention of computers, it was only after they were widespread that BI grew in relevance and its development henceforth became paired with the evolution of computers and databases.
BI Using Pen and Paper
The first use of the term “business intelligence” is widely attributed to Mr. Richard Miller Devens, in his book Cyclopædia of Commercial and Business Anecdotes, first published in 1865. He used it to describe how Sir Henry Furnese, a successful banker, profited from information by actively gathering and acting on it before his competition. This pointed out the fact that it was more reliable to use data and empirical evidence, rather than gut instinct, to develop a business strategy. The idea was further enhanced by others who saw value in information.
During the last decade of the 1800s, Frederick Taylor introduced the first formalized system of business analytics in the United States. His system of scientific management began with time studies that analyzed production techniques and laborers’ body movements to find greater efficiencies that boosted industrial production.
Taylor ended up becoming a consultant to Henry Ford, who in the early 1900s started measuring the time each component of his Ford Model T took to complete on his assembly line. His work and his success revolutionized the manufacturing industry worldwide. Yet, he still used pen and paper for that.
Business Intelligence Gets a Boost from Computers
Electronic computers were embryonic in the 1930s but were developed quickly during World War II, as part of the effort by the allies to crack German codes.
Up until the 1950s, computers relied mostly on punchcards or punched tapes to store data. These were huge piles of cards with tiny holes in them, which would store the information to be processed by the computers. In 1956, however, IBM invented the first hard disk drive, making it possible to store large amounts of information with greater flexibility of access.
Not long after that, in 1958, IBM researcher Hans Peter Luhn published a historical paper called A Business Intelligence System. He theorized about the potential of a system for “selective dissemination” of documents to “action points” based on “interest profiles.” His work has remarkable significance even to this day since he predicted several business intelligence trends which are cutting-edge nowadays, as the ability for information systems to learn and predict based on user interests. Today we call it machine learning. Luhn is popularly recognized as the father of business intelligence.
Even though the concept proposed by Luhn caught the attention of several interested parties, the idea was considered too expensive at the time to have any practical use. More technological progress was needed to make it an economically viable solution.
In the next decade, computer use exploded, even considering that each computer was a gigantic machine which occupied the entire floor of a building and had to be managed by several high-skilled engineers to function properly. Experts again tackled the idea of using computers to extract conclusions from the data, but the main problem was that there was no centralized method available to bring together all the data in one place. Data, by itself, could not generate any insights. To solve this challenge, the first database management systems were designed. Later, they would simply be called databases. This first generation allowed for the first database searches, using a strategy of binary trees. This strategy, although it solved several problems at the time, is considered too heavy and inefficient nowadays. Even so, for companies that could afford it, this new tool provided its value, being used to finally make conclusions from the available data.
BI Technologies Improve: Big Players Enter the Field
In 1970, Edgar Codd from IBM published a paper called A Relational Model of Data for Large Shared Data Banks. It paved the road for next-generation relational databases, allowing for a much broader capacity to store and manipulate data. In a strange move, however, IBM refrained from implementing Codd’s design to preserve revenue for its current database systems. It was only after competitors started implementing them that IBM followed suit.
By this time, there was enough of a market to allow for the first business intelligence providers to appear. Among those, there were SAP, Siebel, and JD Edwards. At the time, they were called decision support systems (DSS).
The big problem at this point was that these databases suffered from “silo” issues. Due to being very one-dimensional, the flexibility of their use was very limited. Even simple issues like one database coding cities as “OH, NJ, and NY” while another using “Ohio, New Jersey, and New York” made cross-referencing a daunting task.
Yet, more and more successful cases of profitable use of data came to be. One of the most famous at the time came from Nielsen. Used for audience measurements, the marketing tool known as the Nielsen rating was used to gauge how many people were watching a particular TV show at any time, using a device called the Audimeter, which was hooked to up to a television set and recorded which channel was being watched.
Nielsen ratings were considered the most looked-at national rating report in the TV industry. However, four times a year, there would be “black weeks”—weeks where Nielsen ratings were not reported. Since there was no confident way to measure ratings in these “black weeks,” the TV networks filled their schedules with reruns.
Both the industry and audience were already used to “black weeks,” but they ended in September 1973. Nielsen introduced its Storage Instantaneous Audimeter (SIA), connecting 1,200 households directly to the company’s business intelligence computer in Florida. It could produce national ratings in just 36 hours, far less than the one to two weeks it took the company’s older system. National ratings would be available every day of the week, every week of the year. There was no longer any need for “black weeks,” and the data was much more available.
Near the end of the 70s, Larry Ellison and two friends released the first commercial version of the Oracle database. It was the first true relational database management system in the market, replacing the ideas used up until then of hierarchical databases and network databases for a more robust structure, which allowed much more flexible searches. This technology would dictate BI’s history and trends in the decades to come.
Importance of BI Grows: We Need More Room!
Lower prices for storage space and better databases allowed for the next generation of business intelligence solutions. Ralph Kimball and Bill Inmon proposed two different but similar strategies to the problem of having all the data of the business in the same place to be able to analyze it. These were data warehouses (DW). Inmon is recognized by many as the father of the data warehouse.
Data warehouses are databases designed to aggregate lots of data from other sources of data (mostly other databases), allowing a much deeper analysis with the ability to cross-reference these different sources. It was still, however, too technical and expensive. Reports needed to be run and maintained by a host of expensive IT technical staff.
Top management at the time would live by the outputs of BI solutions like Crystal Reports and Microstrategy. And, of course, there was Microsoft Excel (released in 1985). Business intelligence was now an integral part of the tools available for the decision-making process.
In 1989, Howard Dresdner, of the Gartner Group, contributed to popularizing the term “business intelligence,” using it as an umbrella term to describe “concepts and methods to improve business decision-making by using fact-based support systems.”
Business Intelligence 1.0
In the 90s, data warehouse costs declined as more competitors entered the market and more IT professionals got acquainted with the technology. This was the period of “Business Intelligence 1.0.”
Data was now commonly accessible to the corporate staff in general, not just top management. However, the problem at this point was that asking new questions was still very expensive. Once a question was “engineered,” the answer would be available quickly, but only for that question.
To reduce this effort, some new tools and “building blocks” were developed to speed the process of different queries:
- ETL (extract, transform, and load) was a set of tools, similar to a programming language, that made it easier to design the flow of data within a data warehouse.
- OLAP (online analytical processing) helped to create different visualization options for the queried data, empowering the analysts to extract better conclusions from the information at hand.
To this day, both ETL and OLAP tools are still a crucial part of business intelligence solutions.
This was also the period where enterprise resource planning (ERP) systems became popular. These are huge management software platforms that integrate applications to manage and automate aspects of a business. They also provided structured data for the data warehouses and in the following years would become the heart of every major company in the world.
In 1995, Microsoft released Windows 95, the first “user-friendly” operational system—and computers became common household items. This would have a profound impact on how people produced and consumed data for the following decades.
BI Disrupted: Data Explosion in the New Millenium
By the year 2000, business intelligence solutions were already established as a “must have” for all medium to large businesses. It was now widely considered a requirement to stay competitive.
From the solution providers perspective, the abundance of solutions started to coalesce in the hands of a few large competitors, like IBM, Microsoft, SAP, and Oracle.
A few new concepts emerged during this period. The difficulty to keep their data warehouses up to date made some companies rethink their approach, transforming their DW into their “single source of truth.” For already existing data, other programs would use the information provided by the DW instead of using their own, thus eliminating most issues of data incompatibility. It was easier said than done, providing lots of technical challenges. The concept, however, was so useful that in the following years the available solutions in the market would adapt to employ this strategy.
As data became more and more abundant, and BI tools proved their usefulness, the development effort was directed toward increasing the speed at which the information would become available, and to reduce the complexity of accessing it. Tools became easier to use, and non-technical people could by now gather data and gain insights by themselves, without the help from technical support.
In the early 2000s, the boom of social networking platforms paved the way for the general public’s opinion to be freely available in the internet, and interested parties could collect (or “mine”) the data, and analyze it. By 2005, the increasing interconnectivity of the business world meant that companies needed real-time information where data from events could be incorporated in the data warehouses as they happened in real time.
This is the year Google Analytics was introduced, providing a free way for users to analyze their website data. This is also the year the term big data was first used. Roger Magoulas, from O’Reilly Media, used it to refer to “a large set of data that is almost impossible to manage and process using traditional business intelligence tools.”
To cope with the additional storage space and computing power required to manage this exponentially increasing amount of data, companies began to search for other solutions. Building larger and faster computers was out of the question, so using several machines at once became a better option. This was the seeds of cloud computing.
Contemporary Uses of BI
In the past 10 years, big data, cloud computing, and data science became words known to pretty much anyone. It is hard at this time to acknowledge which new advancements were most impactful in these last years. However, there are a few interesting cases that have shown the growing power of modern analytic tools.
Advertising, Cookies, and AdTech
In 2012, The New York Times published an article describing how Target accidentally discovered the pregnancy of a high school teenager before their parents. Through analytics, they identified 25 products that when purchased together indicate a woman is likely pregnant. The value of this information was that Target could send coupons to the pregnant woman at a period when a woman’s shopping habits might change.
An enraged father walked into a Target outside Minneapolis and demanded to see the manager. He complained about her daughter receiving coupons for baby clothes, even though she was still in high school. The manager apologized deeply in name of the company, but a few days later the father called back to apologize: “It turns out there have been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”
This anecdotal example shows the contemporary power of data analytics.
The Obama reelection campaign strategy was heavily built upon analytics. Many specialists point to it as one of the main reasons for its success. The strategy, designed by campaign manager Jim Messina, was focused on gathering data on the voters and using it to ensure they would 1) register to vote, 2) be persuaded to vote for Obama and 3) show up to vote on election day. About 100 data analysts made part of the effort, using an environment running on HP Vertica and coded in R and Stata.
Several initiatives were applied to reach those goals, one of which was Airwolf. Built to integrate the field and digital teams efforts, it ensured that once a voter was contacted by the field team in a door-to-door campaign, their interests would be recorded, so that they would get frequent emails from the local organizers tailored specifically to each one’s favorite campaign issues.
With the right tools and data, analysts could answer nearly any question quickly and easily, no matter where the data originally came from. The success of the Obama campaign made big data analytics environments a standard requirement for every campaign since.
The Human Genome Project was completed in 2003 but left many questions unanswered. Despite mapping the entire sequence of nucleotide base pairs that make up human DNA, truly understanding how human genetics work required more intensive study—and it was a perfect application for big data. A typical human genome contains more than 20,000 genes, with each made up of millions of base pairs. Simply mapping a genome requires a hundred gigabytes of data, and sequencing multiple genomes and tracking gene interactions multiplies that number many times—hundreds of petabytes, in some cases.
By applying analytics in their study published in 2016, scientists at the University of Haifa were able to observe what is called the “social character” of genes. What scientists have long wanted to figure out are the inner workings of complex genetic effects that take part in the creation of complex diseases. This goal has been particularly difficult since genetic expressions of certain diseases usually come from the combination of several genetic markers interacting with each other. So not only would researchers have to comb through an entire genetic sequence, but they’d also have to track interactions between multiple different genes.
Even though there is still plenty of data to be analyzed, the way is paved to understand and cure a huge number of genetic defects, large and small.
The Road Ahead
We now reach a time where Facebook can recognize your face in pictures, where Google can predict which kind of advertisement would best suit your profile, where Netflix can give you suggestions on which shows to watch. It is a time when you can talk to your phone, not just to someone on the other side of the phone line. Being able to handle and process huge amounts of data was a primordial step to understand how these marvels came to be.
Big data is still a growing trend. Roughly 90% of available data has been created in the last two years. At the Techonomy conference, in 2010, Eric Schmidt stated that “there were 5 exabytes of information created by the entire world between the dawn of civilization and 2003. Now that same amount is created every two days.”
Handling such a vast amount of data still presents many challenges. Data quality, one of the first and oldest headaches of business intelligence, is still a demanding field. Analytics, the skillset necessary help make sense of the towering pile of data companies are gathering is also in high demand. There are now many flavors of analytics: descriptive analytics, predictive analytics, prescriptive analytics, streaming analytics, automated analytics, etc. Analytics uses several cutting-edge technologies to extract insights from data, such as artificial intelligence, machine learning, and lots of statistical models. It is finally a time when it is cool to be a mathematician.
BI tools are now often designed with a specific industry in mind, be it healthcare, law enforcement, etc. It now works across multiple devices and uses several visualization tools, allowing anyone to apply reasoning to data through interactive visual interfaces. Mobile BI is now a reality.
By combining the strengths of big data, machine learning, and analytics, your life might be very different in the future. Perhaps you might not need to go to the grocery store anymore—your fridge will order what you most likely will need, based on your eating habits. Perchance, you won’t be calling your doctor to say you are ill, because they will call you even before you start feeling the first symptoms.
Humanity now lives in the information age, and business intelligence is a crucial feature of our time, helping us make sense of it all. Business analytics is now even a degree program at many universities. The history of business intelligence is fairly recent, but accelerating and getting denser by the day. The best days of BI are still ahead of us.