The Future of Business Intelligence Part 2: Dismantling the Supply Chain and Planting the Forest.
A new era of Business Intelligence requires new ways of thinking about and delivering insights.
The supply chain of analytics is dead - long live the… what exactly? In part one of this series, we looked at the history of Business Intelligence, identified the two great waves of BI and posited that Wave 2 - conceived of as the ‘supply chain of analytics’ and driven by the dominance of Tableau - is coming to an end. Aka - ‘Tableau is dead!’ If we’re going into a new era of BI and analytics, we’re going to need a new metaphor for how it’s done. And so I present to you, the ‘data tree’ way of delivering insights.
The future of Business Intelligence: Meet the Data Tree
So how does this metaphor actually work? Unlike the one-way supply chain of data, the growth of the data tree is fed by insights generated at the edge. That’s what powers the whole system.
The roots are your source systems, which feed raw data nutrients into the data tree.
The trunk is your centralized data platform and metrics. It supports the system with standardized, trusted data and components
Each individual branch is a domain or department, which grows towards the sun in an autonomous way.
The leaves are individual data products or uses. They could be data sets, dashboards, data apps or ML models.
The sun is the light, the insight, the business value that drives the whole thing. All aspects of the data tree grow towards the light.
When it’s laid out this way, you can see how the conversion of nutrients to food - raw data to insights - happens at the edge, but how that insight is fed back through the whole system to power growth.
This is why I like the tree metaphor. It’s living, breathing, organically growing thing that seeks the light autonomously but supports the whole. I envision a future of many organizations planting small trees and watching them grow into a mighty forest.
Properties of the data tree
If we extend this metaphor to the aspects of a BI tool, some properties of the ideal platform begins to emerge:
Balanced: A data tree that outgrows its roots simply falls over. Wave 3 of business intelligence is about a balanced approach to insight generation and distribution. It is not focused on needless growth and does not derive its value from the sheer amount of charts created, but rather its veracity and total value added.
Circulatory: If sap flows in only one direction the data tree dies. Wave 3 must support bi-directional interaction with decision makers and downstream systems to create feedback loops to drive growth and change. This must be built into the DNA of the tool.
Branching: A data tree takes many branching paths towards the sun. Wave 3 supports many uses via the traditional BI experience, but also composable and embeddable components, API-first integration points and flexible presentation layers to do more than churn out dashboards. It enables individual departments to grow at their own pace and direction.
Rooted: Just as a data tree grows best in great soil, Wave 3 requires an accurate foundation of clearly defined, valuable metrics that can feed any upstream process - whether that’s traditional BI, AI/ML or analytic/operational apps. These metrics are the foundation of balanced self-service.
Adaptable: In the forest or in a manicured lawn, a data tree adapts to and exists in harmony with its surroundings. Wave 3 tools are multi-cloud or on prem. They support SQL and python, code and UI. They adapt to their environment.
Resilient and self-healing: A healthy data tree withstands the storm and heals any damage. Wave 3 tools require observability to alert when something is broken and proactively heal when possible.
The data tree is organic and long-lived. It provides balance between governance and speed by building a circulatory system to easily incorporate self-service insights into the centralized metrics store. The trunk of the system is crafted by the data team while the branches are allowed to grow toward the sun in the most efficient way possible. The nutrients they generate (in the form of important metrics) are fed back into the whole.
Want to talk about this live with me? Tune in Thursday, 2/21/2023 for the Youtube livestream, or watch the replay later.
How does this manifest in BI Wave 3?
So what the heck does this actually mean? The biggest set of changes I see coming for Wave 3 is the backswing of the ‘centralization - distribution’ technology pendulum into a place of balance, where the BI tool is a self-service insight generation platform that easily feeds into other important data processes, instead of being a black-box end point for the data supply chain.
People are always going to do stuff in BI that you didn’t plan for. It’s the easiest place for business people to work with data1 - they cannot be building data pipelines or building views in Snowflake. However, you still need their insights to permeate the data practice, upstream, to deliver value in AI/ML and apps. Therefore, you need a BI tool that can serve as a foundation for that process without having to engage in heavy, centralized data engineering tasks.
This is the virtuous cycle that Wave 3 BI tools need to unlock to break the supply chain of data wide open. I call this the ‘circulatory system of data.’ Insights are generated at the edges and incorporated into the whole with as little friction as possible. The value is created by the ability for an insight generated in one part of the business to be quickly and accurately incorporated into the whole. It ensures a proper flow of data up and down the stack and across the org.
To support this the platform must grow beyond just presenting dashboards. It needs to have an open, headless metrics store to feed AI/ML and apps. The APIs must be bi-directional so that these downstream uses can feed data back into the system to seamlessly integrate with BI. And the whole thing must be wrapped in analytics-as-code to unlock scale and consistency.
What about AI?
This is the elephant in the room and probably the thing you expected as the future of BI. Until recently I would have told you that AI is really not going to be a part of Wave 3 BI systems in a meaningful way. I have watched IBM, Microsoft, Tableau and every other major vendor struggle to implement NLQ and data science features that anyone cares about or even uses. They make great demos and they all fail with real world users. People just don’t like or trust these types of systems. Thoughtspot is perhaps the exception but it doesn’t seem to have set the world on fire.
But then ChatGPT happened, and everything changed. It is very impressive at summarizing existing knowledge and presenting users with confident sounding, usually correct-ish answers. I do think it will become a powerful tool in the arsenal of BI, but there are a few caveats:
Data quality is going to matter even more than it does today, because of how compelling ChatGPT’s answers sound to humans. If your data sucks, it will very confidently give you sucky responses.
A lot of firms may have very poor training data that results in very poor performance and a very bad initial impression.
It’s most impressive initial uses will be to help developers and engineers design and code systems. It can generate usable SQL and DAX, for example. Developers using ChatGPT will be very productive.
There is going to be a major ‘trough of disillusionment’ with this tech when it gets widely implemented in BI and 3% of its answers are egregiously incorrect but sound great. It’s fun and funny as a Google replacement, but any error rate at all is unacceptable for systems that have major business decision making impact.
Fundamentally, AI-driven BI will struggle with Wave 2 data because the just-in-time supply chain of analytics prioritized speed over quality, especially at the pipeline level. How is ChatGPT going to make sense of your 1700 table Snowflake instance where the same metric appears in slightly different, nuanced ways in 53 tables? It’s not.
To make the best use of AI-driven BI, you’re going to need the Wave 3 features I described above to feed the system high quality, up-to-date and consistent metrics. Otherwise it’s going to spit out junk. But I do now see how it’s going to be a major factor going forward, once we get past the growing pains. However the future is hazy on exactly how this is going to play out, so I set it aside as its own separate thing for now.
Is anyone actually building this stuff now?
Yes! There are bits of this being done across lots of interesting tools. Some of them you know well, others you’ve never heard of. Nobody is doing all of it, but some players are close. For true enterprise adoption, someone is going to have to provide all or most of these capabilities in a single package. The ‘Modern Data Stack’ approach of point solutions for every component of the data supply chain works for that metaphor, but it runs into serious issues when you have a large organization to support. It’s just too hard for a huge firm to onboard 17 best-in-class point solutions to get a functioning analytics practice.
Better to have a single BI platform built of composable, code-first features that can fit into many use cases, both traditional dashboarding and AI/ML apps. And that’s what we’ll discuss in part 3 of this series - what interesting tools and platforms fit into this vision of the future.
You may object - actually the easiest place for people to do data work is Excel. Yes! So let’s find a way to integrate this stuff into the BI system as well. Embrace it.
Build the strongest roots, allow the most flexibility towards the ends of the branches to grow towards the most nutrient-rich light (insights) as possible. Find the best trunk (metrics) that allows for the most types of branches to grow. Make sure that the ends of the branches have a way to transfer those increasingly nutritious insights back to the roots. Easy enough?
When is part 3 coming out?