What makes a good BI tool for data mesh?
Most BI tools stink as the front end for data mesh. Here I take a look at GoodData, which does not.
I was recently a guest on the awesome podcast Data Mesh Radio, where host Scott Hirleman and I spent over an hour talking about data mesh and the impact it will have on business intelligence. This is a criminally under-explored topic in my esteem, as the data mesh conversation always focuses on its obviously huge impact on data engineering and warehousing but just assumes BI as a sort of necessary but definitely not as sexy as ML endpoint for a data mesh. I think that’s a mistake. The lessons of product thinking and domain orientation are just as valid for your BI front end - in fact, one could argue that it’s even more important because BI is where your analytics practice comes into direct, messy contact with your business audience. Many truly excellent data teams are building elegant, resilient pipelines that feed dashboards absolutely nobody cares about. That’s bad product-market fit right there. I’ve been on that team. It stings.
So in this post I’m going to take a look at how to pair a modern BI tool with your data mesh implementation to seek the BI product-market fit that will make your shiny new mesh obviously superior to what you had before.
While this isn’t sponsored content, in the spirit of full transparency, GoodData is one of my clients.
Traditional BI doesn’t fit well in a data mesh
To understand why the leading BI tools aren’t great fits for a data mesh, we have to consider their purpose: To enable data analysts to generate dashboards and visualizations as quickly as possible. This is a laudable goal as speed is an important component of the coveted ‘agility’ that all teams seek. However data mesh isn’t about building useful charts or ML models, it’s about delivering data products that delight people. Products are built, tested and refined with a particular audience in mind, and the feedback from that audience drives further development of the product. Churning out dashboards is a poor way to achieve product-market fit for your BI, in part because that ‘supply chain of data’ methodology is typically a one way street. Data flows to the dashboard, and what happens to it after that is anyone’s guess. It doesn’t evolve over time.
In particular, tools like Tableau and Power BI have the following less than desirable downsides for a data mesh:
Rigid, manual and UI-based development and management. Hard to automate.
APIs are an incomplete afterthought
Metrics sharing between BI and ML or business apps is very challenging
Don’t fully integrate into CI/CD
Lack of composability severely limits usefulness
Encourage a one-way trip for data with no built-in feedback mechanisms
Why are these issues? Because the way BI fits into a data mesh, as I envision it, is by combining a composable, scalable and flexible front end with code-based automation and a universal metrics layer. This makes it easy to automate the creation of BI assets for existing and new domains and data products as they shift over time, while serving BI, AI/ML and app dev use cases with high quality components and metrics.
How this would work is as follows:
A central team creates BI assets for use by any domain. These are data-agnostic features that can be customized by a data product team using code.
Domain teams use these components to present a BI front end alongside their data product, as well as a utilizing the metrics layer to provide SQL and API access to their metrics.
Downstream consumers - meaning other domains, data scientists or self-service users - can customize these components for their own uses.
The key thing is that changes can easily flow up and down the stack via the application, not a centralized manual process. Imagine a self-service users develops a new metric. Rather than that being locked in the BI layer - as it would be today - it can easily be added back to the data product, where it is immediately available for use in other applications or federation to other domains. No widely adopted BI tool works this way currently.
What is GoodData?
Which brings us to GoodData and why I think their approach is better suited to data mesh implementations or companies adopting product thinking for BI. They first appeared on my radar thanks to their universal semantic layer. Semantic layers are an old idea that have suddenly flared back to life, and something I spent a good deal of my earlier career building. We started working together and after learning about their overall approach, I felt that GoodData could do a lot of what I envision thanks to the following features:
Everything in GoodData is code and manageable via modern devops practices.
GoodData has workspace inheritance, which allows changes to flow from canonical, domain managed data products down to self-service environments, and back again.
GoodData’s semantic layer ensures that metrics are consistent across use cases and can be easily federated and calculated between domains.
GoodData supports both API and SQL access, which satisfies your data analysts and data scientists alike.
GoodData can be deployed anywhere via kubernetes.
Okay that sounds great - but what do these features actually enable?
GoodData is ‘The Switzerland of Analytics’
One important benefit of Data Mesh is that, as envisioned, you reduce the reliance on monolithic cloud platform vendors by offering flexibility to your domain teams in terms of what tech they use. GoodData is architected from the ground up using industry standard devops practices and runs on any cloud or on-prem. It’s not designed to lock you in to a specific tech stack, push cloud CPU spend or favor one vendor ecosystem over another. Basically, it plays nice with everyone. Even in situations where GoodData would classically be viewed as a competitor - say, vs other BI platforms - their open, API-based architecture allows you to combine tools as you see fit. Already use Snowflake and Power BI? Great - maximize those investments by providing industrial scale metrics, discoverability, composability and an API-first experience with GoodData. You are free to mix and match the tools and approaches that work best for you.
API-first with massive customization at scale
A properly built Data Mesh requires a BI layer built on modern technology with a modern software development workflow. GoodData supports this today - everything in GoodData is code that you can manage using devops best practices; metric models, visualizations, dashboards, workspaces, security and capabilities - all can be programmatically generated and are ready for CI/CD. Furthermore, their API-first approach using the GoodData.ui react SDK enables your app developers to easily query high quality metrics or embed visualizations and analytics components into any downstream application. This means that you can automate the build out of BI, metrics, security, etc… so that new data products can be supplied with standardized, high quality front end components on demand with little to no manual work.
Front end agnostic, Bi-directional metrics layer
One huge challenge for legacy BI applications operating in a Data Mesh is metrics lock-in. Self-service users may utilize the BI front end to combine domain data and produce valuable new insights, but those insights are stuck in BI and it’s quite challenging to extract them for uses in AI/ML or analytics applications. Not so with GoodData, which has a headless metrics layer and a suite of SQL and API access points. Insights uncovered by data analysts can be easily added to the metrics library and made available to developers across the company through their interface of choice. This significantly eases the burden on data engineering and ensures data quality and federation across the mesh. This can even be integrated into downstream operational apps, so that actions take by end users are automatically pushed to the metrics layer and made available to all.
Simple self-service platform management
Data mesh doesn’t decentralize everything - you still need a team providing the underlying data platform which enables the mesh. Because GoodData is code-based and composable, your platform team can easily create and maintain the core BI elements of the data mesh, while allowing each data domain and data product team to customize as necessary for their unique needs. This creates consistency in data presentation and makes it easy to learn and navigate the mesh while maintaining the highest possible data quality and security. A smaller central team can establish the core presentation and metrics layer best practices and make these available as a set of components to new data domains, new data products and any downstream application.
A flexible front end for Data Mesh
A Data Mesh is not static - it evolves over time at the speed of business. It is fast and adaptive. But it also requires a bedrock data platform upon which the mesh can be built, like a painter requires a canvas. Fundamentally the appeal of GoodData in a Data Mesh is its ability to offer flexibility to domain teams and support all self-service and analytic uses while making it easy for the data platform team to create high quality, standardized and reusable components that satisfy the requirements of BI, AI/ML and downstream applications.
Most existing BI suites are not tailored for this kind of environment. They exist solely as an endpoint for distributing charts broadly throughout the organization. They are hard to scale and manage. They are uni-directional and lock insights into proprietary metrics layers, and they don’t play nice with non-BI use cases. Certainly Tableau or Power BI can be a useful front end for a data mesh, but in my opinion they can’t be easily integrated as an important component of the mesh itself. GoodData can, and that’s what really got me excited about it’s potential going forward.