Data Analytics Partner Programmes Google

Living with error margin in thriving data-led and data-fed businesses

Thu, 11th Jul 2024

FYI, this story is more than a year old

Within the highly competitive world of data monetisation, what separates a market-leading data analytics business from its rivals is the ability to generate revenue.

A variety of approaches to monetisation may be used, from adding new services to existing offerings to developing entirely new business models, according to McKinsey. High performers, according to the firm, "are three times more likely than others to say their monetisation efforts contribute more than 20 percent to company revenues."

What makes a high performer is an ability to consistently produce more accurate results and outputs. But "more accurate" in this context should not be taken to mean "perfect".

It is often said that perfection is the enemy of good. This is particularly the case in statistical and data domains. Perfection is an expensive proposition, and its pursuit is an expensive pastime. Or to put it another way: you can still make money without being perfect.

Few market-leading services operate at scale and with 100% perfection, nor is the realisation of customer value predicated on 100% perfection.

Consider Google Maps Timeline, for example. Google Maps' location accuracy and reliability varies depending on a number of factors such as the device's location data, Google's selection of mode of transport (which could be incorrect), a user's Google account login status or settings. However, it can still be a useful record of where people went, when and how long that journey took. The value it delivers may be judged acceptable, even against technically superior but more expensive alternatives. It's a balance.

Another reason 100% perfection may not be pursued is due to the disproportionate amount of time and effort required to make progressively smaller gains in accuracy above a certain point.

The accuracy of a data or machine learning model used out-of-the-box with no additional work may be around 70%. With some finetuning of the model and training over time, an improvement in accuracy from 70% to 90% may be possible. Further improvements above 90% are also possible, but the business case to do so would have to be carefully considered, as it may take weeks or months to achieve the next 1-2% of accuracy improvement.

That may be judged worthwhile in certain high-risk, high-reward scenarios. In the resources sector, for example, production is measured in millions of tonnes per annum (mpta). Small variations – such as a sub 1% improvement – can mean tens of millions of dollars in increased output from a site that can then be sold. Operators have used analytics to tease out inefficiencies in extraction, processing and transport, which can then be optimised to achieve solid gains.

This works for a small proportion of large companies. Outside of them, however, the vast majority of businesses – including those whose production output is data or insights – mostly do not operate at this scale. In practical terms, that means they'll hit a ceiling much sooner in terms of where it still makes – or stops making – commercial sense to chase smaller and smaller accuracy improvements and gains.

Those that succeed, learn to strike the right balance with their business model: determining a margin of error that does not place them at a commercial disadvantage when compared to competitors, and enables best-in-class performance and revenue growth.

Why perfectionism is still pursued

The balance between perfect performance and value is hard to get right. Mistakes can occur. It may not be apparent that a mistake has been made until an accuracy improvement initiative is already underway, and the economic effects of that decision - either the amount of investment required or the higher cost needed to recoup the investment and achieve a return on it - become better understood.

As we've demonstrated so far, the economics of a data monetisation business are complex. There are a number of variables at play. While there are clear success stories of companies unlocking growth and revenue from data monetisation, it is not always outwardly clear how they have made these gains - and, in particular, where their 'sweet spot' is on accuracy versus value.

Mistakes can also occur in an information vacuum: where the intricacies of the business model are not well understood, or where there are perhaps no business metrics in place to determine the economics of pursuing additional improvements to accuracy.

Good decision-making in any business scenario is context-dependent. Mistakes are less likely to occur in companies that have clearly documented what their business model entails, what level of accuracy is commercially acceptable, and how much every additional dollar of capex brings in - or takes out - of the profitability of the business.

Updating your data strategy

My viewpoint is that a margin of error is acceptable in many cases within certain industries. How great that margin is should be negotiated and agreed on, recorded in a data strategy, measured and reported against. This ensures project alignment and stakeholder understanding.

The data strategy should describe the nature of the business - how it makes money, what the acceptable margin of error is, the boundaries that teams need to work within when defining error rates, and the process or set of actions that should be initiated to resolve any slippage outside of documented tolerances.

When this is documented, the whole company has guidance from the executive leadership team on how to conduct business effectively, which should reduce the risk of a false alarm being raised or worse, the expenditure of valuable time and resources to resolve something that does not need resolving.

While considered best practice, the inclusion of this kind of information in the data strategy is often not well-practised. This needs to change.

Determining the correct phraseology and tolerances may be supported by an experienced partner, along with some observability into the data model to understand the nature of the errors it encounters. This is important because in a margin of error of 10%, there can be multiple causes of errors: one may contribute 2% of errors, another 3%, and so on, and each has a different cost associated with reducing or eliminating it. Margin of error is a nuanced internal discussion to have, and is rarely addressed with a broad-brush or reactionist approach.

Once margins of error are calculated, what's important is a clear articulation of how error rates impact revenue. If the current percentage of errors on a client job that brings in X dollars of revenue is clear, the tolerance for a variation of, say, +/- 2% can be described and understood. Delivery teams then know if they craft a solution that results in a 0.5% variance in accuracy, it remains within the defined boundary and tolerance, and is unlikely to cause issues.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google