Fuzzy Data, SEO Metrics and The Trouble With Tools
In most aspects of life we undertake important decisions with care, especially when there is money involved. We may depend on others in the process but when someone offers simple solutions to complex problems we will usually be skeptical (or should be). Hard problems are not easy to solve.
Yet some SEO tool vendors make big business of selling fuzzy data and promising push-button solutions. This has been on my mind for years, well before I thought to build a tool to do SERP analysis, but it’s no less a problem today.

When Unreliable Data meets Opaque and Imprecise SEO Tools
In many industries we have come to expect a basic level of transparency from the companies that offer products and services. Think pharmaceutical or investment management firms, which must abide by clear disclosure rules.
Why are users of SEO tools, then, so willing to put blind faith in these companies? Why do we accept as reliable fact the mystery meat metrics and data served up by the major (and usually high-priced) tool providers?
Opaque and Undocumented Tools
Most of the search marketing community would agree that there are many misguided assumptions associated with SEO ‘best practices’. Likewise, the popular SEO tools that are widely used by enterprises and digital agencies on down to solopreneurs are often misunderstood. This can lead people to misuse these tools, with potentially costly results.
Significant time and resources are invested into content creation or link-building campaigns and whole businesses are sometimes launched based on the data and recommendations provided by these vendors’ tools.
These scores aim to put a single, arbitrary numeric measurement on a page, domain or keyword.
My main concerns here are vendor-provided so-called proprietary (and usually opaque) metrics. These scores aim to put a single, arbitrary numeric measurement on a page, domain or keyword. Another problematic area is the accuracy and origin of the data behind these metrics, such as search volume and backlink data.
I am not referring to the on-page focused correlation analysis tools such as Cora, POP, etc, which emphasize pure data collection and reporting, and provide clear documentation. That is also the general approach for SERP Sonar. I am also not including the many rank tracking tools in this article, although much could be said of the accuracy issues in that space as well.
Market Data Matters
My background is in finance, in the institutional trading space. For years I have worked with and around information and data that is used for critical decision making.
In trading, understanding the source, reliability and proper use of a number can mean the difference between making or losing a lot of money.

Investors in the capital markets, both institutional and retail alike, depend heavily on financial data and market indicators. And for the listed markets (the stock, options and futures exchanges) the accuracy of market data has historically been high.
Consider too, that fundamental data such as company earnings reports and shares outstanding must, by law, be accurate. Furthermore, the vendors that scrub and redistribute such financial market data actually compete with one another to be the most accurate and fastest.
…marketing professionals routinely make costly decisions based on opaque and potentially inaccurate or incomplete data...
And it is understandable why: billions of dollars in investment decisions are made each day based on this data, so it has to be correct. With SEO, the costs and revenue at risk that often depend on the accuracy of a tool and the data it delivers can be just as impactful.
Marketing Money Matters
The online marketing industry should demand a higher standard of accuracy and transparency. Global digital ad spend in 2022 is projected to be over $602 billion, now representing 66.4% of total global marketing budgets. The SEO industry is (conservatively) approaching $80 billion in size, in the US alone.
Yet, marketing professionals routinely make costly decisions based on opaque and potentially inaccurate or incomplete data delivered by the various SEO tool vendors (and Google, of course). Moreover, they put full trust into the various mystery metrics these vendors offer without knowing how the numbers are actually calculated.
With $1.65bn per day in ad budgets and SEO campaigns on the line, I think it’s worth having a little talk about — or with — the martech companies.
With $1.65bn per day in ad budgets and SEO campaigns on the line, I think it’s worth having a little talk about — or with — the marketing technology (martech) companies.

Unfortunately, nothing compels these martech vendors to disclose the quality and freshness of their data and link indexes, or how their metrics are calculated. Consider some of these questions…
- Did they crawl and scrape this data themselves or use third party data (or both)?
- How frequently is the data updated, or link databases re-crawled?
- How do their metrics actually perform as predictive measures?
We don’t know because most of these vendors are unwilling to tell us. So, if you didn’t know the quality or source of the data involved, consider how you would approach other kinds of decisions in life.
- Would you buy a high-ticket item based only on a rumor that it was good?
- Would you buy milk if the container had no expiration date printed on it?
- Would you buy a stock simply because a tool’s “profit difficulty” number was ‘under 30’?
I’m guessing you said “no” to those questions.
So, why do people pursue certain keywords based only on a ‘difficulty score’ a vendor assigned without knowing how that score is calculated? Why do they disqualify a keyword or niche based only on estimated search volume data. Or on a competitor analysis based exclusively on (potentially incomplete or stale) backlink profiles?
Given that none of the vendors openly disclose this information, it should be assumed that the data reflected in the tool’s analysis is either incomplete or out of date (or both).
Schrödinger’s SEO Tool Box
Given that none of the vendors openly disclose this information, it should be assumed that the data reflected in the tool’s analysis is either incomplete or out of date (or both). You could also choose to trust that it is accurate but, without greater visibility or validation, we can only guess.
I do believe that these tools can be extremely valuable by simplifying complex data collection and aggregation tasks, among other things. But people tend to take as gospel whatever these vendors offer. And the tools are only as good as their underlying data.

Equally important, the predictive power of their mystery metrics is only as reliable as your last documented use indicates (because you have nothing else to go by than your own research and testing).
Of course, the SEO tool vendors gain little from showing us how the sausage is made. Data collection is not cheap and, in many cases, their data stores may be stale, inaccurate or just underwhelming in coverage.
Also, they often declare their special metrics to be “proprietary”, making performance testing and comparisons across vendors more difficult. Yet we, their users, are supposed to use their tools to make important decisions (involving online business investments of time and money).
In time, it might become commonplace that clients and agencies only put their trust (and budgets) with SEO tool vendors that are fully open and transparent about the data behind their tools.
We Need a Hero
Perhaps a forward-thinking vendor out there might take the initiative and change how they do business. The bold martech company that more openly shares this important information would surely get rewarded with new clients.
In time, it might become commonplace that clients and agencies only put their trust (and budgets) with SEO tool vendors that are fully open and transparent about the data behind their tools. Which progressive digital marketing tool provider will lead the charge?

The company that steps up first should set the bar high and fully disclose what constitutes their products. They should openly and regularly declare the size of their link index, and the frequency with which it and their other data is updated.
They should share the primary and third party sources and freshness of their search volume data, and any adjustments or estimates applied. And they should reveal precisely how their various metrics (page, domain and keyword scores, etc) are calculated.
A good SEO tool provides deep insights and boosts efficiency but scrutiny, common sense and basic research methods should all be used in tandem.
Until things change, users of these tools must take more ownership over their own data collection and analysis decisions. Ultimately, a holistic approach will yield the best results. A good SEO tool provides deep insights and boosts efficiency but scrutiny, common sense and basic research methods should all be used in tandem.
Applying more rigor and taking a DIY approach is certainly not as convenient but can potentially make a huge difference in revenue outcomes. Your business or that of your clients deserves more than blind faith in an off-the-shelf tool.
I am a data hound, and I love the various tools that leverage all that juicy data that’s out there in the digital marketing space. But some of these tools cost a lot, and we deserve to know more about the quality and freshness of the data they serve up.
What do you think?