From SEO Audits to Web Checkups: Measuring Websites in Context
5 min read

Can a Tailor Mend Their Own Tears?
Last week, as part of my PhD research, I built a comprehensive Web / SEO Checkup Tool to support my second academic paper. The goal was simple: to evaluate a website not by asking isolated questions like “Is the SEO good?” or “Is it fast?”, but by assessing it as a whole system.
Think of it as a blood test for websites.
The irony was unavoidable. The very first real-world test of the tool was my own website. You know the saying: “The tailor can’t mend their own tears.” This was exactly that moment.
In this post, I’ll walk through the checkup report generated for my own site and examine it comparatively against other sites built on the same foundation. Because one thing became clear very early on:
A single signal, on its own, usually means very little.
What Does This Web Checkup Tool Actually Do?
Before going further, it’s worth clarifying what this tool is not. This is not a classic SEO audit tool, nor is it a one-off Lighthouse report.
The system is designed around a true checkup philosophy.
Much like a medical checkup, it evaluates multiple dimensions together:
- Content structure and readability
- Accessibility (WCAG compliance)
- Performance (Lab & Field data)
- Visual system and contrast balance
- Security headers and technical health
- Domain trust signals
- UX, conversion, and tracking patterns
These signals are not interpreted in isolation, but in combination.
More importantly, instead of judging a single site in a vacuum, the same report is generated for competitor sites and placed side by side. In a checkup approach, the real value lies not in the raw number itself, but in its context.
Why Standalone Metrics Can Be Misleading
Contrast warnings are a good example. The same applies to readability, UX, or performance scores.
A report might say:
- “Low contrast risk”
- “Poor readability”
- “UX needs improvement”
But the real question is:
According to whom? Compared to what? And relative to which market?
That’s why this checkup tool is built around comparative analysis, not isolated scoring.
Same Template, Similar Sites, Very Different Outcomes
For this report, I deliberately chose an interesting scenario:
- The same technical stack
- The same design system
- The same component library
Used across three sites with different purposes.
The results were strikingly consistent on the surface:
- Technical scores were almost identical
- Performance metrics were very close
- The visual system clearly belonged to the same family
Yet the moment we looked at content, everything diverged.
Content Differences Create Perception Differences
This is where Flesch Reading Ease becomes far more interesting than a simple “readable / not readable” score.
When viewed comparatively, it starts to tell a deeper story:
- 20–30 Flesch → Academic, dense, highly selective
- 40–50 Flesch → Professional but accessible
- 60+ Flesch → Broad audience, fast consumption
Without explicitly trying to, we end up performing a:
Indirect audience and age-profile analysis.
If a competitor uses simpler language, it may indicate:
- A broader target audience
- Or a deliberate effort to reduce cognitive load
This is difficult to see when analyzing a single site in isolation. Side by side, however, the patterns become obvious.
The Contrast Question: Real Problem or Side Signal?
This is where things get fun.
Despite using a Balanced Theme System on my own site—one that mathematically enforces contrast rules—the report still surfaces contrast-related signals.
This is not a bug.
The visual analysis layer:
- Extracts color distribution from screenshots
- Calculates light/dark ratios
- And flags scenarios that deserve attention
But this does not automatically mean a WCAG violation. It’s a heuristic. Modern, perceptually balanced designs often appear “suspicious” to automated visual analysis systems.
Which brings us to an important reminder:
Automated tools can warn you — interpretation is still a human responsibility.
Why Comparative Reports Matter
When entering a new market or building a new product, we’re really trying to answer two questions:
- What are competitors doing technically?
- What are users already accustomed to?
Comparative checkup reports reveal not only competitors, but also implicit user expectations and tolerance thresholds.
A single site being “good” or “bad” is often meaningless. But if everyone behaves the same way, that behavior becomes the norm.
Small Signals, Big Picture
In summary:
- A contrast signal alone → may be meaningless
- A readability score alone → can be misleading
- A UX score alone → lacks context
But when evaluated comparatively:
- The market’s language becomes visible
- User habits emerge clearly
- “Best practice” reveals itself as contextual, not absolute
In the next post, I’ll share how the identified issues were addressed, how the “tears were mended,” and how updating the shared template shifted these metrics.
A Small Note
If this report caught your interest and you’re curious about your own site’s health, we can run a small experiment.
👉 Anyone who buys me a coffee at https://buymeacoffee.com/goker gets a free Web / SEO Checkup, including:
- Your website
- 2 competitor sites
- 2 search terms
It’s a great way to test the tool in real-world conditions—and to interpret the results together.
SEO/Web Checkup Report Sample PDF for goker.me