How We Score
Every score on PageGuard is produced by automated analysis of publicly accessible page signals. This page explains exactly what we check, how we weight it, and what the results mean.
What a score is — and isn't
A PageGuard score is an automated snapshot of publicly visible signals at the time of the scan. It reflects what our scanner can observe from outside the site: HTML content, HTTP headers, third-party scripts, and publicly linked documents.
A score is not a legal audit, not a certification, and not an assessment of internal systems, server infrastructure, or private data handling practices that are not visible in the page source.
Scores are a starting point for improvement, not a verdict. A site can have a low score while still being operated responsibly — and vice versa. We encourage site owners to run a full scan to see the specific findings behind any score.
Overall score formula
The overall score is a weighted average of six dimension scores, each 0–100. Privacy carries the most weight because it has the most direct legal exposure for most sites.
When the Performance dimension is unavailable (PageSpeed API timeout or missing data), its 15% weight is redistributed proportionally across the remaining five dimensions.
What we check per dimension
Privacy
30%Detects tracking scripts, analytics SDKs, ad pixels, session recorders, and fingerprinting libraries via HTML pattern matching against 200+ technology signatures. Checks for a linked privacy policy and cookie consent mechanism. Runs a second Puppeteer render to catch scripts that load after consent tools fire. Claude Sonnet analyses the actual privacy policy page (when reachable) for GDPR/CCPA coverage gaps.
Security
20%Reads HTTP response headers directly. Scores Content-Security-Policy, Strict-Transport-Security, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy, and four others. Each missing or misconfigured header deducts points by severity.
Accessibility
20%Parses the rendered HTML using static analysis rules checking for missing alt text, form labels without accessible names, heading hierarchy, document language, viewport zoom restrictions, skip navigation, autoplay media, and positive tabindex. Checks are weighted by WCAG severity. axe-core browser-based analysis is being integrated to add colour contrast and ARIA validation.
Performance
15%Calls the Google PageSpeed Insights API. Uses the Lighthouse lab score (0–100) when available. When Lighthouse cannot run (for example on sites with aggressive bot protection), the dimension score is omitted and its weight is redistributed across the other five dimensions. CrUX real-user field data is surfaced separately when available.
AI Readiness
10%Fetches and parses robots.txt to check whether major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, etc.) are allowed or disallowed. Checks for an llms.txt file and for structured answers in the page content that AI systems can extract without scraping.
Schema / Structured Data
5%Parses JSON-LD blocks and microdata from the rendered HTML. Checks for valid Schema.org types (Organization, Product, Article, FAQPage, etc.), Open Graph tags, and Twitter card meta tags. Scores based on coverage and validity.
Score tiers
80–100
Passes our automated threshold. Key signals present.
50–79
Some gaps detected. Worth reviewing before scaling.
0–49
Multiple signals missing. Significant exposure identified.
Tier labels describe the automated risk signal level, not legal compliance status. “Low Risk” does not mean a site is certified compliant with any regulation.
Scan mechanics
Two renders, not one. We run Puppeteer (headless Chromium) twice per scan. Some consent management platforms block tracking scripts on the first render before they fire; running twice and taking the union of detected technologies gives the most complete picture of what the site actually loads.
Plain fetch as backup. Some sites block Puppeteer via bot detection (Cloudflare, etc.). We run a plain HTTP fetch in parallel and fall back to it for HTML analysis when Puppeteer is blocked.
Policy analysis via AI. For privacy scoring, we fetch the actual privacy policy page (when linked) and run it through Claude Sonnet to check for GDPR/CCPA-required clauses. This catches sites that have a policy but are missing key legal disclosures.
Point in time. Scores reflect what the scanner observed at the timestamp shown. Sites change. A score from three months ago is not a current assessment.
Publicly visible signals only. We do not access authenticated pages, admin panels, internal APIs, or any content behind a login. A site with strong security on its app but a weak public homepage will score on the homepage.
Score disputes and corrections
We take accuracy seriously. If you believe a score for your site is incorrect, we want to hear from you — and we will act on it.
How to dispute a score
- Email hello@getpageguard.com with subject line Score Dispute.
- Include your site URL and the specific dimension(s) you believe are inaccurate.
- Describe what you think the scanner missed or got wrong, and provide evidence where possible (e.g. a link to your privacy policy, a screenshot of your cookie banner).
What happens next
- We will acknowledge your dispute within one business day.
- We will re-run the scan and review the specific findings you raised.
- If the score is wrong, we will correct it and update the leaderboard. We will notify you when the correction is live.
- If the scanner is working correctly but you disagree with the methodology, we will note your feedback and explain our reasoning.
We will never refuse to engage with a dispute. If a finding is wrong, it gets fixed — full stop.
Want to see what PageGuard finds on your site?
Scan My Site Free →