SearchScore Agency Spec v2.2 build-ready

New in v2.2: NEW Scan cache semantics, webhook delivery requirements, entitlement re-check, report immutability, downgrade behaviour, billable vs telemetry separation, commercial control rules, white-label tightening

User Requirements & Technical Specification

Version: 2.2

Date: 2026-04-11

Author: Solution Architecture Review

Status: Build-Ready Draft

Updated: v2.2 - commercial control rules, cache semantics, webhook delivery, downgrade behaviour, report immutability

1. Executive Summary

SearchScore is an AI visibility optimisation platform. The Agency Subscription extends the core product to support marketing agencies, SEO consultants and freelancers who manage multiple client websites. Agencies need to demonstrate AI visibility gaps, deliver fixes and prove improvement over time, across all their clients from a single account.

The core product loop remains unchanged: Scan → Understand → Fix → Validate → Track. Every agency feature must support this loop. No scope expansion into social media, email marketing, CRM, campaign management, or generic SEO reporting.

Guiding Principles (non-negotiable)

1. Category focus at all costs - This is NOT an SEO tool, NOT a marketing suite, NOT a generic agency dashboard. It is a focused AI visibility system.

2. Commercially durable - Every architectural decision must support sustainable monetisation.

3. Operationally sane - Usage metering, entitlements, and cost controls are first-class requirements, not afterthoughts.

4. Platform-grade engineering - This is a platform rewrite, not a feature extension. Multi-tenancy, RBAC, queue architecture, and audit logging are structural, not optional.

5. Tool first, platform second - Build the best agency AI visibility tool in the market first. Only expose platform complexity where it increases revenue and defensibility.

2. User Personas

2.1 Primary: Agency Owner / Account Director

• Role: Runs a digital marketing agency with 5-50 clients

• Goal: Use AI visibility as a new service line or client retention tool

• Pain: Needs to quickly show prospects why they're invisible to AI, then deliver measurable improvement

• Behaviour: Wants branded reports, fast audits, and a way to manage multiple clients without switching accounts

2.2 Secondary: SEO Consultant / Freelancer

• Role: Independent consultant managing 3-15 client sites

• Goal: Add AI visibility audits to their service offering

• Pain: Needs affordable multi-site access with professional output (reports, share links)

• Behaviour: Runs audits during prospect calls, sends branded reports as follow-up

2.3 Tertiary: In-House Marketing Team

• Role: Marketing manager overseeing multiple brand properties or subdomains

• Goal: Monitor and improve AI visibility across the brand portfolio

• Pain: Needs a single view of multiple properties with tracking over time

• Behaviour: Weekly monitoring, monthly reporting to leadership

3. User Stories

3.1 Account Management

• U-01 As an agency owner, I want to create a single account and manage all my clients under it, so I don't need separate logins

• U-02 As an agency owner, I want to invite team members with role-based access (admin, analyst, viewer), so my team can work independently

• U-03 As an agency owner, I want visibility into usage by team member, so I can understand who is consuming scans and costs

3.2 Client / Site Management

• U-04 As an agency user, I want to add client domains to my workspace within my plan's allowance, so I can scale predictably

• U-05 As an agency user, I want to organise clients into groups or folders (e.g. by industry, by account manager), so I can find them quickly

• U-06 As an agency user, I want to see a single dashboard showing all client scores at a glance, so I can prioritise attention

• U-07 As an agency user, I want to tag clients with custom labels (e.g. "at risk", "high opportunity", "onboarded"), so I can manage my pipeline

• U-08 As an agency user, I want to assign a client to a specific team member, so responsibility is clear

3.3 Scanning & Auditing

• U-09 As an agency user, I want to run a new scan on any client site from the dashboard, so I can get fresh data on demand

• U-10 As an agency user, I want to trigger scans for multiple clients in bulk, so I don't have to do them one by one

• U-11 As an agency user, I want to schedule automatic re-scans, so monitoring happens without manual effort

• U-12 As an agency user, I want the scan to complete in under 60 seconds for a cached site, so I can run audits live on sales calls

3.4 AI Citation Engine

• U-13 As an agency user, I want to see which AI queries mention my client and which don't, so I can identify content gaps

• U-14 As an agency user, I want to add custom queries specific to my client's industry, so the citation data is relevant

• U-15 As an agency user, I want to compare my client's citation rate against competitors, so I can quantify the gap

• U-16 As an agency user, I want to see the actual AI response text, so I can show the client exactly what AI says about them and competitors

3.5 Fix Engine

• U-17 As an agency user, I want to see a prioritised list of fixes for each client, sorted by impact, so I know what to tackle first

• U-18 As an agency user, I want to mark fixes as in progress or completed, so I can track implementation status

• U-19 As an agency user, I want to generate ready-to-implement assets (schema JSON-LD, llms.txt templates, robots snippets), so I can hand off to developers or implement myself

• U-20 As an agency user, I want to see the estimated score improvement per fix, so I can prioritise the highest-value work

3.6 Competitor Tracking

• U-21 As an agency user, I want to add 3-5 competitor domains per client, so I can benchmark performance

• U-22 As an agency user, I want to see a side-by-side comparison of score, citations, and strengths/weaknesses, so I can explain the competitive landscape to clients

• U-23 As an agency user, I want to see why a competitor is winning, so I can create a specific action plan

3.7 Reporting

• U-24 As an agency user, I want to generate branded reports with my agency logo and colours, so the output looks professional when shared with clients

• U-25 As an agency user, I want to export reports as PDF, so I can attach them to emails

• U-26 As an agency user, I want to generate a shareable link for each report (with expiry and revoke), so clients can view results in a browser without logging in

• U-27 As an agency user, I want before/after comparisons in reports, so I can demonstrate improvement over time

• U-28 As an agency user, I want to generate a report for a prospect before they become a client, so I can use the audit as a sales tool

• U-29 As an agency user, I want scheduled report delivery, so reporting can run on autopilot

3.8 White-Label

• U-30 As an agency owner, I want to use a custom domain for client-facing pages, so clients see my brand

• U-31 As an agency owner, I want client-facing pages to use my branding, so the output feels native to my agency

• U-32 As an agency owner, I want to control whether "Powered by SearchScore" appears on client-facing outputs

3.9 Notifications & Alerts

• U-33 As an agency user, I want to be notified when a client's score changes significantly, so I can respond proactively

• U-34 As an agency user, I want to be notified when a new AI citation check shows a change in mention status, so I can update the client

• U-35 As an agency user, I want email or Slack notifications for scan completions and scheduled reports, so nothing falls through the cracks

3.10 API & Integrations

• U-36 As an agency user, I want API access to pull scan data into my own dashboards or tools, so I can integrate SearchScore into my workflow

• U-37 As an agency user, I want to trigger scans via API, so I can build automated pipelines

• U-38 As an agency user, I want webhook notifications for scan completion events, so my systems can react automatically

3.11 Usage & Billing

• U-39 As an agency owner, I want to see usage by category (scans, citations, reports, API) for the current billing period, so I understand where value is being consumed

• U-40 As an agency owner, I want to see approaching-limit warnings before hitting hard caps, so I'm not surprised

• U-41 As an agency owner, I want to understand what higher plans or add-ons unlock, so I can decide when to upgrade

4. Functional Requirements

4.1 Account & Authentication

ID	Requirement	Priority
F-01	Email/password authentication with optional SSO (Google, Microsoft)	Must
F-02	Role-based access control: Owner, Admin, Analyst, Viewer	Must
F-03	Team member invitation via email with role assignment	Must
F-04	Password reset and session management	Must
F-05	Two-factor authentication	Should
F-06	Enterprise SSO support	Could
F-07	All role restrictions enforced server-side at API level (not frontend-only)	Must

4.2 Workspace & Site Management

ID	Requirement	Priority
F-08	Single workspace per agency account containing all client sites	Must
F-09	Add/remove client domains subject to plan entitlements and usage limits	Must
F-10	Client grouping/folders with custom names	Should
F-11	Custom tags per client site	Should
F-12	Client assignment to team members	Should
F-13	Bulk import of client domains via CSV	Should
F-14	Client search and filter by score range, tag, status	Must

4.3 Scan Cache Semantics

A scan is classified as cached when all of the following conditions are met:

• The domain has been scanned within the **freshness window** (default: 24 hours)

• The scan configuration has not changed (same query set, same model coverage)

• The site's robots.txt or server behaviour has not changed since last scan

Freshness window rules:

• Default window: 24 hours for structural/technical analysis

• Citation check freshness: separate 7-day window (AI responses change less frequently)

• Freshness windows are configurable per plan in the entitlements table

Manual re-scan behaviour:

• A manual "Run scan" from the dashboard or API always triggers a fresh scan (bypasses cache)

• Scheduled scans respect the freshness window and may return cached results

• "Re-scan after fix" always triggers a fresh scan

Usage metering for cached scans:

• Cached scans consume 0 billable scan credits

• Cached citation checks consume 0 billable citation credits

• Fresh scans consume 1 billable scan credit

• The `fresh_scan` boolean on the scans table determines billing

Cache invalidation triggers:

• Manual re-scan request

• Freshness window expiry

• Site content change detected (via HTTP ETag/Last-Modified)

• Plan entitlement change (different model coverage)

4.4 Usage Metering & Entitlements

ID	Requirement	Priority
F-15	Track billable events: fresh scans, citation queries, report generations, API requests	Must
F-16	Track operational telemetry separately: model calls by model, cache hits, queue wait times, report views, webhook retries	Must
F-17	Enforce plan entitlements and usage caps at API level	Must
F-18	Expose current usage and remaining allowance in billing/settings UI	Must
F-19	Entitlement system configurable by plan (not hardcoded in business logic)	Must
F-20	Distinguish fresh scans vs cached scans in metering (only fresh scans are billable)	Must
F-21	Support for soft-limit warnings and hard-limit enforcement	Should
F-22	Support overage usage accounting for future billing integration	Should
F-23	Billable events and operational telemetry must be stored in separate tables or clearly separated by a billable boolean	Must
F-24	Re-check org entitlements and quota at async job execution time (not just at queue time)	Must

Entitlements to configure per plan:

• max_sites, max_scans_per_month, max_custom_queries_per_site, max_competitors_per_site

• max_team_members, report_generation_quota, api_enabled, webhook_enabled

• white_label_level (none / basic / advanced / full), custom_domain_enabled

• scheduling_frequency_allowed, citation_query_count, model_coverage

4.5 Scanning Engine

ID	Requirement	Priority
F-26	On-demand scan for any client domain	Must
F-27	Scan completes in <60s for previously cached domains	Must
F-28	Scan completes in <120s for new domains	Must
F-29	Scheduled recurring scans (daily, weekly, monthly), subject to plan entitlement	Must
F-30	Bulk scan trigger (scan all / scan filtered group)	Should
F-31	Scan queue management (pause, cancel, prioritise)	Could
F-32	Scan history with full data retention (12 months minimum)	Must
F-33	Scan queue with job architecture: retries, dead-letter handling, per-org rate limiting	Must
F-34	Plan-based concurrency limits (one large agency must not starve others)	Must
F-35	Idempotent job execution	Must

4.6 Scoring

ID	Requirement	Priority
F-36	Overall AI Visibility Score 0-100 with tier labels	Must
F-37	Sub-scores: Structure, Clarity, Authority, Accessibility	Must
F-38	Score trend over time (line chart with selectable date range)	Must
F-39	Score comparison against industry average	Should
F-40	Estimated improvement potential based on unresolved fixes	Must

Tier Labels: Keep 5 tiers. Adopt naming: Invisible / Low Visibility / Emerging / Competitive / Dominant. Preserves existing data; "Dominant" aligns with aspirational agency positioning.

4.7 AI Citation Engine

ID	Requirement	Priority
F-41	Predefined query set (10-20) per site, auto-generated from site content and industry	Must
F-42	Custom query addition per client site, subject to plan allowance	Must
F-43	Citation check across supported models (MVP minimum 2, target 3)	Must
F-44	Multi-model citation checks configurable by plan or feature flag	Must
F-45	Per-query result: mention status, position, competitors mentioned, AI response snapshot	Must
F-46	Aggressive caching and reusable result windows where valid	Must
F-47	Summary: "Mentioned in X of Y queries" with trend	Must
F-48	Competitor citation comparison	Must
F-49	Insight generation: "Missing from queries related to [topic cluster]"	Should
F-50	Citation history over time	Should
F-51	Optional deferred execution for less critical checks	Could

4.8 Fix Engine

ID	Requirement	Priority
F-52	Prioritised fix list sorted by impact (High/Medium/Low)	Must
F-53	Each fix includes: title, impact, effort, plain-English explanation, exact steps	Must
F-54	Fix categories: Technical, Content, Entity/Authority	Must
F-55	Fix status tracking: Not Started / In Progress / Completed, with owner field	Must
F-56	Generated assets: schema JSON-LD, llms.txt template, robots.txt snippets	Must
F-57	Estimated score gain per fix (clearly labelled as estimate, not guaranteed)	Must
F-58	"Re-scan after fix" button to validate improvement	Must
F-59	Fix history/change log per site	Must
F-60	Data model must not block future: comments, client approval state, team handoff	Should

4.9 Competitor Module

ID	Requirement	Priority
F-61	Add up to 5 competitor domains per client site, subject to plan	Must
F-62	Side-by-side score comparison	Must
F-63	Citation rate comparison	Must
F-64	Signal gap analysis: "Competitor has X, you don't"	Must
F-65	Competitive insight: "Why they're being recommended instead of you"	Must
F-66	Competitor score history over time	Should

4.10 Reporting

Reports are immutable point-in-time snapshots. Once generated, a report must not silently update based on newer scans or changed data. The data embedded in a report is the data as it existed at generation time.

This is critical for:

• Client trust (what they saw yesterday is what they see today)

• Sales workflows (prospect audit remains valid for the sales cycle)

• Before/after proof (baseline cannot retroactively change)

• Auditability (regulatory and internal review)

Implementation: reports store snapshot copies of all referenced data (scores, issues, citation results) rather than foreign keys to live data. PDFs are generated once and stored.

ID	Requirement	Priority
F-67	Executive summary report (score + top issues + recommendation)	Must
F-68	Full diagnostic report (all sub-scores + all fixes + citation data)	Must
F-69	Competitor comparison report	Should
F-70	Before/after progress report	Must
F-71	PDF export	Must
F-72	Shareable link with configurable expiry and revoke support	Must
F-73	Share links must never expose internal operational metadata	Must
F-74	Branded reports (agency logo, colours, custom footer)	Must
F-75	Multiple report templates for different use cases	Should
F-76	Scheduled report generation and email delivery	Should
F-77	Lead-gen audit report (for prospects, before they become clients)	Must
F-78	Prospect-safe report mode (no internal data leakage)	Must
F-79	Reports are immutable point-in-time snapshots (no silent updates from newer data)	Must

4.11 White-Label (Tiered, Client-Facing Only)

White-label applies to client-facing surfaces only: reports, share pages, optional custom domains. Internal agency workspace UI remains SearchScore-branded unless explicitly enabled for enterprise plans.

White-label is NOT an all-or-nothing toggle. Three independent levels, enabled by plan and feature flag:

Level 1: Basic Branding

ID	Requirement	Priority	Plan
F-79	Agency logo on reports and share pages	Must	Agency
F-80	Primary/secondary colour customisation	Must	Agency
F-81	"Powered by SearchScore" footer toggle	Should	Agency

Level 2: Advanced Branding

ID	Requirement	Priority	Plan
F-82	Custom domain (CNAME) for client-facing pages	Should	Agency Plus
F-83	Branded report pages with full visual control	Should	Agency Plus
F-84	Removal of SearchScore attribution on external surfaces	Should	Agency Plus

Level 3: Full White-Label

ID	Requirement	Priority	Plan
F-85	Custom email sender for scheduled reports	Could	Agency Plus
F-86	No SearchScore branding on client-facing surfaces other than optional legal/footer attribution controls	Could	Agency Plus

4.12 Notifications & Alerts

ID	Requirement	Priority
F-87	Email notification on scan completion (default on, digestible)	Must
F-88	Alert on score change above configurable threshold (default on)	Should
F-89	Alert on citation status change (default off)	Should
F-90	Per-channel notification preferences	Should
F-91	In-app notification centre	Should
F-92	Slack integration (lightweight, not core MVP unless trivial)	Could

Principle: Do not build a spam machine. Default notifications must be limited and meaningful. Do not overinvest here before core usage is proven.

4.13 API (Premium Platform Layer)

API access must be entitlement-driven and support separate monetisation from dashboard seats.

ID	Requirement	Priority
F-93	REST API with API key authentication	Must
F-94	Separate API permissions from dashboard access	Must
F-95	Separate API usage tracking and rate limits	Must
F-96	Endpoints: list sites, trigger scan, get scan results, get fixes, get citations	Must
F-97	Webhook registration for scan completion events	Should
F-98	API rate limiting enforced per organisation and per key (plan-based)	Must
F-99	API documentation (OpenAPI/Swagger)	Must
F-100	Architectural support for: bundled API for higher plans, standalone paid add-on later	Must

4.14 Webhook Delivery Requirements

ID	Requirement	Priority
F-101	Signed webhook payloads using per-org HMAC-SHA256 secret	Must
F-102	Exponential backoff retry policy: 1s, 5s, 30s, 5m, 30m (5 attempts max)	Must
F-103	Dead-letter behaviour: after max retries, mark as permanently failed, alert org admin	Must
F-104	Unique event ID (UUID) in every payload for consumer idempotency	Must
F-105	Supported event types: scan.completed, scan.failed, citation.changed, score.threshold_reached, report.generated	Must
F-106	Webhook delivery status visible in org settings (last delivery, success rate, failures)	Should
F-107	Per-webhook secret rotation without downtime	Should

4.15 Downgrade / Over-Entitlement Behaviour

When an account downgrades below its current usage or feature footprint:

Scenario	Behaviour
Excess sites (above new plan limit)	All existing sites remain readable; no new sites can be added until under limit
Excess scans/quota	Current period usage stands; no new billable actions until period resets or plan upgraded
Premium automation (scheduled scans, bulk)	Paused if plan no longer entitled; admin prompted to resolve
API keys above plan	Disabled (not deleted); admin prompted to reduce or upgrade
White-label features	Revert gracefully for new outputs only; existing reports/PDFs are NOT broken retroactively
Historical reports and PDFs	Never broken or retroactively modified by plan change
Admin notification	Org admin receives clear summary of overages and actions required

Principle: Downgrade is graceful, not destructive. Existing data and outputs are always preserved. Only new actions are restricted.

5. Non-Functional Requirements

5.1 Performance

ID	Requirement
NFR-01	Dashboard loads in <2 seconds after scan
NFR-02	First scan completes in <120 seconds
NFR-03	Cached scan returns in <60 seconds
NFR-04	PDF report generation in <10 seconds
NFR-05	API response time <500ms for non-scan data endpoints

5.2 Scalability

ID	Requirement
NFR-06	Support 500+ client sites per agency account
NFR-07	Support 50+ concurrent scan requests across the platform with per-org controls
NFR-08	Score history retained for 24 months
NFR-09	Citation snapshots retained for 12 months

5.3 Security

ID	Requirement
NFR-10	All data encrypted at rest (AES-256) and in transit (TLS 1.3)
NFR-11	API keys hashed, never stored in plaintext
NFR-12	Session tokens expire after 24 hours or per security policy
NFR-13	Role-based access enforced at API level for every endpoint
NFR-14	Audit log for administrative and sensitive actions
NFR-15	Strong tenancy enforcement - org A cannot access org B's data

5.4 Availability

ID	Requirement
NFR-16	99.5% uptime target for dashboard
NFR-17	Scan queue resilient to individual failures (retry logic, dead-letter handling)
NFR-18	Graceful degradation: dashboard shows last cached data if scan service unavailable

5.5 Platform Architecture

ID	Requirement
NFR-19	Clear service boundaries between SaaS layer and scan engine
NFR-20	Idempotent job handling for all async operations
NFR-21	Entitlement checks at API level for every request
NFR-22	Usage instrumentation on every billable event

6. Data Model

6.1 Entity Relationship Overview

Organisation (1) ──< (M) User
Organisation (1) ──< (M) ClientSite
Organisation (1) ──< (M) ApiKey
Organisation (1) ──< (M) WebhookDelivery
Organisation (1) ──< (1) BrandingConfig
Organisation (1) ──< (M) UsageEvent
Organisation (1) ──< (M) AuditLog
ClientSite (1) ──< (M) Scan
ClientSite (1) ──< (M) Competitor
ClientSite (1) ──< (M) CustomQuery
ClientSite (1) ──< (M) Report
Scan (1) ──< (M) CitationResult
Scan (1) ──< (M) FixItem

6.2 Core Tables

organisations

Column	Type	Notes
id	UUID	Primary key
name	VARCHAR(255)	Agency name
plan	ENUM	pro, agency, agency_plus
billing_email	VARCHAR(255)
stripe_customer_id	VARCHAR(255)
created_at	TIMESTAMP
settings_json	JSONB	Feature flags, overrides

users

Column	Type	Notes
id	UUID	Primary key
org_id	UUID	FK -> organisations
email	VARCHAR(255)	Unique
password_hash	VARCHAR(255)	bcrypt
role	ENUM	owner, admin, analyst, viewer
name	VARCHAR(255)
avatar_url	TEXT
last_login	TIMESTAMP
created_at	TIMESTAMP

client_sites

Column	Type	Notes
id	UUID	Primary key
org_id	UUID	FK -> organisations
domain	VARCHAR(255)	Normalised domain
name	VARCHAR(255)	Client display name
industry	VARCHAR(100)
assigned_user_id	UUID	FK -> users (nullable)
tags	TEXT[]	PostgreSQL array
folder	VARCHAR(255)	Grouping
scan_schedule	JSONB	{"frequency": "weekly", "day": "monday"}
created_at	TIMESTAMP

competitors (separate table for clean normalisation)

Column	Type	Notes
id	UUID	Primary key
site_id	UUID	FK -> client_sites
domain	VARCHAR(255)
label	VARCHAR(255)	Display name
created_at	TIMESTAMP

custom_queries (separate table for clean normalisation)

Column	Type	Notes
id	UUID	Primary key
site_id	UUID	FK -> client_sites
query	TEXT
created_by	UUID	FK -> users (nullable)
created_at	TIMESTAMP

scans

Column	Type	Notes
id	UUID	Primary key
site_id	UUID	FK -> client_sites
triggered_by	ENUM	manual, scheduled, api, bulk
triggered_by_user_id	UUID	FK -> users (nullable)
status	ENUM	queued, running, completed, failed
overall_score	INTEGER	0-100
tier	VARCHAR(50)
structure_score	INTEGER	0-100
clarity_score	INTEGER
authority_score	INTEGER
accessibility_score	INTEGER
category_scores	JSONB	Full breakdown
issues_json	JSONB	All detected issues
improvement_potential	INTEGER	Estimated max score gain
fresh_scan	BOOLEAN	True if new crawl, false if cached
duration_ms	INTEGER
started_at	TIMESTAMP
completed_at	TIMESTAMP
created_at	TIMESTAMP

citation_results

Column	Type	Notes
id	UUID	Primary key
scan_id	UUID	FK -> scans
query	TEXT	The AI prompt tested
model	VARCHAR(50)	chatgpt, perplexity, gemini
mentioned	BOOLEAN
position	INTEGER	1st, 2nd, etc. (null if not mentioned)
competitors_mentioned	TEXT[]
response_snapshot	TEXT	AI response text
cached	BOOLEAN	True if reused from prior check
created_at	TIMESTAMP

fix_items

Column	Type	Notes
id	UUID	Primary key
site_id	UUID	FK -> client_sites
scan_id	UUID	FK -> scans (source scan)
title	VARCHAR(255)
category	ENUM	technical, content, entity
impact	ENUM	high, medium, low
effort	ENUM	low, medium, high
explanation	TEXT	Why it matters (plain English)
steps	JSONB	Array of step objects
generated_asset	TEXT	Code/template to copy
estimated_gain	INTEGER	Estimated score points (labelled as estimate)
status	ENUM	not_started, in_progress, completed
owner_user_id	UUID	FK -> users (nullable)
completed_at	TIMESTAMP
change_history	JSONB	Array of status change events
created_at	TIMESTAMP

reports

Column	Type	Notes
id	UUID	Primary key
site_id	UUID	FK -> client_sites
type	ENUM	executive, diagnostic, competitor, progress, lead_gen
from_scan_id	UUID	FK -> scans
to_scan_id	UUID	FK -> scans (for progress reports, nullable)
share_token	VARCHAR(255)	For shareable links
share_expires_at	TIMESTAMP	Configurable expiry
share_revoked	BOOLEAN	Support for revocation
pdf_url	TEXT	CDN URL
branded	BOOLEAN
template_id	VARCHAR(50)	Report template identifier
created_at	TIMESTAMP

branding_configs

Column	Type	Notes
id	UUID	Primary key
org_id	UUID	FK -> organisations (1:1)
level	ENUM	none, basic, advanced, full
logo_url	TEXT
primary_colour	VARCHAR(7)	Hex
secondary_colour	VARCHAR(7)	Hex
font_family	VARCHAR(100)
custom_domain	VARCHAR(255)	CNAME
domain_verified	BOOLEAN
show_powered_by	BOOLEAN	Default true
email_sender	VARCHAR(255)	Custom sender (full white-label only)
created_at	TIMESTAMP

api_keys

Column	Type	Notes
id	UUID	Primary key
org_id	UUID	FK -> organisations
key_hash	VARCHAR(255)	SHA-256 of API key
key_prefix	VARCHAR(8)	First 8 chars for identification
name	VARCHAR(100)	User-given name
permissions	TEXT[]	Scopes (separate from dashboard access)
scopes	TEXT[]	API-specific scopes
last_used_at	TIMESTAMP
created_at	TIMESTAMP

usage_events

Column	Type	Notes
id	UUID	Primary key
org_id	UUID	FK -> organisations
user_id	UUID	FK -> users (nullable)
type	VARCHAR(100)	scan_fresh, scan_cached, citation_query, report_gen, api_request, webhook_delivery, share_view
billable	BOOLEAN	True for quota-consuming events; false for operational telemetry
quantity	INTEGER	Usually 1
metadata_json	JSONB	{"fresh": true, "model": "chatgpt", "site_id": "...", "cache_hit": false}
billing_period	VARCHAR(20)	"2026-04" format
created_at	TIMESTAMP

Billable events (consume plan quota): scan_fresh, citation_query, report_gen, api_request

Operational telemetry (do NOT consume quota): scan_cached, webhook_delivery, share_view, model_call, cache_hit, queue_wait_time

entitlements (plan limits as data, not code)

Column	Type	Notes
plan	VARCHAR(50)	Primary key
max_sites	INTEGER
max_scans_per_month	INTEGER	Fresh scans only; cached scans are free
max_citation_queries_per_site	INTEGER
max_custom_queries_per_site	INTEGER
max_competitors_per_site	INTEGER
max_team_members	INTEGER
report_quota_per_month	INTEGER
api_enabled	BOOLEAN
api_rate_limit_per_hour	INTEGER
webhook_enabled	BOOLEAN
webhook_max_retries	INTEGER	Default 5
white_label_level	ENUM	none, basic, advanced, full
custom_domain_enabled	BOOLEAN
scheduling_frequencies	TEXT[]	["weekly"], ["daily"], etc.
citation_models	TEXT[]	["chatgpt", "perplexity"], etc.
scan_cache_freshness_hours	INTEGER	Default 24
citation_cache_freshness_hours	INTEGER	Default 168 (7 days)

audit_logs

Column	Type	Notes
id	UUID	Primary key
org_id	UUID	FK -> organisations
actor_user_id	UUID	FK -> users (nullable)
action	VARCHAR(255)
target_type	VARCHAR(100)
target_id	UUID	Nullable
metadata_json	JSONB
created_at	TIMESTAMP

webhook_deliveries

Column	Type	Notes
id	UUID	Primary key
org_id	UUID	FK -> organisations
webhook_id	UUID	FK -> webhooks
event_id	UUID	Unique event ID for consumer idempotency
event_type	VARCHAR(100)	scan.completed, scan.failed, citation.changed, score.threshold_reached, report.generated
payload_hash	VARCHAR(255)
signature	VARCHAR(255)	HMAC-SHA256 using per-org webhook secret
status	VARCHAR(50)	pending, delivered, failed, dead_letter
attempts	INTEGER	Max 5 before dead_letter
last_attempt_at	TIMESTAMP
response_code	INTEGER	Nullable
created_at	TIMESTAMP

7. Commercial Control Rules

These rules are non-negotiable and apply across all implementation layers:

1. No feature with variable infrastructure cost may be implemented without entitlement hooks. Every feature that costs money to run (scans, citations, AI model calls, PDF generation, webhook delivery) must check entitlements before execution.

2. All premium capabilities must support plan-based enablement and later add-on monetisation. API access, white-label, custom domain, and advanced citation models must be architecturally prepared for standalone add-on pricing from day one, even if initially bundled.

3. Async jobs must re-check entitlement at execution time. Scheduled scans, bulk scans, report generation, and citation jobs must verify the org's current plan and quota when the job runs, not just when it was queued. This protects against downgrades, expired billing states, and queued over-usage.

4. Reports are immutable point-in-time snapshots. Generated reports store snapshot copies of all referenced data. They must not silently update based on newer scans or changed data.

5. Excess usage must be meterable even if billing is not yet enabled. Every billable event must be tracked and attributable to an org and billing period, even if Stripe metered billing is not yet connected. This ensures the accounting infrastructure is ready when billing is activated.

6. Downgrade behaviour must be graceful and deterministic. Existing data and outputs are never destroyed by plan changes. Only new actions are restricted. Admin is always notified of overages with clear resolution steps.

Protected commercial surfaces (require explicit entitlement, must not be casually bundled or left unmetered):

• Citation-heavy usage (multi-model, high query counts)

• API access (separate from dashboard)

• White-label / custom domain

• PDF report generation

• Webhook delivery

8. Data Ownership Boundaries

Explicit split between systems. No dual ownership.

SaaS Database (PostgreSQL) owns:

• Organisations, users, permissions, RBAC

• Client sites, tags, folders, assignments

• Competitors, custom queries

• Branding configs, API keys, webhooks

• Reports, share tokens

• Fix status, fix history, fix owners

• Usage events, entitlements, billing state

• Audit logs, notification settings

• Webhook delivery records

Scan System (existing sites.db) owns:

• Raw scan execution, timing, duration

• Raw analysis output (scores, issues, category data)

• Citation run results and AI response snapshots

• Scan artifacts (cached pages, extracted text)

• Scan queue state

Interface: Scan outputs are normalised and persisted into the SaaS layer via a defined write boundary. Scan system produces results; SaaS system consumes and stores them. No shared mutable state.

9. API Structure

9.1 Authentication

POST   /api/auth/login
POST   /api/auth/register
POST   /api/auth/refresh
POST   /api/auth/forgot-password
POST   /api/auth/reset-password

9.2 Organisation & Users

GET    /api/org
PATCH  /api/org
GET    /api/org/users
POST   /api/org/users/invite
PATCH  /api/org/users/:id
DELETE /api/org/users/:id

9.3 Usage & Billing

GET    /api/org/usage                     # Current period usage summary
GET    /api/org/usage/summary             # Aggregated usage stats
GET    /api/org/usage/by-category         # Breakdown by type
GET    /api/org/usage/history             # Historical usage by period
GET    /api/org/entitlements              # Current plan limits

9.4 Client Sites

GET    /api/sites                         # List all client sites (paginated, filterable)
POST   /api/sites                         # Add client site (entitlement check)
GET    /api/sites/:id                     # Get site details
PATCH  /api/sites/:id                     # Update site
DELETE /api/sites/:id                     # Remove site
POST   /api/sites/import                  # Bulk CSV import
GET    /api/sites/:id/summary             # Quick summary (latest scan, score, tier)

9.5 Scans

POST   /api/sites/:id/scans               # Trigger scan (usage metered)
POST   /api/scans/bulk                    # Trigger bulk scan
GET    /api/sites/:id/scans               # Scan history
GET    /api/scans/:id                     # Full scan results
GET    /api/scans/:id/compare             # Compare two scans

9.6 Citations

GET    /api/sites/:id/citations           # Latest citation results
POST   /api/sites/:id/queries             # Add custom query (metered)
DELETE /api/sites/:id/queries/:qid        # Remove custom query
GET    /api/citations/:id/response        # Full AI response snapshot

9.7 Fixes

GET    /api/sites/:id/fixes               # List fixes (filterable by status, category)
PATCH  /api/fixes/:id                     # Update fix status/owner
GET    /api/fixes/:id/asset               # Get generated asset (schema, llms.txt etc)
POST   /api/fixes/:id/rescan              # Re-scan after fix

9.8 Competitors

GET    /api/sites/:id/competitors         # List competitors
POST   /api/sites/:id/competitors         # Add competitor (entitlement check)
DELETE /api/sites/:id/competitors/:cid
GET    /api/sites/:id/comparison          # Side-by-side comparison data

9.9 Reports

POST   /api/sites/:id/reports             # Generate report (usage metered)
GET    /api/sites/:id/reports             # List reports
GET    /api/reports/:id                   # View report
GET    /api/reports/:id/pdf               # Download PDF
POST   /api/reports/:id/share             # Create share link (with expiry)
DELETE /api/reports/:id/share             # Revoke share link
GET    /api/shared/:token                 # Public share page (no auth)

9.10 Branding

GET    /api/org/branding                  # Get branding config
PATCH  /api/org/branding                  # Update branding (level-gated)
POST   /api/org/branding/verify-domain    # Verify custom domain

9.11 API Keys

GET    /api/org/api-keys                  # List API keys
POST   /api/org/api-keys                  # Create key
DELETE /api/org/api-keys/:id              # Revoke key

9.12 Webhooks

GET    /api/org/webhooks                  # List webhooks
POST   /api/org/webhooks                  # Register webhook
DELETE /api/org/webhooks/:id              # Remove webhook

9.13 Notifications

GET    /api/notifications                 # List notifications
PATCH  /api/notifications/:id             # Mark read
GET    /api/org/notification-settings     # Get notification preferences
PATCH  /api/org/notification-settings     # Update preferences

10. UI Screen Specification

10.1 Navigation (Agency Mode)

Left sidebar:

• Overview - Agency-wide dashboard

• Clients - Client site list and management

• [Client Name] - Expanded client submenu (when a client is selected):

- Dashboard

- Visibility Score

- AI Mentions

- Fixes

- Competitors

- Reports

• Settings - Account, team, branding, API, billing

10.2 Agency Overview Dashboard

Purpose: Single view of all client health. Must answer: What's happening? Why? What do I do next?

Layout:

• Row 1: Agency summary stats

- Total clients, Average score across all clients

- Clients needing attention (score dropped >5 pts or below Emerging)

- Scans run this month / allowance remaining (with approaching-limit visual)

• Row 2: Client score table (sortable, filterable)

- Columns: Client name, Domain, Score, Tier, Trend (delta), Last scan, Status tags

- Click row -> navigate to client dashboard

- Inline "Run scan" button per row

• Row 3: High-priority alerts

- Score drops >5 points, New issues detected, Citation status changes

• Row 4: Usage summary

- Scans used/remaining, Citation usage, Reports generated, API calls

10.3 Client Dashboard (per-site)

• Hero score card

• Recommendation status card

• Opportunity card

• Top 3 fixes

• Competitor snapshot

• Recent scan changes

• AI mentions snapshot

10.4 Client Visibility Score Screen

• Overall score with trend

• 4 sub-score cards: Structure, Clarity, Authority, Accessibility

• Accordion detail per category: score, what's wrong, why it matters, related fixes

10.5 Client AI Mentions Screen

• Summary bar: mentioned in X/Y queries, best/worst topic clusters

• Table: Query, Model, Mention status (chip), Competitors, View response button

• Right-side drawer for AI response snapshot with highlighted mentions

• Insight panel: topic gaps, competitor dominance patterns

10.6 Client Fixes Screen

• Top summary: total issues, high-impact count, estimated score gain (labelled as estimate)

• Two-pane layout:

- Left: prioritised fix queue with status badges and owner

- Right: fix detail panel (title, why it matters, what to do, generated asset, action buttons)

10.7 Client Competitors Screen

• Score comparison table (you vs competitors)

• Citation rate comparison

• Insight panel: "Competitor A is winning because..."

• CTA: "View detailed comparison report"

10.8 Client Reports Screen

• Generate report button (select type + date range)

• Report list with type, date, status

• Actions: View, Export PDF, Share link (with expiry control), Delete

10.9 Client Management (List)

• Search bar

• Filters: tag, score range, folder, assigned user

• Bulk actions: scan selected, add tag, assign to user

• Each row: name, domain, score chip, tags, last scan, actions

10.10 Settings

• Account settings (org name, billing email)

• Team management (invite, roles, remove)

• Branding (logo upload, colour picker, custom domain - level-gated)

• API keys (create, view prefix, revoke)

• Webhooks (register URL, select events)

• Notification preferences

• Billing (plan, usage, Stripe portal link)

• Usage dashboard (scans, citations, reports, API by category)

10.11 Monetisation Visibility

The product should tastefully show economic value:

• Usage remaining indicators

• What higher plans unlock (inline, not pushy)

• When bulk scanning or API would be valuable

• Report branding locked behind plan

• Custom domain as premium feature

• This is not dark-pattern upselling - it's helping agencies understand upgrade value

11. Pricing Model (Backend Support)

11.1 Three-Layer Pricing Structure

The backend must support this structure, even if initial launch uses flat tiers:

Layer 1: Core Subscription

• Workspace, team, included sites, included scans, included reporting

Layer 2: Usage Layer

• Overage scans, citation expansions, high-volume API usage

• Metered and visible

Layer 3: Premium Add-ons

• API / MCP access

• Advanced white-label

• Custom domain

11.2 Reference Tiers

Feature	Pro (£49/mo)	Agency (£149/mo)	Agency Plus (£299/mo)
Sites	3	25	100 included
Scans/month	50	500	2,000 included
Competitors/site	2	5	5
Team members	1	5	25
Reports/month	10	100	500 included
Branded reports	No	Yes	Yes
White-label domain	No	No	Yes
API access	No	Optional add-on	Yes or add-on
Webhooks	No	Optional add-on	Yes or add-on
Scheduled scans	Weekly	Daily	Daily
Citation queries/site	10	20	50
Custom queries/site	5	20	100 included
PDF export	Yes	Yes	Yes
Share links/month	10	100	500 included
Priority support	No	Yes	Yes

Note: Public pricing can use "fair use" language if desired, but internal entitlement architecture must not depend on true unlimited usage. Every "included" tier has an internal hard cap with overage accounting.

12. Implementation Layers

Implementation is split into four product layers. Each layer is a coherent deliverable. Do not start a layer until the previous one is validated.

Layer A: Agency Workspace Core (Weeks 1-3)

The foundation. Multi-tenant data model, auth, client management.

• Multi-tenant SaaS database (PostgreSQL)

• Organisations, users, RBAC (server-side enforced)

• Client site CRUD (add, edit, remove, list, group)

• Entitlement system with plan-defined limits

• Usage metering foundation on every billable event

• Agency overview dashboard (all-clients view)

• Team member invitation with role assignment

• Basic scan scheduling

• JWT authentication, session management

Layer B: Core Product Experience (Weeks 4-6)

The value engine. Scores, fixes, citations, competitors.

• Client dashboard (per-site overview)

• AI Visibility Score breakdown

• AI Citation Engine (predefined queries, 2-model MVP, response snapshots)

• Fix Engine (prioritised list, status tracking, owner, generated assets, change history)

• Competitor module (add competitors, side-by-side comparison, gap analysis)

Layer C: Agency Tooling (Weeks 7-9)

The commercial output. Reports, branding, notifications, bulk operations.

• Report generation (executive, diagnostic, progress, lead-gen)

• PDF export

• Share links with expiry and revoke

• Branded reports (Level 1 white-label)

• Lead-gen prospect audit reports

• Bulk scan triggers

• Notification system (email alerts for score changes, scan completions)

• Usage dashboard in settings

• Monetisation visibility (approaching limits, plan upgrade hints)

Layer D: Platform Layer (Weeks 10-12)

The extensibility. API, webhooks, advanced white-label. Only after demand is proven.

• Custom domain support (CNAME verification)

• Advanced white-label (Level 2-3)

• REST API with key authentication (separate permissions, separate metering)

• API documentation (Swagger/OpenAPI)

• Webhook registration and delivery

• Scheduled report email delivery

• 3-model citation expansion

• Custom query metering as premium resource

• Slack notification integration

• Overage billing support

13. Technical Architecture

13.1 Current Stack (What Exists)

• Backend: Node.js / Express on port 3470

• Database: SQLite (`sites.db`) via better-sqlite3

• Frontend: Static HTML served via Cloudflare Pages (`searchscore/landing/`)

• Auth: None (public audit tool with admin panel behind password)

• Payments: Stripe (price IDs configured, webhook live)

13.2 Required Architecture Changes

This is a platform rewrite, not a feature extension. Architectural decisions must reflect that.

Authentication Layer:

• JWT-based auth with refresh tokens

• Server-side RBAC enforcement on every endpoint

• Audit logging for admin actions

Database:

• PostgreSQL for SaaS data (users, orgs, clients, reports, fixes, usage, entitlements)

• Keep existing `sites.db` for raw scan data with explicit ownership boundary

• Turso/libSQL acceptable only if concurrency and migration strategy are explicit

• No dual ownership of scan results

Frontend:

• React/Next.js SPA for authenticated agency app

• Public landing page stays on Cloudflare Pages

• Agency app on `app.searchscore.io`

Job Queue:

• Redis/BullMQ with: retries, dead-letter handling, per-org concurrency limits

• Plan-based queue priority

• Idempotent job execution

• Cancellation support

Report Generation:

• Server-side PDF via Playwright

• Template engine for branded reports

• CDN-backed storage (R2)

Billing/Usage Layer:

• Entitlement service (reads from entitlements table, not hardcoded)

• Usage tracking on every billable event

• Quota enforcement at API level

• Plan-aware access checks

13.3 Infrastructure

Cloudflare CDN
  +-- searchscore.io (landing page, static)
  +-- app.searchscore.io (agency SPA)
        |
API Gateway (Express)
  +-- /api/auth/*   (authentication, RBAC enforcement)
  +-- /api/sites/*  (client management, entitlement checks)
  +-- /api/scans/*  (scan orchestration, usage metering)
  +-- /api/reports/* (report generation, usage metering)
  +-- /api/v1/*     (public API, separate auth, separate metering)
        |
   +----------+----------+
   |                     |
SaaS DB (PostgreSQL)  Scan Engine (existing)
Users, Orgs,          sites.db
Clients, Reports,     Raw scan execution
Fixes, Usage,         Citation runs
Entitlements          Scan artifacts
BrandConfig           Queue state
   |                     |
   +----------+----------+
              |
Job Queue (Redis/BullMQ)
  +-- Scan jobs (on-demand, scheduled, bulk)
  +-- Report generation jobs
  +-- Citation check jobs
  +-- Webhook delivery jobs
  +-- Per-org rate limiting & concurrency

14. Risks & Open Questions

14.1 Open Questions

1. Final tier naming model - Adopt Competitive/Dominant or keep Strong/AI-Ready?

2. Database strategy: PostgreSQL now (recommended), or Turso/libSQL for faster initial delivery?

3. Existing users: How do current Stripe subscribers (Full Report, Monitor) map to new plan structure?

4. Scan cost per model: Citation checks across 3 AI models x 20 queries = 60 API calls per scan. Cost model?

5. Custom domain support burden: DNS verification via CNAME requires agency IT cooperation. Start in Layer D only.

6. White-label leak prevention: What audit process ensures SearchScore identity never leaks on full white-label?

7. API pricing: Should API be bundled into Agency Plus or sold as a standalone add-on?

8. White-label split: Should white-label be split into branding, branded share pages, and custom domain as separate entitlements?

9. Usage pooling: Should scheduled scans and high-volume citation checks consume pooled usage credits?

10. Overage billing: Stripe metered billing integration - implement in Layer D or defer?

14.2 Risks

Risk	Impact	Mitigation
Scan queue overload from bulk scans	High	Per-org rate limiting, plan-based concurrency, staggered queue
AI API costs at scale	High	Aggressive caching, plan-based query/model limits, usage metering
SQLite concurrency in multi-user SaaS	High	Layer A must migrate to PostgreSQL
API cannibalises dashboard value	Medium	Entitlement-based API, separate usage metering, possible add-on pricing
Scope creep into SEO/marketing	Medium	Strict backlog grooming against core loop
One agency degrading others' performance	High	Queue isolation, plan-based priority
White-label identity leakage	Medium	Audit checklist for every external surface
White-label support burden	Medium	Feature-tiered white-label, self-serve docs, premium pricing
"Unlimited" marketing vs actual costs	Medium	Internal controls always enforce caps; marketing is separate concern

15. Acceptance Criteria

15.1 MVP Acceptance Criteria (Layer A + B)

1. An agency owner can sign up, add 5 clients, invite 2 team members, and run scans on all clients

2. A team member with "analyst" role can view clients and run scans but cannot change billing or manage team

3. Every API endpoint enforces RBAC server-side; role leakage is not possible

4. The agency overview dashboard shows all clients sorted by score with trend indicators

5. Citation checks run against minimum 2 AI models and show mention status

6. Fix engine shows prioritised fixes with generated assets, status tracking, and owner

7. Competitor comparison shows side-by-side scores with gap analysis

8. Usage is tracked per organisation and exposed in the billing/settings area

9. Entitlements are enforced via entitlements table, not scattered business logic

10. An agency can show before/after improvement clearly enough for a client to understand it in under 2 minutes

15.2 Phase 2 Acceptance Criteria (Layer C + D)

1. A user can generate a branded PDF report for any client and share it via link with configurable expiry and revoke

2. A user can view before/after score comparison between any two scans

3. An agency can generate a prospect audit and use it in a sales workflow

4. Scheduled weekly scans run automatically and send email notification on completion

5. API allows triggering a scan and retrieving results programmatically

6. Webhooks can be registered and delivered reliably

7. White-label branding works correctly on client-facing outputs without leaking SearchScore identity

8. One large agency cannot degrade platform performance for others (queue isolation verified)

9. API usage is correctly metered and visible in billing usage

10. Share links support expiry and revoke; reports never expose internal metadata

11. Estimated score gains on fixes are clearly labelled as estimates

16. Strategic Direction: Tool First, Platform Second

This is the most important strategic instruction for the build.

SearchScore must first succeed as:

• A high-value agency tool

• A client reporting and audit engine

• A retention and sales support system

Only then expand into:

• Infrastructure

• Programmatic access

• Deeper automation

Optimise the build for:

• Strongest agency UI

• Fastest time to value

• Best reporting experience

• Strongest fix workflow

NOT for:

• Maximum platform complexity on day one

• Generic dashboard bloat

• Feature completeness over commercial clarity

Non-negotiable guardrails:

• No accidental margin traps

• No fake unlimited plans

• No underpriced white-label

• No casual API giveaway

• No generic dashboard bloat

• No loss of category focus

The final product should feel like:

"The clearest, most commercially useful way for an agency to show why a client is not being recommended by AI, fix it, and prove the result."

Not:

• a generic SEO dashboard

• a broad marketing platform

• a raw analytics console

• a cheap white-label data utility

*End of specification.*