Observability operations

Sitemap Health Observability Guide

Sitemap Health Observability Guide for monitoring ZartsAlgo public pages, admin runtime, client portal proof, provider syncs, traffic rollups, incident response, and large-volume growth operations.

Monitoring contract

What this guide should protect.

Use this page before connecting uptime checks, page-speed budgets, Search Console syncs, QuickBooks data, Thumbtack or Angi imports, traffic rollups, admin alerts, or client portal proof cards.

Owner

One team owns the signal

Operations owns cadence, threshold review, incident notes, and the client-safe summary rule.

Failure mode

Name what can go wrong

The main risk is exposing private operational logs or provider payloads while trying to show useful client-facing status.

Data boundary

Separate raw logs from proof

Raw provider payloads, admin events, login attempts, and customer private data stay internal; only verified summaries can reach the portal.

Scale rule

Watch aggregates first

For large volume, query rollups, daily facts, and summarized states before raw events or provider payload tables.

Signals

Metrics and log events to wire first.

Each signal should be measurable, attributable to one owner, and safe to summarize without exposing raw credentials, payloads, or customer details.

Signal 1

Daily rollup completion

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 2

Report freshness

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 3

Source coverage freshness

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 4

Backup age

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 5

Restore drill result

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 6

Sitemap integrity

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 7

Robots and noindex integrity

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 8

Database probe result

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 9

Prepared query error rate

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Signal 10

Client-safe proof publication state

Define source, threshold, owner, silence window, evidence capture, rollback action, and portal-safe status before this signal becomes production monitoring.

Baseline checks

Minimum checks before this becomes live monitoring.

  • Confirm the target URL, owner, and expected status before adding an alert.
  • Use a short threshold that can trigger a human decision, not a vague warning.
  • Store raw event payloads separately from portal-safe report summaries.
  • Record last good timestamp, affected client, affected provider, and recovery action.
  • Keep alert routes separate for frontend, backend, data quality, security, and client success.
  • Run desktop and mobile smoke checks after any public, admin, or portal layout change.
  • Use traffic rollups and provider summaries before expensive raw-table queries.
  • Document what the client can safely see if a report or provider source is stale.
  • Pause publication of suspect proof until a clean sync, import, or recalculation completes.
  • Keep screenshots and logs free of tokens, credentials, and private customer details.

Topic checks

Specific checks for this topic.

  • Daily rollup completion
  • Report freshness
  • Source coverage freshness
  • Backup age
  • Restore drill result
  • Sitemap integrity
  • Robots and noindex integrity
  • Database probe result
  • Prepared query error rate
  • Client-safe proof publication state

Data model hooks

Where this signal can connect when MySQL is live.

The public page stays static. Runtime reads should happen inside admin or portal routes through prepared queries and summarized tables.

Data hook 1

metric_snapshots

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Data hook 2

report_metrics

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Data hook 3

integration_accounts

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Data hook 4

integration_sync_runs

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Data hook 5

integration_sync_errors

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Data hook 6

webhook_events

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Data hook 7

provider_import_batches

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Data hook 8

traffic_pageviews_daily

Use this table or runtime source as a summarized input. Keep raw records behind prepared queries and expose only client-safe fields.

Runbook steps

How the team should respond when the signal changes.

Small, repeatable response steps keep alerts useful when volume grows past thousands of clients and millions of events.

Step 1

Operational checkpoint

Keep alert routes separate for frontend, backend, data quality, security, and client success.

Step 2

Operational checkpoint

Run desktop and mobile smoke checks after any public, admin, or portal layout change.

Step 3

Operational checkpoint

Use traffic rollups and provider summaries before expensive raw-table queries.

Step 4

Operational checkpoint

Document what the client can safely see if a report or provider source is stale.

Step 5

Operational checkpoint

Pause publication of suspect proof until a clean sync, import, or recalculation completes.

Step 6

Operational checkpoint

Keep screenshots and logs free of tokens, credentials, and private customer details.

Step 7

Operational checkpoint

Confirm the target URL, owner, and expected status before adding an alert.

Step 8

Operational checkpoint

Use a short threshold that can trigger a human decision, not a vague warning.

Integration hooks

Connect the guide to admin, portal, provider, and traffic operations.

These hooks turn a static guide into a runtime workflow once the database-backed admin and client portal are connected.

Hook 1

Admin audit trail

Connect this guide to the admin audit trail so the admin view can explain when the signal changed and who verified it.

Hook 2

Portal report period

Connect this guide to the portal report period so the admin view can explain when the signal changed and who verified it.

Hook 3

Provider sync run

Connect this guide to the provider sync run so the admin view can explain when the signal changed and who verified it.

Hook 4

Traffic daily rollup

Connect this guide to the traffic daily rollup so the admin view can explain when the signal changed and who verified it.

Hook 5

Import error summary

Connect this guide to the import error summary so the admin view can explain when the signal changed and who verified it.

Hook 6

Metric snapshot

Connect this guide to the metric snapshot so the admin view can explain when the signal changed and who verified it.

Hook 7

Incident record

Connect this guide to the incident record so the admin view can explain when the signal changed and who verified it.

Hook 8

Maintenance task

Connect this guide to the maintenance task so the admin view can explain when the signal changed and who verified it.

Hook 9

Backup manifest

Connect this guide to the backup manifest so the admin view can explain when the signal changed and who verified it.

Hook 10

Deployment smoke result

Connect this guide to the deployment smoke result so the admin view can explain when the signal changed and who verified it.