← PingOps§ PingOps · Story · 2026-06
Monitoring that finds you, not the other way around.
By Paolo Javier · Founder, Arkynate Labs
Why PingOps exists, what it does today, and where API latency, cross-service traversal, and query-slowness detection take it next.
UPTIMEiOSCLOUDFLARE WORKERSAPNsOPERATIONAL MONITORING
§01
The cost of reactive monitoring
Every outage has a hidden second clock running underneath it. The first clock starts when the service breaks. The second clock starts when somebody who can do something about it actually finds out.
In a healthy engineering org, those two clocks are seconds apart. In most orgs — especially smaller teams, multi-vendor stacks, and anywhere "operations" is somebody's second job — they're minutes, sometimes hours. A scheduled probe misses, a dashboard nobody is watching turns red, an email lands in a shared inbox, a chat channel pings the wrong on-call rotation, a customer files the first ticket. By the time the right person has a laptop open, the damage is already in the metrics.
PingOps was built to collapse that gap to a single notification, on the device its owner already carries.
§02
Why this app, from this builder
PingOps started as an architect's tool for an architect's problem. The recurring frustration was the same across every IT landscape: reactive monitoring with delay baked into every link of the chain — detection delay, alerting delay, channel-routing delay, escalation delay. Even great enterprise stacks suffer from it, because the moment a critical website or upstream API misbehaves, somebody still has to be in front of a screen for any of it to matter.
The pitch is narrow on purpose. PingOps doesn't try to replace a synthetic-monitoring platform or an APM suite. It answers the one question that, in a real outage, almost always comes first: is the thing up, right now, and did I know about it before the customer did? It answers that question from the phone in your pocket, with no dashboard to bookmark and no console to log into.
§03
What PingOps does today
Two ways to run checks. Local runs the checks on-device — manual or automatic while the app is open. No account, no telemetry, nothing leaves the phone. Cloud runs the same checks on a server-side scheduler with Sign in with Apple, so they keep firing while the phone is closed. Cloud cadence scales by tier: 5-minute for Cloud Free, 1-minute for Cloud Pro and Archon.
Add an endpoint in seconds. Paste a cURL command and PingOps parses the method, headers, and body. Copy a request to the clipboard and the app surfaces it as a one-tap import. Or open a
pingops:// deep link a teammate shared — the endpoint, its assertions, and its tier-appropriate cadence land already configured.Validation, not just a ping. Every check supports GET / POST / PUT / DELETE / HEAD, and three layers of assertion: HTTP status code match, response-body keyword search, and JSON-field equality. A 200 OK that ships an error blob in the body isn't healthy, and PingOps treats it that way. A failure classifier turns generic network errors into specific reasons — DNS resolution, TLS handshake, connection refused, bot-defence reject, status mismatch — so the alert tells you where to look.
Lock-screen visibility. When an endpoint enters a failing streak, PingOps starts a Live Activity. The active incident is glanceable from the lock screen without unlocking the phone, with Snooze 10m / Snooze 1h / Open actions wired through App Intents. The Dynamic Island carries the same surface on supported devices.
Alerts that respect attention. Notifications default to aggregate, not per-endpoint — one rolling count rather than a spam cascade. Pro and Archon unlock per-endpoint pushes with reminders for the duration of the incident. Streak thresholds (alert after N consecutive failures) are configurable per endpoint. A weekly SLO digest lands every Sunday for Pro and Archon, summarising the fleet's last seven days without anyone having to open a dashboard.
Two-level groups, SLOs, and drag-and-drop. Production, Staging, Personal — with optional inner groups. Each group carries a tinted accent rail, an aggregate health pill, and an SLO target with 1h / 6h / 24h / 7d windows. Fleet-wide trends — MTTR, p50 / p95 latency, incident timeline — live one tap away. Endpoints reparent by drag.
Public status pages. Cloud Pro and above can publish a tokenized status page at
arkynate.com/pingops/status/<token> — read-only, sparkline + uptime + last incident, instantly revocable. The token rotates on republish; the old URL stays dead.Frictionless sharing. A checked-and-validated endpoint can be copied as cURL, JSON, or a
pingops:// deep link. One tap on the receiving phone imports it. Auth-token redaction is on by default — accidental shares don't leak credentials.Private by design. Custom request headers and bodies are encrypted at rest with per-user AES-256-GCM, derived via HKDF from the Sign in with Apple identifier. An operator with raw database access still can't read the tokens. No analytics SDK, no ad SDK, no telemetry. Available in English, Japanese, Spanish, and German.
Cloud Archon adds on-device SSL/TLS expiry tracking, 90-day history, and Family Sharing across five devices.
Built on edge infrastructure. The Cloud backend is a Cloudflare Workers proxy on a single global D1 database, dispatching checks through Cloudflare Queues every minute and delivering alerts via APNs HTTP/2. There's no provisioning to lose sleep over, and there's no third-party analytics in the path between an endpoint failing and a notification arriving.
Honest about limits. Local mode only runs while the app is open — iOS doesn't allow third parties to poll continuously in the background, which is exactly the gap Cloud closes. Bot-defended hosts (Cloudflare-protected, Akamai-protected, large platforms) often reject automated cloud-IP traffic; PingOps surfaces a tailored failure reason and recommends pointing at a
/health or /status endpoint.§04
What's coming next
The next phase of PingOps moves from is it up to is it fast enough, end-to-end — keeping the same glance-from-your-phone constraint. Fleet-wide latency percentiles already exist inside the Trends surface; the next train turns them into a first-class operational signal.
Per-endpoint latency detectors. Configurable p50 / p95 / p99 thresholds per endpoint, with a sustained-breach alert path that reuses the same Live Activity and aggregate-push pipeline today's failure alerts ride on. The unit of regression becomes the endpoint, not the fleet — so a single slow upstream surfaces immediately instead of disappearing into a healthy global average.
Cross-service traversal latency. Chain monitoring: model a request that fans out through N services (auth → API → AI provider → storage) and watch the cumulative latency budget. PingOps highlights which hop slowed down, so you don't have to guess whether the regression is in your code or in an upstream you don't control. The same status-page surface that today renders uptime will render traversal latency for chains you choose to publish.
Query-slowness detection. For checks that hit a database-backed endpoint, PingOps will track per-query response-time distributions on the server side and alert on percentile drift — no agent install, no schema rewrites, no APM-grade integration required. Opt-in per endpoint; works against the same simple HTTP contract the rest of the product already uses.
All three are being designed to the same brief as the original product: operational, glanceable, honest about its limits, and aimed at the engineer who'd rather get a notification than open a dashboard.
§05
Glance from your phone
The motivating belief hasn't changed. The first person to know about an outage should be the person who can do something about it, and they shouldn't need a laptop to find out. PingOps is what that belief looks like when you ship it.
Glance from your phone. Free to start.
iPhone · iPad · Mac (M1+) · Vision Pro · iOS 18+.
▶ Download on App StoreAndroid in development — coming to the Play Store
Incoming transmission
Need a tool, or a system that needs building?
Try the Arsenal, or send a short note about the problem you're fighting and a rough timeline. Every transmission gets a reply within 2 business days.