Open & auditable — documented directly from source code

How Captxa actually works

No black boxes. The following is a complete technical walkthrough of the backend, documented directly from the production C source. Every constant, every check, every algorithm — exactly as it runs.

Two verification paths

Every request is triaged on arrival. Clean traffic goes through an invisible path. Suspicious signals escalate to a human-visible challenge.

Path A — Invisible PoW

~95% of real users · zero interaction required

  1. 1Browser fingerprint collected & sent to /challenge/simp
  2. 2Server runs triage: IP, JA4, rate-limit, bot-detector
  3. 318-bit SHA-256 PoW challenge issued in encrypted token
  4. 4Browser solves in Web Worker (<50 ms on modern hardware)
  5. 5Solution + mouse trajectory sent to /solve/simp
  6. 6Ed25519 pass token returned in x-captcha-token header

Path B — Sliding Puzzle

Suspicious signals only · user drags a piece

  1. 1Escalation triggered by suspicious IP / JA4 / rate-limit / bot-score
  2. 2Server generates randomised puzzle image + 19-bit PoW challenge
  3. 3Solution coordinates encrypted into COMP|… token
  4. 4Browser solves PoW in Worker + user drags piece to gap
  5. 5PoW + puzzle position + drag trajectory sent to /solve/complex
  6. 6Multi-stage ML trajectory analysis; Ed25519 pass token issued on pass
Step 1 — Triage

What happens at POST /challenge/simp

Before any PoW is issued, the server runs four independent checks. Failing any one silently escalates the request to the complex path — the client is never told which check triggered it.

Check 1 — Suspicious IP

The client IP is looked up in a probabilistic Bloom filter of known bad actors. The filter is reset hourly to prevent false-positive accumulation. Marked IPs are hard-blocked from the simple path.

ip_bloom_is_suspicious(user_ip)

Check 2 — JA4 TLS Fingerprint

Every TLS handshake produces a JA4 fingerprint computed directly from the OpenSSL SSL struct. The fingerprint is matched against the declared User-Agent. A Chrome JA4 hash paired with a Firefox UA string is an immediate escalation signal.

ja4_matches_ua(client_browser, ja4)

Check 3 — Rate Limiter (CMS)

Request frequency is tracked with a Count-Min Sketch keyed on ip | ja4 | ja4o | domain. Exceeding the threshold does not block — it silently routes to the complex path. The CMS is reset on a periodic timer.

cms_increment_and_get(rl_key) > CMS_LIMIT

Check 4 — Browser Environment

The browser reports a fingerprint bundle. The server inspects: navigator.webdriver, missing Chrome runtime (ischromeruntimemissing), an error stack tripwire for automation frameworks (Puppeteer/Playwright/Selenium), screen dimensions, WebGL renderer, device memory, and hardware concurrency.

browser fingerprint fields collected by the JS widget
// Sent as JSON body to POST /challenge/simp
{
  "webglrenderer":           "NVIDIA GeForce RTX 4090/PCIe/SSE2",  // GPU string from WebGL
  "timezone":                "Europe/Barcelona",
  "hardwareconcurrency":     16,
  "innerw":                  1920,   "innerh": 1080,
  "availw":                  1920,   "availh": 1040,
  "devicememory":            8,
  "webdriver":               false,                // navigator.webdriver
  "ischromeruntimemissing":  false,                // chrome.runtime absent in Chrome = headless
  "errorstacktripwire":      false                 // stack trace contains puppeteer/playwright/selenium
}
Step 2 — Proof of Work

SHA-256 Proof of Work

PoW is the economic firewall. A bot must do real CPU work for every challenge, making mass-automation orders of magnitude more expensive than simply blocking it.

18 bits

Simple path difficulty

≈ 262,144 SHA-256 hashes on average

19 bits

Complex path difficulty

≈ 524,288 SHA-256 hashes on average

180 s

Challenge TTL

Token rejected after 3 minutes

PoW algorithm — runs in a Web Worker (JavaScript)
// Worker receives: { challengeHex: "32 hex chars (16 bytes)", difficulty: 18 }
// Target: SHA256( challenge_bytes || nonce_uint32_LE || 0x00000000 ) must have
//         at least `difficulty` leading zero bits.

const buf  = new Uint8Array(40);   // 16B challenge + 4B nonce + 4B padding
buf.set(challenge_bytes, 0);      // challenge occupies bytes 0–15

let nonce = 0;
while (nonce < 0x100000000) {
  view.setUint32(32, nonce, true);  // little-endian nonce at bytes 32–35
  const hash = sha256(buf);
  if (leadingZeroBits(hash) >= difficulty) {
    self.postMessage({ type: "done", nonce });   // ✓ found
    return;
  }
  nonce++;
  if ((nonce & 8191) === 0) self.postMessage({ type: "progress", pct });
}

SHA-256 is implemented entirely in JavaScript inside the Worker blob — no crypto.subtle API, no external dependencies. The Worker is compiled on-the-fly from a Blob URL.

Step 3 — Token Cryptography

Challenge token design

The challenge token is a tamper-evident, IP-bound, time-limited container. It travels from server to browser and back — but the browser never knows what is inside it.

Encryption — ChaCha20-Poly1305

  • 32-byte ephemeral key, generated with RAND_bytes() at server startup
  • 12-byte random nonce per token, prepended to ciphertext
  • 16-byte Poly1305 authentication tag — any byte flip is detected
  • Wire format: nonce(12) ‖ ciphertext ‖ tag(16) → base64url
  • Per-thread EVP_CIPHER_CTX lazy-initialised once, reused forever — zero allocation per request

Plaintext content

Simple token

SIMP|<ip>|<ja4>|<ja4o>|<unix_ts>|<pow_hex32>

Complex token

COMP|<ip>|<ja4>|<ja4o>|<unix_ts>|<pow_hex32>|<sol_x>|<sol_y>

The complex token embeds the correct puzzle solution coordinates — the client never sees sol_x / sol_y. Server decrypts and checks |user_x − sol_x| ≤ 7 pixels.

What the server verifies on every /solve request (in order)

① MAC authentication

ChaCha20-Poly1305 AEAD tag verified — any tampering fails here

② Bloom replay check

Raw ciphertext bytes checked & inserted into Bloom filter — token reuse is blocked

③ IP binding

Token IP must exactly match the current request's IP — no cross-device reuse

④ JA4 binding

Both ja4 and ja4o must match challenge-time fingerprints

⑤ TTL check

Token age > 180 s → rejected. Future timestamps also rejected

⑥ PoW solution

Server recomputes SHA-256(challenge‖nonce) and checks leading zero bits

Step 4 — Trajectory Analysis

Multi-stage mouse trajectory ML pipeline

Every solve request includes the raw pointer trajectory as [x, y, timestamp_ms] triplets. This goes through five analysis stages. A bot score in [0.0, 1.0] is returned — lower is more human.

Simple threshold bot_score < 0.30 to pass
Complex threshold bot_score < 0.50 to pass (more lenient — user already solved puzzle)
1

Integrity filters

Minimum point count enforced. Trajectories that are too short (fewer than the minimum required points), completely static, or have zero time duration are rejected outright before any statistical work begins.

ERR_INTEGRITY → {"valid":false,"error":"integrity_filters"}
2

Burstiness analysis

Bot-generated trajectories often arrive in time-aligned bursts — many points at uniform intervals, or large gaps followed by dense clusters. This stage measures temporal inter-event variance to detect scripted mouse injection.

ERR_BURSTINESS → {"valid":false,"error":"burstiness_failed"}
3

Sample entropy + jerk

Human movement has measurable complexity. Sample entropy of the velocity signal detects suspiciously smooth or repetitive paths. Jerk (third derivative of position) measures abruptness — humans exhibit natural micro-corrections that automated movement lacks.

ERR_ENTROPY → {"valid":false,"error":"sample_entropy_failed"}
4

Fitts' law validation

Fitts' law predicts that movement time to a target scales with distance and inversely with target size. Human drag trajectories follow this psychomotor law; bot trajectories (linear interpolation, constant velocity, etc.) do not. The score penalises deviations from the expected human movement model.

ERR_FITTS → {"valid":false,"error":"fitts_law_failed"}
5

Velocity coefficient of variation

The coefficient of variation (CV) of the instantaneous velocity signal distinguishes human deceleration profiles from scripted constant-speed drags. Humans naturally accelerate and decelerate; bots don't. Final bot score is aggregated across all stages.

ERR_VELOCITY → {"valid":false,"error":"velocity_check_failed"}
Step 5 — Pass Token

Ed25519 signed pass token

A passing solve emits a signed pass token in the x-captcha-token response header. Your frontend stores it and your backend validates it against POST /api/validate.

POST /api/validate — server-side call
// Never call this from the browser.
const res = await fetch('https://api.captxa.com/api/validate', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    captcha_token: req.body.captchatoken,
    secret_key:   process.env.CAPTXA_SECRET_KEY
  })
});
const data = await res.json();
if (!data.Is_Correct) {
  return res.status(403).json({ error: 'Bot detected' });
}
API response reference
// ✅ Valid
{ "Is_Correct":true, "RequestLimit":false, "requests":1 }

// ❌ Invalid / tampered
{ "Is_Correct":false, "reason":"invalid_token" }

// ⚠️ Reused too many times (HTTP 429)
{ "Is_Correct":true, "RequestLimit":true, "requests":12 }

Token usage counting

Each token is tracked server-side by a usage counter. requests tells you how many times this token has been validated. RequestLimit: true (HTTP 429) means the token has been reused beyond the allowed maximum — this is a signal of token harvesting or replay. Your backend should treat a 429 response as a rejection.

Infrastructure

Server architecture

Written from scratch in C on the H2O HTTP server with a libuv event loop backend. No GC, no managed runtime — every request is handled in a tight zero-allocation path.

8

Worker threads

Each pinned to a CPU core via pthread_setaffinity_np

TLS 1.3

Only

SSLv2/3, TLS 1.0, TLS 1.1 explicitly disabled. Session cache: 20,480 entries

5 min

UDP telemetry flush

Events batched in groups of 15, sent to analytics server via UDP

1 hr

Bloom filter reset

Replay-prevention Bloom filter cleared hourly by worker-0 timer

Route table — registered in main()

POST/challenge/simp
POST/solve/simp
GET/challenge/complex
POST/solve/complex
GET/api/stats
POST/api/validate

Per-worker timer schedule (worker-0 only)

1 hour Bloom filter reset — clears replay-prevention state
5 min UDP telemetry flush — sends batched CAPTCHA events to analytics
periodic CMS rate-limit counter reset — resets Count-Min Sketch buckets
periodic Token validate counter reset — clears per-token usage counts
Privacy & Data

What data is collected — and what isn't

Captxa is designed to answer one binary question: is this request from a human? — and then discard everything it collected to answer it.

Collected (transiently, during verification only)

  • Client IP — bound into token, checked on solve, not stored
  • JA4 TLS fingerprint — bound into token, checked on solve, not stored
  • Browser environment bundle — checked once, never persisted
  • Mouse trajectory — analysed in memory, immediately discarded
  • UDP telemetry: domain, ip, passed/failed, timestamp — aggregated statistics only

Never collected

  • No persistent cookies set on the end-user's browser
  • No cross-site tracking pixels or third-party iframes
  • No user identifiers, account data, or behavioural profiles
  • No data sold to or shared with third parties
  • No external CDN calls — the ~3 KB script is fully self-contained

All processing happens on EU-hosted servers in Nuremberg, Germany. Full Data Processing Agreement and GDPR compliance documentation available.

Ready to integrate?

Three lines of HTML and one server-side fetch. That's the entire integration.