How Captxa actually works
No black boxes. The following is a complete technical walkthrough of the backend, documented directly from the production C source. Every constant, every check, every algorithm — exactly as it runs.
Two verification paths
Every request is triaged on arrival. Clean traffic goes through an invisible path. Suspicious signals escalate to a human-visible challenge.
Path A — Invisible PoW
~95% of real users · zero interaction required
-
1Browser fingerprint collected & sent to
/challenge/simp - 2Server runs triage: IP, JA4, rate-limit, bot-detector
- 318-bit SHA-256 PoW challenge issued in encrypted token
- 4Browser solves in Web Worker (<50 ms on modern hardware)
-
5Solution + mouse trajectory sent to
/solve/simp -
6Ed25519 pass token returned in
x-captcha-tokenheader
Path B — Sliding Puzzle
Suspicious signals only · user drags a piece
- 1Escalation triggered by suspicious IP / JA4 / rate-limit / bot-score
- 2Server generates randomised puzzle image + 19-bit PoW challenge
-
3Solution coordinates encrypted into
COMP|…token - 4Browser solves PoW in Worker + user drags piece to gap
-
5PoW + puzzle position + drag trajectory
sent to
/solve/complex - 6Multi-stage ML trajectory analysis; Ed25519 pass token issued on pass
What happens at
POST /challenge/simp
Before any PoW is issued, the server runs four independent checks. Failing any one silently escalates the request to the complex path — the client is never told which check triggered it.
Check 1 — Suspicious IP
The client IP is looked up in a probabilistic Bloom filter of known bad actors. The filter is reset hourly to prevent false-positive accumulation. Marked IPs are hard-blocked from the simple path.
Check 2 — JA4 TLS Fingerprint
Every TLS handshake produces a JA4 fingerprint computed directly from the OpenSSL SSL struct. The fingerprint is matched against the declared User-Agent. A Chrome JA4 hash paired with a Firefox UA string is an immediate escalation signal.
Check 3 — Rate Limiter (CMS)
Request frequency is tracked with a
Count-Min Sketch
keyed on
ip | ja4 | ja4o | domain. Exceeding the threshold does not block — it
silently routes to the complex path. The CMS is
reset on a periodic timer.
Check 4 — Browser Environment
The browser reports a fingerprint bundle. The server inspects: navigator.webdriver, missing Chrome runtime (ischromeruntimemissing), an error stack tripwire for automation frameworks (Puppeteer/Playwright/Selenium), screen dimensions, WebGL renderer, device memory, and hardware concurrency.
// Sent as JSON body to POST /challenge/simp
{
"webglrenderer": "NVIDIA GeForce RTX 4090/PCIe/SSE2", // GPU string from WebGL
"timezone": "Europe/Barcelona",
"hardwareconcurrency": 16,
"innerw": 1920, "innerh": 1080,
"availw": 1920, "availh": 1040,
"devicememory": 8,
"webdriver": false, // navigator.webdriver
"ischromeruntimemissing": false, // chrome.runtime absent in Chrome = headless
"errorstacktripwire": false // stack trace contains puppeteer/playwright/selenium
}
SHA-256 Proof of Work
PoW is the economic firewall. A bot must do real CPU work for every challenge, making mass-automation orders of magnitude more expensive than simply blocking it.
18 bits
Simple path difficulty
≈ 262,144 SHA-256 hashes on average
19 bits
Complex path difficulty
≈ 524,288 SHA-256 hashes on average
180 s
Challenge TTL
Token rejected after 3 minutes
// Worker receives: { challengeHex: "32 hex chars (16 bytes)", difficulty: 18 }
// Target: SHA256( challenge_bytes || nonce_uint32_LE || 0x00000000 ) must have
// at least `difficulty` leading zero bits.
const buf = new Uint8Array(40); // 16B challenge + 4B nonce + 4B padding
buf.set(challenge_bytes, 0); // challenge occupies bytes 0–15
let nonce = 0;
while (nonce < 0x100000000) {
view.setUint32(32, nonce, true); // little-endian nonce at bytes 32–35
const hash = sha256(buf);
if (leadingZeroBits(hash) >= difficulty) {
self.postMessage({ type: "done", nonce }); // ✓ found
return;
}
nonce++;
if ((nonce & 8191) === 0) self.postMessage({ type: "progress", pct });
}
SHA-256 is implemented entirely in JavaScript inside the
Worker blob — no crypto.subtle API, no
external dependencies. The Worker is compiled on-the-fly
from a Blob URL.
Challenge token design
The challenge token is a tamper-evident, IP-bound, time-limited container. It travels from server to browser and back — but the browser never knows what is inside it.
Encryption — ChaCha20-Poly1305
-
▸32-byte ephemeral key, generated with
RAND_bytes()at server startup - ▸12-byte random nonce per token, prepended to ciphertext
- ▸16-byte Poly1305 authentication tag — any byte flip is detected
-
▸Wire format:
nonce(12) ‖ ciphertext ‖ tag(16)→ base64url -
▸Per-thread
EVP_CIPHER_CTXlazy-initialised once, reused forever — zero allocation per request
Plaintext content
Simple token
SIMP|<ip>|<ja4>|<ja4o>|<unix_ts>|<pow_hex32>
Complex token
COMP|<ip>|<ja4>|<ja4o>|<unix_ts>|<pow_hex32>|<sol_x>|<sol_y>
The complex token embeds the correct puzzle
solution coordinates — the client never sees
sol_x / sol_y. Server decrypts and checks
|user_x − sol_x| ≤ 7
pixels.
What the server verifies on every
/solve request (in
order)
① MAC authentication
ChaCha20-Poly1305 AEAD tag verified — any tampering fails here
② Bloom replay check
Raw ciphertext bytes checked & inserted into Bloom filter — token reuse is blocked
③ IP binding
Token IP must exactly match the current request's IP — no cross-device reuse
④ JA4 binding
Both
ja4
and
ja4o
must match challenge-time fingerprints
⑤ TTL check
Token age > 180 s → rejected. Future timestamps also rejected
⑥ PoW solution
Server recomputes SHA-256(challenge‖nonce) and checks leading zero bits
Multi-stage mouse trajectory ML pipeline
Every solve request includes the raw pointer trajectory
as
[x, y, timestamp_ms]
triplets. This goes through five analysis stages. A bot
score in
[0.0, 1.0]
is returned — lower is more human.
Integrity filters
Minimum point count enforced. Trajectories that are too short (fewer than the minimum required points), completely static, or have zero time duration are rejected outright before any statistical work begins.
Burstiness analysis
Bot-generated trajectories often arrive in time-aligned bursts — many points at uniform intervals, or large gaps followed by dense clusters. This stage measures temporal inter-event variance to detect scripted mouse injection.
Sample entropy + jerk
Human movement has measurable complexity. Sample entropy of the velocity signal detects suspiciously smooth or repetitive paths. Jerk (third derivative of position) measures abruptness — humans exhibit natural micro-corrections that automated movement lacks.
Fitts' law validation
Fitts' law predicts that movement time to a target scales with distance and inversely with target size. Human drag trajectories follow this psychomotor law; bot trajectories (linear interpolation, constant velocity, etc.) do not. The score penalises deviations from the expected human movement model.
Velocity coefficient of variation
The coefficient of variation (CV) of the instantaneous velocity signal distinguishes human deceleration profiles from scripted constant-speed drags. Humans naturally accelerate and decelerate; bots don't. Final bot score is aggregated across all stages.
Ed25519 signed pass token
A passing solve emits a signed pass token in the
x-captcha-token
response header. Your frontend stores it and your
backend validates it against
POST /api/validate.
// Never call this from the browser.
const res = await fetch('https://api.captxa.com/api/validate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
captcha_token: req.body.captchatoken,
secret_key: process.env.CAPTXA_SECRET_KEY
})
});
const data = await res.json();
if (!data.Is_Correct) {
return res.status(403).json({ error: 'Bot detected' });
}
// ✅ Valid
{ "Is_Correct":true, "RequestLimit":false, "requests":1 }
// ❌ Invalid / tampered
{ "Is_Correct":false, "reason":"invalid_token" }
// ⚠️ Reused too many times (HTTP 429)
{ "Is_Correct":true, "RequestLimit":true, "requests":12 }
Token usage counting
Each token is tracked server-side by a usage
counter.
requests
tells you how many times this token has been
validated.
RequestLimit: true
(HTTP 429) means the token has been reused beyond
the allowed maximum — this is a signal of token
harvesting or replay. Your backend should treat a
429 response as a rejection.
Server architecture
Written from scratch in C on the H2O HTTP server with a libuv event loop backend. No GC, no managed runtime — every request is handled in a tight zero-allocation path.
8
Worker threads
Each pinned to a CPU core via
pthread_setaffinity_np
TLS 1.3
Only
SSLv2/3, TLS 1.0, TLS 1.1 explicitly disabled. Session cache: 20,480 entries
5 min
UDP telemetry flush
Events batched in groups of 15, sent to analytics server via UDP
1 hr
Bloom filter reset
Replay-prevention Bloom filter cleared hourly by worker-0 timer
Route table — registered in main()
/challenge/simp
/solve/simp
/challenge/complex
/solve/complex
/api/stats
/api/validate
Per-worker timer schedule (worker-0 only)
What data is collected — and what isn't
Captxa is designed to answer one binary question: is this request from a human? — and then discard everything it collected to answer it.
Collected (transiently, during verification only)
- ✓Client IP — bound into token, checked on solve, not stored
- ✓JA4 TLS fingerprint — bound into token, checked on solve, not stored
- ✓Browser environment bundle — checked once, never persisted
- ✓Mouse trajectory — analysed in memory, immediately discarded
-
✓UDP telemetry:
domain, ip, passed/failed, timestamp— aggregated statistics only
Never collected
- ✗No persistent cookies set on the end-user's browser
- ✗No cross-site tracking pixels or third-party iframes
- ✗No user identifiers, account data, or behavioural profiles
- ✗No data sold to or shared with third parties
- ✗No external CDN calls — the ~3 KB script is fully self-contained
All processing happens on EU-hosted servers in Nuremberg, Germany. Full Data Processing Agreement and GDPR compliance documentation available.
Ready to integrate?
Three lines of HTML and one server-side fetch. That's the entire integration.