back

Methodology

How it works.

Three stages. Read. Cross-check. Call it.

01
Read

A header parser pulls out From, Reply-To, Return-Path, the Received chain IPs, and the Authentication-Results verdicts. A regex pass over the body extracts URLs, bare domains, email addresses, and IPs.

Sender is tagged primary. Everything else is secondary. Capped at 20 unique identifiers per check.

02
Cross-check

Every identifier is placed on the Whisper graph in parallel. Registrar, MX, SPF, DNSSEC, ASN, country, threat-feed history, co-located hostnames.

3.67B nodes · 30.8B relationships · sub-millisecond query

03
Call it

The signals compose into a single verdict against published thresholds. The rules are below. The graph is the authority.

Below 30 → clean · 30–79 → mixed · 80+ → hostile

The rules
+30Couldn't identify any sender from the headers
+30From domain uses a TLD not in the IANA root zone
+15From domain TLD is on the high-abuse list (.tk, .top, .click, etc.)
+25DKIM signer apex differs from From apex (spoofing pattern)
+25Sender header apex differs from From apex
+20Reply-To apex differs from From apex
+20Return-Path apex differs from From apex
+50Sender domain registered less than 7 days ago
+30Sender domain registered less than 30 days ago
+15Sender domain registered less than 90 days ago
-20Sender domain over 5 years old with zero threat-feed listings
+25Sender registrable apex is not in the Whisper graph
+25Body URL registrable apex is not in the Whisper graph
+30Sender domain exists as a graph node but has zero IPs, MX, and inbound links
+15Sender domain has zero inbound web links across Common Crawl
+60Sender domain has no MX record
-10Sender uses Google Workspace, Microsoft 365, or another major provider
+15Sender domain has no SPF
-5Sender domain is DNSSEC-signed
+20Authentication-Results says SPF, DKIM, or DMARC failed (each)
+60Sender domain on phishing threat feeds
+50Sender domain on malware threat feeds
+30Sender domain on spam threat feeds
+60Body URL host on phishing feeds
+50Body URL host on malware feeds
+25Body URL host on spam feeds
+35Sending IP on phishing or malware feeds
+20Sending IP on spam feeds
+40Display name impersonates a brand the domain does not match
+40Sender domain is a typosquat (1-2 character edit distance from a known brand)
-50Sender is a known legit brand domain with major email infrastructure
+30Body URL host registered less than 30 days ago and unrelated to sender
+25Body URL host on object storage (storage.googleapis.com, s3, blob.core.windows.net)
+20Body URL host on free SaaS (vercel.app, netlify.app, replit.app, etc.)

Weights live in src/lib/score.ts. Templates that turn each triggered signal into plain English live in src/lib/templates.ts. Both are open source under github.com/whisper-sec/isitspam.

Built on Whisper

The same graph and queries are available to your team. Set up the MCP client, read the Cypher query guide, or browse the full developer documentation.