Report

We Analyzed 100,000 Cold Emails: What We Found Will Change How You Think About Spam

Original research analyzing 100,000 cold outreach emails processed by Email Ferret. Data on AI generation rates, sender tools, personalization tactics, and what actually gets past spam filters.

March 15, 2026

Key Findings

1.73% of cold emails showed signs of AI generation based on content analysis
2.The average recipient gets cold emails from 23 unique sales tools per month
3.Emails with AI-generated personalization had 4.1x higher inbox placement rates
4.Only 12% of cold emails included a valid unsubscribe mechanism
5.68% of cold email domains were less than 90 days old
6.Follow-up sequences averaged 4.7 emails over 12 days

We Analyzed 100,000 Cold Emails: What We Found Will Change How You Think About Spam

Published March 15, 2026 — Email Ferret Research

Over a ten-week period ending in mid-March 2026, Email Ferret processed and analyzed 100,000 cold outreach emails identified by our heuristic scoring pipeline. These weren't borderline cases — every email in this dataset scored 4 or higher on our spam scale, meaning our system had high confidence each one represented unsolicited commercial outreach. What we found in that dataset tells a story that goes far beyond "spam is getting worse." The cold email industry has professionalized to the point where it operates with the infrastructure discipline of enterprise software and the creative sophistication of a marketing agency. Traditional spam filters aren't losing the battle. They never had a chance in this one.

The headline: 73% of the emails we analyzed showed clear signs of AI generation. If you're still thinking of AI cold email as a novelty or an edge case, this report will reframe that assumption entirely.

Methodology

The 100,000 emails analyzed in this report were collected between January 1 and March 15, 2026, from the Email Ferret processing pipeline. Every email in the dataset met the following criteria:

Scoring threshold: Heuristic score of 4 or higher on Email Ferret's -10 to +10 scale, indicating confirmed cold outreach
Classification confirmation: One or more of the following secondary signals present: BDR phrase detection, automation tool fingerprinting, LLM sales intent classification, or domain infrastructure analysis indicating new sending infrastructure
Scope: Emails processed across the Email Ferret user base during the collection period

Analysis dimensions: For each email, we analyzed AI generation indicators (content patterns, structural markers, LLM-specific phrasing), sending platform identification (automation tool fingerprinting), personalization depth (what specific details were referenced), sending domain age and infrastructure, sequence behavior (reply chain manipulation, follow-up patterns), and CAN-SPAM compliance indicators.

Privacy architecture: Email content was analyzed in real-time at processing time and not retained. No email text, sender identities, or recipient information was stored as part of this analysis. All statistics in this report are aggregated. Individual emails cannot be reconstructed from our data.

Limitations: This dataset reflects Email Ferret's user base, which skews toward technical and professional users who are high-frequency targets for B2B cold outreach. Absolute volume figures should not be extrapolated to general consumer email populations. The dataset represents confirmed cold outreach — it does not speak to the total proportion of all email that is cold outreach.

Key Finding 1: 73% of Cold Emails Are AI-Generated

The most significant finding is also the most consequential for anyone thinking about inbox security: nearly three in four cold emails in our dataset showed clear signs of AI generation.

We use a multi-signal approach to classify AI generation rather than relying on any single indicator. The signals include: paragraph-level structural patterns characteristic of LLM-generated text (predictable subject-body-CTA architecture, formulaic transitions), stylistic fingerprints associated with specific models, absence of genuine research errors that human writers make, and semantic consistency scores that measure how uniformly "helpful" the tone stays across an email — a pattern that is nearly universal in AI output and quite rare in human writing.

Breakdown by sophistication tier:

| Tier | Description | Share of AI emails | |---|---|---| | Basic AI | Template with variable insertion (name, company, job title) | 31% | | Moderate AI | Paragraph-level generation with role/industry context | 28% | | Advanced AI | Fully unique per-recipient, sourced from external data | 14% |

The basic tier is what most people imagine when they think about AI cold email: a template with {{first_name}} and {{company}} swapped in. But that tier is actually shrinking as a share of AI cold email. The moderate and advanced tiers — which produce emails that are substantially harder to distinguish from genuine personal correspondence — together represent 42% of all cold email in our dataset, or more than half of all AI-generated cold email.

Year-over-year trend: Email Ferret's retrospective data shows AI-generated cold outreach at approximately 40% of confirmed cold email in early 2025. The jump to 73% over 12 months represents a 33-percentage-point increase — one of the fastest adoption curves we've seen for any technology in the email security space.

Model attribution: Using stylistic fingerprinting and metadata analysis, we were able to attribute model provenance for a subset of AI-generated emails. Among attributable emails:

ChatGPT (GPT-4 family): 34%
Claude (Anthropic): 18%
Gemini (Google): 12%
Unknown or custom fine-tuned models: 9%
Multi-model campaigns (intentionally rotating): 27%

The multi-model figure deserves attention. More than a quarter of attributable AI-generated campaigns deliberately rotated between models — not for quality reasons, but specifically to defeat detection approaches that rely on model-specific stylistic patterns. This is a deliberate evasion tactic that first appeared at measurable scale in Q1 2026.

Key Finding 2: The Cold Email Tool Landscape

Email Ferret's automation fingerprinting identifies sending platforms by analyzing headers, sending infrastructure, unsubscribe link formats, tracking pixel patterns, and other platform-specific metadata. The following breakdown represents the platform distribution across our 100,000-email dataset.

Sending platform distribution:

| Platform | Share of cold emails | |---|---| | Apollo.io | 22% | | Instantly.ai | 18% | | Outreach.io | 14% | | Smartlead | 11% | | SalesLoft | 9% | | Lemlist | 7% | | Reply.io | 5% | | Woodpecker | 4% | | Mailshake | 3% | | Other / custom infrastructure | 7% |

Apollo and Instantly together account for 40% of all detected cold outreach — a significant concentration. Both platforms have aggressively added AI generation features in 2025 and 2026, making them the dominant vectors for AI-personalized cold outreach at scale.

Multi-platform campaigns: 41% of cold emails in our dataset came from domains that were simultaneously using multiple sending platforms. This is a deliberate redundancy strategy — if one platform's sending reputation degrades, campaigns continue through other channels without interruption.

Recipient exposure: Based on domain-level analysis of sending infrastructure, the average Email Ferret user received cold emails from 23 unique sales tools per month during the collection period. This figure represents the breadth of the problem rather than its depth: a single sales team might send 30 emails through Apollo, but those 30 emails contribute to just one of the 23 tools in the exposure count.

Key Finding 3: Personalization Is Getting Scary Good

The cold email industry's personalization capabilities in early 2026 are genuinely impressive by any objective measure — and deeply problematic from an inbox security perspective.

Personalization depth breakdown across our dataset:

| Personalization level | Share of emails | |---|---| | No personalization (completely generic) | 8% | | Name + company name only | 27% | | Role + industry-specific references | 31% | | LinkedIn/social data references | 22% | | Recent company news, funding, or hiring mentions | 12% |

The 8% of fully generic emails are largely irrelevant from a detection standpoint — they're easy to identify and filter. The more interesting story is the 34% of emails that referenced LinkedIn, social data, recent company events, or funding. These emails incorporated specific, accurate details that require real research (or AI-powered research automation) to produce.

Inbox placement impact: AI-personalized emails — those in the moderate and advanced AI tiers — had 4.1x higher inbox placement rates than template-based cold emails in our dataset. This is the most significant operational finding in the report: personalization directly translates to deliverability because personalized emails generate more engagement signals (opens, replies), which reinforces sending reputation.

Personalization data sources: Among emails with role/industry or deeper personalization, the most commonly detected data sources were:

LinkedIn (profile, activity, posts): 67%
Company website (team pages, about pages, product pages): 43%
Crunchbase (funding data, company metrics): 18%
Twitter/X (recent posts, engagement data): 12%

The implication is clear: any information you publish publicly about yourself or your company is being ingested and used to generate personalized cold outreach at scale. A press release about your funding round generates thousands of AI-personalized emails that mention it within days of publication.

Key Finding 4: Domain Infrastructure — Engineered for Evasion

The cold email industry has developed a sophisticated infrastructure discipline around domain management that specifically targets the weaknesses of reputation-based spam filtering.

Domain age: 68% of sending domains in our dataset were less than 90 days old at the time they sent the emails we analyzed. This is striking because domain age is one of the few infrastructure signals that most email security tools — including Gmail — use as a trust factor. The industry has adapted around it.

Domain rotation: The median cold email campaign we detected used 12 sending domains. A campaign targeting 10,000 contacts across 12 domains sends roughly 830 emails per domain — a volume that sits comfortably below the thresholds Gmail uses to classify bulk sending.

Thin infrastructure: 54% of sending domains had fewer than 5 indexed web pages at the time of sending. These are not legitimate businesses with established web presences — they are shell domains created for the explicit purpose of sending email.

Inbox warming: 71% of sending domains showed evidence of inbox warming service activity before the cold emails were sent. Warming services create artificial engagement histories by orchestrating networks of email accounts to automatically send, open, read, reply to, and remove from spam each other's messages. This builds a synthetic reputation that makes new domains appear trustworthy.

Warming lead time: The average warm-up period before first cold email was 14 days. In two weeks, a domain that didn't exist a month ago can manufacture the engagement history of an active business email account.

Domain rotation cadence: 89% of campaigns switched their primary sending domain within 30 days. By the time a domain has accumulated any negative reputation signals, the campaign has already moved to the next one in the rotation.

The practical consequence: the entire concept of sender reputation as a spam defense is being systematically engineered around. The infrastructure is not trying to build a legitimate reputation — it is manufacturing a synthetic one at industrial scale.

Key Finding 5: The Sequence Machine

Cold email is not a single touch — it's a coordinated sequence designed to maximize the probability of a response through persistence.

Sequence statistics from our dataset:

Average emails per sequence: 4.7
Average sequence duration: 12 days
Most common send pattern: Day 1, Day 3, Day 7, Day 10, Day 14
Maximum sequence length detected: 11 emails over 28 days

Manipulative tactics:

Fake "Re:" subject lines: 23% of sequences included one or more emails with a "Re:" prefix in the subject line that did not correspond to an actual reply. This tactic attempts to make the email look like a continuation of an existing conversation. Email Ferret's fake thread detection flags this pattern specifically.
Calendar invite attachments: 17% of sequences included a calendar invite (.ics file) or calendar invitation link as an attention hook — typically in the second or third email of a sequence. The invite is not for a meeting that was agreed upon; it's an unsolicited calendar injection tactic.
CAN-SPAM compliance: Only 12% of emails in our dataset included a valid unsubscribe mechanism meeting CAN-SPAM requirements. This is not an oversight — omitting an unsubscribe link is a deliberate choice. A valid unsubscribe mechanism is required by law for commercial email in the United States, but enforcement against individual small-scale campaigns is effectively zero. The practical risk of non-compliance is low; the practical benefit (not creating an easy opt-out path that might suppress response) is perceived as high.

The sequence behavior makes clear that the intent is not to send a single email — it's to maintain pressure across multiple contacts until a response is generated or the recipient explicitly stops the sequence.

Key Finding 6: What Actually Gets Past Gmail

Perhaps the most operationally significant finding in our dataset is the inbox placement data. Of the 100,000 confirmed cold outreach emails we analyzed, 89% landed in the recipient's Primary inbox — not the Spam folder, not even Promotions. The Primary inbox.

Gmail's spam filter, despite being one of the most sophisticated spam filters ever built, is functionally blind to the majority of modern AI-generated cold outreach. Understanding why requires understanding what Gmail actually filters for.

Factors correlated with inbox placement (our dataset):

| Factor | Placement rate increase | |---|---| | Sending domain age > 30 days | +34% | | Inbox warming active | +28% | | AI personalization (moderate or advanced tier) | +22% | | One-to-one sending pattern (no CC/BCC) | +18% | | Valid SPF/DKIM/DMARC authentication | +15% (baseline) |

Factors that triggered Gmail's spam filter:

| Factor | Placement rate decrease | |---|---| | Bulk sending patterns (high per-domain volume) | -45% | | Known spam keyword patterns in subject or body | -31% | | Missing or failed authentication | -28% |

The pattern is instructive. Gmail's filter is calibrated to detect the spam of 2019: high-volume blasts from unauthenticated domains with obvious spam keywords. Modern cold email campaigns are specifically structured to avoid every one of these signals. They send low volumes per domain. They authenticate properly. They avoid spam keywords. They look, in every measurable way, like legitimate personal business correspondence.

The 11% of emails that did land in spam were predominantly from the least sophisticated tier — basic AI templates sent at high volume from unwarmed domains. The 89% that reached the Primary inbox had done everything right according to Gmail's detection model, because "everything right" in Gmail's model describes exactly what a professional cold email campaign does.

What This Means for Your Inbox

The data in this report points to a structural problem that won't be solved by tweaking existing spam filter rules.

Gmail's spam filter is a product optimized to catch bulk unsolicited email. It does that well. The problem is that the cold email industry has evolved past bulk unsolicited email. Modern cold outreach is sent at low per-domain volumes, from authenticated domains with manufactured reputations, with AI-generated content that is unique and personalized per recipient, through sequences engineered to maintain just enough contact to generate responses. None of those characteristics trigger Gmail's detection model.

Keyword filters face the same problem. Every major cold email platform and AI generation tool explicitly optimizes to avoid common spam keywords. The words that trigger filters are well-known; avoiding them is table stakes.

The only reliable approach to detecting modern cold outreach is analyzing what an email is trying to accomplish — its intent — rather than what words it contains or which domain it comes from. An email that is attempting to initiate a sales conversation has consistent semantic properties regardless of how it's phrased, what domain it's sent from, or which AI model generated it. Intent-based analysis is robust to the surface variations that defeat every other detection approach.

How Email Ferret Detects What Others Miss

Email Ferret's detection pipeline applies more than 15 signals to each email, combining heuristic scoring with LLM-powered intent analysis:

Infrastructure signals: Domain age, SPF/DKIM/DMARC authentication, inbox warming markers, sending platform fingerprints

Content signals: BDR phrase detection (a curated library of cold outreach language patterns), fake thread detection (identifying false "Re:" prefixes), automation tool fingerprints in headers and links

Behavioral signals: Sender history (have you emailed this person before?), domain trust assessment (is this a known legitimate business?), sequence detection

Intent analysis: For emails that clear the heuristic threshold, an LLM evaluates the email's actual purpose — is this attempting to initiate a sales conversation? — regardless of how it's written or what words it uses

Trust mechanisms: Allowlist (contacts you've approved), trusted domain list (known SaaS vendors, major corporations, and other established senders), same-domain detection (emails from your own organization)

The result is a system that catches AI-personalized cold outreach that looks exactly like legitimate business correspondence to every other filter — because it's evaluating what the email is trying to do, not just what it looks like.

Critically, Email Ferret analyzes and discards. Email content is processed in real-time and never stored. The statistics in this report are aggregated from that real-time analysis pipeline — no email text was retained to produce them.

About This Research

This report is based on Email Ferret's ongoing processing pipeline analysis. The 100,000-email dataset analyzed here covers January 1 through March 15, 2026.

Email Ferret publishes quarterly trend data through the AI Spam Index. The first edition of the AI Spam Index (Q1 2026) was published March 19, 2026, and tracks AI spam volume and sophistication metrics on a quarterly basis.

Future research from Email Ferret will examine specific industry verticals, geographic variation in cold email targeting, and the evolution of AI generation techniques as detection methods improve.

For press inquiries or data questions, contact support@emailferret.io

Methodology notes and full data tables available upon request.

Press Kit

For media inquiries, press releases, or interview requests, please contact:

Email: press@emailferret.io

Report

We Analyzed 100,000 Cold Emails: What We Found Will Change How You Think About Spam

Original research analyzing 100,000 cold outreach emails processed by Email Ferret. Data on AI generation rates, sender tools, personalization tactics, and what actually gets past spam filters.

March 15, 2026

Key Findings

1.73% of cold emails showed signs of AI generation based on content analysis
2.The average recipient gets cold emails from 23 unique sales tools per month
3.Emails with AI-generated personalization had 4.1x higher inbox placement rates
4.Only 12% of cold emails included a valid unsubscribe mechanism
5.68% of cold email domains were less than 90 days old
6.Follow-up sequences averaged 4.7 emails over 12 days

We Analyzed 100,000 Cold Emails: What We Found Will Change How You Think About Spam

Published March 15, 2026 — Email Ferret Research

Methodology

The 100,000 emails analyzed in this report were collected between January 1 and March 15, 2026, from the Email Ferret processing pipeline. Every email in the dataset met the following criteria:

Scoring threshold: Heuristic score of 4 or higher on Email Ferret's -10 to +10 scale, indicating confirmed cold outreach
Classification confirmation: One or more of the following secondary signals present: BDR phrase detection, automation tool fingerprinting, LLM sales intent classification, or domain infrastructure analysis indicating new sending infrastructure
Scope: Emails processed across the Email Ferret user base during the collection period

Key Finding 1: 73% of Cold Emails Are AI-Generated

The most significant finding is also the most consequential for anyone thinking about inbox security: nearly three in four cold emails in our dataset showed clear signs of AI generation.

Breakdown by sophistication tier:

Model attribution: Using stylistic fingerprinting and metadata analysis, we were able to attribute model provenance for a subset of AI-generated emails. Among attributable emails:

ChatGPT (GPT-4 family): 34%
Claude (Anthropic): 18%
Gemini (Google): 12%
Unknown or custom fine-tuned models: 9%
Multi-model campaigns (intentionally rotating): 27%

Key Finding 2: The Cold Email Tool Landscape

Sending platform distribution:

Key Finding 3: Personalization Is Getting Scary Good

The cold email industry's personalization capabilities in early 2026 are genuinely impressive by any objective measure — and deeply problematic from an inbox security perspective.

Personalization depth breakdown across our dataset:

Personalization data sources: Among emails with role/industry or deeper personalization, the most commonly detected data sources were:

LinkedIn (profile, activity, posts): 67%
Company website (team pages, about pages, product pages): 43%
Crunchbase (funding data, company metrics): 18%
Twitter/X (recent posts, engagement data): 12%

Key Finding 4: Domain Infrastructure — Engineered for Evasion

The cold email industry has developed a sophisticated infrastructure discipline around domain management that specifically targets the weaknesses of reputation-based spam filtering.

Key Finding 5: The Sequence Machine

Cold email is not a single touch — it's a coordinated sequence designed to maximize the probability of a response through persistence.

Sequence statistics from our dataset:

Average emails per sequence: 4.7
Average sequence duration: 12 days
Most common send pattern: Day 1, Day 3, Day 7, Day 10, Day 14
Maximum sequence length detected: 11 emails over 28 days

Manipulative tactics:

Fake "Re:" subject lines: 23% of sequences included one or more emails with a "Re:" prefix in the subject line that did not correspond to an actual reply. This tactic attempts to make the email look like a continuation of an existing conversation. Email Ferret's fake thread detection flags this pattern specifically.
Calendar invite attachments: 17% of sequences included a calendar invite (.ics file) or calendar invitation link as an attention hook — typically in the second or third email of a sequence. The invite is not for a meeting that was agreed upon; it's an unsolicited calendar injection tactic.
CAN-SPAM compliance: Only 12% of emails in our dataset included a valid unsubscribe mechanism meeting CAN-SPAM requirements. This is not an oversight — omitting an unsubscribe link is a deliberate choice. A valid unsubscribe mechanism is required by law for commercial email in the United States, but enforcement against individual small-scale campaigns is effectively zero. The practical risk of non-compliance is low; the practical benefit (not creating an easy opt-out path that might suppress response) is perceived as high.

Key Finding 6: What Actually Gets Past Gmail

Factors correlated with inbox placement (our dataset):

Factors that triggered Gmail's spam filter:

What This Means for Your Inbox

The data in this report points to a structural problem that won't be solved by tweaking existing spam filter rules.

How Email Ferret Detects What Others Miss

Email Ferret's detection pipeline applies more than 15 signals to each email, combining heuristic scoring with LLM-powered intent analysis:

Infrastructure signals: Domain age, SPF/DKIM/DMARC authentication, inbox warming markers, sending platform fingerprints

Behavioral signals: Sender history (have you emailed this person before?), domain trust assessment (is this a known legitimate business?), sequence detection

About This Research

This report is based on Email Ferret's ongoing processing pipeline analysis. The 100,000-email dataset analyzed here covers January 1 through March 15, 2026.

Future research from Email Ferret will examine specific industry verticals, geographic variation in cold email targeting, and the evolution of AI generation techniques as detection methods improve.

For press inquiries or data questions, contact support@emailferret.io

Methodology notes and full data tables available upon request.

Press Kit

For media inquiries, press releases, or interview requests, please contact:

Email: press@emailferret.io