Methodology

How ClaimHit's
pipeline works

ClaimHit doesn't score infringement — it surfaces candidates that may infringe. Every patent runs through a five-stage pipeline that adds evidence at each step. Candidates confirmed on a manufacturer's own page surface as Good Match. Candidates the AI ensemble agreed on but couldn't be independently confirmed surface as Possible Match.

Stage 1 · The architecture

Seven models. Zero coordination.

When you submit a patent, ClaimHit runs seven AI models in parallel — two Claude Sonnet instances, one Claude Haiku, two GPT-4o variants, DeepSeek, and Mistral. Each model receives the same patent claims and proposes candidates independently, with no knowledge of what the others found.

A single model producing a confident result is hard to validate — confident hallucinations look identical to confident accurate results. Multiple independent models converging on the same target is a qualitatively different signal. Independence is what makes the consensus meaningful, and where models disagree we still keep candidates so the next four stages can verify them against the open web.

Claude Sonnet (×2)
Anthropic
ACTIVE
Claude Haiku
Anthropic
ACTIVE
GPT-4o (×2)
OpenAI
ACTIVE
DeepSeek
DeepSeek
ACTIVE
Mistral
Mistral AI
ACTIVE
The pipeline

Five stages, in order

Each stage either adds evidence or removes noise. Candidates that survive all five and confirm on a manufacturer's page surface as Good Match. Candidates the AI ensemble agreed on but couldn't be independently confirmed surface as Possible Match.

01
🤝

Multi-Model Search

Seven AI models run in parallel against your patent’s inventive contribution. Each independently proposes candidate products and companies. Where multiple models agree, we have a strong consensus signal. Where they don’t, we still keep the candidates and verify them in later stages — disagreement isn’t a reason to drop a candidate, only a reason to insist on independent evidence.

Independence is what makes consensus meaningful. Disagreement keeps the net wide.
02
🌐

Web Grounding

In parallel with the AI ensemble, we run a six-slot web search across two complementary retrieval engines — one optimized for semantic relevance (matches concepts even when keywords differ), the other for keyword precision against Google’s index. Each slot targets a different page type: manufacturer marketing language, feature pages, use-case explainers, competitive comparisons, end-user reviews, and an invention-specific angle. This catches real products the AI ensemble missed.

Six search slots. Two engines. Real products beyond what AI alone can recall.
03
🚫

Structural Noise Filtering

Before running expensive verification, we drop entries that pattern-match content sites — UGC platforms, blog and news subdomains, editorial paths, patent corpus sites, academic aggregators. These are correctly classified as non-products by later stages anyway, but filtering them upstream saves analysis time without losing signal. The filter is conservative: anything ambiguous passes through.

Cheap noise removal up front. Conservative — borderline candidates always pass through.
04
🎯

Category-Fit Check

Each candidate goes through a Claude Sonnet review that asks one focused question: is this product in the same category as the invention? Solid-state sensor pages get dropped from rotating-sensor searches. Mapping software gets dropped from sensor-hardware searches. The check is recall-tuned — borderline products pass through to verification rather than being dropped early. Decisions come back with a confidence band that affects the final ranking.

Category match, not feature match. Recall over precision at this stage.
05

Manufacturer-Page Verification

For candidates that pass category-fit, we attempt to confirm them on the manufacturer’s own page. We check that the URL resolves, that page content matches patent-distinctive vocabulary, and that a final language-model pass confirms the named product is actually hosted on that domain. Candidates that confirm get tagged Good Match. Candidates the AI ensemble agreed on but we couldn’t independently confirm get tagged Possible Match — both surface in your results.

Confirmation on the manufacturer’s own domain is the strongest evidence we surface.
Match types

Good Match vs Possible Match

Every result is sorted into one of two buckets, by what we could verify.

● GOOD MATCH
Confirmed on the manufacturer's own page and active in the target market. The strongest evidence we surface — but still a candidate, not a finding.
● POSSIBLE MATCH
The AI ensemble agreed but we couldn't independently confirm — either the manufacturer page didn't resolve, content didn't match, or market activity was uncertain. Worth investigating; not yet documented.
Risk levels

What each level means

HIGH
Multiple models agree, core claim elements match, specific documented evidence exists for at least three elements. Warrants formal attorney analysis.
MEDIUM
Strong potential requiring further investigation. May reflect proprietary specifications, undisclosed implementations, or markets where public documentation is limited.
LOW
Filtered from results by default. Score based on inference only — treat as a directional lead at most.
⚖️ Legal Disclaimer: ClaimHit results are for preliminary research only. They surface candidates that may infringe — not infringement findings. Results do not constitute legal advice and should not be relied upon as the basis for legal action without formal analysis by a qualified patent attorney.

See it in practice

3 free searches. No credit card. Results in about 90 seconds.

Try ClaimHit Free →