Methodology

How ClaimHit's
pipeline works

ClaimHit doesn't score infringement — it surfaces candidates that may infringe. Every patent runs through a five-stage pipeline that adds evidence at each step. Candidates confirmed on a manufacturer's own page surface as Good Match. Candidates the models agreed on but couldn't be independently confirmed surface as Possible Match.

Stage 1 · The architecture

Six models. Zero coordination.

When you submit a patent, ClaimHit runs six models in parallel — multiple frontier models from different providers, including several run in distinct configurations. Each model receives the same patent claims and independently proposes candidates, with no knowledge of what the others have found. This is the discovery step; everything they propose is verified in later stages.

A single model producing a confident result is hard to validate — confident hallucinations look identical to confident accurate results. Multiple independent models converging on the same target is a qualitatively different signal. Independence is what makes the consensus meaningful, and where models disagree we still keep candidates so the next four stages can verify them against the open web.

Frontier model

Provider A

ACTIVE

Frontier model

Provider A

ACTIVE

Frontier model

Provider A

ACTIVE

Frontier model

Provider B

ACTIVE

Frontier model

Provider B

ACTIVE

Frontier model

Provider C

ACTIVE

The pipeline

Five stages, in order

Each stage either adds evidence or removes noise. Candidates that survive all five and confirm on a manufacturer's page surface as Good Match. Candidates the models agreed on but couldn't be independently confirmed surface as Possible Match.

01

🤝

Parallel Candidate Discovery

Six models run in parallel against your patent’s inventive contribution. Each independently proposes candidate products and companies. Where multiple models agree, we have a strong consensus signal. Where they don’t, we still keep the candidates and verify them in later stages — disagreement isn’t a reason to drop a candidate, only a reason to insist on independent evidence.

Independence is what makes consensus meaningful. Disagreement keeps the net wide.

02

🌐

Web Grounding

In parallel with the models, we run a six-slot web search across two complementary retrieval engines — one optimized for semantic relevance (matches concepts even when keywords differ), the other for keyword precision against Google’s index. Each slot targets a different page type: manufacturer marketing language, feature pages, use-case explainers, competitive comparisons, end-user reviews, and an invention-specific angle. This catches real products the models missed.

Six search slots. Two engines. Real products beyond what AI alone can recall.

03

🚫

Structural Noise Filtering

Before running expensive verification, we drop entries that pattern-match content sites — UGC platforms, blog and news subdomains, editorial paths, patent corpus sites, academic aggregators. These are correctly classified as non-products by later stages anyway, but filtering them upstream saves analysis time without losing signal. The filter is conservative: anything ambiguous passes through.

Cheap noise removal up front. Conservative — borderline candidates always pass through.

04

🎯

Category-Fit Check

Each candidate goes through a focused model review that asks one question: is this product in the same category as the invention? Solid-state sensor pages get dropped from rotating-sensor searches. Mapping software gets dropped from sensor-hardware searches. The check is recall-tuned — borderline products pass through to verification rather than being dropped early. Decisions come back with a confidence band that affects the final ranking.

Category match, not feature match. Recall over precision at this stage.

05

✓

Manufacturer-Page Verification

For candidates that pass category-fit, we attempt to confirm them on the manufacturer’s own page. We check that the URL resolves, that page content matches patent-distinctive vocabulary, and that a final language-model pass confirms the named product is actually hosted on that domain. Candidates that confirm get tagged Good Match. Candidates the models agreed on but we couldn’t independently confirm get tagged Possible Match — both surface in your results.

Confirmation on the manufacturer’s own domain is the strongest evidence we surface.

Match types

Good Match vs Possible Match

Every result is sorted into one of two buckets, by what we could verify.

● GOOD MATCH

Confirmed on the manufacturer's own page and active in the target market. The strongest evidence we surface — but still a candidate, not a finding.

● POSSIBLE MATCH

The models agreed but we couldn't independently confirm — either the manufacturer page didn't resolve, content didn't match, or market activity was uncertain. Worth investigating; not yet documented.

Risk levels

What each level means

HIGH

Multiple models agree, core claim elements match, specific documented evidence exists for at least three elements. Warrants formal attorney analysis.

MEDIUM

Strong potential requiring further investigation. May reflect proprietary specifications, undisclosed implementations, or markets where public documentation is limited.

LOW

Filtered from results by default. Score based on inference only — treat as a directional lead at most.

⚖️ Legal Disclaimer: ClaimHit results are for preliminary research only. They surface candidates that may infringe — not infringement findings. Results do not constitute legal advice and should not be relied upon as the basis for legal action without formal analysis by a qualified patent attorney.

See it in practice

Search your patent in about 90 seconds.

Book a Demo →

How ClaimHit'spipeline works

Six models. Zero coordination.

Five stages, in order

Parallel Candidate Discovery

Web Grounding

Structural Noise Filtering

Category-Fit Check

Manufacturer-Page Verification

Good Match vs Possible Match

What each level means

See it in practice

How ClaimHit's
pipeline works