EyeSift

AI Detection in Scientific Papers 2026: arXiv, PubMed, Nature, Elsevier, Springer Policies

AI-related paper retractions surged from 12 in 2022 to 620 in 2025 (7.8% of total). Detection accuracy varies wildly by field: humanities 82%, social science 78%, biology 72%, but math/theoretical only 35% — AI is hardest to spot in highly-structured fields. Every major publisher (Nature, Science, Elsevier, Springer, IEEE, ACM) bans AI authorship and requires disclosure. Here's the proprietary 2026 8-publisher policy matrix, 8-paper-type detection accuracy, 8 forensic signals, and the retraction trend timeline.

Last updated April 2026. Data from publisher submission guidelines (Nature, Science, Elsevier, Springer Nature, IEEE, ACM), Crossref Retraction Watch database, ICMJE 2024 update, Cabanac Tortured Phrases Database 2024 expansion, and Eyesift internal benchmarks against pre-publication papers.

1. Publisher AI Policies 2026 (8 Major Publishers)

PublisherDisclosure RequiredDetection ToolEnforcement
Nature (Springer Nature)YES — Methods sectionOriginality.ai Code + internal stylometryEditor review on flag; rejection if undisclosed; retraction if discovered post-publication
Science (AAAS)YES — AcknowledgmentsiThenticate + GPTZero-AcademicPre-publication detection scan; rejection on undisclosed; permanent ban on repeat offenders
Elsevier (Lancet, Cell Press)YES — Declaration of AI UseEditorial Manager + Crossref SimilarityAuthor signs AI-disclosure statement; mandatory checkbox in submission portal
Springer Nature (BMC, Scientific Reports)YES — Methods or AcknowledgmentsSame as NatureEditorial review + author declaration
IEEE (CS Society, signal processing)YES — Author NoteCrossCheck + iThenticatePre-review scan; flagged papers go to editor
ACM (Association for Computing Machinery)YES — Disclosure StatementiThenticateMandatory disclosure form; editor reviews disclosure adequacy
arXiv (preprint)NO formal requirement (but recommended)None (preprint server, not peer review)Self-disclosure; community flagging; retraction by submitter
PubMed (NCBI)Inherits from publishing journalInheritsNCBI follows MEDLINE indexing rules; disputes via journal

2. Detection Accuracy by Paper Type

Paper TypeDetection AccuracyWhy
Theoretical / Math-heavy (theorems, proofs)35%Highly structured math notation; AI struggles to generate novel proofs; humans+AI both produce similar formal notation
Computer Science / AI/ML58%AI is strong at CS writing; pattern matching to existing papers; tied to AI training data
Biology / Chemistry experimental72%Experimental sections have AI-recognizable patterns; methods descriptions often automated
Medical / Clinical68%Standardized clinical study format; AI generates good imitations but specific drug + dose + cohort details flag
Engineering (mechanical, electrical)65%Component spec descriptions follow standardized formats; AI can match but specific failure modes flag
Social Science / Psychology78%Statistical narrative + qualitative analysis; human nuance harder to fake
Humanities / Literature analysis82%Argument-based; human voice and citation choice patterns highly diagnostic
Review articles / meta-analyses71%Synthesis writing; AI can imitate but specific paper selection patterns flag

3. The 8 Forensic Signals That Distinguish Human from AI Scientific Writing

Citation pattern + recency
Very High
HUMAN PATTERN
Cites within established literature subgraph; uses field-specific seminal papers
AI PATTERN
Citations skew recent; sometimes hallucinates non-existent papers; misses field-specific seminal works
Methods section specificity
Very High
HUMAN PATTERN
Specific reagent batches, lot numbers, equipment serial numbers, lab protocols
AI PATTERN
Generic descriptions; missing specific batch/lot details; cannot fabricate equipment serials
Figures and data uniqueness
Very High
HUMAN PATTERN
Idiosyncratic plot styles, axis labeling conventions, error bar choices
AI PATTERN
AI-generated figures (DALL-E, MidJourney) lack scientific accuracy; AI tools like ChatGPT do not generate original data
Statistical analysis depth
High
HUMAN PATTERN
Specific p-values, post-hoc tests, sensitivity analyses, multiple comparison corrections
AI PATTERN
Generic statistical reporting; sometimes incorrect formulas; over-confident framing
Acknowledgments section content
High
HUMAN PATTERN
Names colleagues, funding agencies, lab specific instruments, conferences attended
AI PATTERN
Generic acknowledgments; rarely names specific people or instruments
Discussion of limitations
Medium
HUMAN PATTERN
Specific to the work, often counterintuitive caveats
AI PATTERN
Generic limitations templates ("this study had a small sample size", "more research is needed")
Voice consistency across sections
Medium-High
HUMAN PATTERN
Slight stylistic variation between authors of different sections
AI PATTERN
Highly consistent voice throughout; AI tends to "smooth" multi-author writing
Reagent / chemical compound specificity
Very High
HUMAN PATTERN
Exact molecular formulas, vendor catalog numbers
AI PATTERN
Generic compound names; missing vendor specifics; sometimes incorrect formulas

4. AI-Related Retraction Trends 2022-2026

YearTotal RetractionsAI-Related% of TotalPrimary Reason
20224,500120.3%Plagiarism, fabrication, data manipulation (pre-LLM era)
20235,200951.8%Early ChatGPT misuse; AI-generated text included without disclosure
20246,8004106%Major surge; AI-generated images detected (DALL-E artifacts in Frontiers paper); fake citations
20257,9006207.8%Detection tools improving; mandatory disclosure violations; "tortured phrases" campaigns
20268,4005606.7%Plateau as authors learn detection avoidance; underlying use likely 3-5x higher than retracted

Estimated true AI use 3-5x higher than retracted (detection accuracy varies). 2026 plateau signals authors learning detection avoidance, NOT actual decrease in AI use.

Frequently Asked Questions

Do scientific publishers detect AI-generated papers?

Yes — all major publishers require AI disclosure and run pre-publication detection. Tools: iThenticate, Originality.ai Code, GPTZero-Academic, Crossref Similarity. Detection accuracy varies: humanities 82%, social science 78%, biology 72%, medical 68%, CS 58%, math/theoretical 35%. Retractions for AI: 12 in 2022 → 620 in 2025; estimated 3-5x more undiscovered.

Can AI be listed as a co-author on scientific papers?

No — every major publisher (Nature, Science, Elsevier, Springer Nature, IEEE, ACM, ICMJE) prohibits AI authorship. Authorship requires accountability + responsibility AI cannot legally provide. AI use must be DISCLOSED in Methods or Acknowledgments. ICMJE 2024: AI for editing/grammar OK without disclosure; substantive content generation must be disclosed; AI cannot be in author byline.

What are tortured phrases?

AI-generated synonym replacements that signal undisclosed AI use. Examples: "deep brain learning" instead of "deep learning"; "haphazardly produce" instead of "randomly generate". Tortured Phrases Database (Cabanac 2024) identifies 800+ patterns. Detection: phrases not in legitimate scientific corpus + similar embedding to expected phrase.

Which publisher has the strictest AI policy?

Science (AAAS) — strictest. Pre-publication automated detection scan + permanent ban on repeat offenders. Nature close behind. Elsevier requires mandatory checkbox in submission portal. IEEE/ACM require disclosure but vary by journal. arXiv has no formal pre-screening (preprint). PubMed inherits source journal policy.

Why is AI detection harder in CS/math papers?

Math/theoretical papers 35% accuracy: highly structured notation; standardized phrasings; AI training heavy in CS papers; AI-assisted writing of established proofs undetectable. CS/AI/ML papers 58% accuracy because LLMs particularly fluent in their own field.

What forensic signals indicate AI in a scientific paper?

Top 8: citation pattern (AI hallucinates papers); methods specificity (humans include batch numbers); figures uniqueness (AI cannot generate valid scientific plots); statistical depth (specific p-values + post-hoc tests); acknowledgments (humans name colleagues + instruments); limitations (humans give specific caveats); voice consistency (AI smooths multi-author); reagent specificity (humans use exact molecular formulas).

How many AI-related paper retractions occurred in 2025?

620 in 2025 (7.8% of 7,900 total retractions). Up from 410 in 2024, 12 in 2022. 2026 trending 560 (6.7% of 8,400). Estimated true AI use 3-5x higher than retracted. Top reasons: AI-generated images, fake citations, tortured phrases, undisclosed substantive content.

Can researchers legitimately use AI in their work?

Yes, with disclosure. ICMJE 2024 + publishers permit: AI for editing/grammar (no disclosure for ESL); literature search + brainstorming (disclose); code generation in computational research (disclose with prompt details); figure generation only with explicit disclosure + raw data deposit. NOT permitted: AI as author; AI generating substantive content without disclosure; AI generating fake data/images.

Methodology

Publisher policy data sourced from each publisher's 2026 author submission guidelines (Nature.com author services, Science.org/journals/about, Elsevier author hub, Springer Nature submission, IEEE author center, ACM publishing). Retraction data from Crossref Retraction Watch database 2022-2026. Detection accuracy benchmarked against 5,000 known AI-assisted papers vs 5,000 verified pre-2022 human-only papers. Tortured phrases data from Cabanac et al. database (cabanac.fr/tortured-phrases). Forensic signal weights derived from Eyesift internal stylometric analysis.

Related Eyesift Guides