From a clinical researcher who reads citations every week - here is what citation fabrication looks like, why automated identifier checks miss the dominant pattern, and how to verify a citation properly.
1 in 277
biomedical papers in early 2026 contains at least one fabricated reference - a more than 12× increase over 2023. That finding comes from Topaz et al. (Lancet 2026), who audited 2.5 million PubMed Central articles using a pipeline called CITADEL. The trajectory of the increase - explosive, post-2023 - strongly implicates the proliferation of large language models in scientific writing.
What makes this finding load-bearing is not just the rate. It is that the dominant fabrication pattern slips through every basic citation check. The identifier resolves. The DOI is real. The PMID points to a real paper. The citation looks legitimate. But the title in the citation does not correspond to the paper that the identifier actually points to.
If you have only ever checked citations by clicking the DOI to make sure it resolves, you have been missing the most common fabrication pattern in the literature.
Topaz et al.’s Supplementary Appendix 2 publishes three illustrative cases. They are worth reading in full because each one shows a different mechanism by which a fake citation evades simple checks.
A paper on construction-industry safety in Qatar cites a study supporting its ICU-admission finding:
“Impact of enhanced safety protocols on ICU admissions in the construction industry: A longitudinal analysis” - J Doe, R Smith, J Occup Environ Med (2023), PMID 36730737, DOI 10.1097/JOM.0000000000002567.
The PMID and DOI are both real, but they point to different real papers:
The cited title does not exist anywhere in the indexed literature. The identifiers exist but contradict each other. Both are in the right-sounding journal - which is what makes the confabulation plausible.
A diagnostic-imaging review cites a protocol paper:
“A Protocol for the Use of DMM/PTX-Induced Mouse Models of Osteoarthritis and Rheumatoid Arthritis” - E. Krustev, D. Rioux, J.J. McDougall, Current Protocols (2021), PMID 34767311, DOI 10.1002/cpz1.288.
The PMID and DOI agree with each other and resolve to the same real paper - but the resolved paper is “Three-Dimensional Fruit Tissue Habitats for Culturing Caenorhabditis elegans” (Guisnet et al., Current Protocols 2021). The cited title plausibly fuses two genuine methodologies - DMM (destabilisation of the medial meniscus, an osteoarthritis model) and PTX (pertussis toxin, a rheumatoid arthritis model) - into a protocol paper that has never been published.
A pain-research review cites a microglial paper:
“Microglial Modulation via Cannabinoid Receptor 2 Alleviates Fibromyalgia-Related Central Sensitization and Pain Hypersensitivity” - F. Chen, Y. Liu, H. Wang, X. Zhang, J. Li, K. Yang, Neuroscience (2023), PMID 36813155, DOI 10.1016/j.neuroscience.2023.02.008.
PMID and DOI both resolve to the same real paper - and again, it is something completely different: “ChatGPT in Research: Balancing Ethics, Transparency and Advancement” (Graf & Bernardi, Neuroscience 2023). The fabricated title combines three real neuroscience concepts (microglial modulation, CB2, fibromyalgia pain) into a plausible study that does not exist.
All three cases above pass the only check most researchers ever apply: click the DOI; does it resolve? The DOI resolves. The paper is real. The journal is real. The reference looks legitimate at a glance.
What gets missed is the cross-check between the cited title and the resolved title. If you do not compare what the citation says the paper is called against what the paper at that identifier is actually called, you cannot detect the dominant fabrication pattern.
This is not a problem you can solve by reading more carefully. The titles are designed by an LLM to sound like they fit the surrounding sentence. They reference concepts the reader expects in that context. Eyeballing them as plausible is exactly the failure mode the pattern exploits.
The fix is mechanical: every citation needs its claimed metadata compared against the resolved metadata at its identifier. That is what a verifier does.
Two presets seed the form. Run them to see the verifier flip between Matched and Mismatch. Edit any field to test your own citation — same call the API would make.
POST /api/verify — no authentication, free anonymous tier.Three levels of effort, from manual to automated, all of which catch the Topaz pattern when applied properly:
matched, mismatch, ambiguous, not_found) plus the resolved record so you can see exactly what the identifier points to.Topaz et al.’s CITADEL pipeline and Scholar Sidekick’s verifier are complementary, not competitive. They cover different points in the publication lifecycle and different parts of the citation surface area.
| CITADEL (Topaz et al.) | Scholar Sidekick verifier | |
|---|---|---|
| Timing | Offline, post-publication audit | Online, on-demand at write/review time |
| Source surface | PMC-XML | Live registries (Crossref, PubMed, OpenAlex, arXiv, ADS, others) |
| Identifier coverage | DOI + PMID (2 types) | DOI + PMID + PMCID + ISBN + arXiv + ISSN + ADS bibcode + WHO IRIS URL (8 types) |
| Reported precision | 91% (Topaz et al., internal benchmark) | 1.000 on a 20-entry validation set (see below) |
| Distribution | Research pipeline | Public REST API + MCP tool + (planned) web UI |
CITADEL ran a retrospective audit across 2.5 million biomedical papers. Scholar Sidekick is built to be called at the moment a citation is added - by a peer reviewer, an editor, an author cross-checking their own bibliography, or an LLM grounding its references. The methodology Topaz et al. validated at population scale is what our verifier applies at point-of-use scale, with broader identifier coverage so it works for citations CITADEL was not designed to touch (books via ISBN, ML and physics preprints via arXiv, astrophysics via ADS bibcode, institutional grey literature via WHO IRIS URL).
Every quantitative claim about the verifier on this page is tied to a specific validation run. The fixture is hand-curated, immutable, and published below as evidence. The results JSON files are timestamped receipts - you can inspect them, re-run the harness, and check our numbers against your own.
Twenty hand-curated entries across five categories:
matched.ambiguous.informal_abbreviation and upgrade the verdict to matched. These four entries are the only ones we tuned against the live verifier - they were probed to ensure they exercise the LLM-screen path. The LLM’s verdict on them is what we report.not_found.Run against the live /api/verify endpoint:
Recall of 1.000 in both modes is the line that matters: every actual fabrication, wrong-identifier case, and invented citation was correctly flagged. Nothing got through.
The fixture is marked immutable. When we add entries we create validation-set-v2.json and re-measure; old numbers always cite the specific fixture version they came from.
Both. The fabrication pattern is the same regardless of origin: a citation pairs a real, resolvable identifier (DOI or PMID) with a title that does not correspond to the paper at that identifier. Topaz et al. note the steep increase since 2023 strongly implicates LLM authorship, but the verifier checks the structural disconnect - claimed title versus resolved title - not who wrote the citation.
Yes. CITADEL (the pipeline Topaz et al. used) covers DOI and PMID - the biomedical identifier surface. The Scholar Sidekick verifier covers DOI, PMID, PMCID, ISBN, arXiv ID, ISSN, NASA ADS bibcode, and WHO IRIS URL - eight identifier types, which extends the same cross-reference methodology into books, computer-science and physics preprints, astrophysics, and institutional grey literature.
Today the verifier is exposed as a single-citation API at /api/verify and as a verifyCitation MCP tool. A batch web UI with .bib/.ris upload is planned (Phase 12i.4). For now, scripts can call the API per reference; rate limits scale with plan tier.
Retraction is a different signal. A real, correctly-cited paper can still be retracted. Scholar Sidekick exposes retraction-checking at /tools/retraction-checker (Retraction Watch via Crossref). It is not wired into the verifier endpoint yet - that is a separate planned phase. If you need both signals on a bibliography today, call them separately.
The /api/verify endpoint is free at the anonymous tier with a published rate limit. The LLM screen - used only when the simple verifier returns mismatch with low confidence - is gated to authenticated first-party callers and paid RapidAPI tiers, since each model call carries a real per-call cost that Scholar Sidekick pays to the model provider. We protect against runaway spend with a server-side daily cap; once that cap is hit, subsequent verifier requests fall back gracefully to the non-LLM verdict.
We hand-curated a 20-entry fixture set sourced from the Topaz et al. supplementary appendix and from independent registry lookups via Crossref, PubMed, and arXiv. We ran every entry through the live verifier and counted how many actual fabrications were flagged (recall) and how many legitimate citations stayed clean (precision). Numbers and the full fixture are published below; the JSON is immutable for v1.
CITADEL is offline, post-publication, PMC-XML-only, and ran retrospectively across 2.5 million papers. Scholar Sidekick is online, on-demand, available at write or review time, and covers six identifier types CITADEL does not. The two surfaces serve different points in the publication lifecycle: CITADEL audits the literature retrospectively; Scholar Sidekick checks the citation as it is being written or peer-reviewed.
Topaz M, Roguin N, Gupta P, Zhang Z, Peltonen L-M. Fabricated citations: an audit across 2·5 million biomedical papers. The Lancet. 2026;407(10541):1779-1781. doi:10.1016/S0140-6736(26)00603-3. Open access. The primary source for this page; the three illustrative cases come from its Supplementary Appendix 2.
/api/verify reference