Skip to main content
  1. Posts/

The citation problem nobody talks about

·3 mins
Author
TurtleTech ehf.
OokCite: DOI validation and citation formatting flow

The format treadmill
#

A thesis goes through LaTeX with BibTeX. Consistent style, correct DOIs, clean output. Then Methods X wants Word. A book chapter needs Chicago author-date instead of numeric. CSL gets close but not exact. Every manual fix in one output format introduces a regression in another.

This cycle eats hours that should go toward actual research. The problem recurs in every research group, and nobody treats it as a problem worth solving properly. It just sits there, consuming time.

Adware masquerading as tools
#

Citation generators exist. Citefast, MyBib, and the rest format references from DOIs or titles. They also plaster every page with ads, inject tracking scripts, and treat search queries as advertising inventory.

The transaction feels wrong. Formatting a DOI in APA style takes one HTTP call to CrossRef and a CSL processor. The infrastructure cost rounds to zero. The ad-to-utility ratio suggests a different business model than “helping researchers.”

Language models make it worse
#

The temptation to ask a chatbot for citation help has an ugly failure mode.

A 2023 study in Scientific Reports checked 300 ChatGPT-generated citations. 32.3% were fabricated. Not wrong dates or transposed initials: entirely nonexistent papers. The fabricated references used real author names active in the field, properly formatted DOIs, and genuine journal titles. The papers themselves never existed.

A Deakin University study of GPT-4o found 56% of generated citations contained fabrications or substantive errors. In Mata v. Avianca (2023), lawyers submitted a brief with chatbot-generated case law. The cases didn’t exist. The court fined them $5,000.

The mechanism: author names, journal titles, volume numbers, and page ranges follow predictable statistical patterns. Language models generate text that matches those patterns without verifying whether the specific combination refers to a real publication. A fabricated citation can propagate through the literature for years before anyone checks the DOI.

OokCite
#

We built OokCite around a simple constraint: every DOI gets validated against our own search index, backed by CrossRef, before formatting. If the DOI doesn’t resolve to a real record, you get an error, not invented metadata.

2,900+ CSL styles. Paste a DOI, pick a style. No ads, no tracking. The free tier handles 30 lookups per day.

The name pays homage to the Librarian from Terry Pratchett’s Discworld.1 An orangutan who runs the Unseen University library, catalogs every volume, and maintains the chains that keep the more dangerous books from escaping. He communicates almost entirely by saying “Ook.” If anyone understands the importance of getting references right, it’s him.

OokCite integrates with Ridley, so you can format citations on your phone while reading a paper. Try it at ookcite.turtletech.us.


  1. He turns people who call him a monkey into something regrettable. ↩︎