Listen to Research Papers: arXiv, PMC, bioRxiv, OpenReview [Free, 2026]

My literature-review folder has a name. I call it The Graveyard. It is a flat directory full of PDFs named 1703.03400v3.pdf and 41586-2023-06004-9.pdf and anonymous-iclr2026-3847.pdf, and I promise myself every weekend I'll read them. On Sunday night I move the oldest ones to a subfolder called archive-maybe-later and tell myself this is triage. It is not triage. It is a shrine.

If you do any kind of research — academic, industrial, hobby — you probably have a graveyard too. Preprints from Friday's arXiv dump. PMC articles your advisor forwarded. A bioRxiv link someone dropped in Slack. The NeurIPS OpenReview page with two hundred accepted papers you are going to "skim the abstracts of." None of this fits in a day. None of it fits on a single screen.

Last winter I gave up trying to read them and started listening to them instead. Not as polished podcasts. Not through a $20/month AI service that summarizes and editorializes and flattens the thing into something you can't cite. Just: open the paper in my browser, click a button, hear the text in my ears while I do something else. That's the workflow. That's all of it.

The extension is CastReader. It's free. This post is about the five paper sources it handles best.

Why HTML, Not PDF

Quick detour on the format question, because it matters.

A PDF is a layout. It knows where ink goes on a page. It does not know what a paragraph is. Two-column journals, floating figure captions, hyphens at line breaks, inline citations wedged into the prose — PDF readers have to reconstruct the text flow by guessing. They guess wrong constantly, and when they guess wrong the audio sounds like someone reading a shuffled deck of cards. Paragraph from page 3. Figure caption from page 7. Footer. Page-number-acknowledgement. Nobody lasts three minutes of that.

HTML is structured text. Paragraphs are paragraphs. Headings are headings. The reading order is the order of the DOM. When a paper exists as HTML, a browser-based TTS tool reads it the same way it reads a blog post. When a paper only exists as PDF, you're still better off opening it in the browser (which extracts text on the fly) than running it through a desktop reader, but the experience is measurably rougher.

Almost every major paper source has figured this out by now. Here's where each one sits.

arXiv

Dedicated page: /listen-to-arxiv

arXiv started rendering HTML via ar5iv and LaTeXML in 2023. By 2024 most new submissions have an HTML version — look for the "HTML" link next to "PDF" on the abstract page. Open that page, click CastReader, and you're listening.

Math gets skipped. Inline equations, display blocks, theorem environments — the TTS voice pauses briefly and moves on. For a machine learning paper where the prose describes the architecture and the equations formalize it, you still get 80% of the argument from the text alone. For a pure theory paper where every sentence references Theorem 3.2, audio isn't the right modality. Read that one with your eyes.

I use arXiv listening mostly for triage. Twenty papers in my morning feed become twenty-minute walks where I decide which three deserve desk time.

PMC

Dedicated page: /listen-to-pmc

PubMed Central is NIH's open-access archive for biomedical full text. It's distinct from PubMed — PubMed is the search index (you mostly see abstracts), PMC is the archive (you get the complete paper). Whenever a PubMed result has a "Free full text" link, it usually points to PMC, and that's where the real reading happens.

CastReader reads the abstract, introduction, methods, results, and discussion in order. Inline [1][2] citation markers and the References list are stripped. Funding, Acknowledgements, Author Contributions — skipped. Figure and table captions are read inline because they're prose. Figure images and table data are skipped because audio can't represent them. If you need to actually look at Figure 3, the paragraph highlighting tells you where you are in the paper so jumping back to the visual is one glance.

This is the workflow that converted me on biomedical listening. You can get through the argument of a 6,000-word open-access paper in about 25 minutes at 1.5x speed. That's the time it takes to make dinner.

Europe PMC

Dedicated page: /listen-to-europepmc

Europe PMC is the European mirror of PMC, plus it adds European biomedical sources that PMC doesn't index. It's a Vue SPA with a lazy-loaded accordion — the full-text section doesn't appear in the DOM until the user expands it. CastReader waits for the hydration and then reads the JATS structure the same way it reads PMC. Both PMC ID (/article/PMC/PMCxxxxxxx) and MED ID (/article/MED/xxxxxxx) URL schemes work.

If your lab has a Europe PMC collection, this is the same experience as PMC: abstract, introduction, methods, results, discussion. Clean audio, no citation noise, paragraph highlighting.

bioRxiv + medRxiv

Dedicated page: /listen-to-biorxiv

bioRxiv and medRxiv are the preprint servers for life sciences and medicine. They're both built on the HighWire Press platform with JATS rendering, so the same extractor handles both. Open the .full URL — not the abstract-only page — click CastReader, and it reads the preprint top to bottom.

One thing to say about preprints specifically: they haven't been peer-reviewed. Use them for speed, not for final certainty. The upside of listening is that you can churn through a week of preprint backlog in an evening and identify the three that are worth deep-reading once the reviewed version lands. That's a legitimate use of preprints and a legitimate use of audio.

For preprints that are PDF-only (which is rare on bioRxiv but does happen), see the PDF listening page.

OpenReview

Dedicated page: /listen-to-openreview

OpenReview hosts conference submissions for NeurIPS, ICLR, ICML, COLT, RLC, TMLR, and a long tail of workshops. The forum page (openreview.net/forum?id=...) shows the abstract, authors, keywords, and the review threads. CastReader reads the paper metadata cleanly — skipping the status badges, review tabs, and author-affiliation popups that make React SPAs noisy.

Full paper bodies on OpenReview are distributed as PDFs. For those, download the PDF and use the PDF listening page or the CastReader desktop app. But for the abstract-triage problem — "which of the 487 ICLR submissions should I actually read?" — listening to a block of abstracts while walking is transformative. I go into poster sessions with a real shortlist instead of a feeling of generalized panic.

Review thread reading is planned for a future release. For now, if you want the reviewer takes, open them visually.

What About the Long Tail

This post covers the five sources with the highest volume and the cleanest HTML. There are dozens more. Semantic Scholar for metadata and linked abstracts. ResearchGate for researcher-shared uploads. Journal publisher sites (Nature, Cell, Science, PLOS, Springer) where individual papers can be read if the full text is accessible. Generic web pages — blog posts, institutional reports, lab wikis — everything CastReader handles for non-academic content also works for these.

The CastReader philosophy is: if the content exists as HTML in the DOM, we read it. If it exists as a PDF you open in your browser, we read what the browser extracts. If it's locked behind a paywall you can't access, we don't pretend to have scraping magic — nobody does.

The Listening Habit

Here is what actually changed for me after six months of paper listening.

The graveyard folder shrunk. Not because I read everything — I don't — but because I triaged everything. I know what each paper claims. I know which ones I need to return to. The anxiety of unopened PDFs was mostly the anxiety of not knowing. Listening replaces unknown with known-and-filed.

My commute turned productive. Thirty-five minutes each way. That's seventy minutes daily of focused academic listening at 1.5x speed. About four abstracts-plus-introductions. Over a week that's twenty papers of light coverage — roughly the literature-review pass I used to procrastinate on for a month.

My eyes lasted longer. Paper deadlines used to end with me staring at a screen at 2am trying to read one last reference. Now the last reference gets listened to while I make tea, and the final hours of focused writing happen with a mind that isn't screen-saturated.

The one thing listening can't replace: the close-reading pass on the paper that matters most. When I find a paper that's going to shape my work, I still sit down and read it with a pen and a notebook, marking equations, underlining claims, writing margin notes. Audio is the filter. Eyes are the scalpel.

How to Start

  • Install CastReader from the Chrome Web Store. It's free. No signup. It's also on Edge.
  • Open a paper on any of the five platforms. For arXiv use the HTML link, for bioRxiv use the .full URL, for others the forum or article page.
  • Click the CastReader icon. Listen.
  • Adjust speed with the floating player. I default to 1.5x for familiar topics, 1.1x for dense methods sections, 2x for triage.

No credits. No quotas. No tier. The whole point of CastReader is that the cost-to-try on any given paper is zero, so papers stop being heavy objects that you avoid and become lightweight background that fills your walks and chores.

The graveyard doesn't have to be a graveyard.

Listen to Research Papers: arXiv, PMC, bioRxiv, OpenReview [Free, 2026] | CastReader