In my previous post, I briefly mentioned Ethan Perlstein’s two posts on publishing in the era of open science (part one and part two). I just wanted to quickly highlight something that I haven’t paid much attention to, but perhaps should have, regarding the importance of post publication. In his post, Perlstein mentions the ratio of HTML to PDF downloads, and how he used this as proxy for the quality of readership (i.e. expert, academic readers versus non-expert readers):
Some of you may be miffed by my apparent conflation of site traffic and readership. Without more sophisticated analytics, I confess that it’s difficult to gauge the background of readers (scientists vs. non-scientists), or how much of the paper they’re actually reading (abstract vs. full text). However, the ratio of HTML views to PDF downloads (HTML/PDF) may be informative here. After the initial surge, HTML/PDF was 1 in 20 and remained there until the second surge, after which it fell to 1 in 30. If we assume that PDF downloads are a proxy for “expert” readership, then the second surge diluted quality readership. Conversely, the third surge lifted the ratio to 1 in 15, with as many as 20% of readers, many of whom were presumably academics, on Day 29 choosing to download a PDF version of the paper.
It certainly rings true for me that I only download the PDFs of papers I intend on referencing and making use of on more than one occasion. Still, my intentions sometimes verge on the side of completely unrealistic and I’ll download a large number of PDFs that don’t survive past a cursory glance. I guess, for me at least, there is a stronger driving motivation behind my downloading of PDFs: the cluttered design of HTML pages for many journals simply drives me insane. Rather than any in-built preference for reading in PDF, the need to get away from the HTML page is more than enough motivation for me to download a small document. Still, with better online reading and annotating tools being just around the corner, I wonder how long this variable will hold as a viable proxy?
In short, I certainly think more research needs to be done into readership, with one aim being to develop metrics that accurately capture expert and non-expert audiences.