Open access and citations – do we know anything for sure?


March 23, 2016

For 74 Taiwanese researchers, the mean citation score is 1,29 time higher for their works published in open access journals than for those submitted in traditional venues. It would be good to confirm this output for a big sample of researchers working in various countries and disciplines.

Citations are the currency of modern, competitive academia. Authors want them, journal editors and publishers want them for their authors, heads of faculties want them for their researchers. “Can open access bring more citations to this particular article?” – this is one of the questions which is frequently asked by representatives of each of these groups.

Intuitively, posting an article on-line, so it can be accessed by anyone with an Internet connection, should generate more readership for the paper. Subsequently this should result in more citations, in comparison to the articles hidden behind paywall. Greater citation scores therefore seem to be a logical consequence of the openness.

However, when talking about this problem, we should remember that readership does not translate automatically into citations. We can imagine the paper that has a big readership and small citation score, as well as the paper which has more citations than views (yes, some researchers refer to papers that they have not read). In general, we know that review articles get more citation than original studies, and papers introducing new methods get more citations than those coming up with new results. So not every paper with the same readership will have the same citation score. Readership has no direct influence on citation, but it is an almost necessary condition of citation. This makes a possible open access citation advantage an even more interesting subject of research.

46 times “yes”

Hypothetical open access citation advantage was tested by at least 70 peer reviewed studies up to date (a very good summary of these studies can be found on the SPARC/Europe website), with 46 confirming it, 17 denying and 7 inconclusive. Only 10 of these studies excluded self-citations. Among them 7 found a citation advantage for open access and 3 denied it. Positive influence was confirmed by studies on works from various fields including Humanities, Social Sciences, Natural Sciences, Engineering and Mathematics (which is clearly noted in the summary mentioned above). So is it certain that this influence exists? Is it part of the current consensus among researchers? If so, how did the studies written by “deniers” even get published?

I think it is a good idea to have a closer look at this problem and examine the arguments from both sides. And if you think that analysing open access influence on citations is an easy task, unfortunately I have to warn you that as usual, the reality is not so simple.

The simplest is not the best

There are big databases, which store information about citations in research papers, and compare the average citation score for journals. Only a quick look at these databases is enough to realise that in the majority of research fields, most cited journals are not open (however in some fields there are open access venues challenging the top of the rankings). Fully open access journals are usually quite new and the majority of these venues are not so glamourous for researchers like established serials, of which the majority are toll access. Glamourous, paywalled journals attract well known authors, who lift journal’s average citation scores. So it is pointless to treat the comparison of a journal’s impact per paper as a way to estimate a possible open access citation advantage.

The journal centred approach

The majority of studies concentrate on comparing the results of open vs toll access articles published in the same volume of the same journal. This is possible in two cases. The first case is so called hybrid open access, where the author can pay an Article Processing Charge for opening his or her article published in a traditional venue. Second is “green open access”, the self-archiving of works published in traditional journals in open access repositories.

Hybrid open access is expensive for the authors, therefore it is used mostly by researchers with external funding, who are sometimes even required to publish their output openly. Articles published this way are more likely to come from a small percentage of high profile authors with access to external funding and should not be used in a simple comparison to other works published in the same journals.

Different forms of green

Green open access is most frequently considered by studies on open access citation advantage. Multiple studies exist that compare articles published in the same journal and volume to determine if those with green open access copies are cited more. But green OA in fact consists of several phenomenon. In some disciplines it is popular to publish a non-reviewed version of the work (pre-print). Pre-print of the article is therefore openly available months before formal publication of their peer-reviewed version. Some authors also archive peer-reviewed version of their works, when it is allowed by the publisher, and these versions are usually available on-line after an embargo period.

The selection bias and the early advantage

Analysing the influence of green OA copies on citations seems to be an interesting approach to the problem, but several questions emerge around this topic. The first is whether authors tend to self-archive better papers, so is the reported correlation between green open access and citation a false positive, originating from the quality difference (this problem is called “the selection bias”)? The second problem is the difference in publication time. If some green open access copies are pre-prints, they might be published much earlier than officially published articles, and earlier than articles without an open access copy published in the same venue. This difference may cause difference in citation patterns (this is called “the early advantage”).

Among studies analysing the importance of these two problems, the existence of an “early advantage” was confirmed by 5 of 6. This is quite an important argument for the “deniers”, that at least some of the positive effect of open access on citations that was found in other studies was a result of the early advantage, not the openness itself, and so it does not apply to all types of open access. Selection bias was confirmed by 3 of 5 studies analysing this problem, so it may be also be a source of the error in some studies on the green open access. Analysis claiming very high citation advantage for green open access, may present numbers inflated by factors other than the openness itself.

Can we say something about gold OA researching the green?

On the other hand, another two factors may limit the potential influence of green open access on citation scores, and are not taking place in the case of other OA publishing models. Firstly, in some cases the copies archived in the repository might be harder to find for their potential audience than the article itself, so the influence on the citation might be smaller than if the article was open on a publisher’s website. Another factor is the previously mentioned difference between versions published in OA repository and on the publisher website. These differences may make both citing the paper and counting citations more challenging. Therefore, studies on green open access should not be directly extrapolated onto other models.

As usual, more research is needed

In my opinion, the best way would be to compare the difference in performance of papers published by the same authors in open access journals, with their works in traditional venues. This was already done, but by a few small studies only. The most comprehensive one concerns publications of 74 authors from Taiwan, dealing with Library and Information Sciences. According to this research the mean citation score was 1,29 time higher for open access papers by these researchers, than for their works published in traditional venues. What is more, according to this study, the positive influence of openness was bigger for papers written in Chinese than for ones published in English. These are interesting results and I think it would be good to confirm them on works from different disciplines and from other countries.

