The digital landscape has brought new opportunities and challenges to the evaluation of research across different academic disciplines. How does open science influence this already complex landscape? Is there any difference in evaluating research in different disciplines? Milena Dobreva* looks at historical developments and then at some recent projects which seek to better understand the changes which open science is bringing to the research evaluations.
The monitoring and measuring of academic outputs developed gradually and took shape in the sixties of the 20th century when Scientometrics consolidated as a knowledge area. Our timeline illustrates some of the developments in this domain.
The predecessors of Scientometrics mostly explored how to predict the publication activity within a particular area. In 1926, Alfred J. Lotka proposed a formula which assesses the frequency of publication across a research domain; while it models the common-sense scenario where a small number of researchers publish a high number of articles compared to many scholars who publish few papers, the formula was one of the initial steps into the domain of modelling and predicting the flow of scientific publications.
This development was followed by Samuel Bradford’s Law which suggests that in any domain the journals can be split into three groups: a limited number of core journals attracts 1/3 of the articles on a given subject; another set of journals which do not have such a central role publish further 1/3 of the articles on the subject, and the remaining 1/3 of publications are spread between a large set of remaining journals. The number of core journals differ across domains. This law demonstrates that if a researcher wants to discover the best publications in their domain they would find 1/3 in some 5-10 journals; with the rest of the good quality publications spread in less popular journals. It also has implications on collecting such sets of journals, e.g. in online journal access bundles, which are expected to answer the needs of researchers across various disciplines.
While these models are based on mathematical formulae, the collection of bibliographic data in digital form which grew sufficiently by the 60’s of the 20th century allowed the leap from generic predictions to tracing the visibility of particular publications, authors, journals, or research groups.
Harnessing the power of citations
The gradual use of computer technologies and the availability of bibliographic data in digital form allowed a new development which is considered to be the major milestone for the development of the modern Scientometrics. In 1960 Eugene Garfield founded the Institute for Scientific Information (ISI) offering a novel citation indexing service (SCI, Science Citation Index) for papers published in the academic journals. With the advancement of the service, the areas of social sciences and humanities started to be covered by the designated Social Sciences Citation Index (SSCI) from 1972, and Arts & Humanities Citation Index (AHCI) from 1978.
New metrics and novel evaluation frameworks. Changing roles of stakeholders
In recent years, the evaluation of academic work is a domain undergoing intensive changes. Stakeholders in this domain increasingly find that the traditional roles they played are being redefined with the growing share of scientific communication happening in digital space. Policy makers and funders traditionally introduced and assessed indicators which help to analyse the influence of their policies on academic performance – and now have to create novel frameworks integrating indicators which capture the ongoing change and its consequences; higher education institutions strive for collecting evidence demonstrating their research excellence which would help to attract not only more students, but also larger research funding; academic publishers undergo substantial changes in business models to benefit from the digital environment but also need to factor in the changes in the research evaluation culture; last but not least scholars are keen to see how influential their work is compared to their peers and have to look for more indicators and to test which ones really capture substance and not only visibility. In the last few decades there has been a substantial expansion in developing new metrics to evaluate research impact, and this created a positive climate new intermediators to emerge – mostly commercial companies that implement different metrics appeared within the research evaluation landscape.
The wide array of methodologies and analysis techniques available to evaluators presents both an opportunity and a challenge. Whereas practicing evaluators have an ever-growing collection of methodologies from which to choose, those seeking to take stock of recent research on the set of evaluation methodologies appropriate for a given situation are faced with a daunting task.
This shift in the roles of stakeholders also needs to integrate the advancement of open science where publications, research data, and methodologies are all offered openly to researchers and also to the larger citizen community. Open science aims to also integrate citizen researchers in scholarly activities. This is yet another change in the evaluation landscape, first because the open access publications and their reputation are still blending into the citation based metrics where the well-established journals keep certain positions. If we look again into Bradford’s law, one area which still needs to be studied is whether open access journals manage to enter the sets of those selected publications, a few in every research domain, which encounter for 1/3 of the publications in their respective research domain – or do they still play the role of absorbing the less popular articles being written.
Redefining the evaluation frameworks
One noticeable tendency in recent years is that different communities of academics are proposing manifestos appealing for a change in the practices in evaluating research.
Do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.
The San Francisco Declaration on Research Assessment (DORA) had been initiated in 2012 by the American Society for Cell Biology (ASCB) jointly with a group of editors and publishers of scholarly journals. It includes a set of recommendations aiming to improve the ways in which scholarly outcomes are being evaluated. Its general recommendation aimed at all stakeholders takes a strong position against the use of some quantitative metrics: “Do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions”. Currently the Declaration has over 12,000 individual supporters and some 900 institutional supporters; only about 6% come from the Humanities with the majority of supporters being from the Sciences.
Is the low number of supporters from the Humanities indicative of a slower engagement of this community with the ongoing debate on evaluation of research outputs? Or is it rather indicative of the separation between the networks of scholars in the Humanities and the Sciences?
The Leiden manifesto of 2014 suggests as its first principle “Quantitative evaluation should support qualitative, expert assessment”. It also calls for openness and transparency of the evaluation process, and for taking into account the “variation in publication and citation practices” across domains. In general the spirit of the manifesto supports both the open science contributing to the openness the dimension of evaluation; it is also indicating that differences between domains have to be taken into account.
But how do these differences manifest? How much do we know about them? And in particular, how can those researchers working in interdisciplinary units be assessed? An active computer scientist working in a digital humanities team – and I have observed such situations myself – would have achieved quantitative metrics which might outnumber many of his colleagues who are coming from the Humanities domain and publish in their ‘native’ journals. On the contrary, a linguist working in a computer science team would constantly be in danger of under-delivering on the metrics applied nowadays.
However, the communities of those scholars who are interested in evaluation of research and the best methods to achieve objective and informative evaluations, still have not developed any universally accepted models especially within the Humanities and Social Sciences domains.
I would like to bring the attention of the readers of this blog to two ongoing networks in Europe, which are addressing some of the issues of the specific requirements and opportunities for the Humanities and social sciences within the research evaluation domain.
ENRESSH: a systematic look at the changing landscape of evaluation in the Social Sciences and the Humanities
The “European Network for Research Evaluation in the Social Sciences and the Humanities” (ENRESSH) is a COST action which started in 2016 and will be implemented over a four-year period. It focuses on mapping the European practices in the evaluation and providing best practices.
ENRESSH has the ambition to look at various methods, including qualitative and quantitative evaluation (expert and peer-review, indicator based evaluation, case studies and narratives). It will explore how evaluation relates to journal rating; publisher ranking; book scoring; and how it is used within Google Scholar and other web-based resources. Although the action only started recently, it seeks for viable methods of engaging policy makers from the various countries taking part in the network. The network will be organising events and will also be offering periodically the opportunity to apply for a short term scientific mission exploring particular areas of interest.
KNOWeSCAPE: “Same data, different results”
KNOWeSCAPE – Analyzing the dynamics of information and knowledge landscapes, is another COST action implemented between 2013 and 2017. Although it is not specifically focused on research evaluation, it hosted activities which looked at ways of analysing and visualising various research metrics. One such example is the short term scientific mission of Rob Koopman from OCLC, Leiden, to the Departement of Library and Information Science, Humboldt University Berlin, a visit hosted by Frank Havemann. The aim of the visit was to look at alternative ways to cluster documents based on citation links and semantic indexing. Since the elements analysed in bibliometrics—papers, journals, authors, and nowadays web resources are thematically heterogeneous, this research also addresses the essential question of receiving the same results over the same set of resources when the same algorithm is implemented in various times, or when different algorithms evaluating identical metrics are applied. While this research is not directly focused on evaluation in social sciences and the humanities, it illustrated one of the aspects of modern Scientometrics which needs further research and implementation.
Conclusive remarks: The taxonomy tree of FOSTER and the way ahead
FOSTER (Facilitate Open Science Training for European Research) is a two-year EC-funded project which was completed in 2016. Besides its training activities and open educational resources, the project developed a taxonomy tree of open science1 which features Open Science Evaluation.
This domain includes Open metrics and impact (featuring altmetrics, bibliometrics, semantometrics, and webometrics) and Open peer review (see Fig. 2).
Our previous illustrations show that the granularity of this taxonomy in its Evaluation area can be developed in further depth. The historical developments in the domain of Scientometrics demonstrate a movement from abstract models to the heavy use of data residing in the digital space. The pioneering use of citation data is being extended these days with new metrics. The researchers’ communities are not only following the changes in the publication domain, but are also proactive in the search for new frameworks for evaluation which will balance the quality of the research input with data which demonstrate visibility and reputation. We still work in a time when the actual frameworks are being designed, and new metrics appear – and while some may see this as yet another manifestation of the old Chinese curse, “May you live in interesting times”, for others this is definitely an exciting time of new developments and of rethinking not only evaluation practices, but also the overall research lifecycle within the open science paradigm.
1 – Lotka, A.J. (1926). “The frequency distribution of scientific productivity”. Journal of the Washington Academy of Sciences. 16 (12): 317–324.
2 – Currently collects data from 3,000 journals across 50 disciplines, according to the webpage.
3 – Collects data from 1,700 journals (source).
4 – For example, altmetrics.com which allows to visualize data on social media presence of publications; researchgate.net, academia.edu, google scholar which serve as personal repositories of research publications of individual scholars but also provide various metrics on visibility and impact.
5 – COST, Cooperation in Science and Technology, is a programme funded by the European Commission which support trans-European research networking (http://www.cost.eu/).
* – Milena Dobreva is an Associate Professor in the University of Malta. Her major research interests are in digital libraries and digital humanities.