Irrational rationality: critique of metrics-based evaluation of researchers and universities

According to one of the most consulted of the global university rankings services, the QS World University Rankings 2022, the University of Toronto is the top ranked university in Canada. It shouldn’t take more than a brief pause to reflect on this statement to see the fiction in what is presented as objective empirical information (pseudoscience). In the real world, it is mid-June, 2021. The empirical “facts” on which QS is based are still in progress, in a year of pandemic with considerable uncertainty. It is not possible to complete data on 2021 until the year is over. Meanwhile, QS is already reporting stats for 2022; perhaps they are psychic?

Scratching slightly at the surface, anyone with even a little bit of familiarity with the universities in Canada is probably aware that the University of Toronto is currently under a rare Censure against the University of Toronto due to a “serious breach of the principles of academic freedom” in a hiring decision. Censure is a “rarely invoked sanction in which academic staff in Canada and internationally are asked to not accept appointments, speaking engagements or distinctions or honours at the University of Toronto, until satisfactory changes are made”. I don’t know the details of the QS algorithms, but I think it’s fair to speculate that neither support for academic freedom or a university’s ability to attract top faculty for appointments, speeches, distinctions or honours is factored in, or if factored in, weighted appropriately.

Digging just a little bit deeper, someone with a modicum of understanding of the university system in Canada and Ontario in particular would know that the University of Toronto is one of Ontario’s 23 public universities, all of which have programs approved and regularly reviewed for quality by the same government, and funded under the same formulae and provide the same economic support for students. Degrees at a particular level are considered equivalent locally and courses are often transferable between institutions. When not under censure, the University of Toronto is indeed a high quality university; so is the University of Ottawa, where I work, Carleton (the other Ottawa-based university), and all the other Ontario universities. Specific programs frequently undergo additional accreditation. My department offers a Master’s of Information Studies program that is accredited by the American Library Association (ALA). Both the Ontario government and ALA require actual data in their QA / accreditation process. This includes evidence of strategic planning, but not guesswork about future output.

If QS is this far off base in their assessment of universities in the largest province of a G7 country (the epitome of the Global North), how accurate is QS and other global university rankings in the Global South? According to Stack (2021) and the authors of the newly released book Global University Rankings and the Politics of Knowledge global university rankings such as QS and THE and the push for the Global South to develop globally competitive “world class universities” are more about reproducing colonial relations, marketizing higher education and commercializing research than assuring high quality education. The attention paid to such rankings distracts universities and even countries from what matters locally. As Chou points out, the focus on rankings leads scholars in Taiwan to publish in English rather than Mandarin although Mandarin is the local language. A focus on publishing in international, English language journals creates a disincentive to conduct research of local importance almost everywhere.

My chapter in this work focuses on the intersection of critique on metrics-based evaluation of research and how this feeds into the university rankings system. The first part of the chapter Dysfunction in knowledge creation and moving beyond provides a brief history and context of bibliometrics and the development of traditional and new metrics-based approaches and major critique and advocacy efforts to change practice (the San Francisco Declaration on Research Assessment (DORA) and the Leiden Manifesto). The unique contribution of this chapter is critique of the underlying belief behind both traditional and alternative metrics-based approaches to assessing research and researchers, that is, the assumption that impact is good and an indicator of quality research and therefore it makes sense to measure impact, with the only questions being whether particular technical measures of impact are accurate or not. For example, if impact is necessarily good, then the retracted study by Wakefield et al. that falsely correlated vaccination with autism is good research by any metric – many academic citations both before and after publication, citations in popular and social media and arguably a factor in the real-world impact of the anti-vaccination movement and the subsequent return of preventable illnesses like measles and a factor in the challenge of fighting COVID through vaccination. An alternative approach is suggested, using the traditional University of Ottawa’s collective agreement with APUO (union of full-time professors) as a means of evaluation that considers many different types of publications and considers quantity of publication in a way that gives evaluators the flexibility to take into account the kind of research and research output.


Morrison, H. (2021). What counts in research? Dysfunction in knowledge creation and moving beyond. In: Stack, M. (2021). Global University Rankings and the Politics of Knowledge, pp. 109 – 130.

Stack, M. (2021). Global University Rankings and the Politics of Knowledge.

What counts in research? Dysfunction in knowledge creation & moving beyond

One of the long-term challenges to transitioning scholarly communication to open access is reliance on bibliometrics. Many authors and organizations are working to address this challenge. The purpose of this post is to share some highlights of my work in progress, a book chapter (preprint) designed to explain the current state of bibliometrics in the context of a critique of global university rankings. Some reflections in brief that are new and relevant to advocates of open access and changes in evaluation of scholarly work follow.

  • Impact:it is not logical to equate impact with quality, and further, it is dangerous to do so. Most approaches to evaluation of scholarly work assume that impact is a good thing, an indicator of quality research. I argue that this reflects a major logical flaw, and a dangerous one at that. We should be asking whether it makes sense for an individual research study (as opposed to weight of evidence gained and confirmed over many studies) should have impact. If impact is good and more impact is better, then the since-refuted study that equated vaccination with autism must be an exceptionally high quality study, whether measured by traditional citations or the real-world impact of the return of diseases such as measles. I argue that this is not a fluke, but rather a reasonable expectation of reward systems that favour innovation, not caution.  Irreproducible research, in this sense, is not a fluke but rather a logical outcome of current evaluation of scholarly work.
  • New metrics (or altmetrics) serve many purposes and should be developed and used, but should be avoided in the context of evaluating the quality of scholarship to avoid bias and manipulation. It should obvious that metrics that go beyond traditional academic citations are likely to reflect and amplify existing social biases (e.g. gender, ethnicity), and non-academic metrics such as tweets are in addition subject to manipulation by interested parties including industry and special interest groups (e.g. big pharma, big oil, big tobacco).
  • New metrics are likely to change scholarship, but not necessarily in the ways anticipated by the open access movement. For example, replacement of the journal-level citation impact by article-level citations is already very well advanced, with Elsevier in a strong position to dominate this market. Scopus metrics data is already in use by university rankings and is being sold by Elsever to the university market.
  • It is possible to evaluate scholarly research without recourse to metrics. The University of Ottawa’s collective agreement with full-time faculty reflects a model that not only avoids the problems of metrics, but is an excellent model for change in scholarly communication as it is recognized that scholarly works may take many forms. For details, see the APUO Collective Agreement 2018 – 2021 section 23.3.1 – excerpt:

23.3.1. General Whenever this agreement calls for an assessment of a Faculty Member’s scholarly activities, the following provisions shall apply.

a) The Member may submit for assessment articles, books or contributions to books, the text of presentations at conferences, reports, portions of work in progress, and, in the case of literary or artistic creation, original works and forms of expression

b) Works may be submitted in final published form, as galley proofs, as preprints of material to be published, or as final or preliminary drafts. Material accepted for publication shall be considered as equivalent to actually published material…

h) It is understood that since methods of dissemination may vary among disciplines and individuals, dissemination shall not be limited to publication in refereed journals or any particular form or method.

There may be other models; if so, I would be interested in hearing about them, please add a comment to this post or send an e-mail.

The full book chapter preprint is available here:


This chapter begins with a brief history of scholarly journals and the origins of bibliometrics and an overview of how metrics feed into university rankings. Journal impact factor (IF), a measure of average citations to articles in a particular journal, was the sole universal standard for assessing quality of journals and articles until quite recently. IF has been widely critiqued; even Clarivate Analytics, the publisher of the Journal Citation Reports / IF, cautions against use of IF for research assessment. In the past few years there have been several major calls for change in research assessment: the 2012 San Francisco Declaration on Research Assessment (DORA), the 2015 Leiden Manifesto (translated into 18 languages) and the 2017 Science Europe New vision for meaningful research assessment. Meanwhile, due to rapid change in the underlying technology, practice is changing far more rapidly than most of us realize. IF has already largely been replaced by item-level citation data from Elsevier’s Scopus in university rankings. Altmetrics illustrating a wide range of uses including but moving beyond citation data, such as downloads and social media use are prominently displayed on publishers’ websites. The purpose of this chapter is to provide an overview of how these metrics work at present, to move beyond technical critique (reliability and validity of metrics) to introduce major flaws in the logic behind metrics-based assessment of research, and to call for even more radical thought and change towards a more qualitative approach to assessment. The collective agreement of the University of Ottawa is presented as one model for change.

Cite as:

Morrison, H. (2019). What counts in research? Dysfunction in knowledge creation & moving beyond. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs. Retrieved from

What do rankings tell us about higher education? Roundtable at the Peter Wall Centre, Vancouver May 2017

May 13 – 17 I will be at a roundtable talking about rankings and higher education at the Peter Wall Centre, University of British Columbia. If you’re in Vancouver join us for one of the public events!

My approach to rankings is a critical one flowing from my theoretical perspective of irrational (or instrumental) rationality. In brief, we humans have a tendency to develop tools such as metrics to help us then become slaves to the tools. The old metrics (e.g. relying on high impact factor journals) are a barrier to productive change in scholarly communication; but will the new metrics be any better? What are your thoughts on university rankings? Comments are welcome.

Update December 2019: please see this book chapter preprint for my latest work on this topic, and watch for publication of this book by the University of Toronto Press anticipated in 2020.

DOAJ, Impact Factor and APCs

by César Villamizar and Heather Morrison

In May 2015 we conducted a pilot study correlating OA APCs and the journal impact factor, using data from 2010, 2013 and 2014. Here are some early results:

  • about 10% of the journals listed in JCR are DOAJ journals
  • over 10% of the journals listed in DOAJ have an impact factor
  • about 40% of the DOAJ IF journals had an APC as of May 2015 (estimate; higher than overall journals with APC)
  • average APC of IF journals in 2014 more than double overall average APC ($1,948 as compared with overall average of $964)
  • average APCs of IF journals increased by 7% in a 5-month period from 2013 to 2014 and by 16% from 2010 to 2014
  • over 80% of APC / IF journals increased price by 6% or more in a 5-month period from December 2013 to May 2014
  • about 20% of APC / IF journals increased price by 10% or more in a 5-month period from December 2013 to May 2014
  • 7% of APC / IF journals increased price by 20% or more in a 5-month period from December 2013 to May 2014

Conclusion: about 10% of DOAJ journals have impact factors, and about 10% of impact factor journals are DOAJ journals. Open access journals (or some OA journals) using the APC business model may be exploiting impact factor status as a means to raise prices. Further investigation warranted.


As of May 3, 2015, Thomson Reuters’ Journal Citation Reports (JCR) listed 11,619 journals with impact factor (IF). Of these, 1,146 are listed in the Directory of Open Access Journals (DOAJ). As of May 15, 2015, 10,532 journals were listed in DOAJ. This means that 9.8% of the titles listed in JCR are DOAJ titles, and 10.8% of DOAJ journals have an IF.

The pilot involved selecting half of the DOAJ journals with an IF (572 journals from both sciences and social sciences, selected alphabetically abbreviated title A – J Otolaryngol-Head and looking up the quartile and subject ranking. Of these titles, 169 were included in the May 2014 OA APC sample. For 126 journals data was available for both December 2013 and May 2014, the basis of the 2013-2014 calculations. Assuming that the portion of APC-charging journals would be the same for non-sampled journals, this would result in an estimate of 229 journals with IF and APC, 40% of the total. This is higher than the 26% of journals with APCs as of May 2014.

Stats of the 572 in DOAJ with impact factor (pilot):

  • 42.1% of the journals are in the quartile four (Q4), 27.2% of the journals are in the quartile three (Q4), 18.9% of the journals are in the quartile two (Q2), and 11.8% of the journals are in the quartile one (Q1)
    • 69% of the journals are in the Q4 and Q3
    • 31% of the journals are in the Q2 and Q1



  • Out of the 572 journals,
    • APC data by year
      • 2010 B&S : 176
      • Dec 2013 SKC : 129
      • May 2014 SKC : 169
  • We have 126 journals with APC information collected in Dec 2013 SKC and May 2014 SKC
  • We have 110 journals with APC information collected in 2010 S&B,Dec 2013 SKC and May 2014 SKC.

Stats of the 126 journals with APC Data (Dec 2013 SKC – May 2014 SKC)

  • 17,5% of the journals are in the quartile four (Q4), 38,1% of the journals are in the quartile three (Q4), 30,2% of the journals are in the quartile two (Q2), and 14,3% of the journals are in the quartile one (Q1)
    • 55,5% of the journals are in the Q4 and Q3
    • 45,5% of the journals are in the Q2 and Q1


  • 3,2% of the journals decreased their APC (this is 3 journals; 2 are Hindawi journals. Hindawi as of May 2014 had a practice of rotating free publication. These 2 journals had APCs of 0 in 2014, but have substantial prices today (Bioinorganic Chemistry Applications is now $1,250 and International Journal of Genomics is now $1,500). The third journal with an apparent small price decrease, Experimental Animals, from $200 to $198 USD is likely an anomaly due to a weakening of the main currency, the Japanese Yen, with respect to the USD. In other words, all price decreases appear to be temporary anomalies.
  • 14,3% of the journals maintained their APC
  • 82,5% of the journals increased their APC at least 6.4%
    • 3,1% increased their APC between 6,4% and 7,49%
    • 54,8% increased their APC between 7,5% and 9,49%
    • 15% increased their APC between 9,5% and 13,9%
    • 7% increased their APC between 14% and 25%

The following figure reflects the 123 titles remaining after removing the 2 anomalous 0 APC titles.


The following chart illustrates the percentage of journals by price increase from 2013 to 2014.


APC 2010 USD APC 2013 USD APC 2014 USD
Max 2,165 2,420 2,650
Min 500
Min greater than zero 500 200 198
Median 1,825 2,060 2,215
Mode 1,825 2,060 2,215
Average 1,637 1,808 1,948
  • Medicine and Biology and Life Science represents 81,1% of the journals categories susceptible to charge APCs
    • 3% of the journals in these two categories increased their APC at least in 6.4%
    • 9% increased their APC between 6.4% and 7.49%
    • 1% increased their APC between 7.5% and 9.49%
    • 50% increased their APC between 9.5% and 13.9%
    • 8% increased their APC between 14% and 25%

Note and references

2010 data courtesy of Solomon, D.J. & Björk, B.C. (2012). A study of open access journals using article processing charges. The Journal of the American Society for Information Science and Technology 2012. Retrieved May 31, 2015 from (data unpublished)

2014 data: Morrison H, Salhab J, Calvé-Genest A, Horava T. Open Access Article Processing Charges: DOAJ Survey May 2014. Publications. 2015; 3(1):1-16.

Cite as:

Villamizar, C., & Morrison, H. (2015). DOAJ, Impact Factor and APCs. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs. Retrieved from