Some limitations of DOAJ metadata for research purposes

by: Xuan Zhao, Luan Borges, & Heather Morrison

Abstract

The Directory of Open Access Journals http://doaj.org is an excellent service that fulfills many important functions, in particular facilitating access to a vetted collection of over 15,000 freely available peer-reviewed journals. The DOAJ search services and metadata download are very useful for researchers as well. The purpose of this post is to alert researchers to some of the limitations of the DOAJ metadata that researchers need to take into account to avoid drawing erroneous conclusions. First, when downloading DOAJ metadata, it is necessary to open the .csv file in Unicode in order to retain non-English characters. We open in Open Office for this reason, then save as an excel file. The nature of the metadata means that some data is inserted in the wrong column; clean-up, as discussed below, is necessary before data analysis. When journal editors or others working on their behalf enter metadata into DOAJ, research is not the primary purpose of this exercise; for this reason, in-depth assessment and corrections may be necessary before analysis. Below, we present publisher size analysis as an example of what researchers may encounter. Finally, because the main purpose of DOAJ is connecting readers with content, the metadata of interest to a particular research project may not be up to date. As demonstrated below, as of Jan. 5, 2021, only 30% of DOAJ journals have a “last update” date within the previous year (2020). We do not know whether the “last update” date reflects a full or partial metadata review. We illustrate the potential impact on research results with the example of the SKC longitudinal APC study. Of the 4,292 DOAJ journals that responded “yes” to the APC question, only 30% have a last update date of 2020 or 2021. Even with this 30% of journals, we have no way of knowing whether the APC status and/or amount per se was updated, or only other unrelated metadata. This means that if we compare 2019 prices obtained from publisher websites in 2019 with 2021 DOAJ APC metadata, we will almost certainly get incorrect results, for example falsely assuming that matching APC amounts means no change in the prices. DOAJ provides rich and useful metadata for the researcher and the research question “is this journal listed in DOAJ?” is of value in and of itself. For this reason, we intend to continue using DOAJ metadata in addition to data derived from other sources, particularly data derived directly from publisher websites. See below to a link to an open data version of the DOAJ metadata reflecting the corrections explained in this post.

Details

Correcting for displaced observations

As previously mentioned, the first step to confidently use the DOAJ metadata for analysis and research is identifying and correcting data inserted in the wrong column, herein also called displaced observations. 

Below we can see an example of a displaced observation from the DOAJ metadata. Column BB has no assigned variable while containing some observations, apparently displaced one column to the right. 

Table 1 – An example of misplaced data from 2021 DOAJ metadata

Users may follow different steps to correct for displaced data. Here we explain in more detail how we have identified these displacements and corrected them.  

Before proceeding with any analysis, it is important to get familiarized with the DOAJ metadata first. We recommend users to read the DOAJ Guide to applying, available online, because the metadata reflects responses to questions asked in the application process. The DOAJ metadata, as of 5 Jan. 2021, possesses 53 variables ranging from Journal Title to Country to Most recent article added. It may be helpful to start correcting observations from variables with easily identifiable responses, such as « Country » or « Country of Publisher », or variables that allow only two types of answers (i.e Yes or No), such as Author holds copyright without restrictions and APC. It is recommended to create a pivot table to identify displaced observations, repeating this process until no observations are identified in a wrong column. 

When cleaning-up the DOAJ metadata, users will notice that in some cases only one observation was displaced; in other cases, an entire row was displaced beginning on a specific variable. In the example highlighted in yellow below, all observations beginning at variable Publisher were displaced one column to the right. 

Table 2 – Line 36 illustrates an example of an entire row with displaced observations

Data entry inconsistencies

When correcting for displaced observations, we have also identified some inconsistencies in the way observations are registered in the DOAJ metadata. The table below lists the main visible inconsistencies found for some variables. In the majority of instances, the inconsistencies will not impact DOAJ users looking up information for a particular journal. However, it is important to take into account these inconsistencies before proceeding to any automated statistical analysis. For example, DOAJ metadata as is can be used to identify the number of journals with persistent article identifiers, but automated counting of DOI v. ARK or other approaches would require some advance data manipulation.

VariableExample
Alternative titleSome journals alternative titles may be registered as a number. Some examples are  “2300-6633” and “0”. 
KeywordsSome observations have some special characters as follows: 
6.         rheology, tribology, hydrodynamics, thermodynamics, mechanics of structures, mechatronics. 
           water cycles, water environment, water treatment and reuse, water resource, water quality, hydrology
 •          natural sciences, •      environmental sciences, •      social sciences, agricultural sciences, veterinary medicine, medical sciences
Copyright information URLSome URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an “h” at the beginning and an  “l” at the end of the link. ttp://www.emeraldgrouppublishing.com/services/publishing/jiuc/authors.htm
Plagiarism information URLSome URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an « h » at the beginning and an  « l » at the end of the link.
ttp://www.emeraldgrouppublishing.com/services/publishing/jiuc/authors.htm
URL for journal’s instructions for authorsSome URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an « h » at the beginning of the URL
ttps://revistas.unasp.edu.br/LifestyleJournal/about/submissions
Other submission fees information URLSome URLs have extra letters. The example below, for instance, has a letter « i » at the beginning of the URL
ihttps://journals.univie.ac.at/index.php/voebm/m/index
Some URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an « h » at the beginning of the URL
ttp://psr.ui.ac.id/index.php/journal/about/submissions#authorGuidelines ttps://www.karger.com/Journal/Guidelines/261897#sec62
Preservation ServicesPreservation services can be registered as a name or a website
Preservation Service: national libraryPreservation services – national library can be registered as a name or a website
Preservation information URLSome URLs lack a letter « h » at the beginning or the end. The example below, for instance, has a small error. There should be an « h » at the beginning of the URL
tps://periodicos.uff.br/revistagenero/about/editorialPolicies#focusAndScope ttp://ejournal.stkip-pgri-sumbar.ac.id/index.php/economica
Deposit policy directoryDeposit policy directory can be registered as a name or a website
Persistent article identifiersPersistent article identifiers can be registered as an acronym (UDC, DOI, ARK), but also as a website, such as dc.identifier.uri (DSpaceUnipr) or NBN http://www.depositolegale.it/national-bibliography-number/
Another example is the occurrences UDC and UDC (Universal decimal Classification), which are equivalents but were registered differently
URL for journal’s Open Access statementSome URLs lack a letter « h » at the beginning or at the end, or they have an extra h at the beginning of the URL. The example below has an extra letter « h » at the beginning of the URL. 
hhttp://www.revistas.usp.br/gestaodeprojetos/about
Table 3 – Visible inconsistencies identified in the DOAJ metadata

Publisher’s names duplicates investigation and clean-up

The purpose of this project is preparation to develop a rough picture of publisher size to compare with Solomon & Björk’s findings (2012). In order to better perform publisher size analysis, we have specifically investigated the publisher duplicates and corrected most of the obvious errors, such as small differences in punctuation and/or characters, extra spaces at the beginning and/or at the end, and minor differences in entering the publisher name when it is the same, etc. (Please see examples in Table 4 – Investigative Strategies – Publisher Names Duplicates).

The process of clean-up was divided into three stages. Firstly, we created a pivot table for the publisher column to identify the entries in rows which were slightly different but weren’t gathered. Secondly, when potential duplicates were found, we conducted an investigation to confirm duplicates and/or to decide which name to keep (in priority order: use the name with the most journal entries; correct name with obvious typo; use the first name listed). Please see the investigative strategies below:

Table 4 – Investigative Strategies – Publisher Names Duplicates

Thirdly, after identifying inconsistencies in publisher names, we created a table (please see Table 5 – Corrections GatheringPublisher Names Duplicates) to register all the corrections on the variable Publisher. About 500 inconsistencies were corrected. Thus, the number of publishers in the pivot table has decreased from 7218 entries (data resource: pivot table based on DOAJ metadata) to 6804 entries (data resource: pivot table based on the cleaned-up version of database).

Table 5 – Corrections GatheringPublisher Names Duplicates

As illustrated in the two tables above, there were different types of data inconsistencies. In order to respect metadata to the greatest extent, we acted prudently when making decisions. In some minor variation cases, we tried to click on the URLs to check publisher websites and to collect convincing evidence. However, we met some intricate complex challenges.

One of the challenges was the language. Due to the massiveness and the wide-range of publishers (124 countries, 80 languages, DOAJ, 7 Feb. 2021) [https://doaj.org/], we were unable to identify all of the sources of information. Besides, when there were invalid URLs or unmatched information, it was difficult to seek out any precision. What’s more, among 7218 entries of publisher names, some of the potential duplicates weren’t gathered because of their different beginning words. For example, “Editora da Universidade Estadual de Maringá (Eduem)” vs. “Eduem – Editora da Universidade Estadual de Maringá” and “Academica Brâncuşi” vs. “Editura Academica Brâncuşi”. They were usually far apart and hard to be detected. More details can be found in the Table 6 below:

Different beginning words (examples)“Academica Brâncuşi” vs. “Editura Academica Brâncuşi”;
“Alexandru Ioan Cuza University of Iaşi” vs. “Editura Universităţii ‘Alexandru Ioan Cuza’ Iaşi”;
“Editora da Universidade Estadual de Maringá (Eduem)” vs. “Eduem – Editora da Universidade Estadual de Maringá”
Table 6 – (1)

Unmatched publisher names (examples):

Original publisher namesPossible correct namesURLs
Canadian Society for the Study of Education.The Canadian Association for Curriculum Studieshttps://jcacs.journals.yorku.ca/index.php/jcacs/index
Badan Penelitian dan Pengembangan KesehatanURL directs to a new web link:
https://ejournal2.litbang.kemkes.go.id/index.php/jki/index
whose publisher name is:
Pusat Penelitian dan Pengembangan Biomedis dan Teknologi Dasar Kesehatan
http://ejournal.litbang.kemkes.go.id/index.php/jki
Shaheed Beheshti University of Medical Sciences and Health ServicesKowsarmedicalhttp://journals.sbmu.ac.ir/jme
Table 6 – (2)

Invalid URLs (examples):

Original publisher namesOriginal URLs (invalid)
Alborz University of Medical Sciences
(URLs wrongly directs to a website whose contents are meaningless; when we searched the journal title, we were directed to this website : https://enterpathog.abzums.ac.ir/)
http://enterpathog.com/?page=home ; https://jehe.abzums.ac.ir/index.php?slc_lang=en&sid=1
Instituto Nacional de Salud (INS)http://revistas.ins.gov.py/index.php/rspp/
Instituto Superior de Ciências de Educação do Huambohttp://revista.isced-hbo.ed.ao/rop/index.php/ROP/index
Table 6 – (3)

Given the barriers and challenges mentioned above, we can draw a conclusion to the limitations of publisher names clean-up project. Precision is not possible in this project because the question “who is the publisher” is complex. Instead of making any definitive claims about publisher size, we are primarily interested in whether the long tail effect (a few big publishers, a few more middle-sized, most very small) reported by Solomon & Björk (2012) can still be observed in DOAJ in 2021.

DOAJ metadata update analysis

The following analysis was conducted to determine whether DOAJ metadata on article processing charges (APCs) – charging status and amount – would be sufficient for SKC’s longitudinal study on APC trends over time. The answer is clearly no. The metadata for the vast majority of journals in DOAJ (overall and APC charging) has not been updated for more than a year, and it is unknown whether the most recent update would have included an update to APC or other metadata. We will continue to use DOAJ metadata as it is rich and the question “is this journal listed in DOAJ” is of value in and of itself, however for price comparisons we cannot rely on this data as it would likely result in erroneous conclusions.

DOAJ journals by year of last update.

This chart illustrates the percentage of DOAJ journals last update by year. Detailed figures are in the table below. Note that just under half the journals were last updated 2 or more years ago (2018 or earlier).

DOAJ last update as of Jan. 5, 2021
Year# journals last updated % journals last updated
20152942%
20161,4699%
20172,86418%
20182,95119%
20193,41222%
20204,66230%
2021390%
Total15,691100%
Table 7

DOAJ APC charging journals by year of last update

The chart above illustrates the percentage of journals that answered “yes” to the DOAJ question about charging APCs by year of last update. The table below provides the detailed figures. Note that only 30% of DOAJ journals that charge APCs were updated in the past year (2020 or 2021). It is also unknown whether in these cases the last update was a thorough review of the metadata, or might have been an update of non-APC data.

DOAJ last update APC journals only Jan. 5, 2021
Year of last udpate# of journals last updated% journals last updated
2015471%
20162386%
201749912%
201893022%
20191,28630%
20201,27630%
2021160%
Total4,292100%
Table 8

A version of the Jan. 5, 2021 DOAJ metadata file reflecting the corrections explained below is available as open data here:

Directory of Open Access Journals; Zhao, Xuan; Borges, Luan; Morrison, Heather, 2021, “DOAJ_metadata_2021_01_05_with_SKC_clean_up”, https://doi.org/10.5683/SP2/G5LEXG, Scholars Portal Dataverse, V1

References

The Directory of Open Access Journals (DOAJ) online: https://doaj.org/

Solomon, D. J., & Björk, B. (2012). A study of open access journals using article processing charges. Journal of the American Society for Information Science and Technology63(8), 1485–1495. https://doi.org/10.1002/asi.22673

Cite as: Zhao, X., Borges, L., & Morrison, H. (2021). Some limitations of DOAJ metadata for research purposes. Sustaining the Knowledge Commons. https://sustainingknowledgecommons.org/2021/02/10/some-limitations-of-doaj-metadata-for-research-purposes/

Preservation of Digital Blog-Posts

A Literature Review, January 2021

The goal of this literature review was to gain an understanding of the current status of research on the topic of digital blog preservation. After conducting a series of searching within the database LISTA (Library, Information Science, and Technology Abstracts), one can determine that there are little to no recent developments in technology or research specifically for the access/preservation of digital blog posts.

Unsurprisingly, much of the scholarly conversation about blog/microblog preservation took place between 2002 and 2010. 

Thoughts on Blog Preservation

Despite the varying opinions that blogs are either easier or more difficult to preserve than other digital communications, scholars agree that blogs and microblogs have unique qualities that deserve scholarly discussion.  

According to Patsy Baudoin, many blogging websites utilize software that automatically preserves the sequencing of posts (2008). This innate quality of the software supports the archiving principles of “original order” and “provenance”. However intelligent the blogging software appears to be, blogs and other user-generated content are especially vulnerable to link rot (Banks, 2010).

Blogs can become complex to preserve because they may contain various file formats, media, or have several owners (Baudoin, 2008). To add to this sentiment, Grimard (2005) states that the variety of formats adds to the “opaqueness” of digital records (opaqueness referring to the unnatural structure of electronic information that is only computer-readable).

To maintain the integrity of the blog during the preservation process, the digital archivist would have to consider preserving the additional external links within the original blog post. Furthermore, copyright can be an issue in certain blog preservation circumstances, as there have been several cases brought to the US Supreme Court (Chen, 2005).

Preservation Technology

Open-source technologic advancements in blog preservation have been disappointing at best. According to Caroline Young, there have been several programs for blog preservation that have essentially failed soon after conception (2013).

Some examples are PANDORA by the National Library of Australia, and ArchivePress by the University of London’s Computer Centre and British Library Digital Preservation department. Young mentions a developing blog preservation software called BlogForever, which was still in development in 2013. Now, it seems to be available for use and claims to be a new system to harvest, preserve, manage and reuse blog content.

Young (2013), Banks (2010), Rosenthal (2016), and Chen (2010) all highlight the impact made by the introduction of the Internet Archive’s Wayback Machine. The Wayback Machine has simproved the landscape of digital preservation of grey literature like bog posts; however, it is not without its challenges. Much like other archiving software, it has difficulty with images and audio files. 

Solutions to the Preservation Problem

Though an older article, Grimard (2005) offers some solutions to digital preservation that are still relevant. One important recommendation is to standardize the format of the information. The recommendation is echoed by Young (2013). Both authors emphasize the importance of converting files to the most usable format. Since file formats are simply a set of conventions that software developers can change and alter, they may become obsolete. Young describes the universal XML format as being hierarchical and organized logically. 

LOCKSS is a blog preservation software mentioned in both Leroy (2018) and Rosenthal (2016). It is an open-source software designed with libraries in mind. It also claims to preserve animations, data sets, images, audio, and text content.

Conclusion

The scholarly conversation on the preservation and conservation of blog content has slowed in the past decade. This could be because the options currently available are adequate for the need of blog preservation.

Blogs and microblogs are comprised of various formats that can contribute to the challenges in digital preservation. According to research in the early 2010s, images, animations, and audio files, which blogs usually contain, are difficult to preserve with the Wayback Machine. This may have improved in the more recent years.

There are also preservation software options like the LOCKSS and BlogForever that seems to be more targeted toward archiving blog content than the Wayback Machine is.

Reference List

Chen, X. (2010). Blog Archiving Issues: A Look at Blogs on Major Events and Popular Blogs. Internet Reference Services Quarterly15(1), 21–33. https://doi.org/10.1080/10875300903529571

Baudoin, P. (2008). On Preserving Blogs for Future Generations. The Serials Librarian53(4), 59–61. https://doi.org/10.1300/J123v53n04_04

Farace, D., & Schöpfel, J. (Eds.). (2010). Chapter 14. Blog Posts and Tweets: The Next Frontier for Grey Literature. In Grey Literature in Library and Information Studies (pp. 217–226). K. G. Saur. https://doi.org/10.1515/9783598441493.2.217

Grimard, J. (2005). Managing the Long-term Preservation of Electronic Archives or Preserving the Medium and the Message. Archivaria, 153–167.

Leroy, A. (2018). LOCKSS Distributed Digital Preservation Networks. Université libre de Bruxelles. Belgium. ISSN, 9. https://nusl.techlib.cz/en/conference/conference-proceedings

Rosenthal, D. S. H. (2017). The medium-term prospects for long-term storage systems. Library Hi Tech35(1), 11–31. http://dx.doi.org.proxy.bib.uottawa.ca/10.1108/LHT-11-2016-0128

Young, C. (2013). Oh My Blawg! Who Will Save the Legal Blogs? Law Library Journal105(4), 493–503.

Cite as: Pelland, K. (2021). Preservation of digital blog-posts. Sustaining the Knowledge Commons. https://sustainingknowledgecommons.org/2021/01/29/preservation-of-digital-blog-posts/

Dramatic Growth of Open Access September 30, 2020

Cross-posted from The Imaginary Journal of Poetic Economics

While many aspects of our lives and activities have slowed down during the COVID pandemic, this has not been the case with open access! The OA initiatives tracked through this series continue to show  strong growth on an annual and quarterly basis. Important milestones are being reached, and others will be coming soon.

Highlights

The Directory of Open Access Journals now lists over 15,000 fully open access, peer reviewed journals, having added 379 journals (> 4 per day) in the past quarter, and now provides searching for over 5 million articles at the article level.

A PubMed search for “cancer” limited to literature from the past 5 years now links to full-text for over 50% of the articles.

The Bielefeld Academic Search Engine now cross-searches over 8,000 repositories and will soon surpass the milestone of a quarter billion documents.

Anyone worried about running out of cultural materials during the pandemic will be relieved to note that the Internet Archive has exceeded a milestone of 6 million movies in addition to over 27 million texts (plus audio, concerts, TV, collections, webpages, and software).

Analysis of quarterly and annual growth for 39 indicators from 10 services reflecting open access publishing and archiving (Internet Archive, Bielefeld Academic Search Engine, Directory of Open Access Books, bioRxiv, PubMedCentral, PubMed, SCOAP3, Directory of Open Access Journals, RePEC and arXiv) demonstrates ongoing robust growth beyond the baseline growth of scholarly journals and articles of 3 – 3.5 per year. Growth rates for these indicators ranged from 4% – 100% (doubling). 26 indicators had a growth rate of over 10%, 15 had a growth rate of over 20%, and 6 had a growth rate of over 40%. The full list can be found in this table.

Thank you to everyone in the open access movement for continuing the hard work that makes this growth possible.

The open data edition is available here:   

Morrison, Heather, 2020, “Dramatic Growth of Open Access Sept. 30, 2020”, https://doi.org/10.5683/SP2/AVBOW6, Scholars Portal Dataverse, V2 

This post is part of the Dramatic Growth of Open Access Series.  

Cite as: Morrison, H. (2020). Dramatic Growth of Open Access September 30, 2020. The Imaginary Journal of Poetic Economics https://poeticeconomics.blogspot.com/2020/10/dramatic-growth-of-open-access.html

Bienvenue à C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora

Notre Tanoh Laurent Kakou a créé un blog pour son propre projet de recherche en libre accès, C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora.

Quelques articles seront familiers aux lecteurs de Soutenir les savoirs communs, le travail de l’équipe; d’autres sont nouveau recherche fait par Tanoh. La vidéo Qu’est-ce que la revue Afroscopie?, un entretien avec Benoit Awazi, est éclairante pour quiconque s’intéresse à la recherche en Afrique francophone.

Merci et félicitations à notre Tanoh Laurent Kakou, candidat au doctorat en communication (et diplômé d’ÉSIS), qui a réussi son examen de synthèse cet été! Meilleurs voeux à Tanoh et sa recherche.

English

Welcome to C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora

Our Tanoh Laurent Kakou has created a blog for his own research project in open access, C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora.

Some articles will be familiar to readers of Sustaining the knowledge commons, as the work of the team; others are new research projects by Tanoh. The video Qu’est-ce que la revue Afroscopie?, an interview with Benoit Awazi, is enlightening for anyone who is interested in research in francophone Africa.

Thank you and congratulations to our Tanoh Laurent Kakou, a doctoral candidate in communication (and graduate of ÉSIS) on passing his comprehensive exam this summer! Best wishes to Tanoh and his research.

Français

Knowledge and equity: analysis of three models

Update July 15: a 10-minute YouTube video overview of this work by Dr. Rahman & I can be viewed here.

Abstract:

The context of this paper is an analysis of three emerging models for developing a global knowledge commons. The concept of a ‘global knowledge commons’ builds on the vision of the original Budapest Open Access Initiative (2002) for the potential of combining academic tradition and the internet to remove various access barriers to the scholarly literature, thus laying the foundation for an unprecedented public good, uniting humanity in a common quest for knowledge. The global knowledge commons is a universal sharing of the knowledge of humankind, free for all to access (recognizing reasons for limiting sharing in some circumstances such as to protect individual privacy), and free for everyone qualified to contribute to. The three models are Plan S / cOAlition S, an EU-led initiative to transition all of scholarly publishing to an open access model on a short timeline; the Global Sustainability Coalition for Open Science Services (SCOSS), a recent initiative that builds on Ostrom’s study of the commons; and PubMedCentral (PMC) International, building on the preservation and access to the medical research literature provided by the U.S. National Institutes of Health to support other national repositories of funded research and exchange of materials between regions. The research will involve analysis of official policy and background briefing documents on the three initiatives and relevant historical projects, such as the Research Council U.K.’s block grants for article processing charges, the EU-led OA2020 initiative, Europe PMC and the short-lived PMC-Canada. Theoretical analysis will draw on Ostrom’s work on the commons, theories of development, under-development, epistemic / knowledge inequity and the concepts of Chan and colleagues (2011) on the importance of moving beyond north-to-south access to knowledge (charity model) to include south-to-south and south-to-north (equity model). This model analysis contributes to build a comparative view of transcontinental efforts for a global knowledge commons building with shared values of open access, sharing and collaboration, in contrast to the growing trend of commodification of scholarly knowledge evident in both traditional subscriptions / purchase-based scholarly publishing and in commercial open access publishing. We anticipate that our findings will indicate that a digital world of inclusiveness and reciprocity is possible, but cannot be taken for granted, and policy support is crucial. Global communication and information policy have much to contribute towards the development of a sustainable global knowledge commons.

Full text: https://ruor.uottawa.ca/handle/10393/40664

Cite as: Morrison, H. & Rahman, R. (2020). Knowledge and equity: analysis of three models. International Association of Communication and Media Researchers (IAMCR) annual conference, July 2020.

SpringerOpen 2019 – 2020

By Anqi Shi & Heather Morrison

Abstract

307 SpringerOpen titles for which we have data on journals that were fully open at some point from 2010 to the present were studied, with a primary focus on pricing and status changes from 2019 – 2020 and a secondary focus on longitudinal status changes. Of the 307 titles, 226 are active, fully open access and are still published by SpringerOpen, 40 have ceased publication, 19 were transferred to another publisher, and 18 journals that were formerly open access are now hybrid. 6 of these journals transitioned from free to hybrid in the past year. An additional 2 journals were not found. An additional 2 journals were not found. Of the 226 active journals published by SpringerOpen, 51% charge APCs. The average APC is 1,233 EUR, an increase of 3% over the 2019 average. 46.5% of the 101 journals for which we have 2019 and 2020 data did not change in price; 13.9% decreased in price; and 39.6% increased in price. The extent of change in price was substantial, ranging from a 50% price drop to a 94% price increase.

Detail – download the PDF: springer open 2019-2020

Data (for DOAJ 2016 – 2019 data for journals that are now hybrid see columns BV – ): Springeropen_2019_2020

Cite as: Shi, A. & Morrison, H. (2020). SpringerOpen pricing trends 2019-2020. Sustaining the Knowledge Commons May 25, 2020 https://sustainingknowledgecommons.org/2020/06/11/springeropen-2019-2020/

BioMedCentral 2020

BioMedCentral (BMC) 2019 – 2020

by Anqi Shi & Heather Morrison

Key points

  • Open access commercial publishing pioneer BMC is now wholly owned by a private company with a portfolio including lines of business that derive revenue from journal subscriptions, book sales, and textbook sales and rentals
  • Two former BMC fully OA journals, listed in DOAJ from 2014 – 2018 as having CC-BY licenses, are now hybrid and listed on the Springer website and have disappeared from the BMC website
  • 67% of BMC journals with APCs in 2019 and 2020 increased in price and 11% decreased in price.
  • Journals with price increases had a higher average APC in 2019, i.e. more expensive journals appear to be more likely to increase in price

Abstract

Founded in 2000, BioMedCentral (BMC) was one of the first commercial (OA) publishers and a pioneer of the article processing charges (APC) business model. BMC was acquired by Springer in 2008. In 2015, Springer was acquired by the Holtzbrinck Publishing Group in 2015 and became part of SpringerNature. In other words, BMC began as an OA publisher and is now one of the imprints or business lines of a company whose other lines of business include sales of journal subscriptions and scholarly books and textbook sales and rentals. Of the 328 journals actively published by BMC in 2020, 91% charge APCs. The average APC was 2,271 USD, an increase of 3% over 2019. An overall small increase in average APC masks substantial changes at the individual journal level. As first noted by Wheatley (2016), BMC price changes from one year to the next are a mix of increases, decreases, and retention of the same price. In 2020, 67% of the 287 journals for which we have pricing in USD for both 2019 and 2020 increased in price; 11% decreased in price, and 22% did not change price. It appears that it is the more expensive journals that are more likely to increase in price. The average 2019 price of the journals that increased in 2020 was 2,307 USD, 18% higher than the 2019 average of 1,948 USD for journals that decreased in price. 173 journals increased in price by 4% or more, well above the inflation rate. 39 journals increased in price by 10% or more; 13 journals increased in price by 20% or more. Also in 2020, there are 11 new journals, 11 journals ceased publication, 5 titles were transferred to other publishers, 2 journals changed from no publication fee to having an APC, and 3 journals dropped their APCs. Two journals formerly published fully OA by BMC are no longer listed on the BMC website, but are now listed as hybrid on the Springer website. This is a small portion of the total but is worth noting as the opposite direction of the transformative (from subscriptions to OA) officially embraced by SpringerNature.

Details and documentation: download the PDF: BMC_2019_2020_as_hm

Data: BMC_2019_2020

Cite as: Shi, A. & Morrison, H. (2020). BioMedCentral 2020. Sustaining the Knowledge Commons. https://sustainingknowledgecommons.org/2020/06/08/biomedcentral-2020/

Frontiers 2020: a third of journals increase prices by 45 times the inflation rate

Updates June 4:

  1. Frontier’s comment regarding their pricing transparency below is helpful. It is important for those who support gold OA publishing to understand the cost implications of their demands and expectations. Frontiers states: “As Frontiers’ sole source of income, APCs allow us to subsidize new journals and communities with less research funding, to reinvest in our publishing platform, and to offer a fee support program. More than a third of all articles published in 2017/18 received full or partial waivers as a result of this approach, which we fully intend to continue to offer in the years ahead.” An average APC of $2,170 USD could support hosting a whole journal in North America and could be enough to fund a year or partial year of a highly paid researchers’ salary, in less affluent countries. If granting agencies were to directly subsidize local publishing in both more and less affluent countries, this would probably cost less and do more (by supporting local development) than expecting publishers like Frontiers to subsidize APCs.
  2. It has come to my attention that this post happens to coincide with negotiations on a national agreement between Frontiers and Germany in the context of PlanS / cOAlition S. Details about the agreement can be found:

A third of the journals published by Frontiers in 2019 and 2020 (20 / 61 journals) have increased in price by 18% or more (up to 55%). This is quite a contrast with the .4% Swiss inflation rate for 2019 according to Worlddata.info ; 18% is 45 times the inflation rate. This is an even more marked contrast with the current and anticipated economic impact of COVID; according to Le News, “A team of economic experts working for the Swiss government forecasts a 6.7% fall in GDP”. (Frontiers’ headquarters is in Switzerland).

This is similar to our 2019 finding that 40% of Frontier’s journals had increased in price by 18% or more (Pashaei & Morrison, 2019) and our 2018 finding that 40% of Frontier journals had increased in price by 18% – 31% (Morrison, 2018).

The price increases are on top of already high prices. For example, Frontiers in Earth Science increased from 1,900 USD to 2,950 USD, a 55% price increase. Frontiers in Oncology increased from 2,490 to 2,950 USD, an 18% price increase.

This illustrates an inelastic market. Payers of these fees are largely government research funders, either directly or indirectly through university libraries or researchers’ own funds. The payers are experiencing a major downturn and significant challenges such as lab closures, working from home in lockdown conditions, and additional costs to accommodate public health measures, while Frontiers clearly expects ever-increasing revenue and profit.

Following is a list of Frontier journals with price increases. All pricing is in USD.

Journal title 2020 APC 2019 APC 2020 – 2019 price change (numeric) 2020 – 2019 price change (percent)
Frontiers in Earth Science 2,950 1,900 1,050 55%
Frontiers in Veterinary Science 2,950 1,900 1,050 55%
Frontiers in Cardiovascular Medicine 2,490 1,900 590 31%
Frontiers in Ecology and Evolution 2,490 1,900 590 31%
Frontiers in Energy Research 2,490 1,900 590 31%
Frontiers in Environmental Science 2,490 1,900 590 31%
Frontiers in Molecular Biosciences 2,490 1,900 590 31%
Frontiers in Nutrition 2,490 1,900 590 31%
Frontiers in Physics 2,490 1,900 590 31%
Frontiers in Surgery 2,490 1,900 590 31%
Frontiers in Artificial Intelligence 1,150 950 200 21%
Frontiers in Bioengineering and Biotechnology 2,950 2,490 460 18%
Frontiers in Cell and Developmental Biology 2,950 2,490 460 18%
Frontiers in Chemistry 2,950 2,490 460 18%
Frontiers in Integrative Neuroscience 2,950 2,490 460 18%
Frontiers in Marine Science 2,950 2,490 460 18%
Frontiers in Materials 2,950 2,490 460 18%
Frontiers in Oncology 2,950 2,490 460 18%
Frontiers in Pediatrics 2,950 2,490 460 18%
Frontiers in Systems Neuroscience 2,950 2,490 460 18%

The full spreadsheet can be found here:

Frontiers_OA_main_2020

References

Morrison, H. (2018). Frontiers: 40% journals have APC increases of 18 – 31% from 2017 to 2018. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs. Retrieved from https://sustainingknowledgecommons.org/2018/04/12/frontiers-40-journals-have-apc-increases-of-18-31-from-2017-to-2018/

Pashaei, H., & Morrison, H. (2019). Frontiers in 2019: 3% increase in average APC. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs. Retrieved from https://sustainingknowledgecommons.org/2019/04/30/frontiers-in-2019-3-increase-in-average-apc/

Cite as:  Morrison, H. (2020). Frontiers 2020: a third of journals increase prices by 45 times the inflation rate. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs : https://sustainingknowledgecommons.org/2020/06/03/frontiers-2020-a-third-of-journals-increase-prices-by-45-times-the-inflation-rate/

CNKI free services during COVID-19 and OA long-term practice

Abstract

Chinese National Knowledge Infrastructure (CNKI), initiated in 1999 by Tsinghua University and Tsinghua Tongfang Co., Ltd., is both the largest institutional repository in China and a near-monopoly provider of for-pay academic databases with a higher profit margin than Elsevier or Wiley, among other services. With promotion and support from the government, CNKI keeps developing its track towards open access [1]. CNKI offers free access to millions of documents ranging from dissertations and academic articles to popular and party journals. The COAA, Chinese Open Access Aggregator, launched in 2019, makes available more than 10,000 open access journals, although foreign scholars may find it difficult to benefit from this due to the language. CNKI has played an important role in making works on COVID-19 freely available, as well as in expanding access to subscribers at home during lock-down.

Details

CNKI stands for Chinese National Knowledge Infrastructure, it was initiated by Tsinghua University and Tsinghua Tongfang Co., Ltd. and was founded in June 1999. According to Tongfang ’s annual report, the company officially opened the world ’s largest Chinese knowledge portal ‘CNKI (cnki.net) database’ in 2004, informally known as ‘Zhiwang’. CNKI is currently China’s largest integrator of academic electronic resources, including more than 95% of officially published Chinese academic resources.

At the end of 2017, CNKI had more than 20,000 institutional users, more than 20 million individual registered users, full-text downloads amounted to 2 billion pages per year and more than 150,000 online users. The market share of CNKI in Chinese undergraduate colleges is 100%. [2]

As most students know, the best way to access databases outside school is VPN. However, in some inconvenient situations like during the COVID-19 lockdown time in China, you cannot use VPN in some places. Some major Chinese database vendors provided recent limited-time free services. According to the Central China Normal University Library announcement, during the COVID-19 epidemic period (the service period is tentatively from February 1 to March 3, 2020), CNKI provides 4 free services including CNKI database literature acquisition, research learning, and collaborative scientific research services (CNKI OKMS platform). (English translation by the author) At the same time, the school’s students are offered a new online entrance to access CNKI database.[3]

For Chinese readers, CNKI developed a special database online platform to release and promote the latest COVID-19 related study results. You can notice the platform name in red font on the homepage. The platform includes 2,256 journals in total, including 23 non-Chinese journals.[4]

Source: print screen from https://cnki.net/

At the same time, CNKI announced that there is free access given by the CNKI OKMS platform, helping uninterrupted research team communication during the special times. The “OKMS Huizhi” is an Office Software for Collaborative Research.

Ms. Dai also stresses that the “OKMS Huizhi” platform was launched in May 2019, and it is now free because of the COVID-19 epidemic situation so that everyone can research from home. Before June 1, the “OKMS Huizhi” platform will be open for free. (English translation by the author) [5]

Besides the limited free access due to the COVID-19 pandemic period, CNKI started to open a variety of continuous services, for example, full-text open access to some Chinese published literature.

The target of this service is the whole country of China, which started in November 2015. The types of documents served include academic journals, conference papers, doctoral dissertations, master’s theses, and newspapers.

The free service scope of 2020 is all documents published by CNKI in 2011 and before, including 40.89 million articles published in 11,402 journals from 1911 to 2011, accounting for about 59.8% of all documents. These include academic journals; culture, art, and other popular journals; party construction, political newspapers, and other party and government journals; higher education, vocational education, and other educational journals; economic information journals. From 2000-2011 CNKI published 188,000 doctoral dissertations, 1.51 million ancillary papers, 4.17 million conference papers, accounting for 45.6%, 38.1%, and 67.4% respectively, as well as, 18.15 million articles from more than 400 newspapers from 2001 to 2019, totaling 64,908 million articles. (English translation by the author) [6]

For Chinese authors, there is a free service that started in September 2019, aiming at the authors who have Chinese publications collected in CNKI database. On this online free author service platform, authors can download own published documents for free, manage academic achievements, obtain academic evaluation reports, track academic frontier developments, and achieve online journal submission.[7] For English readers, CNKI keeps updating its oversea website. At the time this blog post is written, the open-access (OA) online-first publishing of COVID-19 platform is officially online to serve [http://new.oversea.cnki.net/index/] which includes 2,288 China journals and 25 foreign journals.

Source: print screen from http://new.oversea.cnki.net/index/
Source: print screen from http://en.gzbd.cnki.net/GZBT/brief/Default.aspx

What is more, CNKI Open Access Aggregator (COAA) is introduced to foreign scholars. CNKI Open Access Aggregator, COAA in short, was launched in 2019 and currently has more than 10,000 open access journals covering all fields of science, technology, medicine, social sciences, and humanities.

According to the COAA platform introduction on their webpage, it will continue to expand the coverage of open resources from now on, increase open access books, papers, conference papers, etc., to provide users with a large number of open access resources. The journal covers 100 countries and regions on five continents, covering 100 disciplines and covering 70 languages. (English translation by the author) [8] Unfortunately, the homepage and all the instructions are in Chinese. The language barrier could be a difficulty for non-Chinese scholars.

Besides all the effort CNKI has made to develop open-access (OA), there are many challenges it is facing. One survey of Chinese readers conducted by Wen revealed the fact that 94.5 percent of the respondents were ignorant of the existence of OA journals.[9] As we mentioned before, the market share of CNKI in Chinese undergraduate colleges is 100% which keeps CNKI the Chinese world of academic publishing in a monopolistic stranglehold. According to Wang Yiwei’s article on July 24, 2019, CNKI has posted an average annual profit margin of nearly 60%in the past decade which almost doubled the figure of Wiley [10].

https://www.sixthtone.com/news/1004345/publish-or-perish-how-chinas-elsevier-made-its-fortune

At the end of 2018, the Taiyuan University of Technology, a university located in Taiyuan, Shanxi province, China, put a notice regarding the suspension of access to “CNKI” in 2019 on their school website[11] and the next day the school library published that the budget for the usage contract with CNKI was 588,000 yuan (about $85,500). [12]

The cancellation due to high fees happens around the world. For example, SUNY (State University of New York System) subscribed to approximately 250 titles in Elsevier instead of the whole database in 2020 and this approach will save SUNY institutions $7 million annually. [13]

CNKI, which has been developed with the strong support of the government, the Ministry of Education, the Ministry of Science and Technology, and other departments, could assume more social responsibilities through open-access (OA) instead of taking advantage of its leading enterprises to gain more economic benefits. As the quick development of online services is being promoted by the national government during the COVID-19 pandemic period, it is believed that open-access (OA) is to become the future of academic library exchanges in China.

References:

[1] Zhong, Jing, and Shuyong Jiang. 2016. “Institutional Repositories in Chinese Open Access Development: Status, Progress, and Challenges.” The Journal of Academic Librarianship 42 (6): 739–44. https://doi.org/10.1016/j.acalib.2016.06.015.

[2] 谭捷,张李义 & 饶丽君. (2010).中文学术期刊数据库的比较研究. 图书情报知识(04),4-13. doi:10.13366/j.dik.2010.04.015. https://kns8.cnki.net/KCMS/detail/detail.aspx?dbcode=CJFD&dbname=CJFD2010&filename=TSQC201004005&v=MDAwNDFyQ1VSN3FmWStSbUZpL2tVcjNOTVQ3YWJiRzRIOUhNcTQ5RllZUjhlWDFMdXhZUzdEaDFUM3FUcldNMUY=

[3] Central China Normal University Library Announcement (2020). 疫情期间限时免费数据库使用攻略. http://lib.ccnu.edu.cn/info/1071/4595.htm

[4] CNKI 2.0 homepage. https://kns8.cnki.net/nindex/

[5] 本王整理(2020-02-04). 刚刚!中国知网道歉了,并对免费服务项目做出说明. http://www.ecorr.org/news/industry/2020-02-04/176080.html

[6]《中国学术期刊(光盘版)》电子杂志社有限公司(2020-02-01). 关于中国知网免费服务项目的说明. https://piccache.cnki.net/index/images2009/other/2020/freeservice.html

[7] open-access author service platform. https://expert.cnki.net/Register/AuthorPlat

[8] COAA platform introduction (2019). http://coaa.discovery.cnki.net/public/about

[9] Wen (2008) citation: as cited in Hu (2012).Hu, Dehau. 2012. “The Availability of Open Access Journals in the Humanities and Social Sciences in China.” Journal of Information Science 38 (1): 64–75. https://doi.org/10.1177/0165551511428919.

[10] Wang Yiwei(2020-06-24). Publish or Perish: How China’s Elsevier Made its Fortune. https://www.sixthtone.com/news/1004345/publish-or-perish-how-chinas-elsevier-made-its-fortune

[11] Zhang shumei (2018-12-28). Notice on suspending access to “CNKI series database” in 2019 http://www2017.tyut.edu.cn/info/1026/11127.htm

[12] Tendering and Procurement Center (2018-12-29). 2019 Electronic Periodical Database Renewal Service Project Transaction Announcement http://cgzb.tyut.edu.cn/info/1076/3542.htm

[13] Big Deal Cancellation Tracking. https://sparcopen.org/our-work/big-deal-cancellation-tracking/

 

Cite as: Shi, A. (2020). [ CNKI free services during COVID-19 and OA long-term practice ]. Sustaining the Knowledge Commons. [https://sustainingknowledgecommons.org/2020/05/05/cnki-free-services-during-covid-19-and-oa-long-term-practice/].