Open access article processing charges 2011 – 2021

by: Heather Morrison, Luan Borges, Xuan Zhao, Tanoh Laurent Kakou & Amit Nataraj Shanbhoug

Abstract

This study examines trends in open access article processing charges (APCs) from 2011 – 2021, building on a 2011 study by Solomon & Björk (2012). Two methods are employed, a modified replica and a status update of the 2011 journals. Data is drawn from multiple sources and datasets are available as open data (Morrison et al, 2021). Most journals do not charge APCs; this has not changed. The global average per-journal APC increased slightly, from 906 USD to 958 USD, while the per-article average increased from 904 USD to 1,626 USD, indicating that authors choose to publish in more expensive journals. Publisher size, type, impact metrics and subject affect charging tendencies, average APC and pricing trends. About half the journals from the 2011 sample are no longer listed in DOAJ in 2021, due to ceased publication or publisher de-listing. Conclusions include a caution about the potential of the APC model to increase costs beyond inflation, and a suggestion that support for the university sector, responsible for the majority of journals, nearly half the articles, with a tendency not to charge and very low average APCs, may be the most promising approach to achieve economically sustainable no-fee OA journal publishing.

A preprint of the full article is available here: https://ruor.uottawa.ca/handle/10393/42327

The two base datasets and their documentation are available as open data:

Morrison, Heather et al., 2021, “2011 – 2021 OA APCs”, https://doi.org/10.5683/SP2/84PNSG, Scholars Portal Dataverse, V1

Citation: cite the original URL rather than this blogpost URL (article); if citing data, use the citation above.

Morrison, H., Borges, L., Zhao, X., Kakou, T.L., Shanbhoug, A.M. (2021). Open access article processing charges 2020 – 2021. Preprint. Sustaining the Knowledge Commons. https://ruor.uottawa.ca/handle/10393/42327

Improving the DOAJ metadata – Why and how

by: Xuan Zhao & Heather Morrison

Abstract

The Directory of Open Access Journals (DOAJ, http://doaj.org/) is an essential world-wide open access service (16,134 journals listed, as of March 29, 2021), which promotes quality, peer-reviewed open access journals. The journals included can get higher and broader visibility. To make the most of this service, journal editors need to pay attention to the accuracy of their entries in the DOAJ metadata (journal-title, publisher information, location information, subject, language, URLs, etc.). This post aims to explain the benefits for journals of improving the quality of metadata and what journal editors can do. 

Our discussion is mainly based on recent research of the Sustaining the Knowledge Commons team and cites some other researchers’ findings. 

For journals, what are the benefits of improving the DOAJ metadata?

As detailed on the DOAJ website (DOAJ, https://doaj.org/apply/why-index/), there are five benefits for journals indexed in DOAJ, and accordingly, five reasons to improve the metadata: 

  1. “Reputation and prominence”

“DOAJ is the most important community-driven, open access service in the world and has a reputation for advocating best practices and standards in open access. By indexing your journal in DOAJ, its reputation and prominence will be enhanced.”

We assume that journals with accurate and precise entries can give a serious and active impression, helping them maintain the reputation. 

  1. “Standards and best practice”

“DOAJ’s basic criteria for inclusion have become the accepted way of measuring an open access journal’s adherence to standards in scholarly publishing. We can help you adopt a range of ethical and quality standards, making your journals more attractive publishing channels. DOAJ is committed to combatting questionable publishers and questionable publishing practices, helping to protect researchers from becoming trapped by unethical journals.”

As open access journals are listed in a quality standards system like DOAJ, it is important to make sure that their information is correct to distinguish them from the questionable journals undoubtedly. 

  1. “Funding and compliance”

“Open access publication funds often require that authors who want funding must publish in journals that are included in DOAJ. Indexing in DOAJ makes your journals compliant with many initiatives and programmes around the world, for example Plan S in Europe or Capes/Qualis in Brazil.”

With correct entries in metadata, the DOAJ journals can be more easily discovered by foundations, related programmes and organizations.

  1. “Discoverability and visibility”

“DOAJ metadata is free for anyone to collect and use, which means it is easily incorporated into search engines and discovery services. It is then propagated across the internet. If you provide us with article metadata for your journal, this will be supplied to all the major aggregators and the many research organisations and university library portals who use our widgets, RSS feeds, API and other services. Indexing your journal in DOAJ is likely to increase traffic to your website and give greater exposure to your published content. Levels of traffic to a journal website typically increase threefold after inclusion in DOAJ. Your journal’s visibility in search engines, such as Google, will improve.”

Indexing journals in DOAJ means they are more easily discovered and cited by other researchers. Correcting metadata will help raise the chances that people working in the same area will find the relevant research they need.

  1. “International coverage”

“Our database includes more open access journals from a diverse list of countries than any of the other major indexing services. We have a global editorial team via a network of Managing Editors, Ambassadors and volunteers, so we will do our best to offer local support in your language. We promise you that information about your journal will be seen around the world.”

The DOAJ journals are aimed at readers from all over the world and may be seen by people who are not proficient in the journals’ language. In this case, journal editors need to ensure the correctness of data entry so that readers can read with confidence. 

What’s more, a higher quality database will be more valuable for researchers and promote the entire OA ecosystem. Especially for services like university libraries, which tend to keep up with the latest content and take advantage of metadata corrections. 

In brief, keeping the entries of DOAJ metadata correct reinforces the advantages for journals mentioned above and benefits the users of DOAJ. 

As journal editors, what can we do?

As demonstrated in a study of the SKC (Zhao, Borges & Morrison, 2021), “as of January 5, 2021, only 30% of DOAJ journals have a ‘last update’ date within the previous year (2020)”, which means only 30% of DOAJ journals fully or partially updated their information in DOAJ system. To make the best use of DOAJ, journal editors should regularly check their entries to ensure that their data is correct and up to date. For example, if journal URLs are not kept up to date, an incorrect URL means, at best, that the journal cannot be found. Crawford (2016), in a study of DOAJ journals, found journals flagged that were as malware (or as containing malware) by Mal- warebytes, Windows Defender, McAfee Site Advisor or Office 2013. 

Most of the visible inconsistencies in the metadata are input errors or location errors (listed below). Most of the input errors are “small differences in punctuation and/or characters, extra spaces at the beginning and/or at the end”, as reported by SKC (Zhao, Borges & Morrison, 2021). Combined with the findings of Crawford (2016), we list the data to be modified by categories as follows:

  • Input error or location error in:

wrong column, journal title, special character, keywords, copyright information URL, plagiarism information URL, URL for journal’s instructions for authors, other submission fees information URL, preservation services, preservation service: national library, preservation information URL, deposit policy directory, persistent article identifiers, URL for journal’s open access statement, etc. 

  • Publisher name duplicates:

Extra space or short of space, minor detail (e.g. non-English character in one but not the other), minor difference in punctuations and/or characters (e.g. “Abant İzzet Baysal Üniversitesi” vs. “Abant İzzet Baysal University”), abbreviation in one but not the other (e.g. “Asociación Interuniversitaria de Investigación Pedagógía” vs. “Asociación Interuniversitaria de Investigacion Pedagogica (AIDIPE)”), etc.

  • “APC-charging journals that don’t clearly state the amount charged” (Crawford, 2016)

Sometimes it is hard to indicate “who is the publisher”. We list some situations below:

  • When there are branch publishers under one publisher, and all of them are recorded in DOAJ, especially when their journals’ websites do not have any clear indications ;
  • When a publisher has more than one active names (perhaps due to different sponsors of one publisher, or the nature of commercial publishers), but their journals’ websites do not have any clear indications ;
  • When journals changed their websites but didn’t renew the URLs in the DOAJ database;
  • Invalid URLs;
  • Unmatched publisher name/journal name and URLs.

DOAJ also provides article-level search and is working to encourage more journals to provide article-level metadata. It makes both the journal-level and article-level metadata available for anyone to download. (DOAJ, https://doaj.org/docs/public-data-dump/) Thus, it would be better if journal editors can ensure the correctness of the articles’ information. 

References

Crawford, W. (2016). Gold Open Access Journals 2011 – 2015https://waltcrawford.name/goaj1115.pdf

Directory of Open Access Journals. Retrieved March 29, 2021, from http://doaj.org/

Public data dump. Directory of Open Access Journals. Retrieved March 29, 2021, from https://doaj.org/docs/public-data-dump/

Why index your journal in DOAJ? Directory of Open Access Journals. Retrieved March 29, 2021, from https://doaj.org/apply/why-index/

Zhao, X., Borges, L., & Morrison, H. (2021). Some limitations of DOAJ metadata for research purposes. Sustaining the Knowledge Commonshttps://sustainingknowledgecommons.org/2021/02/10/some-limitations-of-doaj-metadata-for-research-purposes/

Some limitations of DOAJ metadata for research purposes

by: Xuan Zhao, Luan Borges, & Heather Morrison

Abstract

The Directory of Open Access Journals http://doaj.org is an excellent service that fulfills many important functions, in particular facilitating access to a vetted collection of over 15,000 freely available peer-reviewed journals. The DOAJ search services and metadata download are very useful for researchers as well. The purpose of this post is to alert researchers to some of the limitations of the DOAJ metadata that researchers need to take into account to avoid drawing erroneous conclusions. First, when downloading DOAJ metadata, it is necessary to open the .csv file in Unicode in order to retain non-English characters. We open in Open Office for this reason, then save as an excel file. The nature of the metadata means that some data is inserted in the wrong column; clean-up, as discussed below, is necessary before data analysis. When journal editors or others working on their behalf enter metadata into DOAJ, research is not the primary purpose of this exercise; for this reason, in-depth assessment and corrections may be necessary before analysis. Below, we present publisher size analysis as an example of what researchers may encounter. Finally, because the main purpose of DOAJ is connecting readers with content, the metadata of interest to a particular research project may not be up to date. As demonstrated below, as of Jan. 5, 2021, only 30% of DOAJ journals have a “last update” date within the previous year (2020). We do not know whether the “last update” date reflects a full or partial metadata review. We illustrate the potential impact on research results with the example of the SKC longitudinal APC study. Of the 4,292 DOAJ journals that responded “yes” to the APC question, only 30% have a last update date of 2020 or 2021. Even with this 30% of journals, we have no way of knowing whether the APC status and/or amount per se was updated, or only other unrelated metadata. This means that if we compare 2019 prices obtained from publisher websites in 2019 with 2021 DOAJ APC metadata, we will almost certainly get incorrect results, for example falsely assuming that matching APC amounts means no change in the prices. DOAJ provides rich and useful metadata for the researcher and the research question “is this journal listed in DOAJ?” is of value in and of itself. For this reason, we intend to continue using DOAJ metadata in addition to data derived from other sources, particularly data derived directly from publisher websites. See below to a link to an open data version of the DOAJ metadata reflecting the corrections explained in this post.

Details

Correcting for displaced observations

As previously mentioned, the first step to confidently use the DOAJ metadata for analysis and research is identifying and correcting data inserted in the wrong column, herein also called displaced observations. 

Below we can see an example of a displaced observation from the DOAJ metadata. Column BB has no assigned variable while containing some observations, apparently displaced one column to the right. 

Table 1 – An example of misplaced data from 2021 DOAJ metadata

Users may follow different steps to correct for displaced data. Here we explain in more detail how we have identified these displacements and corrected them.  

Before proceeding with any analysis, it is important to get familiarized with the DOAJ metadata first. We recommend users to read the DOAJ Guide to applying, available online, because the metadata reflects responses to questions asked in the application process. The DOAJ metadata, as of 5 Jan. 2021, possesses 53 variables ranging from Journal Title to Country to Most recent article added. It may be helpful to start correcting observations from variables with easily identifiable responses, such as « Country » or « Country of Publisher », or variables that allow only two types of answers (i.e Yes or No), such as Author holds copyright without restrictions and APC. It is recommended to create a pivot table to identify displaced observations, repeating this process until no observations are identified in a wrong column. 

When cleaning-up the DOAJ metadata, users will notice that in some cases only one observation was displaced; in other cases, an entire row was displaced beginning on a specific variable. In the example highlighted in yellow below, all observations beginning at variable Publisher were displaced one column to the right. 

Table 2 – Line 36 illustrates an example of an entire row with displaced observations

Data entry inconsistencies

When correcting for displaced observations, we have also identified some inconsistencies in the way observations are registered in the DOAJ metadata. The table below lists the main visible inconsistencies found for some variables. In the majority of instances, the inconsistencies will not impact DOAJ users looking up information for a particular journal. However, it is important to take into account these inconsistencies before proceeding to any automated statistical analysis. For example, DOAJ metadata as is can be used to identify the number of journals with persistent article identifiers, but automated counting of DOI v. ARK or other approaches would require some advance data manipulation.

VariableExample
Alternative titleSome journals alternative titles may be registered as a number. Some examples are  “2300-6633” and “0”. 
KeywordsSome observations have some special characters as follows: 
6.         rheology, tribology, hydrodynamics, thermodynamics, mechanics of structures, mechatronics. 
           water cycles, water environment, water treatment and reuse, water resource, water quality, hydrology
 •          natural sciences, •      environmental sciences, •      social sciences, agricultural sciences, veterinary medicine, medical sciences
Copyright information URLSome URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an “h” at the beginning and an  “l” at the end of the link. ttp://www.emeraldgrouppublishing.com/services/publishing/jiuc/authors.htm
Plagiarism information URLSome URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an « h » at the beginning and an  « l » at the end of the link.
ttp://www.emeraldgrouppublishing.com/services/publishing/jiuc/authors.htm
URL for journal’s instructions for authorsSome URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an « h » at the beginning of the URL
ttps://revistas.unasp.edu.br/LifestyleJournal/about/submissions
Other submission fees information URLSome URLs have extra letters. The example below, for instance, has a letter « i » at the beginning of the URL
ihttps://journals.univie.ac.at/index.php/voebm/m/index
Some URLs lack a letter « h » at the beginning or the end. The example below illustrates this small error. There should be an « h » at the beginning of the URL
ttp://psr.ui.ac.id/index.php/journal/about/submissions#authorGuidelines ttps://www.karger.com/Journal/Guidelines/261897#sec62
Preservation ServicesPreservation services can be registered as a name or a website
Preservation Service: national libraryPreservation services – national library can be registered as a name or a website
Preservation information URLSome URLs lack a letter « h » at the beginning or the end. The example below, for instance, has a small error. There should be an « h » at the beginning of the URL
tps://periodicos.uff.br/revistagenero/about/editorialPolicies#focusAndScope ttp://ejournal.stkip-pgri-sumbar.ac.id/index.php/economica
Deposit policy directoryDeposit policy directory can be registered as a name or a website
Persistent article identifiersPersistent article identifiers can be registered as an acronym (UDC, DOI, ARK), but also as a website, such as dc.identifier.uri (DSpaceUnipr) or NBN http://www.depositolegale.it/national-bibliography-number/
Another example is the occurrences UDC and UDC (Universal decimal Classification), which are equivalents but were registered differently
URL for journal’s Open Access statementSome URLs lack a letter « h » at the beginning or at the end, or they have an extra h at the beginning of the URL. The example below has an extra letter « h » at the beginning of the URL. 
hhttp://www.revistas.usp.br/gestaodeprojetos/about
Table 3 – Visible inconsistencies identified in the DOAJ metadata

Publisher’s names duplicates investigation and clean-up

The purpose of this project is preparation to develop a rough picture of publisher size to compare with Solomon & Björk’s findings (2012). In order to better perform publisher size analysis, we have specifically investigated the publisher duplicates and corrected most of the obvious errors, such as small differences in punctuation and/or characters, extra spaces at the beginning and/or at the end, and minor differences in entering the publisher name when it is the same, etc. (Please see examples in Table 4 – Investigative Strategies – Publisher Names Duplicates).

The process of clean-up was divided into three stages. Firstly, we created a pivot table for the publisher column to identify the entries in rows which were slightly different but weren’t gathered. Secondly, when potential duplicates were found, we conducted an investigation to confirm duplicates and/or to decide which name to keep (in priority order: use the name with the most journal entries; correct name with obvious typo; use the first name listed). Please see the investigative strategies below:

Table 4 – Investigative Strategies – Publisher Names Duplicates

Thirdly, after identifying inconsistencies in publisher names, we created a table (please see Table 5 – Corrections GatheringPublisher Names Duplicates) to register all the corrections on the variable Publisher. About 500 inconsistencies were corrected. Thus, the number of publishers in the pivot table has decreased from 7218 entries (data resource: pivot table based on DOAJ metadata) to 6804 entries (data resource: pivot table based on the cleaned-up version of database).

Table 5 – Corrections GatheringPublisher Names Duplicates

As illustrated in the two tables above, there were different types of data inconsistencies. In order to respect metadata to the greatest extent, we acted prudently when making decisions. In some minor variation cases, we tried to click on the URLs to check publisher websites and to collect convincing evidence. However, we met some intricate complex challenges.

One of the challenges was the language. Due to the massiveness and the wide-range of publishers (124 countries, 80 languages, DOAJ, 7 Feb. 2021) [https://doaj.org/], we were unable to identify all of the sources of information. Besides, when there were invalid URLs or unmatched information, it was difficult to seek out any precision. What’s more, among 7218 entries of publisher names, some of the potential duplicates weren’t gathered because of their different beginning words. For example, “Editora da Universidade Estadual de Maringá (Eduem)” vs. “Eduem – Editora da Universidade Estadual de Maringá” and “Academica Brâncuşi” vs. “Editura Academica Brâncuşi”. They were usually far apart and hard to be detected. More details can be found in the Table 6 below:

Different beginning words (examples)“Academica Brâncuşi” vs. “Editura Academica Brâncuşi”;
“Alexandru Ioan Cuza University of Iaşi” vs. “Editura Universităţii ‘Alexandru Ioan Cuza’ Iaşi”;
“Editora da Universidade Estadual de Maringá (Eduem)” vs. “Eduem – Editora da Universidade Estadual de Maringá”
Table 6 – (1)

Unmatched publisher names (examples):

Original publisher namesPossible correct namesURLs
Canadian Society for the Study of Education.The Canadian Association for Curriculum Studieshttps://jcacs.journals.yorku.ca/index.php/jcacs/index
Badan Penelitian dan Pengembangan KesehatanURL directs to a new web link:
https://ejournal2.litbang.kemkes.go.id/index.php/jki/index
whose publisher name is:
Pusat Penelitian dan Pengembangan Biomedis dan Teknologi Dasar Kesehatan
http://ejournal.litbang.kemkes.go.id/index.php/jki
Shaheed Beheshti University of Medical Sciences and Health ServicesKowsarmedicalhttp://journals.sbmu.ac.ir/jme
Table 6 – (2)

Invalid URLs (examples):

Original publisher namesOriginal URLs (invalid)
Alborz University of Medical Sciences
(URLs wrongly directs to a website whose contents are meaningless; when we searched the journal title, we were directed to this website : https://enterpathog.abzums.ac.ir/)
http://enterpathog.com/?page=home ; https://jehe.abzums.ac.ir/index.php?slc_lang=en&sid=1
Instituto Nacional de Salud (INS)http://revistas.ins.gov.py/index.php/rspp/
Instituto Superior de Ciências de Educação do Huambohttp://revista.isced-hbo.ed.ao/rop/index.php/ROP/index
Table 6 – (3)

Given the barriers and challenges mentioned above, we can draw a conclusion to the limitations of publisher names clean-up project. Precision is not possible in this project because the question “who is the publisher” is complex. Instead of making any definitive claims about publisher size, we are primarily interested in whether the long tail effect (a few big publishers, a few more middle-sized, most very small) reported by Solomon & Björk (2012) can still be observed in DOAJ in 2021.

DOAJ metadata update analysis

The following analysis was conducted to determine whether DOAJ metadata on article processing charges (APCs) – charging status and amount – would be sufficient for SKC’s longitudinal study on APC trends over time. The answer is clearly no. The metadata for the vast majority of journals in DOAJ (overall and APC charging) has not been updated for more than a year, and it is unknown whether the most recent update would have included an update to APC or other metadata. We will continue to use DOAJ metadata as it is rich and the question “is this journal listed in DOAJ” is of value in and of itself, however for price comparisons we cannot rely on this data as it would likely result in erroneous conclusions.

DOAJ journals by year of last update.

This chart illustrates the percentage of DOAJ journals last update by year. Detailed figures are in the table below. Note that just under half the journals were last updated 2 or more years ago (2018 or earlier).

DOAJ last update as of Jan. 5, 2021
Year# journals last updated % journals last updated
20152942%
20161,4699%
20172,86418%
20182,95119%
20193,41222%
20204,66230%
2021390%
Total15,691100%
Table 7

DOAJ APC charging journals by year of last update

The chart above illustrates the percentage of journals that answered “yes” to the DOAJ question about charging APCs by year of last update. The table below provides the detailed figures. Note that only 30% of DOAJ journals that charge APCs were updated in the past year (2020 or 2021). It is also unknown whether in these cases the last update was a thorough review of the metadata, or might have been an update of non-APC data.

DOAJ last update APC journals only Jan. 5, 2021
Year of last udpate# of journals last updated% journals last updated
2015471%
20162386%
201749912%
201893022%
20191,28630%
20201,27630%
2021160%
Total4,292100%
Table 8

A version of the Jan. 5, 2021 DOAJ metadata file reflecting the corrections explained below is available as open data here:

Directory of Open Access Journals; Zhao, Xuan; Borges, Luan; Morrison, Heather, 2021, “DOAJ_metadata_2021_01_05_with_SKC_clean_up”, https://doi.org/10.5683/SP2/G5LEXG, Scholars Portal Dataverse, V1

References

The Directory of Open Access Journals (DOAJ) online: https://doaj.org/

Solomon, D. J., & Björk, B. (2012). A study of open access journals using article processing charges. Journal of the American Society for Information Science and Technology63(8), 1485–1495. https://doi.org/10.1002/asi.22673

Cite as: Zhao, X., Borges, L., & Morrison, H. (2021). Some limitations of DOAJ metadata for research purposes. Sustaining the Knowledge Commons. https://sustainingknowledgecommons.org/2021/02/10/some-limitations-of-doaj-metadata-for-research-purposes/

Preservation of Digital Blog-Posts

A Literature Review, January 2021

The goal of this literature review was to gain an understanding of the current status of research on the topic of digital blog preservation. After conducting a series of searching within the database LISTA (Library, Information Science, and Technology Abstracts), one can determine that there are little to no recent developments in technology or research specifically for the access/preservation of digital blog posts.

Unsurprisingly, much of the scholarly conversation about blog/microblog preservation took place between 2002 and 2010. 

Thoughts on Blog Preservation

Despite the varying opinions that blogs are either easier or more difficult to preserve than other digital communications, scholars agree that blogs and microblogs have unique qualities that deserve scholarly discussion.  

According to Patsy Baudoin, many blogging websites utilize software that automatically preserves the sequencing of posts (2008). This innate quality of the software supports the archiving principles of “original order” and “provenance”. However intelligent the blogging software appears to be, blogs and other user-generated content are especially vulnerable to link rot (Banks, 2010).

Blogs can become complex to preserve because they may contain various file formats, media, or have several owners (Baudoin, 2008). To add to this sentiment, Grimard (2005) states that the variety of formats adds to the “opaqueness” of digital records (opaqueness referring to the unnatural structure of electronic information that is only computer-readable).

To maintain the integrity of the blog during the preservation process, the digital archivist would have to consider preserving the additional external links within the original blog post. Furthermore, copyright can be an issue in certain blog preservation circumstances, as there have been several cases brought to the US Supreme Court (Chen, 2005).

Preservation Technology

Open-source technologic advancements in blog preservation have been disappointing at best. According to Caroline Young, there have been several programs for blog preservation that have essentially failed soon after conception (2013).

Some examples are PANDORA by the National Library of Australia, and ArchivePress by the University of London’s Computer Centre and British Library Digital Preservation department. Young mentions a developing blog preservation software called BlogForever, which was still in development in 2013. Now, it seems to be available for use and claims to be a new system to harvest, preserve, manage and reuse blog content.

Young (2013), Banks (2010), Rosenthal (2016), and Chen (2010) all highlight the impact made by the introduction of the Internet Archive’s Wayback Machine. The Wayback Machine has simproved the landscape of digital preservation of grey literature like bog posts; however, it is not without its challenges. Much like other archiving software, it has difficulty with images and audio files. 

Solutions to the Preservation Problem

Though an older article, Grimard (2005) offers some solutions to digital preservation that are still relevant. One important recommendation is to standardize the format of the information. The recommendation is echoed by Young (2013). Both authors emphasize the importance of converting files to the most usable format. Since file formats are simply a set of conventions that software developers can change and alter, they may become obsolete. Young describes the universal XML format as being hierarchical and organized logically. 

LOCKSS is a blog preservation software mentioned in both Leroy (2018) and Rosenthal (2016). It is an open-source software designed with libraries in mind. It also claims to preserve animations, data sets, images, audio, and text content.

Conclusion

The scholarly conversation on the preservation and conservation of blog content has slowed in the past decade. This could be because the options currently available are adequate for the need of blog preservation.

Blogs and microblogs are comprised of various formats that can contribute to the challenges in digital preservation. According to research in the early 2010s, images, animations, and audio files, which blogs usually contain, are difficult to preserve with the Wayback Machine. This may have improved in the more recent years.

There are also preservation software options like the LOCKSS and BlogForever that seems to be more targeted toward archiving blog content than the Wayback Machine is.

Reference List

Chen, X. (2010). Blog Archiving Issues: A Look at Blogs on Major Events and Popular Blogs. Internet Reference Services Quarterly15(1), 21–33. https://doi.org/10.1080/10875300903529571

Baudoin, P. (2008). On Preserving Blogs for Future Generations. The Serials Librarian53(4), 59–61. https://doi.org/10.1300/J123v53n04_04

Farace, D., & Schöpfel, J. (Eds.). (2010). Chapter 14. Blog Posts and Tweets: The Next Frontier for Grey Literature. In Grey Literature in Library and Information Studies (pp. 217–226). K. G. Saur. https://doi.org/10.1515/9783598441493.2.217

Grimard, J. (2005). Managing the Long-term Preservation of Electronic Archives or Preserving the Medium and the Message. Archivaria, 153–167.

Leroy, A. (2018). LOCKSS Distributed Digital Preservation Networks. Université libre de Bruxelles. Belgium. ISSN, 9. https://nusl.techlib.cz/en/conference/conference-proceedings

Rosenthal, D. S. H. (2017). The medium-term prospects for long-term storage systems. Library Hi Tech35(1), 11–31. http://dx.doi.org.proxy.bib.uottawa.ca/10.1108/LHT-11-2016-0128

Young, C. (2013). Oh My Blawg! Who Will Save the Legal Blogs? Law Library Journal105(4), 493–503.

Cite as: Pelland, K. (2021). Preservation of digital blog-posts. Sustaining the Knowledge Commons. https://sustainingknowledgecommons.org/2021/01/29/preservation-of-digital-blog-posts/

Bienvenue à C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora

Notre Tanoh Laurent Kakou a créé un blog pour son propre projet de recherche en libre accès, C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora.

Quelques articles seront familiers aux lecteurs de Soutenir les savoirs communs, le travail de l’équipe; d’autres sont nouveau recherche fait par Tanoh. La vidéo Qu’est-ce que la revue Afroscopie?, un entretien avec Benoit Awazi, est éclairante pour quiconque s’intéresse à la recherche en Afrique francophone.

Merci et félicitations à notre Tanoh Laurent Kakou, candidat au doctorat en communication (et diplômé d’ÉSIS), qui a réussi son examen de synthèse cet été! Meilleurs voeux à Tanoh et sa recherche.

English

Welcome to C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora

Our Tanoh Laurent Kakou has created a blog for his own research project in open access, C.A.S.A.D.: Centre d’Accès aux Savoirs d’Afrique et de sa Diaspora.

Some articles will be familiar to readers of Sustaining the knowledge commons, as the work of the team; others are new research projects by Tanoh. The video Qu’est-ce que la revue Afroscopie?, an interview with Benoit Awazi, is enlightening for anyone who is interested in research in francophone Africa.

Thank you and congratulations to our Tanoh Laurent Kakou, a doctoral candidate in communication (and graduate of ÉSIS) on passing his comprehensive exam this summer! Best wishes to Tanoh and his research.

Français

BioMedCentral 2020

BioMedCentral (BMC) 2019 – 2020

by Anqi Shi & Heather Morrison

Key points

  • Open access commercial publishing pioneer BMC is now wholly owned by a private company with a portfolio including lines of business that derive revenue from journal subscriptions, book sales, and textbook sales and rentals
  • Two former BMC fully OA journals, listed in DOAJ from 2014 – 2018 as having CC-BY licenses, are now hybrid and listed on the Springer website and have disappeared from the BMC website
  • 67% of BMC journals with APCs in 2019 and 2020 increased in price and 11% decreased in price.
  • Journals with price increases had a higher average APC in 2019, i.e. more expensive journals appear to be more likely to increase in price

Abstract

Founded in 2000, BioMedCentral (BMC) was one of the first commercial (OA) publishers and a pioneer of the article processing charges (APC) business model. BMC was acquired by Springer in 2008. In 2015, Springer was acquired by the Holtzbrinck Publishing Group in 2015 and became part of SpringerNature. In other words, BMC began as an OA publisher and is now one of the imprints or business lines of a company whose other lines of business include sales of journal subscriptions and scholarly books and textbook sales and rentals. Of the 328 journals actively published by BMC in 2020, 91% charge APCs. The average APC was 2,271 USD, an increase of 3% over 2019. An overall small increase in average APC masks substantial changes at the individual journal level. As first noted by Wheatley (2016), BMC price changes from one year to the next are a mix of increases, decreases, and retention of the same price. In 2020, 67% of the 287 journals for which we have pricing in USD for both 2019 and 2020 increased in price; 11% decreased in price, and 22% did not change price. It appears that it is the more expensive journals that are more likely to increase in price. The average 2019 price of the journals that increased in 2020 was 2,307 USD, 18% higher than the 2019 average of 1,948 USD for journals that decreased in price. 173 journals increased in price by 4% or more, well above the inflation rate. 39 journals increased in price by 10% or more; 13 journals increased in price by 20% or more. Also in 2020, there are 11 new journals, 11 journals ceased publication, 5 titles were transferred to other publishers, 2 journals changed from no publication fee to having an APC, and 3 journals dropped their APCs. Two journals formerly published fully OA by BMC are no longer listed on the BMC website, but are now listed as hybrid on the Springer website. This is a small portion of the total but is worth noting as the opposite direction of the transformative (from subscriptions to OA) officially embraced by SpringerNature.

Details and documentation: download the PDF: BMC_2019_2020_as_hm

Data: BMC_2019_2020

Cite as: Shi, A. & Morrison, H. (2020). BioMedCentral 2020. Sustaining the Knowledge Commons. https://sustainingknowledgecommons.org/2020/06/08/biomedcentral-2020/

Frontiers 2020: a third of journals increase prices by 45 times the inflation rate

Updates June 4:

  1. Frontier’s comment regarding their pricing transparency below is helpful. It is important for those who support gold OA publishing to understand the cost implications of their demands and expectations. Frontiers states: “As Frontiers’ sole source of income, APCs allow us to subsidize new journals and communities with less research funding, to reinvest in our publishing platform, and to offer a fee support program. More than a third of all articles published in 2017/18 received full or partial waivers as a result of this approach, which we fully intend to continue to offer in the years ahead.” An average APC of $2,170 USD could support hosting a whole journal in North America and could be enough to fund a year or partial year of a highly paid researchers’ salary, in less affluent countries. If granting agencies were to directly subsidize local publishing in both more and less affluent countries, this would probably cost less and do more (by supporting local development) than expecting publishers like Frontiers to subsidize APCs.
  2. It has come to my attention that this post happens to coincide with negotiations on a national agreement between Frontiers and Germany in the context of PlanS / cOAlition S. Details about the agreement can be found:

A third of the journals published by Frontiers in 2019 and 2020 (20 / 61 journals) have increased in price by 18% or more (up to 55%). This is quite a contrast with the .4% Swiss inflation rate for 2019 according to Worlddata.info ; 18% is 45 times the inflation rate. This is an even more marked contrast with the current and anticipated economic impact of COVID; according to Le News, “A team of economic experts working for the Swiss government forecasts a 6.7% fall in GDP”. (Frontiers’ headquarters is in Switzerland).

This is similar to our 2019 finding that 40% of Frontier’s journals had increased in price by 18% or more (Pashaei & Morrison, 2019) and our 2018 finding that 40% of Frontier journals had increased in price by 18% – 31% (Morrison, 2018).

The price increases are on top of already high prices. For example, Frontiers in Earth Science increased from 1,900 USD to 2,950 USD, a 55% price increase. Frontiers in Oncology increased from 2,490 to 2,950 USD, an 18% price increase.

This illustrates an inelastic market. Payers of these fees are largely government research funders, either directly or indirectly through university libraries or researchers’ own funds. The payers are experiencing a major downturn and significant challenges such as lab closures, working from home in lockdown conditions, and additional costs to accommodate public health measures, while Frontiers clearly expects ever-increasing revenue and profit.

Following is a list of Frontier journals with price increases. All pricing is in USD.

Journal title 2020 APC 2019 APC 2020 – 2019 price change (numeric) 2020 – 2019 price change (percent)
Frontiers in Earth Science 2,950 1,900 1,050 55%
Frontiers in Veterinary Science 2,950 1,900 1,050 55%
Frontiers in Cardiovascular Medicine 2,490 1,900 590 31%
Frontiers in Ecology and Evolution 2,490 1,900 590 31%
Frontiers in Energy Research 2,490 1,900 590 31%
Frontiers in Environmental Science 2,490 1,900 590 31%
Frontiers in Molecular Biosciences 2,490 1,900 590 31%
Frontiers in Nutrition 2,490 1,900 590 31%
Frontiers in Physics 2,490 1,900 590 31%
Frontiers in Surgery 2,490 1,900 590 31%
Frontiers in Artificial Intelligence 1,150 950 200 21%
Frontiers in Bioengineering and Biotechnology 2,950 2,490 460 18%
Frontiers in Cell and Developmental Biology 2,950 2,490 460 18%
Frontiers in Chemistry 2,950 2,490 460 18%
Frontiers in Integrative Neuroscience 2,950 2,490 460 18%
Frontiers in Marine Science 2,950 2,490 460 18%
Frontiers in Materials 2,950 2,490 460 18%
Frontiers in Oncology 2,950 2,490 460 18%
Frontiers in Pediatrics 2,950 2,490 460 18%
Frontiers in Systems Neuroscience 2,950 2,490 460 18%

The full spreadsheet can be found here:

Frontiers_OA_main_2020

References

Morrison, H. (2018). Frontiers: 40% journals have APC increases of 18 – 31% from 2017 to 2018. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs. Retrieved from https://sustainingknowledgecommons.org/2018/04/12/frontiers-40-journals-have-apc-increases-of-18-31-from-2017-to-2018/

Pashaei, H., & Morrison, H. (2019). Frontiers in 2019: 3% increase in average APC. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs. Retrieved from https://sustainingknowledgecommons.org/2019/04/30/frontiers-in-2019-3-increase-in-average-apc/

Cite as:  Morrison, H. (2020). Frontiers 2020: a third of journals increase prices by 45 times the inflation rate. Sustaining the Knowledge Commons / Soutenir Les Savoirs Communs : https://sustainingknowledgecommons.org/2020/06/03/frontiers-2020-a-third-of-journals-increase-prices-by-45-times-the-inflation-rate/

Coronavirus: an idea to identify articles that aren’t OA yet, but could be

As posted to the Global Open Access List, scholcomm and the radical open access list, following is a suggestion for how to identify articles on coronavirus that are not yet open access. The majority of these articles will be in journals that allow author self-archiving, and some may be published by authors covered by open access policies. Communication with authors and/or journals may be helpful to improve the percentage of open access.
A PubMed search for “coronavirus” limited to the past 10 years then limited again to free full-text yields results of 55% free full-text. With no date limit, it’s 46%.
This search will get at research on COVID and the next most relevant research, all the other coronaviruses (mers, sars, common cold), and will be helpful for researchers and medical practitioners anywhere.
China’s early release of the COVID genetic code and even traditional publishers scrambling to make COVID resources free is demonstrating that people get at least some of the points of open access and open research.
It would be interesting to compare publisher responses today with earlier epidemics. If I recall correctly, there is a significant change from responding to pressure to proactively making resources free without OA pressure.
This is progress. It’s not 100% OA but a lot more researchers and practitioners have free access to a lot more of our knowledge than was the case with the 2003 Sars epidemic.
Further pressure might be helpful. Identification and analysis of the 45% PubMed results that are coronavirus but not free full-text would identify suitable targets for gentle pressure. Some such articles may have been written by authors covered by an OA policy. Such a results list would likely yield journal lists and individual articles, many of which could be deposited in repositories thanks to the efforts of green OA advocates.
Librarians and others working from home can send e-mails to authors and it should be possible to add items to repositories remotely. Publishers who are green not gold should ideally work with PMC and can also send e-mails to authors reminding them of the green policy.
Although research on coronavirus is urgent, university researchers who are also teachers are likely swamped due to a sudden shift to online teaching this semester. For this group, it might make sense to time communication after the semester ends.
Just some ideas…
Cite as:  Morrison, H. (2020). Coronavirus: an idea to identify articles that aren’t OA yet, but could be. Sustaining the Knowledge Commons. https://sustainingknowledgecommons.org/2020/03/31/coronavirus-an-idea-to-identify-articles-that-arent-oa-yet-but-could-be/

COVID-19 open access and open research: good progress and what is missing

Update April 1: added: NISO’s meta-collection of COVID-19 responses by the information community. In future updates will be moved to the bottom of the post in order to focus on resources.

Update March 31: to avoid confusion, I’ve added a list of the key resources for policy-makers, the general public, researchers and practitioners at the top of this post. The original post is now named “details for open access and scholarly communication specialists” and is intended to help specialists contribute to the fight against COVID-19 and to use  COVID-19 as an ad hoc case study to understand why open access to scholarship matters, assess and further progress on OA. I’ve also added Emerald, an example of best practice in providing free access to a broader range of information such as social sciences and supply chain management.

Key resources

Details for open access and scholarly communication specialists

Major publishers are making research and data directly related to COVID-19 freely available. This is good news, and may reflect progress towards open access over the past two decades, because the arguments for free sharing of information in the context of pandemic are so compelling, as I touched on in this post.

A few examples, current best practices and gaps, will follow, but first, a few notes to explain why we need to move beyond open sharing of directly related resources to include all resources.

  • Scientists working on COVID: while the greatest need is research and data directly on COVID per se, some pieces of the puzzle of solving any scientific problem can come from any branch of scientific inquiry. For example, basic research on how the respiratory system works, viruses and their transmission, may provide clues that will help COVID scientists. Some of this knowledge may be locked up in the print collections of libraries that are closed to limit spread of the virus.
  • Practitioners dealing with the more severe cases are often dealing with patients who have other health issues. Clinical research on the other issues and relevant co-morbidity studies (e.g. when people with the other illness have other types of pneumonia) might save some lives.
  • Educational institutions and governments that want to speed up training of health professionals to cope with the pandemic need the full range of knowledge relating to the health professions, in addition to COVID-specific resources. This includes all of the basic sciences (biology, chemistry, physics), much of the social sciences, as well as arts and humanities for a well-rounded education (e.g. foster creativity through arts, cultural understanding for clinical care through humanities).
  • The pandemic per se raises a great many major secondary challenges, particularly the social challenges of helping entire populations cope with lock-down and the short and medium-term economic challenges. To address these challenges, we need all of our knowledge about communications, information, psychology, culture and history, along with classical and political economics. Part of the immediate solution to help people cope with lockdown is culture and arts. Like the COVID resources, many arts organizations and individual artists are making their works freely available. This is welcome and useful, but raises questions about economic support for artists and the arts so that this can continue; these are economic questions as well as challenges for the arts. We need open access to all of our knowledge to move forward with these secondary challenges. Right now is an excellent time to do this, because some of these secondary challenges are critical to dealing with the pandemic and limiting short and medium-term damage, and because so many researchers everywhere are working from home and would be able to benefit from this access.
  • Libraries are an essential service and have been providing online services for many resources. In the short term, one way to contribute even further: It should be possible to have people work at scanning stations to digitize material not yet online while maintaining social distancing. Correction: safety is a priority. Staff should not be asked to take this on if travel to work presents a risk of infection, for example. This might have to wait until the pandemic is over.

Examples of major publisher COVID-19 related initiatives for comparative purposes follow. Note that I use parent company names first as part of an ongoing effort to help people understand the nature of these organizations, whether publicly traded corporations or privately held businesses, often with multiple divisions of which scholarly publishing forms just one part.

NISO: COVID-19: Response from the Information Community. NISO is developing and growing a meta-collection of responses that include all of the following, and much more. This site is recommended for those looking for resources. The following analysis is limited to a few select examples of good practices.

RELX (Elsevier +): COVID responses across all company divisions, featured prominently on home page; Novel Coronavirus Center “;with the latest medical and scientific information on COVID-19. The center has been set up since the start of the outbreak and is in English and Mandarin. Elsevier has provided full access to this content for PubMed Central”; COVID-19 clinical toolkit; free institutional access to ClinicalKey student platform until the end of June; rapid publication (preprints and data) of COVID-19 related works; data visualization of the impact of the virus on the aviation industry; LexisNexis free, comprehensive COVID-19 related legal news coverage; turned exhibition space in Austria into a functional hospital.

SpringerNature: “As a leading research publisher, Springer Nature is committed to supporting the global response to emerging outbreaks by enabling fast and direct access to the latest available research, evidence, and data.”

informa (Taylor & Francis +): no mention of COVID on parent company home page; Taylor & Francis COVID-19 resource center: microsite that provides “links and references to all relevant COVID-19 research articles, book chapters and information that can be freely accessed on Taylor & Francis Online and Taylor & Francis ebooks in support of the global efforts in diagnosis, treatment, prevention and further research into COVID-19″; prioritizing rapid publication of COVID-19 research.

Wiley offers free access to resources until the end of the Spring 2020 term to help with online education; ” making all current and future research content and data on the COVID-19 Resource Site available to PubMed Central”.

Emerald: free access not only to resources directly related to COVID-19, but also other coronaviruses such as SARS, also “explores the wider impact on society and includes research on healthcare, education, homeworking, SCM and tourism.”

Discussion

Some best practices beyond making directly relevant resources free from different companies that others could follow:

  • Meta-collection of a discipline-specific list of resources: NISO’s COVID-19 response from the information community
  • Comprehensive, company-wide COVID-19 response: RELX (Elsevier +)
  • Help for educational institutions facing the challenge of suddenly moving online: Wiley
  • Rapid publication: informa (Taylor & Francis +), RELX (Elsevier +)
  • PubMedCentral deposit, facilitating search by researchers and best long-term solution: Wiley, RELX (Elsevier +), Emerald (also available on WHO website)
  • Including wider impact on society: Emerald

Gaps

  • No hospital for countries most in need (another hospital in Austria is welcome, but there are many other countries with greater needs).
  • Resources beyond those most directly and obviously related to COVID-19.
  • Language: the only language mentioned besides English is RELX / Elsever, and only Mandarin is mentioned.

Cite as:  Morrison, H. (2020). COVID-19, open access and open research: good progress and wha is missing. Sustaining the knowledge commons. https://sustainingknowledgecommons.org/2020/03/30/covid-19-open-access-and-open-research-good-progress-and-what-is-missing/

See also:

Additional publisher resources:

American Association of Publishers (AAP) What publishers are doing to help during the coronavirus. (thanks to the Open Access Tracking Project)

Related SKC / IJPE posts:

Morrison, H. (2020). COVID-19, open access and open research: good progress and what is missing. Sustaining the knowledge commons. https://sustainingknowledgecommons.org/2020/03/30/covid-19-open-access-and-open-research-good-progress-and-what-is-missing/

Morrison, H. (2007). Needed: open access, open science. The Imaginary Journal of Poetic Economics https://poeticeconomics.blogspot.com/2007/07/needed-open-access-open-science.html