How to download and work with DOAJ journal metadata without messing up the characters in the journal titles and publisher names.

Background: the file containing the sample for May 2014 has incorrect characters which are very difficult to fix because we did not contemplate how to download and open the csv files in the right way.

These are the most important steps to download and work with DOAJ journal metadata without messing up the characters in the journal titles and publisher names:

  • Go to DOAJ and download metadata
  • Save the CSV file on your computer WITHOUT opening it
  • If you’re using a spreadsheet package, first open the application (e.g. Excel or LibreOffice Calc) and then IMPORT the CSV file into the application.

The following is one example of how you can import and work with the file without messing up the characters using Excel 2013.

  • Open Excel and click on the “Data” menu option.
  • Click on the “From Text” icon.
  • Browse the location of the CSV file, and then click on the “Import” button.
  • The Text Import Wizard will prompt, showing Step 1 of 3.
  • Choose “Delimited” on data type.
  • Select the character set as “65001: Unicode (UTF-8)”
  • Click on the “Next” button to display Step 2 of 3.
  • Select the “Comma” character.
  • Click on the Next button to display Step 3 of 3.
  • Choose the appropriate data format for each column
  • Click on the “Finish” button to complete importing your data into MS Excel.
  • Save the file in MS Excel format WITH ANOTHER NAME, ensuring that you preserve a copy of the original CSV file for further verifications.

If you use open software like LibreOffice Calc, you can import and work with the file and save it again without messing up the characters using Excel 2013.

Import file into LibreOffice Calc.

  • Open the application (LibreOffice Calc).
  • Click on “File – Open”.
  • Browse the location of the CSV file and click on the “Open” button.
  • An import file dialog box will prompt
  • Select the character set “Unicode (UTF-8)”
  • On the Separator Options, select “Comma”
  • Choose the appropriate data format for each column
  • Click on the “Finish” button to complete importing your data into LibreOffice Calc.
  • Save the file in LibreOffice Calc format, WITH ANOTHER NAME, ensuring that you preserve a copy of the original CSV file

Now you can work and modify the file, adding more information if needed using any spreadsheet software. However, if you work with Excel and need to save the file again in CSV format, you need to follow another process or you end up messing up the characters in the journal titles and publisher names, or special characters added in another columns.

The easiest way to do this is to follow these steps:

  • Open the Excel file using LibreOffice Calc
  • Choose File – Save as – Text CSV
  • An export file dialog box will prompt
  • Select the character set “Unicode (UTF-8)” and click Ok

If you know other methods or alternatives, please share them with us.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s