Friday, July 11, 2008

08. Downloading from International Itracks Accounts

As noted in a previous post, there are some obstacles to uploading files to the international Itracks accounts when these files are created by Excel. These obstacles can be overcome, by and large, when the researcher ensures that files to be uploaded are first encoded as UTF-8 or UTF-16.

There are challenges with downloading files from international accounts as well, however. Files don't always open in their target applications, and some don't always display accented characters properly. Indeed, some can't even be uploaded back to Itracks.

There are three types of information that can be downloaded from the Itracks software:

  • User files
  • Transcripts, and
  • Discussion guides.

For user files and discussion guides, the options are CSV files or TXT files. For transcripts, the options are CSV, TXT, RTF, and HTM.

I downloaded a sample file for each option and determined that:

  • The RTF and HTM formats of the transcript opened correctly in their respective applications (Word and Firefox)
  • User files, TXT. Encoded as UTF-16 by Itracks. Correctly opened by both Notepad and Excel, with accents displaying properly. Can be re-uploaded
  • User files, CSV. Encoded as ISO 8859. Opened by Excel, but accented characters do not display and are replaced by a question mark. Cannot be uploaded.
  • Discussion guide, TXT. Encoded as UTF-16. Correctly opened by both Notepad and Excel, with accents displaying properly. Can be re-uploaded without further convertsion
  • Discussion guide, CSV. Encoded as ISO 8859. Opened by Excel, but accented characters do not display and are replaced by a question mark. Cannot be uploaded.
  • Transcripts, TXT. Encoded as UTF-16. Correctly opened by both Notepad and Excel, with accents displaying properly.
  • Transcripts, CSV. Encoded as UTF-8. Accented characters rendered as a ? by Excel. Conversion to UTF-16 makes it intelligible in Excel with accents displaying properly.
Some conclusions:


  1. CSV files are not useful, as currently implemented. They cannot be opened with the accents displaying correctly in their target applications (Excel). Only the transcript CSV file can be made useful by converting it to UTF-16


  2. All TXT files are encoded as UTF-16 and all were opened by their target application (Notepad) with the accents displaying properly. Further, even Excel can open these text files directly, with accents displaying correctly.


  3. Researchers should probably be using TXT files in preference to CSV files, at least for downloading.

Wednesday, July 09, 2008

07. Uploading files to Itracks - character encoding

Accented characters can be an issue when using an international Itracks account.

Most of these issues can be resolved by ensuring that all files uploaded to the Itracks software - html whiteboards, user files (CSV or TXT), and discussion guides (CSV or TXT) are encoded using UTF-8 (see Wikipedia for a discussion of UTF-8. Another great discussion is here).


Indeed, in tests I conducted yesterday and today, the international (French) Itracks software only accepted CSV or TXT files created by Excel if they were first encoded as UTF-8 or UTF-16.


The one exception: curiously enough, you can upload a CSV discussion guide created by Excel directly into the Itracks software, even though it is encoded as Windows 1252 (only if it is created entirely in Excel, apparently; if any information is pasted in from another application, the file is saved with ISO-8859!) A user file, Windows 1252 encoding, cannot be uploaded.


Moral: encode to UTF-8 all uploads.

How?

One way to convert a file to UTF-8 encoding is to open it in Notepad, and then save it using the UTF-8 encoding option.

There is another way to convert text files to UTF-8 encoding. And that is by using a nifty (and free) utility called charsc.exe that you can download from the site of the author, Benjamin Kalytta.

The utility automatically detects the current encoding, and provides you with a wide selection of encoding alternatives. It will also work in batch mode, allowing you to convert a number of files at once.

06. Uploading FIles to Itracks - Update

It appears that uploaded TXT files no longer require quotation marks (except in cases where control characters are part of the field being uploaded).

This is great news and means that neither work-around suggested in a recent post will be necessary in the future.

Tuesday, July 08, 2008

05. Uploading Files to Itracks

The Itracks software allows the researcher to upload an entire discussion guide, or all the group participants (and observers) in a project, a significant saving of time and effort over an alternative approach of cutting-and-pasting each field for each record.

For both tasks, the options are to use a CSV (Comma-separated values) or a tab-delimited TXT file. To the best of my knowledge, Microsoft Word does not export to either of these formats, leaving us to work with Excel, which does.

For those who choose to upload a TXT file, there are some hurdles to overcome. Itracks requires quotation marks around all the fields. Excel, however, does not generate TXT files with both a field delimiter and quotation marks.

What to do?

To date, I had been using a rather tedious search-and-replace approach (open the TXT file with Notepad, search-and-replace the tab characters with "TAB", insert a quotation mark at the beginning of the file, and at the end of each line). Tedious & error-prone.

I've discovered a better way. Run a macro (in Excel) that will write selected records to a text file with both tab and quote delimiters.

Microsoft has written such a macro and provides it, as is. You can find it on their support site under the heading Procedure to export a text file with both comma and quote delimiters in Excel.

Wait, you say. we don't want comma and quote, we want tab and quote. No worries. Follow the directions on the Microsoft site to load up the macro in a new worksheet. Once you've pasted in the macro, edit it to replace the line

Print #FileNum, ",";

with

Print #FileNum, chr(9);
CHR(9) is the code for the tab character.

Now, whenever you need to export from Excel to Itracks in a TXT format, select the cells and call up the edited macro to generate the TXT file in the correct format.

Alternatively, you can download this Excel spreadsheet from my site. It contains the macro already loaded. The usual warnings apply.

Don't forget to change the encoding of the file to UTF-8 if you are working in a international-language Itracks account. More on that in another post.