Shoebox 5

2014 Note: This review is maintained a sa relic. Currently the tool for interlinearizing is Fieldworks, also available from the SIL website.

Shoebox is widely-used software for linguistic data management. It is described in a number of places on the web so this page serves to point to the most useful sites. The previous review of Shoebox dealt with an earlier and less powerful version of the software.

The recent addition to the Shoebox suite is Toolbox, at the moment only for the major non-Mac platform. It extends the Shoebox model to include Unicode, and allows users to export in various formats, including a much better form of XML than is exported by Shoebox.

A step-by-step guide to starting with Shoebox (by Jason Lee and Pascale Jacq) can be found here as an rtf or pdf document.

An "Introduction to Shoebox and Toolbox with notes on Econv, Transcriber and Elan" by Andrew Margetts (June 2005). This pdf document provides an outline of Shoebox and how it fits with transcription software. Accompanying the pdf file is a zip file of 1 Mb with a set of sample files, including Shoebox settings files as discussed in the pdf document.

An application of Shoebox to an African text corpus is discussed at

A discussion by Sebastian Drude on multilayer transcription using Shoebox, at

A set of dictionaries produced by SIL Suriname and made web-accessible from Shoebox:

An example of a dataset which makes the most of Shoebox by linking texts, lexical lists, speaker id and details and grammatical information is Peter Austin's Malyangapa Knowledgebase.

The broader issue of how to represent interlinear text as a basis for developing the next interlinearising tool is discussed here:, and by Cathy Bow, Baden Hughes and Steven Bird at

An overview of lexicographic websites is found at the Linguistlist.

Audio is not supported directly in Shoebox, but text files with audio timecodes can be the input to Shoebox interlinearising as shown in the workflow for Audiamus.

Audio is supported directly in Toolbox, with a media field specifying medianame, start time, end time being playable by clicking in the field and pressing Alt-F4.


  • large user-base means that data will have paths for migration from Shoebox format to future formats.
  • specifically designed for linguistic functions.
  • good documentation.
  • data stored in ASCII text files, accessible by other software.
  • all we currently have.


  • unsupported, soon to be if not already orphaned.
  • no true relational function. Relationships are look-ups not links.
  • difficult to learn, control of functions buried in nooks and crannies.

    Current version: Shoebox 5 (5.01 and Toolbox for Windows),
    Platform: Macintosh, Windows
    Application size & Suggested minimum RAM: See:
    Documentation: Detailed documentation is available on the CD available from:, and some documentation is also available as pdf files at

    A French version of the documentation also exists, "la bo”te ˆ chaussures du linguiste", and used to be located here:

    Page last modified: 10th August 2005


    Go back to the The Computer Assisted Language Worker (part two)