Objectives

The design has these objectives:

  • use a well understood format (we choose “comma separated values” a.k.a. CSV)

  • files editable with a text editor and commonly available software (Excel, Google Sheets, Calc etc.)

  • avoid duplication of data as far as possible to minimise possibilities for inconsistency

  • include the minimum amount of data needed by Cantabular

  • contents of files are self-describing (you can tell from the header line what you are looking at)

  • use similar formats for variables and mappings

  • avoid having to load very large files to edit most data

  • make version control / diffs easier and more comprehensible by splitting into multiple files

  • use full stop as a separator because doubling up to escape is readable unlike hyphen or underscore

  • can be used to represent geographic hierarchy via large mappings