Skip to Main Content
QUL logo

README Guide and Template

Support for Research Data Management

Alexandra Cooper
Data Services Coordinator

Meghan Goodchild
Research Data Management Librarian

RDM.library@queensu.ca

README Guide

A readme file is a simple type of documentation for a dataset or data file to help ensure that data can be correctly interpreted by yourself or others at a later date. Some best practices of readme files include:

  • Use a plain text file rather than proprietary formats such as Word.
  • Name your readme consistently (e.g., README.txt, readme.txt) 
  • Include readme files as part of the file structure for your project (e.g., different folder levels, related files) and/or in relation to a single data file
  • Use a standardized structure consistently across your project
  • Follow standards and conventions from your discipline whenever possible

Recommended Content

Recommended minimum content for data re-use is in bold.

General information

  1. Provide a title for the dataset
  2. Name/institution/address/email information for
    • Principal investigator (or person responsible for collecting the data)
    • Associate or co-investigators
    • Contact person for questions
  3. Date of data collection (can be a single date, or a range)
    • Format – YYYY-MM-DD
  4. Information about geographic location of data collection
    • latitude, longitude, or city/region, State, Country, as appropriate
  5. Keywords used to describe the data topic
  6. Language information
  7. Information about funding sources that supported the collection of the data

Sharing and access information

  1. Licenses or restrictions placed on the data
  2. Recommended citation for the data

Data and file overview

  1. For each filename, a short description of what data it contains
    • Include format of the file if not obvious from the file name
  2. Folder structure and/or relationship between files, if important
  3. Date that the file was created
    • Format – YYYY-MM-DD
  4. Information about related data collected but that is not in the described dataset

Methodological information

  1. Description of methods for data collection or generation
    • include links or references to publications or other documentation containing experimental design or protocols used)
  2. Description of methods used for data processing
    • describe how the data were generated from the raw or collected data
  3. Any software or instrument-specific information needed to understand or interpret the data, including software and hardware version numbers
    • include full name and version of software, and any necessary packages or libraries needed to run scripts
  4. Standards and calibration information, if appropriate
  5. Describe any quality-assurance procedures performed on the data
  6. Definitions of codes or symbols used
  7. People involved with sample collection, processing, analysis and/or submission

Data-specific information
Repeat this section as needed for each dataset (or file, as appropriate). Recurring items may also be explained in a common initial section. 

  1. Count of number of variables, and number of cases or rows
  2. Variable list
    • Provide details on each variable in the file. Include the variable name as it appears in the data file; a description of the variable including the full name, explanation of what the variable represents, and units of measure, if applicable; if value labels (codes) has been used, list the value labels with a description of what each label (code) represents; and any additional notes required to understand the nature of the variable or the content found in the variable.
  3. Definitions for codes or symbols used to record missing data
  4. Specialized formats or other abbreviations used

Data-specific information
Repeat this section as needed for each dataset (or file, as appropriate). Recurring items may also be explained in a common initial section.

  1. Count of number of variables, and number of cases or rows
  2. Variable list
    • Provide details on each variable in the file. Include the variable name as it appears in the data file; a description of the variable including the full name, explanation of what the variable represents, and units of measure, if applicable; if value labels (codes) has been used, list the value labels with a description of what each label (code) represents; and any additional notes required to understand the nature of the variable or the content found in the variable.
  3. Definitions for codes or symbols used to record missing data
  4. Specialized formats or other abbreviations used