Check Inter-rater Reliability to Improve Data Accuracy

Check Inter-rater Reliability to Improve Data Accuracy

Reliability coding verifies that the desired codes are observable and that the coders are accurately interpreting events. With reliability coding, two coders separately code the same data source. You can create a reliability column, which the second coder can use to reccord his or her observations. After both coders have coded the same column, they can compare their codes and determine their inter-rater reliability.

Note

While it may be tempting to have the reliability coder code the data source in a new spreadsheet, you should endeavor to keep all codes for a given data source in a single spreadsheet. This facilitates analysis and ensures that like data are kept together. To prevent the reliability coder from seeing the original coding pass, you can hide the original coded column.

Make a Reliability Column

In general, when creating reliability columns you create blank cells that correspond to the cells created during the first coding pass. This ensures that the two coders will have observed the same events in the data stream and allows easy comparison of the two coding passes.

The Datavyu Ruby API provides the makeReliability() method for creating reliability variables. makeReliability() has four parameters:

Parameter

Type

Description

relname

String or Ruby column from getColumn()

the name of the new reliability column you will create. The convention is to name it rel_columnName, but you can name it whatever you want to.

column_to_copy

String

The name of the column that we want to create a reliability column from (i.e. the existing coded column).

multiple_to_keep

Integer (optional)

Number of cells to skip: a value of 2 includes every other cell in the new variable; 1 includes every cell, and 0 creates a blank column with no cells

*codes_to_keep

comma-separated strings (optional)

Codes you want to copy from the original to new column. These are codes that the reliability coder will not have to code.

Tip

Copying the onset of the original column to the new reliability column in the args_to_keep parameter makes it easier for the second coder to navigate to the correct locations in the data source, and to code the same events as the original coder.

When making a reliability column, you should also think about how you are going to compare the columns. checkReliability(), which you use to check reliability. It requires that each pair of cells has a unique identifier that link them together. For example, a trial number coded into each cell would match corresponding cells, even if only a subset of cells were included in the reliability variable.

The following example uses the sample data to create a new reliability column called “trial_rel” from its “trial” column, skipping every other cell, and copying over the onset and trialnum codes so that the reliability coder doesn’t have to recode onset and trial numbers.

require 'Datavyu_API.rb'
begin
   makeReliability("trial_rel", "trial", 2, "onset", "trialnum")
end

Note

You do not have to write the variable back to the spreadsheet. makeReliability() automatically writes its results to the spreadsheet.

Check Reliability

Once the second coder has recorded their observations in the reliability column, you can use checkReliability() to compare the primary and reliability columns cells. checkReliability() returns the number of errors, and the percent agreement.

checkReliability() has four required parameters, and one optional one:

Parameter

Type

Description

main_col

String or Ruby variable from getColumn()

The primary column that rel_col will be compared against

rel_col

String or Ruby variable from getColumn()

The reliability column to compare to main_col

match_arg

String

The argument used to match the reliability cells to the primary cells. Must be a unique identifier between the cells.

time_tolerance

Integer

Amount of slack permitted, in milliseconds, between the two onset and offsets before it will be considered an error. Set to 0 to tolerate no difference.

dump_file

String path or Ruby File object (optional)

The full string path to dump the reliability output to. This can be used for multi-file dumps or just to keep a log. You can also give it a Ruby File object if a file is already started.

Note

match_arg is particularly important: for checkReliability() to know which cells to compare, it needs to have some parameter that is unique to each pair of corresponding primary and reliability cells. In many cases, the onset time of the cell can be used to match primary and reliability cases.

  1. Create an object that holds the path that it should output to using Ruby’s File.expand_path method, which converts a relative path, like ~/Desktop/file.txt to an absolute path name, which contains the root directory, and all sub-directories, like /Users/alice/Desktop/file.txt

    The following commands create a variable called dump_file that holds the absolute path to a file on the Desktop called relcheck.txt:

    require 'Datavyu_API.rb'
    begin
    
       dump_file = File.expand_path("~/Desktop/relCheck.txt")
    

#. Compare “trial” and “rel_ “ using checkReliability(),

with a 5ms time tolerance, and output the results to the dump_file:

require 'Datavyu_API.rb'
begin
   dump_file = File.expand_path("~/Desktop/relcheck.txt")

   # Compare the "trial" and "trial_rel" columns, using trialnumber as
   #  their matching code and dump the results to a file on the desktop.
   checkReliability("trial", "trial_rel", "trialnum", 5, dump_file)
end

Video Example of Checking for Reliability

This video displays one way to check for inter-rater reliability for a single column in a spreadsheet.