Check Inter-rater Reliability to Improve Data Accuracy¶
Reliability coding verifies that the desired codes are observable and that the coders are accurately interpreting events. With reliability coding, two coders separately code the same data source. You can create a reliability column, which the second coder can use to reccord his or her observations. After both coders have coded the same column, they can compare their codes and determine their inter-rater reliability.
Note
While it may be tempting to have the reliability coder code the data source in a new spreadsheet, you should endeavor to keep all codes for a given data source in a single spreadsheet. This facilitates analysis and ensures that like data are kept together. To prevent the reliability coder from seeing the original coding pass, you can hide the original coded column.
Make a Reliability Column¶
In general, when creating reliability columns you create blank cells that correspond to the cells created during the first coding pass. This ensures that the two coders will have observed the same events in the data stream and allows easy comparison of the two coding passes.
The Datavyu Ruby API provides the makeReliability()
method
for creating reliability variables. makeReliability()
has four
parameters:
Parameter
Type
Description
relname
String or Ruby column from
getColumn()
the name of the new reliability column you will create. The convention is to name it
rel_columnName
, but you can name it whatever you want to.
column_to_copy
String
The name of the column that we want to create a reliability column from (i.e. the existing coded column).
multiple_to_keep
Integer (optional)
Number of cells to skip: a value of
2
includes every other cell in the new variable;1
includes every cell, and0
creates a blank column with no cells
*codes_to_keep
comma-separated strings (optional)
Codes you want to copy from the original to new column. These are codes that the reliability coder will not have to code.
Tip
Copying the onset of the
original column to the new reliability column in the args_to_keep
parameter makes it easier for the second coder to navigate to
the correct locations in the data source, and to code the same
events as the original coder.
When making a reliability column, you should also think about how you
are going to compare the columns. checkReliability()
, which
you use to check reliability. It requires that each pair of cells has a
unique identifier that link them together. For example,
a trial number coded into each cell would match corresponding cells,
even if only a subset of cells were included in the reliability
variable.
The following example uses the sample data
to create a new
reliability column called “trial_rel” from its “trial” column, skipping
every other cell, and copying over the onset and trialnum codes
so that the reliability coder doesn’t have to recode onset and
trial numbers.
require 'Datavyu_API.rb'
begin
makeReliability("trial_rel", "trial", 2, "onset", "trialnum")
end
Note
You do not have to write the variable back to the spreadsheet.
makeReliability()
automatically writes its results to
the spreadsheet.
Check Reliability¶
Once the second coder has recorded their observations in the reliability
column, you can use checkReliability()
to compare the primary
and reliability columns cells. checkReliability()
returns the number of errors, and the percent agreement.
checkReliability()
has four required parameters, and one
optional one:
Parameter
Type
Description
main_col
String or Ruby variable from
getColumn()
The primary column that
rel_col
will be compared against
rel_col
String or Ruby variable from
getColumn()
The reliability column to compare to
main_col
match_arg
String
The argument used to match the reliability cells to the primary cells. Must be a unique identifier between the cells.
time_tolerance
Integer
Amount of slack permitted, in milliseconds, between the two onset and offsets before it will be considered an error. Set to 0 to tolerate no difference.
dump_file
String path or Ruby File object (optional)
The full string path to dump the reliability output to. This can be used for multi-file dumps or just to keep a log. You can also give it a Ruby File object if a file is already started.
Note
match_arg
is particularly important: for
checkReliability()
to know which cells to compare, it needs
to have some parameter that is unique to each pair of corresponding
primary and reliability cells. In many cases, the onset time of the cell can
be used to match primary and reliability cases.
Create an object that holds the path that it should output to using Ruby’s File.expand_path method, which converts a relative path, like
~/Desktop/file.txt
to an absolute path name, which contains the root directory, and all sub-directories, like/Users/alice/Desktop/file.txt
The following commands create a variable called
dump_file
that holds the absolute path to a file on the Desktop calledrelcheck.txt
:require 'Datavyu_API.rb' begin dump_file = File.expand_path("~/Desktop/relCheck.txt")
#. Compare “trial” and “rel_
“ using checkReliability()
,
with a 5ms time tolerance, and output the results to the
dump_file
:require 'Datavyu_API.rb' begin dump_file = File.expand_path("~/Desktop/relcheck.txt") # Compare the "trial" and "trial_rel" columns, using trialnumber as # their matching code and dump the results to a file on the desktop. checkReliability("trial", "trial_rel", "trialnum", 5, dump_file) end
Video Example of Checking for Reliability¶
This video displays one way to check for inter-rater reliability for a single column in a spreadsheet.