Coding Theory and Practice

Coding Theory and Practice

Useful Definitions

You may want to reference Datavyu’s User Guide to gain deeper understanding of various key terms.

A code tags a section of video with an identifier. Codes for outcome measures typically reflect the expression or non-expression of particular behaviors or traits. Codes can also represent participant information, conditions, tasks, predictors, and independent variables. In Datavyu, codes are represented by a cell in the coding spreadsheet or by a variable within the cell. When coders “code” video files, they insert cells and type letters into prompts for each variable within the cell.

A coding manual describes and documents what coders should do (and what previous coders did do) while scoring the videos. It formalizes the coding decisions by defining what each code represents, and the criteria for coders’ decisions. This information is valuable for researchers who may analyze or revisit their data months or years after its collection, for setting conventions within and across labs, and for sharing and repurposing data. You likely want a separate coding manual for each study. You can use an existing manual as a template to set up a new coding manual.

A coding pass reflects a complete scoring of a video file for one variable or set of variables. In Datavyu, a pass is generally a column of cells with codes.

A spreadsheet organizes and stores your codes for a particular video file. In Datavyu, the spreadsheets are the Datavyu files. Each spreadsheet is automatically linked with its corresponding video file. In Datavyu, your codes are in cells and your cells are in columns. Regardless of coding software, you should expect to develop your coding manual and set up your template coding spreadsheet in tandem. This is an iterative process and you will likely need to make changes or want to add new coding passes and/or codes down the line.

A comment is a note by a coder. Comments can be completely informal and used effectively to get an idea of what behaviors of interest are on the videos. Comments can be more formalized (by adding the coder’s name and date of comment) and used in a more serious way to locate excerpts, highlight problems/discrepancies, or explain a coding decision.

The ordinal is the number of the cell in the sequence of cells in a column. Ordinals are an important way to keep track of a sequence of cells or the identity of a cell when time is not useful.

In Datavyu, onset and offset times are the two times that accompany each cell. Typically, the onset time marks the beginning of an event and the offset time marks the end of the event. Sometimes events are continuous (e.g., when baby looks left, the look to the right ends; when baby looks away, the look to the left ends). Sometimes events are isolated (e.g., after trial #1 ends, there are several seconds or minutes before trial #2 begins). Sometimes only the onset or the order of events is important; in that case, you can code using point cells, where there is only one time associated with each cell (onset and offset are the same number). Sometimes onsets and offsets are arbitrary (maybe you want to assess a behavior every two minutes or randomly sample 10 minutes of behavior from each hour); Datavyu scripts make time sampling easy. Datavyu has special keys for entering onsets/offsets for continuous events, isolated events, and point cells. Sometimes you will want to use onset or offset times as a way to link cells across columns. Note that the notion of onset and offset is only a convention. In Datavyu, it is possible for the “onset” time to be later than the “offset” time, if for example, you want one time to represent the start of event A and the other time to represent the start of event B and sometimes B precedes A. In these cases, Datavyu will display the cell with a red line (don’t worry, the red line is only a tag; it does not mean that you made a mistake unless your code does not allow offset times to precede onset times).

In Datavyu, a script is a routine written in the Ruby programming language that allows you to manipulate the data in your spreadsheet, add/delete codes from your spreadsheet, insert or delete cells in your spreadsheet, insert or delete columns in your spreadsheet, conduct analyses on your codes, import data into the spreadsheet, and export spreadsheet data in whatever format you desire. Scripts can operate on a single spreadsheet or on all of the spreadsheets in a file (e.g., 100s of spreadsheets simultaneously).

Coding Criteria and Types of Codes

Behavioral codes lie along a continuum. Implicit (sometimes called “subjective”) codes are at one end of the continuum and explicit (sometimes called “objective”) codes are at the other end of the continuum. The difference between the two types of codes is whether the behavioral criteria for codes are implicit or explicit. Implicit codes do not require the observer to see particular behaviors; explicit codes do require this. As illustrated in the following 2 x 2 table, the benefits of one are the failure of the other. Implicit criteria allow coders to determine the code based on their own judgments of what behavior is being expressed; coders can take individual differences between participants into account. Explicit criteria force coders to determine the code based on whether a particular behavior was expressed; individual differences between participants and particulars of the situation must be ignored. With an implicit code for “falling,” for example, coders use their own judgment to decide whether the infant lost balance in the particular instance. With an explicit code for “falling,” coders must use explicit criteria such as whether the infant’s hands or bottom touched the floor, whether the transition from upright occurred within a particular time frame, or whether an experimenter or parent grasped the infant’s body. With an implicit code for “negative affect,” coders use their own judgment to decide whether a child feels distress or anger. With an explicit code, coders must use explicit criteria such as whether the child’s brows were knit, lip was jutted, mouth was in a square shape, or crying/tears were expressed.

Type

Implicit Coding Criteria

Explicit Coding Criteria

Pros

Reflect individual differences in the manner of expressing the target behavior

Know how behavior was expressed in each instance and individual

Cons

Do not know how behavior was expressed in each instance and individual

Ignores individual differences in manner of expressing the target behavior

Implicit and explicit codes can be equally reliable (in terms of inter-rater reliability and consistency of participants’ responses) and equally valid (meaning that the codes reflect the behaviors you intend to measure). The benefit of an explicit code is that you will know exactly what coders scored (e.g. baby stopped for at least 0.5s at the edge of the obstacle with feet or hands touching the obstacle, etc.).

In some cases, implicit codes are your best bet and can assure you that an explicit code was sufficiently exhaustive. For example, in a study asking whether infants defer to mothers’ advice about walking down slopes, we worried that mothers’ delivery of encouragement or discouragement might have been influenced by the severity of the slope of an incline; perhaps mothers did not encourage as enthusiastically on steep slopes as they did on shallow ones or did not discourage as enthusiastically on shallow slopes as they did on steep ones. Thus, we asked blind coders to judge whether the slope was shallow, steep, or intermediate and whether mothers were providing an encouraging or discouraging message based solely on mothers’ behaviors. With this implicit code, coders judged type of message nearly perfectly, but judged degree of slope exactly at chance. We thus satisfied ourselves that the explicit codes were sufficiently exhaustive.

Implicit and explicit codes can be of any granularity. Datavyu can provide detailed frame-by-frame coding (in milliseconds) and global approximate coding (region of video, ordinal only) or both. It’s entirely up to you. Implicit and explicit codes can refer to durations (involving onset and offset times) and to categories (did behavior occur yes/no; which of several behaviors occurred; what is the ranking of the behavior).

Non-behavioral codes can also be scored in Datavyu. For example, you can type in or import information about participant demographics, the observational setting, various conditions and independent variables, and so on. Non-video data can be imported into Datavyu if you use our code and Ruby API. You can import your own data into Datavyu and use it to identify interesting sections of video or you can use the video to identify interesting sections of other synchronized data streams.