Announcing the new Datavyu v1.3.6 with enhanced features and transcription support!

Hi, I'm wondering about a reliability scripting question. In our case, we'll have one column in two coders' files that we'd like to compare and do reliability on, which has 4 codes in it. However, it's not likely that the number or order of codes will be a good thing to match on, since this relies on the coders having found the right moments in the video in the first place. That is, coders are marking, e.g. when an object word is said by the parent to the child by the cell onset/offset, and then 4 codes: what the object word is (e.g. banana), what kind of utterance it is (e.g. q for question), whether it's present in the view of the child (y/n), and who said it (e.g. MOT for mother).

I'm currently using just a simple compare_columns ruby script to and then doing temporal alignment, and having the two coders generate a consensus version that they discuss. What would be great would be to be able to then calculate the degree to which each of the original coded columns agreed with the final consensus version for each code, with some sort of buffer of time within which the cell would have to match, e.g. it's okay if the onset and offset overlap but are not the same, or are <1s apart, etc.

Is this a straightforward thing to implement with the existing ruby functions? Glancing at the previous scripts you provided I wasn't quite sure how to implement this; some sort of mini sample script would be great! [or an explanation of why this is a fool's errand:)]

asked 13 Jan '15, 09:35

eb732's gravatar image

eb732
1334
accept rate: 0%


Have you already looked into the create_mutually_exclusive function from the API? It takes two columns and constructs a third column that is basically an exclusive-or of all the cells from the two input columns. It will give you a comprehensive list of all disagreements between two columns in a spreadsheet.

One of the simplest things to do would be to then add up the durations of all cells in the newly created "mutex" column to find the amount of time the disagreements amount to. If you only want to give lee-way to disagreements you could go through each of the cells in the mutex column with a script and filter out cells which match your criteria.

Do both of your coders code the the same amount of data? Or does one coder code a subsection?

permanent link

answered 25 Jan '15, 21:15

Shohan%20Hasan's gravatar image

Shohan Hasan ♦♦
381126
accept rate: 12%

edited 25 Jan '15, 21:16

Hm, I played around a bit with create_mutually_exclusive, but i'm not sure it's doing what I have in mind. But maybe I don't understand your suggestion about going through the mutex column to filter. Both coders code the same amount of data, and the idea is to quantify the difference between each coder's file's column to the consensus column, for each code in each cell. Except some cells may be altogether missing in one column, and thus the order and cell number will vary (and adding up durations won't work). Let me know if it's easier to write out an example; i'm hitting the char. limit

(02 Feb '15, 18:04) eb732 eb732's gravatar image

Don't worry, this is a common task — although we run a reliability script first to create a disagreements column to aid the coders in then discussing and coming up with a final column. Checking each coder for reliability against the final column is also possible; you just have to run reliability using create_mutually_exclusive twice (once for each coder).

The script may be fairly complicated if you are not familiar with the Datavyu API so I can help you get started with one if you can send de-identified sample Datavyu files to shohan.hasan@nyu.edu.

(05 Feb '15, 09:55) Shohan Hasan ♦♦ Shohan%20Hasan's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×46
×10
×7
×3

question asked: 13 Jan '15, 09:35

question was seen: 5,558 times

last updated: 05 Feb '15, 09:56

powered by OSQA