Announcing the new Datavyu v1.3.6 with enhanced features and transcription support!

Is there a website or something?

asked 16 Jan '12, 16:09

Jesse's gravatar image

Jesse ♦
11181015
accept rate: 53%

edited 17 Jan '12, 09:29

Clinton's gravatar image

Clinton
111171924


Introduction

The core-developers have worked hard at making OpenSHAPA flexible, so that it can be extended and tailored to the needs of your research. One of the ways OpenSHAPA can be customised is through the creation of scripts; little sets of instructions you can use to automate various tasks within OpenSHAPA. You can write scripts to perform repetitive tasks and calculations, as well as handlers for importing and exporting data to custom file formats.

Scripts within OpenSHAPA are written in Ruby and you can learn all about the basics of the language by trying this online demo. (typing 'help' will get your started with a quick 15 minute tutorial).

Once you are familar with the basics of ruby, it is time to take a look at some sample OpenSHAPA scripts. In the samples directory of OpenSHAPA you will find scripts of how to do various actions in OpenSHAPA, and they make great templates for your own scripts.

Using the OpenSHAPA Ruby API

OpenSHAPA provides an API to abstract away working with the database. Since working with the database directly can be quite tedious, this API handles that for you. The API is all bundled with OpenSHAPA itself and all the following tutorial will teach you all you need to get going with scripting in OpenSHAPA:

Part 1: The basics and a print script

The OpenSHAPA Scripting API contains many helpful functions to make scripting an easier experience. Functions to read and write to/from the database are provided, as well as functions to help you manipulate your data. OpenSHAPA variables, when called from the database, are loaded into the Ruby environment in a very easy to manipulate format. The provided OpenSHAPA_API.rb file contains references on how to use all of the available functions.

To being creating a scripting, create a new file with your favorite text editor. This text editor preferably has Ruby syntax highlighting. This means that different parts of the code will be colored to help make the code easier to parse when looking at it. Depending on your operating system, I recommend the following programs:

  • Windows: Notepad++ (free and open source)
  • Mac OSX: Textmate (non-free and closed source) or TextWrangler (free)
  • Linux: Geany (free and open source, available in the apt repository) or Kate (also free and open source, comes with KDE and available in apt)

Now that your file is open, we are ready to begin scripting! Scroll down to the bottom of the file so you see the "begin" and "end" commands. These mark the beginning and end of the script. Because of the way Ruby parses scripts, these must be at the bottom of the script, otherwise the functions below them will not be available.

So fire up OpenSHAPA and load the data file you'd like to print. Make sure it has some cells in it so that we have something to print!

Now go back to the text editor. The first thing we need to do is tell Ruby to load one of our variables from the OpenSHAPA database. This is done with the getVariable command. Say one of our variables is named "trial", and it encodes the onset and offset of a trial and a trial number with the argument "trialnum". To call this variable into Ruby, use the following command:

 require 'OpenSHAPA_API.rb'
 begin
     trial = getVariable("trial")
end

Trial has become the name of the Ruby representation of our OpenSHAPA variable. This name could be anything, but it is best to name it something descriptive that you can remember what it does several months down the line. But this script isn't very exciting, it isn't doing anything that we can see! So lets have it print the ordinal, onset, offset, and trial number for each cell in the trial variable.

To loop through cells, we use a for loop. So the code:

require 'OpenSHAPA_API.rb'
 begin
     for i in 1..10
             print i, "\n"
     end
 end

can be read in the following way. 1..10 specifies a range of numbers between 1 and 10. So you get a list that is 1 2 3 4 5 6 7 8 9 10. You can change the numbers on either end to match the numbers you need to loop over, should you ever have to use this. print tells Ruby to print text to the screen, and the "n" tells the computer to end that line. So this is just printing whatever number i is at that time in the loop. Note how the loop needs an end at the bottom of it. This closes the loop. All of the code between the for... line and that end will be executed inside the loop.

This loop can be read in English as "For each i in the range 1 to 10, print i".

We can use this same construct when looping through the cells in our variable. The cells of a variable are accessed through the .cells array. For example, with our trial variable above, we can type trial.cells to access the cell array. Using all of this information, we can create a function to print all of the cells in trial.cells.

 require 'OpenSHAPA_API.rb'
 begin
     for t_cell in trial.cells
         print t_cell.ordinal, t_cell.onset, t_cell.offset, t_cell.trialnum, "\n"
     end
 end

Now that isn't so bad, is it? To access arguments of a cell, you just simply type "cellname.fieldname". All cells always have an ordinal, onset, and offset, and then whatever arguments are in them. These argument names are always lower case and stripped of all non-alphabetic characters. For example, if your argument name was "trial_NUM" instead of "trialnum", then you would still access the argument with "t_cell.trialnum". This is forced because of how Ruby treats method names.

This section of the tutorial is now over. Please play around with the Ruby language and your editor, and become more comfortable with the syntax. There are some great online Ruby tutorials. Having a basic understanding of the language will make scripting much easier and more fun!

Part 2: Modifying cells and writing back to the OpenSHAPA database

Now that we know how to read and print our data inside the OpenSHAPA database, lets learn about the tools that are were built to write back changes made to it. There are two main functions for manipulating data in an OpenSHAPA database. The first is used to modify arguments in cells. Each cell has a method called change_arg that takes two parameters: the method name and what value to change it to. The second is the setVariable function, which will write the variable back to the database. Changes made to a variable are not saved back to the database until setVariable is called on them. Lets see these in action using the same trial variable from the last section:

 require 'OpenSHAPA_API.rb'
 begin
     trial = getVariable("trial")
     for trial_cell in trial.cells
         trial_cell.change_arg("trialnum", trial_cell.trialnum.to_i + 1)
     end
     setVariable(trial)
 end

Whoa! Ok, lets break down what is going on in that for loop. It isn't really that bad. So, we have our trial_cell, and we're changing an argument of it with its change_arg method. The first field is the name of the argument we are changing, which is "trialnum". The second is the value we are changing it to. In this case, what we are doing, is taking the trialnum of that cell and adding one to it. But, since, for ease of use, all of the arguments are stored in Ruby as strings, we have to convert the string to an integer with the ".to_i" method. You can do this to any string that is number, like to make "3" into 3 by doing a "3".to_i. was used, then a new variable will be created called "test_trial" and it will leave the original trial variable alone in the database.setVariable("test_trial", trial) has two forms. In the one used above, we are simply writing that variable back to the database with the changes, overwriting the previous cells that were there with new ones. Now if the form setVariable

This is kind of confusing, so take a second to play around with it. You can address individual cells in the trial.cells array with square brackets. For example,

 trial.cells[0]

refers to the first cell in the cell array. Arrays in Ruby have a method called "length", which tells you how many elements they have in them. So if I wanted to access the last element of an array, I could use:

trial.cells[trial.cells.length - 1]93

Notice how we had to subtract one from the index. This is simply because length will tell you how many elements there are in an array, say 3. But they are indexed from 0, so element 1 is at index 0, element 2 at index 1, and element 3 at index 2. So, because length returns 3, we have to subtract 1 to get 2, which is the index of the last element in the array. Confusing? Yes, but this is consistent across most programming languages, with a notable exception being Matlab. Try out these functions for a while and become comfortable with them. That concludes this section.

Part 3: Creating new variables and cells from scripts

The Ruby API also has functions to create new cells in a variable, and new variables in a spreadsheet. To create a new, blank variable, use the createNewVariable function. The first argument of this function is the name of the variable, and then the rest of the arguments are the arguments to appear in the variable. For instance, to make the trial example that we've been working with, you would use the command:

 trial = createNewVariable("trial","trialnum")

You can make the argument list as long as you'd like though, for example:

  trial = createNewVariable("trial","trialnum","testdate","birthdate","turndir")93

That command will create a new variable named trial with the arguments trialnum, testdate, birthdate and turndir. The createNewVariable function will return the reference to the new variable, much like the getVariable function does for already existing variables. However, when you create a new variable, it is only created in Ruby and not in the OpenSHAPA database unless you use setVariable to write it back. This is done so it is possible to create temporary variables. For example, if you wanted to create a variable on the fly in order to merge two others, but didn't want to actually write it back to the database, then you can do that with this function. In order to create a new cell in a variable for, say, the trial variable, simply use:

 new_cell = trial.make_new_cell()

This will return a new blank cell and add it to the array of cells in the trial variable. To make changes to the cell, simply edit the new_cell variable, as this is a reference to the new cell in the array. For example,

new_cell.change_arg("onset", 1000)95

will make the new cell's onset 1000ms, or 00:00:01:000. New cells are created with 0 onset and 0 offset and have blank arguments. If you want to place the new cells at certain points, you will need to change their onset and offset after creating them.

Part 4: Reliability columns and strategies

The API contains several easy to use functions for making simple columns for inter-rater reliability. The primary function of these functions is simply to make life easier if you want to copy parts of a variable for coding. To use this function to make a copy of the trial column above, simply type:

reltrial = makeRel("trial", "rel.trial", 2, "onset", "trialnum")95

This usage of it will copy every other cell (the 2, every fourth would be 4, at so on) in the trial column into a column called "rel.trial", and will copy over the onset of the cell and the trialnum of the cell, but leave everything else blank (the offset and so on). This function will write the new column to the database for you, and also return a reference to the new column in case you want to edit it some more.

This method of generating columns works great for studies that have discrete, known events in a trial setup. However, many studies are coded continuously and do not conform to a discrete trial setup. For example, if you are coding the activities of a subject walking around the room, but the study has little to no structure, you might not want to take every other cell that the primary coder coded. Instead, you'd want to pick random blocks of time for the reliability coder to code. Luckily, there is a function in the API for this too! Lets say that we have several columns. One is a long block of time, called "block". Within that block are continuously coded events, like a cell for every step a subject takes in a room. Lets call this "step". Now, if we wanted reliability on the steps, telling the reliability coder where the steps are isn't going to help you be very reliable. But, we can use the "block" variable to bind our times to, having our reliability coder code sections of "block". Lets call our new column "rel.steps". Lets also have it take every 3rd (the three in the function call below) 15 second (the 15 in the call below) block.

makeDurationBlockRel("rel.steps", "steps", "block",  15, 3)95

There we go! Easy! Please play around with these functions and understand how they work. They will help make your life much easier for general reliability passes!

permanent link
This answer is marked "community wiki".

answered 18 Jan '12, 16:09

Clinton's gravatar image

Clinton
111171924
accept rate: 36%

edited 31 May '12, 11:51

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×11
×10
×6
×1
×1

question asked: 16 Jan '12, 16:09

question was seen: 4,688 times

last updated: 31 May '12, 11:51

powered by OSQA