Animal Behavior Reliability
  • Home
  • About
  • Foundations
    • Proposal
    • Measurements >
      • Definitions
    • Team makeup
    • Training >
      • Features of test subsets
      • Assessment
    • Metrics
  • Diving deeper
    • Iterative training processes >
      • Tasks and techniques
      • Categorical data
      • Continuous data
      • Rare outcomes
    • Timeline
    • Troubleshooting
    • Reporting
  • Checklist
  • Resources

Continuous data.​

Continuous data includes behaviors scored from video or in person that generate outcome variables that are on an interval or ratio scale. ​

Evaluating consistency in continuous data:

Picture
Example:
One example of continuous data that we often use in our research is duration of time spent performing a behavior. In this case, we wanted to train a team to score self-grooming in dairy heifers (Downey and Tucker, 2023). We defined this behavior as "Touching hair with the tongue or mouth on heifer’s own body; includes if mouth is not visible but directed toward body and the head moves in a vertical (up or down) motion." Trainees were given twenty-one 5-minute videos from a variety of different heifers and instructed to score self-grooming continuously in each video. This matched the modality for data collection. 
Values were then compared against an expert's scores for the same videos. Below, you can see the results in seconds generated from one trainee against the expert. Glancing at this table, it already seems evident that the trainee was very close to the expert, as the values are very similar. We can also look at these responses visually, such as by graphing them (black dots + dotted trendline) and evaluating how close they are to the ideal fit (red line). 
Picture

Original data output from reliability training for self-grooming

Picture

Visual comparison of self-grooming scores between the trainee and expert

In this particular example, we also looked at figures generated from BORIS to confirm that the trainee identified grooming at the same moments as the expert. It is possible to have similar overall durations for a behavior yet score it at different times. The figures below are the results for Video 1 in the training. Multiple behaviors were scored continuously in each video, as the training described for self-grooming was similarly conducted for a total of 10 behaviors of interest. Focusing just on orange here, which is self-grooming, we can see that the trainee accurately scored this behavior as occurring at the same times that the expert reported.
Picture

Observer 1 (Expert)

Picture

Observer 2 (Trainee)


Grooming scores were then evaluated with intraclass correlation coefficients (ICC). In this case, this trainee got an ICC of 1, which was higher than our cutoff of 0.9, and was allowed to proceed to independent video scoring.

To see examples about what happens when metrics are not acceptable after training, see troubleshooting.
<< Diving deeper
< Categorical data
Rare outcomes >
Picture
Picture
Picture
Proudly powered by Weebly
  • Home
  • About
  • Foundations
    • Proposal
    • Measurements >
      • Definitions
    • Team makeup
    • Training >
      • Features of test subsets
      • Assessment
    • Metrics
  • Diving deeper
    • Iterative training processes >
      • Tasks and techniques
      • Categorical data
      • Continuous data
      • Rare outcomes
    • Timeline
    • Troubleshooting
    • Reporting
  • Checklist
  • Resources