Animal Behavior Reliability
  • Home
  • About
  • Foundations
    • Proposal
    • Measurements >
      • Definitions
    • Team makeup
    • Training >
      • Features of test subsets
      • Assessment
    • Metrics
  • Diving deeper
    • Iterative training processes >
      • Tasks and techniques
      • Categorical data
      • Continuous data
      • Rare outcomes
    • Timeline
    • Troubleshooting
    • Reporting
  • Checklist
  • Resources

Categorical data.​

We frequently collect data in discrete bins, for example, by assigning a yes/no value, or a score of 1, 2, 3.

Evaluating consistency in categorical data:

Picture
Example:
One type of categorical data that we work with is hygiene scores, which is a way of rating animal cleanliness on a scale. In this case, our scale was from 1 to 3, with 1 being clean, 2 being moderate, and 3 being dirty. Our test subset for reliability training consisted of 30 photos of dairy cattle. We included variation in the photos by including examples of different ages of cattle (younger animals may have different size requirements for what constitutes "dirty" since they are smaller), different hair color (dirt is easier to see on a white cow compared to a black cow), and different photo angles (when scoring in person for data collection, cattle will not be standing in one perfect position). We included 10 photos for each score, to give equal representation of all categories for our statistical approach. We then ran a kappa score on trainee responses for all 30 photos against an expert.
Individuals with scores above 0.8 then moved on to a live hygiene scoring training with an expert. This second step was included because data collection was ultimately performed live, so it was important to provide a training where the modality matched the true methodology. Once individuals had high agreement here (Cohen's kappa > 0.8), they could move on to unsupervised data collection, i.e. unsupervised hygiene scoring of cattle.

To view or take the photo test, click here. This test automatically calculates and returns a kappa score at the end, so you can quickly see your own reliability score. To complete the test, you'll first need to read more about the specific definitions for each of the 3 hygiene scores here.

​To see examples about what happens when metrics are not acceptable after training, see troubleshooting.
<< Diving deeper
< Tasks and techniques
Continuous data >
Picture
Picture
Picture
Proudly powered by Weebly
  • Home
  • About
  • Foundations
    • Proposal
    • Measurements >
      • Definitions
    • Team makeup
    • Training >
      • Features of test subsets
      • Assessment
    • Metrics
  • Diving deeper
    • Iterative training processes >
      • Tasks and techniques
      • Categorical data
      • Continuous data
      • Rare outcomes
    • Timeline
    • Troubleshooting
    • Reporting
  • Checklist
  • Resources