Animal Behavior Reliability
  • Home
  • About
  • Foundations
    • Proposal
    • Measurements >
      • Definitions
    • Team makeup
    • Training >
      • Features of test subsets
      • Assessment
    • Metrics
  • Diving deeper
    • Iterative training processes >
      • Tasks and techniques
      • Categorical data
      • Continuous data
      • Rare outcomes
    • Timeline
    • Troubleshooting
    • Reporting
  • Checklist
  • Resources

Visualization and Metrics.

Formally evaluating reliability allows others to assess our approach. We describe the metrics we often use in our scientific papers and the rationale behind each.

Visualization and metrics

There are a few common metrics or strategies used to evaluate reliability. When deciding which metric to use, consider the type of data you have, the goals of your training, and the pros and cons of each suitable method. In some cases, multiple metrics may be needed to provide robust and trustworthy information about reliability.
  • Step 1
  • Step 2
<
>

Step 1: Visual observation

Visually checking data during reliability testing is an important step. We recommend starting here and returning to these visualizations as you calculate metrics. Mismatches between metrics and the visual story can help you identify problems. ​
Learn more ->

Step 2: Identify an appropriate, robust metric for your data type

If your data are categorical, here are some commonly used metrics:
  • Concordance
  • Correlation: Ranks
  • Percent agreement​
  • Including observer in the model​
If your data are continuous, here are some commonly used metrics:
  • Correlation: ICC
  • Regression
  • Bland-Altman plot
  • Mean difference
  • Coefficient of variation
  • Including observer in the model
Note: there are more methods and approaches than those listed here. We endeavored to explain some of the most common metrics used for reliability testing, whether or not they are considered robust or best practice. We provide details, and warnings, about these approaches on each sub-page.
<< Foundations
< Test Subsets
Diving deeper >>
Picture
Picture
Picture
Proudly powered by Weebly
  • Home
  • About
  • Foundations
    • Proposal
    • Measurements >
      • Definitions
    • Team makeup
    • Training >
      • Features of test subsets
      • Assessment
    • Metrics
  • Diving deeper
    • Iterative training processes >
      • Tasks and techniques
      • Categorical data
      • Continuous data
      • Rare outcomes
    • Timeline
    • Troubleshooting
    • Reporting
  • Checklist
  • Resources