Visualization and Metrics.Formally evaluating reliability allows others to assess our approach. We describe the metrics we often use in our scientific papers and the rationale behind each.
|
Visualization and metrics
There are a few common metrics or strategies used to evaluate reliability. When deciding which metric to use, consider the type of data you have, the goals of your training, and the pros and cons of each suitable method. In some cases, multiple metrics may be needed to provide robust and trustworthy information about reliability.
-
Step 1
-
Step 2
<
>
Step 1: Visual observation
Visually checking data during reliability testing is an important step. We recommend starting here and returning to these visualizations as you calculate metrics. Mismatches between metrics and the visual story can help you identify problems.
Step 2: Identify an appropriate, robust metric for your data type
If your data are categorical, here are some commonly used metrics:
If your data are continuous, here are some commonly used metrics:
Note: there are more methods and approaches than those listed here. We endeavored to explain some of the most common metrics used for reliability testing, whether or not they are considered robust or best practice. We provide details, and warnings, about these approaches on each sub-page.