Including observer in the model.
To account for potential variability in observer reliability, observer ID can be included in the hypothesis-testing statistical model as a random effect after all experimental data have been collected. This approach accounts for variation across observers, without explicitly estimating the effect of each observer and assuming observer has a systematic effect on your outcomes of interest.
Another option would be to include observer as a fixed effect in your statistical model. This approach asks explicitly about the effect of the observer and may be useful, for example, if you were trying to assess the effect of experience (trainee vs. expert) as part of your experimental question(s).
To use either approach (fixed or random effect), your final dataset must meet all assumptions your chosen models. For linear mixed models, this includes continuous data, normal residuals, homogeneous residual variance, and no autocorrelation or multicollinearity. Generalized linear mixed models extend these assumptions to allow for both continuous and categorical data, and allow for broader distributions for the response variable (do not require strictly normal residuals), though follow the remaining assumptions for linear mixed models.
This method may require caution. Including observer as a fixed effect, for example, costs degrees of freedom in the model. Rather than looking at the effect of observers afterwards, we think a more robust strategy would be to first train observers to be highly (and similarly) reliable using a robust statistical test (e.g. concordance for categorical data, ICC for continuous data) to ensure quality data collection, rather than accept high observer variation and potentially high variability in data quality.
This method may also be used to account for drift over time. For example, if you achieved strong reliability before data collection began using a more robust metric, but found your team no longer met these cutoffs when you re-tested reliability after the experiment, including observer ID in your models could help account for some drift. If data are in a format that allows them to be re-analyzed, however (e.g. video, photos), that is likely preferable.
Another option would be to include observer as a fixed effect in your statistical model. This approach asks explicitly about the effect of the observer and may be useful, for example, if you were trying to assess the effect of experience (trainee vs. expert) as part of your experimental question(s).
To use either approach (fixed or random effect), your final dataset must meet all assumptions your chosen models. For linear mixed models, this includes continuous data, normal residuals, homogeneous residual variance, and no autocorrelation or multicollinearity. Generalized linear mixed models extend these assumptions to allow for both continuous and categorical data, and allow for broader distributions for the response variable (do not require strictly normal residuals), though follow the remaining assumptions for linear mixed models.
This method may require caution. Including observer as a fixed effect, for example, costs degrees of freedom in the model. Rather than looking at the effect of observers afterwards, we think a more robust strategy would be to first train observers to be highly (and similarly) reliable using a robust statistical test (e.g. concordance for categorical data, ICC for continuous data) to ensure quality data collection, rather than accept high observer variation and potentially high variability in data quality.
This method may also be used to account for drift over time. For example, if you achieved strong reliability before data collection began using a more robust metric, but found your team no longer met these cutoffs when you re-tested reliability after the experiment, including observer ID in your models could help account for some drift. If data are in a format that allows them to be re-analyzed, however (e.g. video, photos), that is likely preferable.