Understanding Statistics
Learn about statistical metrics, confidence intervals, and how to interpret evaluation data.
Key Metrics
The Statistics tab provides several metrics to help you understand your evaluation results beyond simple averages.
Mean and Standard Deviation
The mean is the average rating for a dimension. The standard deviation shows how spread out the ratings are. A low standard deviation means evaluators largely agreed; a high one indicates diverse opinions.
Confidence Intervals
Confidence intervals show the range within which the true average rating likely falls. A narrow interval means the result is precise; a wide interval suggests you may need more ratings for a reliable conclusion.
Statistical Significance
For SxS evaluations, statistical significance tests tell you whether the difference between Experience A and B is meaningful or could be due to chance. Results are marked as statistically significant when the p-value is below 0.05.
Effect Size
Effect size measures the magnitude of the difference between experiences. Even a statistically significant difference may be small in practice. Look at effect size alongside significance to make informed decisions.