Kappa Power Analysis

This repository provides a Monte Carlo simulation tool for estimating how many lines (items) need to be double-coded to achieve reliable estimates of Cohen’s κ.
It is based on the paper A Less Overconservative Method for Reliability Estimation for Cohen’s Kappa by He, M., Baker, R. S., Hutt, S., & Zhang, J. (2024), presented at the Fourth International Conference on Quantitative Ethnography.

Given an expected population κ (POPULATION_KAPPA) and known total number of lines (POPULATION_SIZE), the code determines the minimum sample size needed so that, if the true population κ were only that value, the probability of observing a sample κ above a specified threshold (SAMPLE_THRESHOLD) is below a chosen tolerance (TARGET_PROB).

In other words, this code identifies a sample size that makes it very unlikely to observe a high sample κ if the true agreement were meaningfully lower, providing evidence that the full dataset κ is not far below the target threshold.

The simulation also accounts for each coder’s observed base rates (PREV_R1, PREV_R2), allowing realistic modeling of prevalence effects on κ.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
KappaPowerAnalysis.py		KappaPowerAnalysis.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kappa Power Analysis

About

Uh oh!

Releases

Packages

Languages

pcla-code/KappaPowerAnalysis

Folders and files

Latest commit

History

Repository files navigation

Kappa Power Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages