Maximally selected chi-square statistics for at least ordinal scaled variables.
Collaborative Research Center 386, Discussion Paper 407
The association between a binary variable Y and a variable X with an at least ordinal measurement scale might be examined by selecting a cutpoint in the range of X and then performing an association test for the obtained 2x2 contingency table using the chi-square statistic. The distribution of the maximally selected chi-square statistic (i.e. the maximal chi-square statistic over all possible cutpoints) under the null-hypothesis of no association between X and Y is different from the known chi-square distribution. In the last decades, this topic has been extensively studied for continuous X variables, but not for non-continuous variables with an at least ordinal measurement scale (which include e.g. classical ordinal or discretized continuous variables). In this paper, we suggest an exact method to determine the distribution of maximally selected chi-square statistics in this context. This novel approach can be seen as a method to measure the association between a binary variable and variables with an at least ordinal scale of different types (ordinal, discretized continuous, etc). As an illustration, this method is applied to a new data set describing pregnancy and birth for 811 babies.