A normal-scores alternative to the Wilcoxon test

Copyright © Richard B. Darlington. All rights reserved.

This section describes a normal-scores test for testing the null hypothesis that a set of scores is distributed symmetrically around a specified value C. If the null hypothesis is true, then C is both the mean and the median of the distribution.

Elsewhere I describe two methods for transforming scores to a normal distribution. Here we use the simpler of those two methods--the equal-area method. The use of the more complex median-scores method is illustrated below.

Consider a set of normal scores applying to just the right half of a normal distribution. For instance, if N = 9, we want the scores that divide the right half of a normal distribution into 10 equal areas. Thus we want the scores whose right-hand areas are .45, .40, .35, .30, .25, 20, .15, .10, and .05. A normal-curve table shows these 9 scores are
.126 .253 .385 .524 .674 .842 1.036 1.282 1.645
In this test you rank all N scores by the absolute values of their deviations from the hypothesized center C, with the score closest to C getting a rank of 1. As usual, assign mean ranks to ties. Then use those ranks to assign normal scores. If any scores exactly equal C, assign them a normal score of 0. Then attach minus signs to the scores below C, leaving the scores above C as positive. Define M as the mean of these scores. Then use an ordinary z test to test the null hypothesis that the scores have a mean of zero; use 1/sqrt(N) for the standard error of the mean. Thus
z = M × sqrt(N).

The logic behind this test is that if the scores are in fact distributed symmetrically around C, then positive and negative signs will be assigned with roughly equal frequency to the normal scores, and their expected mean will be zero and their standard deviation will be approximately 1. But if, say, positive deviations from C are larger and more numerous than negative deviations, then M will tend to be positive. The logic is rather similar to the logic of the Wilcoxon signed-ranks test. The major advantage of the normal-scores test is that there is little or no tendency for the various possible values of M to be tied with each other--unlike the Wilcoxon rank-sum S. That raises power, for reasons explained elsewhere.

Using median scores in the normal-scores test on a center C

The use of median scores in the normal-scores test is discussed elsewhere. As described there, each of the N ranks is transformed into an Area value based on the Beta distribution. In the present case you apply those areas to the distribution of the absolute values of normal scores, or equivalently to the right half of a standard normal distribution. That can be done through the relation

New left-tail area = .5 + Original left-tail area/2

For instance, in the previous example with N = 9, the 9 left-tail areas were
.074 .180 .286 .393 .500 .607 .714 .820 .926
The last equation transforms these values to
.537 .590 .643 .697 .750 .803 .857 .910 .963
The z-values corresponding to these areas are then the desired normal scores. They are:
.093 .227 .367 .514 .674 .854 1.066 1.342 1.786
In other ways the method is like the equal-area method.

Go to Darlington home page