From: John Darrington Date: Tue, 7 Mar 2017 05:31:49 +0000 (+0100) Subject: Fixed a bug in the Mann-Whitney test vs. missing=analysis. X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=c875597832d56353461bafd46e268f0ba5fbb5da;p=pspp Fixed a bug in the Mann-Whitney test vs. missing=analysis. When missing values were deleted from the dataset, they were deleted after the ranks for the U value had been inserted, - thus the wrong rank sum would be calculated. This change deletes missing values *before* the ranks are inserted. The issue this fixes is described at http://lists.gnu.org/archive/html/pspp-users/2017-03/msg00009.html --- diff --git a/NEWS b/NEWS index 8959f7ec5d..1b57a19d5b 100644 --- a/NEWS +++ b/NEWS @@ -1,11 +1,15 @@ PSPP NEWS -- history of user-visible changes. -Copyright (C) 1996-2000, 2008-2016 Free Software Foundation, Inc. +Copyright (C) 1996-2000, 2008-2016, 2017 Free Software Foundation, Inc. See the end for copying conditions. Please send PSPP bug reports to bug-gnu-pspp@gnu.org. Changes from 0.10.2 to 0.10.4: + * A bug where the Mann-Whitney test would give misleading results + if run on multiple variables and MISSING=ANALAYSIS was specified + has been fixed. + * Gtk+3.14.5 or later must now be used when building. * The AUTORECODE command now accepts an optional / before INTO. diff --git a/src/language/stats/mann-whitney.c b/src/language/stats/mann-whitney.c index f752b463a2..cc82312ed3 100644 --- a/src/language/stats/mann-whitney.c +++ b/src/language/stats/mann-whitney.c @@ -107,7 +107,10 @@ mann_whitney_execute (const struct dataset *ds, CONST_CAST (struct n_sample_test *, nst), NULL); - + reader = casereader_create_filter_missing (reader, &var, 1, + exclude, + NULL, NULL); + reader = sort_execute_1var (reader, var); rr = casereader_create_append_rank (reader, var, @@ -122,9 +125,6 @@ mann_whitney_execute (const struct dataset *ds, const size_t group_var_width = var_get_width (nst->indep_var); const double rank = case_data_idx (c, rank_idx)->f; - if ( var_is_value_missing (var, val, exclude)) - continue; - if ( value_equal (group, &nst->val1, group_var_width)) { mw[i].rank_sum[0] += rank; diff --git a/tests/automake.mk b/tests/automake.mk index 5158fee56d..0fe8a5f69c 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -256,6 +256,7 @@ EXTRA_DIST += \ tests/data/v13.sav \ tests/data/v14.sav \ tests/data/test-encrypted.sps \ + tests/language/mann-whitney.txt \ tests/language/data-io/Book1.gnm.unzipped \ tests/language/data-io/test.ods \ tests/language/data-io/newone.ods \ diff --git a/tests/language/mann-whitney.txt b/tests/language/mann-whitney.txt new file mode 100644 index 0000000000..bcbf5056fb --- /dev/null +++ b/tests/language/mann-whitney.txt @@ -0,0 +1,231 @@ + 7 5 7 2 1.00 + 5 3 6 2 1.00 + 5 4 7 2 1.00 + 5 5 7 3 1.00 + 7 5 -1 4 1.00 + 6 4 7 2 1.00 + 7 5 7 2 1.00 + 6 5 -1 3 1.00 + 7 6 7 6 1.00 + 7 4 7 1 .00 + 7 5 7 3 .00 + 6 5 7 3 1.00 + 6 4 6 2 .00 + 4 3 6 2 .00 + 5 7 -1 4 1.00 + 5 4 6 2 1.00 + 6 4 7 2 1.00 + 6 4 7 2 1.00 + 7 5 6 1 .00 + 7 5 6 2 1.00 + 7 7 3 4 1.00 + 3 5 7 4 .00 + 5 2 2 1 1.00 + 5 5 5 5 .00 + 6 6 6 4 1.00 + 3 5 4 2 1.00 + 6 4 6 3 .00 + 5 3 7 2 .00 + 5 4 7 1 .00 + 4 3 7 2 1.00 + 3 2 7 1 1.00 + 7 3 1 2 1.00 + 5 7 7 3 1.00 + 5 4 7 3 1.00 + 6 5 7 2 .00 + 6 5 7 4 .00 + 7 5 7 4 1.00 + 7 4 6 1 .00 + 4 3 6 1 1.00 + 3 4 5 2 .00 + 5 3 5 1 .00 + 3 4 5 1 1.00 + 7 5 7 4 1.00 + 7 4 4 2 .00 + 7 6 7 6 1.00 + 5 4 7 4 .00 + 7 5 7 3 1.00 + 7 6 7 5 1.00 + 7 5 7 4 .00 + 3 4 6 2 .00 + 7 6 7 5 .00 + 5 5 6 2 .00 + 7 4 7 6 .00 + 5 5 7 3 .00 + 7 6 6 3 1.00 + 6 5 6 2 .00 + 7 6 1 3 .00 + 3 4 5 2 .00 + 6 6 7 3 .00 + 7 7 7 7 .00 + 7 2 7 1 1.00 + 6 6 7 5 1.00 + 7 6 7 2 .00 + 4 2 7 1 .00 + 3 5 5 1 1.00 + 6 5 7 4 .00 + 7 4 -1 3 1.00 + 7 6 -1 3 1.00 + 6 4 7 2 1.00 + 5 5 6 3 .00 + 3 2 4 1 1.00 + 7 5 5 3 1.00 + 6 5 7 4 .00 + 6 6 7 3 1.00 + 7 4 7 2 1.00 + 4 2 7 4 1.00 + 4 6 -1 4 .00 + 7 4 7 2 1.00 + 3 2 7 1 .00 + 6 7 -1 3 .00 + 5 3 6 5 1.00 + 7 5 -1 3 .00 + 5 6 6 1 1.00 + 7 6 7 2 1.00 + 4 5 7 3 .00 + 3 5 5 2 1.00 + 7 7 7 4 .00 + 6 5 -1 4 .00 + 6 5 6 1 .00 + 3 3 7 1 1.00 + 2 3 6 2 .00 + 6 6 7 2 .00 + 2 1 3 1 .00 + 4 2 1 1 .00 + 6 3 7 2 1.00 + 7 5 7 3 .00 + 4 3 5 2 .00 + 6 4 7 2 .00 + 5 4 6 3 .00 + 1 2 7 1 .00 + 7 6 7 4 1.00 + 5 3 7 2 .00 + 7 5 6 2 1.00 + 6 5 6 3 1.00 + 6 3 6 2 .00 + 4 4 6 4 1.00 + 2 1 -1 1 .00 + 5 4 7 2 1.00 + 5 6 7 3 1.00 + 7 6 7 4 1.00 + 2 5 7 3 .00 + 6 6 6 3 1.00 + 4 5 -1 2 1.00 + 7 4 -1 2 .00 + 6 7 6 1 1.00 + 7 4 7 3 .00 + 5 3 7 2 1.00 + 5 4 6 3 .00 + 5 4 6 2 1.00 + 5 6 6 3 .00 + 7 5 6 2 1.00 + 3 4 -1 2 1.00 + 4 6 -1 2 .00 + 6 5 7 3 1.00 + 2 4 -1 1 1.00 + 7 7 -1 2 1.00 + 4 6 -1 3 1.00 + 7 3 7 4 1.00 + 3 5 6 1 .00 + 6 7 7 4 1.00 + 6 6 6 2 1.00 + 7 6 -1 4 1.00 + 7 6 7 4 .00 + 4 2 6 1 .00 + 6 6 6 2 .00 + 6 5 -1 2 1.00 + 6 3 5 3 1.00 + 6 5 6 3 1.00 + 5 1 5 2 .00 + 6 5 7 2 .00 + 4 6 7 2 .00 + 5 6 -1 5 1.00 + 3 2 -1 1 .00 + 6 6 -1 3 .00 + 7 5 6 4 .00 + 7 4 5 1 1.00 + 4 1 1 3 1.00 + 3 4 7 1 .00 + 7 4 7 2 .00 + 4 3 6 1 1.00 + 6 7 -1 4 1.00 + 7 5 7 4 .00 + 7 4 6 3 .00 + 6 4 7 2 .00 + 5 5 7 2 1.00 + 6 5 6 4 .00 + 6 4 -1 2 .00 + 5 4 7 2 .00 + 3 3 4 1 .00 + 6 5 7 4 .00 + 4 4 -1 1 .00 + 6 4 5 3 .00 + 7 6 7 3 1.00 + 6 5 7 2 1.00 + 7 5 -1 3 .00 + 6 6 -1 4 1.00 + 6 5 7 3 .00 + 6 5 7 3 .00 + -1 5 7 5 .00 + 7 4 3 3 1.00 + 7 3 7 2 .00 + 4 5 7 2 .00 + 5 5 2 1 1.00 + 5 4 6 3 .00 + 7 5 6 3 .00 + 7 6 7 2 .00 + 7 5 7 4 1.00 + 7 7 7 3 .00 + 7 6 7 4 1.00 + 4 4 5 3 1.00 + 3 2 5 1 1.00 + 1 3 1 5 1.00 + 6 4 7 2 1.00 + 4 5 -1 2 1.00 + 6 5 -1 4 1.00 + 5 3 7 1 .00 + 5 4 7 2 1.00 + 6 5 -1 3 1.00 + 6 3 7 2 1.00 + 3 2 6 1 .00 + 6 3 7 2 .00 + 1 3 1 6 .00 + 6 4 7 2 .00 + 7 4 6 2 1.00 + 7 6 7 1 .00 + 4 6 -1 3 1.00 + 7 5 -1 3 1.00 + 6 2 3 1 .00 + 7 5 6 3 .00 + -1 -1 -1 -1 .00 + 6 5 6 3 .00 + 5 4 7 2 1.00 + 4 6 7 4 1.00 + 4 2 4 2 1.00 + 7 6 -1 3 1.00 + 6 6 6 2 1.00 + 4 3 5 1 1.00 + 7 3 7 1 .00 + 6 7 -1 3 1.00 + 6 3 -1 2 .00 + 6 5 5 2 1.00 + 4 6 -1 3 .00 + 2 4 7 2 .00 + 6 6 7 6 .00 + 6 6 -1 3 .00 + 6 5 -1 2 .00 + 4 4 6 2 1.00 + 6 5 7 4 1.00 + 6 5 -1 5 .00 + 6 5 6 3 .00 + 6 6 7 3 .00 + 6 2 -1 1 1.00 + 5 4 7 3 1.00 + 7 7 -1 2 1.00 + 7 7 7 5 1.00 + 6 4 7 2 .00 + 5 5 -1 3 .00 + 7 5 7 3 1.00 + 6 3 1 2 .00 + 4 2 2 1 .00 + 2 4 6 1 .00 diff --git a/tests/language/stats/npar.at b/tests/language/stats/npar.at index 038b788df1..1fb9e66949 100644 --- a/tests/language/stats/npar.at +++ b/tests/language/stats/npar.at @@ -1034,6 +1034,52 @@ height,98.0000,218.0000,-.6020,.547 AT_CLEANUP +AT_SETUP([NPAR TESTS Mann-Whitney Multiple]) +dnl Check for a bug where the ranks were inappropriately allocated, when +dnl multiple variables were tested and MISSING=ANALYSIS chosen. + +cp $abs_srcdir/language/mann-whitney.txt . + +AT_DATA([npar-mann-whitney.sps], [dnl +SET FORMAT = F11.3 + +DATA LIST NOTABLE FILE='mann-whitney.txt' + LIST /I002_01 I002_02 I002_03 I002_04 sum_HL *. + +VARIABLE LABELS + I002_01 'IOS: Familie' + I002_02 'IOS: Freunde' + I002_03 'IOS: Partner*in' + I002_04 'IOS: Bekannte'. + +MISSING VALUES I002_01 I002_02 I002_03 I002_04 (-9 -1). + +NPAR TESTS + /MISSING=ANALYSIS + /M-W=I002_01 I002_02 I002_03 I002_04 BY sum_HL (0 1). +]) + +AT_CHECK([pspp -O format=csv npar-mann-whitney.sps], [0], [dnl +Table: Ranks +,N,,,Mean Rank,,Sum of Ranks, +,.000,1.000,Total,.000,1.000,.000,1.000 +IOS: Familie,114.000,115.000,229.000,110.018,119.939,12542.000,13793.000 +IOS: Freunde,115.000,115.000,230.000,108.339,122.661,12459.000,14106.000 +IOS: Partner*in,97.000,91.000,188.000,95.351,93.593,9249.000,8517.000 +IOS: Bekannte,115.000,115.000,230.000,111.065,119.935,12772.500,13792.500 + +Table: Test Statistics +,Mann-Whitney U,Wilcoxon W,Z,Asymp. Sig. (2-tailed) +IOS: Familie,5987.000,12542.000,-1.167,.243 +IOS: Freunde,5789.000,12459.000,-1.674,.094 +IOS: Partner*in,4331.000,8517.000,-.245,.807 +IOS: Bekannte,6102.500,12772.500,-1.046,.296 +]) + +AT_CLEANUP + + + AT_SETUP([NPAR TESTS Cochran]) AT_DATA([npar-cochran.sps], [dnl set format f11.3.