We had been using Freedman-Diaconis rule to select the bin width.
According to the literature this is better than Sturges rule.
However it cannot work when the interquartile range is zero, which
will happen for datasets with small range.
This change uses the Freedman-Diaconis rule and falls back to
Sturges when the IQR is zero.
- /* Freedman-Diaconis' choice of bin width. */
iqr = calculate_iqr (frq);
iqr = calculate_iqr (frq);
- bin_width = 2 * iqr / pow (valid_freq, 1.0 / 3.0);
+
+ if (iqr > 0)
+ /* Freedman-Diaconis' choice of bin width. */
+ bin_width = 2 * iqr / pow (valid_freq, 1.0 / 3.0);
+
+ else
+ /* Sturges Rule */
+ bin_width = (x_max - x_min) / (1 + log2 (valid_freq));
histogram = histogram_create (bin_width, x_min, x_max);
histogram = histogram_create (bin_width, x_min, x_max);
}
calc_stats (vf, stat_value);
}
calc_stats (vf, stat_value);
- t = tab_create (3, ((frq->stats & FRQ_ST_MEDIAN) ? frq->n_stats - 1 : frq->n_stats) + frq->n_show_percentiles + 2);
+ t = tab_create (3, ((frq->stats & FRQ_ST_MEDIAN) ? frq->n_stats - 1 : frq->n_stats)
+ + frq->n_show_percentiles + 2);
tab_box (t, TAL_1, TAL_1, -1, -1 , 0 , 0 , 2, tab_nr(t) - 1) ;
tab_box (t, TAL_1, TAL_1, -1, -1 , 0 , 0 , 2, tab_nr(t) - 1) ;
VAR=x
/PERCENTILES = 0 25 50 75 100.
])
VAR=x
/PERCENTILES = 0 25 50 75 100.
])
AT_CHECK([pspp -O format=csv frequencies.sps], [0],
[Table: X
Value Label,Value,Frequency,Percent,Valid Percent,Cum Percent
AT_CHECK([pspp -O format=csv frequencies.sps], [0],
[Table: X
Value Label,Value,Frequency,Percent,Valid Percent,Cum Percent
+
+AT_SETUP([FREQUENCIES dichotomous histogram])
+AT_DATA([frequencies.sps], [dnl
+data list notable list /d4 *.
+begin data.
+0
+0
+0
+1
+0
+0
+0
+0
+1
+0
+0
+0
+0
+0
+1
+2
+0
+end data.
+
+FREQUENCIES
+ /VARIABLES = d4
+ /FORMAT=AVALUE TABLE
+ /HISTOGRAM=NORMAL
+ .
+])
+
+AT_CHECK([pspp frequencies.sps], [0], [ignore])
+AT_CLEANUP