Until now, histograms have always had exactly 11 bins. There is no
"correct" number of bins, but a fixed number of bins also seems less than
ideal. Use Sturges' formula, instead, to choose the number of bins.
Reported by Erik Frebold <efrebold@interchange.ubc.ca>.
double x_max = -DBL_MAX;
struct histogram *hist;
- const double bins = 11;
+ int bins;
struct hsh_iterator hi;
struct hsh_table *fh = ft->data;
if ( frq->value.f > x_max ) x_max = frq->value.f ;
}
+ /* Sturges' formula. */
+ bins = ceil (log (ft->valid_cases) / log (2) + 1);
+ if (bins < 5)
+ bins = 5;
+
hist = histogram_create (bins, x_min, x_max);
for( i = 0 ; i < ft->n_valid ; ++i )