pintos-os.org Git - pspp/blob - examples/t-test.sps

   1 * T-TEST example pspp code
   2
   3 * Generate an example dataset for male and female humans
   4 * with weight, height, beauty and iq data
   5 * Weight and Height data are generated as normal distributions with
   6 * different mean values. iq is generated with the same mean value (100).
   7 * Beauty is only slightly different.
   8 * Every run of the program will produce new data
   9 input program.
  10
  11 * Females have gender 0
  12 * Create 8 female cases
  13 loop #i = 1 to 8.
  14  compute weight  = rv.normal (65, 10).
  15  compute height = rv.normal(170.7,6.3).
  16  compute beauty = rv.normal (10,4).
  17  compute iq = rv.normal(100,15).
  18  compute gender = 0.
  19  end case.
  20 end loop.
  21
  22 * Males have gender 1
  23 loop #i = 1 to 8.
  24  compute weight  = rv.normal (83, 13).
  25  compute height = rv.normal(183.8,7.1).
  26  compute beauty = rv.normal(11,4).
  27  compute iq = rv.normal(100,15).
  28  compute gender = 1.
  29  end case.
  30 end loop.
  31
  32 end file.
  33 end input program.
  34
  35 * Add a label to the gender values to have descriptive names
  36 value labels
  37   /gender 0 female 1 male.
  38
  39 * Plot the data as boxplot
  40 examine
  41   /variables=weight height beauty iq by gender
  42   /plot=boxplot.
  43
  44 * Do a Scatterplot to check if weight and height
  45 * might be correlated. As both the weight and the
  46 * height for males is higher than for females
  47 * the combination of male and female data is correlated.
  48 * Weigth increases with height.
  49 graph
  50   /scatterplot = height with weight.
  51
  52 * Within the male and female groups there is no correlation between
  53 * weight and height. This becomes visible by marking male and female
  54 * datapoints with different colour.
  55 graph
  56   /scatterplot = height with weight by gender.
  57
  58 * The T-Test checks if male and female humans have
  59 * different weight, height, beauty and iq. See that Significance for the
  60 * weight and height variable tends to 0, while the Significance
  61 * for iq should not go to 0.
  62 * Significance in T-Test means the probablity for the assumption that the
  63 * height (weight, beauty,iq) of the two groups (male,female) have the same
  64 * mean value. As the data for the iq values is generated as normal distribution
  65 * with the same mean value, the significance should not go down to 0.
  66 t-test groups=gender(0,1)
  67   /variables=weight height beauty iq.
  68
  69 * Run the Code several times to see the effect that different data
  70 * is generated. Every run is like a new sample from the population.
  71
  72 * Change the number of samples (cases) by changing the
  73 * loop range to see the effect on significance!
  74 * With increasing number of cases the sample size increases and
  75 * the estimation of mean values and standard deviation becomes better.
  76 * The difference in beauty becomes visible only with larger sample sizes.
  77
  78