1 * T-TEST example pspp code
3 * Generate an example dataset for male and female humans
4 * with weight, height, beauty and iq data
5 * Weight and Height data are generated as normal distributions with
6 * different mean values. iq is generated with the same mean value (100).
7 * Beauty is only slightly different.
8 * Every run of the program will produce new data
11 * Females have gender 0
12 * Create 8 female cases
14 compute weight = rv.normal (65, 10).
15 compute height = rv.normal(170.7,6.3).
16 compute beauty = rv.normal (10,4).
17 compute iq = rv.normal(100,15).
24 compute weight = rv.normal (83, 13).
25 compute height = rv.normal(183.8,7.1).
26 compute beauty = rv.normal(11,4).
27 compute iq = rv.normal(100,15).
35 * Add a label to the gender values to have descriptive names
37 /gender 0 female 1 male.
39 * Plot the data as boxplot
41 /variables=weight height beauty iq by gender
44 * Do a Scatterplot to check if weight and height
45 * might be correlated. As both the weight and the
46 * height for males is higher than for females
47 * the combination of male and female data is correlated.
48 * Weigth increases with height.
50 /scatterplot = height with weight.
52 * Within the male and female groups there is no correlation between
53 * weight and height. This becomes visible by marking male and female
54 * datapoints with different colour.
56 /scatterplot = height with weight by gender.
58 * The T-Test checks if male and female humans have
59 * different weight, height, beauty and iq. See that Significance for the
60 * weight and height variable tends to 0, while the Significance
61 * for iq should not go to 0.
62 * Significance in T-Test means the probablity for the assumption that the
63 * height (weight, beauty,iq) of the two groups (male,female) have the same
64 * mean value. As the data for the iq values is generated as normal distribution
65 * with the same mean value, the significance should not go down to 0.
66 t-test groups=gender(0,1)
67 /variables=weight height beauty iq.
69 * Run the Code several times to see the effect that different data
70 * is generated. Every run is like a new sample from the population.
72 * Change the number of samples (cases) by changing the
73 * loop range to see the effect on significance!
74 * With increasing number of cases the sample size increases and
75 * the estimation of mean values and standard deviation becomes better.
76 * The difference in beauty becomes visible only with larger sample sizes.