X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Ftutorial.texi;h=b76d8bf838ab1b50de389fa346cd39beead8a41d;hb=e2f99612bf4f4691623f16730eed3e55afdc54f0;hp=ca8a83998034e5c8fae301d0cf7c4cdfdb4eb038;hpb=620d94c8a41811d8dc8ba8a0f500896a9a894a18;p=pspp diff --git a/doc/tutorial.texi b/doc/tutorial.texi index ca8a839980..b76d8bf838 100644 --- a/doc/tutorial.texi +++ b/doc/tutorial.texi @@ -1,3 +1,12 @@ +@c PSPP - a program for statistical analysis. +@c Copyright (C) 2017 Free Software Foundation, Inc. +@c Permission is granted to copy, distribute and/or modify this document +@c under the terms of the GNU Free Documentation License, Version 1.3 +@c or any later version published by the Free Software Foundation; +@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. +@c A copy of the license is included in the section entitled "GNU +@c Free Documentation License". +@c @alias prompt = sansserif @include tut.texi @@ -26,7 +35,7 @@ executed. Whichever method you choose, the syntax is identical. When using the interactive method, @pspp{} tells you that it's waiting for your -data with a string like @prompt{@pspp{}>} or @prompt{data>}. +data with a string like @prompt{PSPP>} or @prompt{data>}. In the examples of this chapter, whenever you see text like this, it indicates the prompt displayed by @pspp{}, @emph{not} something that you should type. @@ -85,6 +94,7 @@ The following sections explain how to define a dataset. * Reading data from a pre-prepared PSPP file:: * Saving data to a PSPP file.:: * Reading data from other sources:: +* Exiting PSPP:: @end menu @node Defining Variables @@ -105,14 +115,14 @@ and reads data into them by manual input. @float Example, data-list @cartouche @example -@prompt{@pspp{}>} data list list /forename (A12) height. -@prompt{@pspp{}>} begin data. +@prompt{PSPP>} data list list /forename (A12) height. +@prompt{PSPP>} begin data. @prompt{data>} Ahmed 188 @prompt{data>} Bertram 167 @prompt{data>} Catherine 134.231 @prompt{data>} David 109.1 @prompt{data>} end data -@prompt{@pspp{}>} +@prompt{PSPP>} @end example @end cartouche @caption{Manual entry of data using the @cmd{DATA LIST} command. @@ -144,11 +154,21 @@ and @samp{(A12)} says that the variable @var{forename} is a string variable and that its maximum length is 12 bytes. The second variable's name is specified by the text @samp{height}. Since no format is given, this variable has the default format. +Normally the default format expects numeric data, which should be +entered in the locale of the operating system. +Thus, the example is correct for English locales and other +locales which use a period (@samp{.}) as the decimal separator. +However if you are using a system with a locale which uses the comma (@samp{,}) +as the decimal separator, then you should in the subsequent lines substitute +@samp{.} with @samp{,}. +Alternatively, you could explicitly tell @pspp{} that the @var{height} +variable is to be read using a period as its decimal separator by appending the +text @samp{DOT8.3} after the word @samp{height}. For more information on data formats, @pxref{Input and Output Formats}. @item -Normally, @pspp{} displays the prompt @prompt{@pspp{}>} whenever it's +Normally, @pspp{} displays the prompt @prompt{PSPP>} whenever it's expecting a command. However, when it's expecting data, the prompt changes to @prompt{data>} so that you know to enter data and not a command. @@ -168,7 +188,7 @@ terminating commands. Once the data has been entered, you could type @example -@prompt{@pspp{}>} list /format=numbered. +@prompt{PSPP>} list /format=numbered. @end example @noindent to list the data. @@ -215,7 +235,7 @@ Zachariah 113.02 You can can tell the @cmd{DATA LIST} command to read the data directly from this file instead of by manual entry, with a command like: @example -@prompt{@pspp{}>} data list file='mydata.dat' list /forename (A12) height. +@prompt{PSPP>} data list file='mydata.dat' list /forename (A12) height. @end example @noindent Notice however, that it is still necessary to specify the names of the @@ -240,7 +260,7 @@ have the suffix @file{.sav}, but that is not mandatory. The following syntax loads a file called @file{my-file.sav}. @example -@prompt{@pspp{}>} get file='my-file.sav'. +@prompt{PSPP>} get file='my-file.sav'. @end example @noindent You will encounter several instances of this in future examples. @@ -258,7 +278,7 @@ the @cmd{SAVE} command. The following syntax will save the existing data and variables to a file called @file{my-new-file.sav}. @example -@prompt{@pspp{}>} save outfile='my-new-file.sav'. +@prompt{PSPP>} save outfile='my-new-file.sav'. @end example @noindent If @file{my-new-file.sav} already exists, then it will be overwritten. @@ -276,6 +296,13 @@ separated text, from spreadsheets, databases or other sources. In these instances you should use the @cmd{GET DATA} command (@pxref{GET DATA}). +@node Exiting PSPP +@subsection Exiting PSPP + +Use the @cmd{FINISH} command to exit PSPP: +@example +@prompt{PSPP>} finish. +@end example @node Data Screening and Transformation @section Data Screening and Transformation @@ -317,8 +344,8 @@ data and identify the erroneous values. @float Example, descriptives @cartouche @example -@prompt{@pspp{}>} get file='@value{example-dir}/physiology.sav'. -@prompt{@pspp{}>} descriptives sex, weight, height. +@prompt{PSPP>} get file='@value{example-dir}/physiology.sav'. +@prompt{PSPP>} descriptives sex, weight, height. @end example Output: @@ -363,7 +390,7 @@ represent data entry errors. @cartouche [@dots{} continue from @ref{descriptives}] @example -@prompt{@pspp{}>} examine height, weight /statistics=extreme(3). +@prompt{PSPP>} examine height, weight /statistics=extreme(3). @end example Output: @@ -474,24 +501,24 @@ A sensible check to perform on survey data is the calculation of reliability. This gives the statistician some confidence that the questionnaires have been completed thoughtfully. -If you examine the labels of variables @var{v1}, @var{v3} and @var{v5}, +If you examine the labels of variables @var{v1}, @var{v3} and @var{v4}, you will notice that they ask very similar questions. One would therefore expect the values of these variables (after recoding) to closely follow one another, and we can test that with the @cmd{RELIABILITY} command (@pxref{RELIABILITY}). @ref{reliability} shows a @pspp{} session where the user (after recoding negatively scaled variables) requests reliability statistics for -@var{v1}, @var{v3} and @var{v5}. +@var{v1}, @var{v3} and @var{v4}. @float Example, reliability @cartouche @example -@prompt{@pspp{}>} get file='@value{example-dir}/hotel.sav'. -@prompt{@pspp{}>} display dictionary. -@prompt{@pspp{}>} * recode negatively worded questions. -@prompt{@pspp{}>} compute v3 = 6 - v3. -@prompt{@pspp{}>} compute v5 = 6 - v5. -@prompt{@pspp{}>} reliability v1, v3, v5. +@prompt{PSPP>} get file='@value{example-dir}/hotel.sav'. +@prompt{PSPP>} display dictionary. +@prompt{PSPP>} * recode negatively worded questions. +@prompt{PSPP>} compute v3 = 6 - v3. +@prompt{PSPP>} compute v5 = 6 - v5. +@prompt{PSPP>} reliability v1, v3, v4. @end example Output (dictionary information omitted for clarity): @@ -509,19 +536,19 @@ Output (dictionary information omitted for clarity): #================#==========# #Cronbach's Alpha#N of Items# #================#==========# -# .86# 3# +# .81# 3# #================#==========# @end example @end cartouche @caption{Recoding negatively scaled variables, and testing for reliability with the @cmd{RELIABILITY} command. The Cronbach Alpha coefficient suggests a high degree of reliability among variables -@var{v1}, @var{v2} and @var{v5}.} +@var{v1}, @var{v3} and @var{v4}.} @end float As a rule of thumb, many statisticians consider a value of Cronbach's Alpha of 0.7 or higher to indicate reliable data. -Here, the value is 0.86 so the data and the recoding that we performed +Here, the value is 0.81 so the data and the recoding that we performed are vindicated. @@ -569,11 +596,11 @@ an appropriate non-parametric test instead of a linear one. @float Example, normality @cartouche @example -@prompt{@pspp{}>} get file='@value{example-dir}/repairs.sav'. -@prompt{@pspp{}>} examine mtbf +@prompt{PSPP>} get file='@value{example-dir}/repairs.sav'. +@prompt{PSPP>} examine mtbf /statistics=descriptives. -@prompt{@pspp{}>} compute mtbf_ln = ln (mtbf). -@prompt{@pspp{}>} examine mtbf_ln +@prompt{PSPP>} compute mtbf_ln = ln (mtbf). +@prompt{PSPP>} examine mtbf_ln /statistics=descriptives. @end example @@ -700,27 +727,28 @@ then an alternative calculation is necessary. For the @var{height} variable, the output shows the significance of the Levene test to be 0.33 which means there is a 33% probability that the -Levene test produces this outcome when the variances are unequal. -Such a probability is too high -to assume that the variances are equal so the row -for unequal variances should be used. +Levene test produces this outcome when the variances are equal. +Had the significance been less than 0.05, then it would have been unsafe to assume that +the variances were equal. +However, because the value is higher than 0.05 the homogeneity of variances assumption +is safe and the ``Equal Variances'' row (the more powerful test) can be used. Examining this row, the two tailed significance for the @var{height} t-test is less than 0.05, so it is safe to reject the null hypothesis and conclude that the mean heights of males and females are unequal. For the @var{temperature} variable, the significance of the Levene test -is 0.58 so again, it is unsafe to use the row for equal variances. -The unequal variances row indicates that the two tailed significance for -@var{temperature} is 0.19. Since this is greater than 0.05 we must reject +is 0.58 so again, it is safe to use the row for equal variances. +The equal variances row indicates that the two tailed significance for +@var{temperature} is 0.20. Since this is greater than 0.05 we must reject the null hypothesis and conclude that there is insufficient evidence to suggest that the body temperature of male and female persons are different. @float Example, t-test @cartouche @example -@prompt{@pspp{}>} get file='@value{example-dir}/physiology.sav'. -@prompt{@pspp{}>} recode height (179 = SYSMIS). -@prompt{@pspp{}>} t-test group=sex(0,1) /variables = height temperature. +@prompt{PSPP>} get file='@value{example-dir}/physiology.sav'. +@prompt{PSPP>} recode height (179 = SYSMIS). +@prompt{PSPP>} t-test group=sex(0,1) /variables = height temperature. @end example Output: @example @@ -784,9 +812,9 @@ identifies the potential linear relationship. @xref{REGRESSION}. @float Example, regression @cartouche @example -@prompt{@pspp{}>} get file='@value{example-dir}/repairs.sav'. -@prompt{@pspp{}>} regression /variables = mtbf duty_cycle /dependent = mttr. -@prompt{@pspp{}>} regression /variables = mtbf /dependent = mttr. +@prompt{PSPP>} get file='@value{example-dir}/repairs.sav'. +@prompt{PSPP>} regression /variables = mtbf duty_cycle /dependent = mttr. +@prompt{PSPP>} regression /variables = mtbf /dependent = mttr. @end example Output: @example @@ -851,7 +879,7 @@ suggesting that at the 0.06 level, the formula predictor of the time to repair. -@c LocalWords: @pspp{} dir itemize noindent var cindex dfn cartouche samp xref +@c LocalWords: PSPP dir itemize noindent var cindex dfn cartouche samp xref @c LocalWords: pxref ie sav Std Dev kilograms SYSMIS sansserif pre pspp emph @c LocalWords: Likert Cronbach's Cronbach mtbf npplot ln myfile cmd NPAR Sig @c LocalWords: vindex Levene Levene's df Diff clicksequence mydata dat ascii