1 @c PSPP - a program for statistical analysis.
2 @c Copyright (C) 2019 Free Software Foundation, Inc.
3 @c Permission is granted to copy, distribute and/or modify this document
4 @c under the terms of the GNU Free Documentation License, Version 1.3
5 @c or any later version published by the Free Software Foundation;
6 @c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
7 @c A copy of the license is included in the section entitled "GNU
8 @c Free Documentation License".
11 @node Encrypted File Wrappers
12 @chapter Encrypted File Wrappers
14 SPSS 21 and later can package multiple kinds of files inside an
15 encrypted wrapper. The wrapper has a common format, regardless of the
16 kind of the file that it contains.
19 The SPSS encryption wrapper is poorly designed. It is much cheaper
20 and faster to decrypt a file encrypted this way than if a well
21 designed alternative were used. If you must use this format, use a
22 10-byte randomly generated password.
26 * Common Wrapper Format::
30 @node Common Wrapper Format
31 @section Common Wrapper Format
33 This section describes the general format of an SPSS encrypted file
34 wrapper. The following sections describe the details for each kind of
37 An encrypted file wrapper begins with the following 36-byte header,
38 where @i{xxx} identifies the type of file encapsulated, as described
39 in the following sections:
42 0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE|
43 0010 44 @i{xx} @i{xx} @i{xx} 15 00 00 00 00 00 00 00 00 00 00 00 |D@i{xxx}............|
44 0020 00 00 00 00 |....|
47 Following the fixed header is essentially the regular contents of the
48 encapsulated file in its usual format, with each 16-byte block
49 encrypted with AES-256 in ECB mode. Each type of encapsulated file is
50 processed in a slightly different way before encryption, as described
51 in the following sections. The AES-256 key is derived from a password
56 Start from the literal password typed by the user. Truncate it to at
57 most 10 bytes, then append as many null bytes as necessary until there
58 are exactly 32 bytes. Call this @var{password}.
61 Let @var{constant} be the following 73-byte constant:
64 0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11
65 0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80
66 0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85
67 0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1
68 0040 e1 83 07 d8 0d 00 00 01 00
72 Compute CMAC-AES-256(@var{password}, @var{constant}). Call the
73 16-byte result @var{cmac}.
76 The 32-byte AES-256 key is @var{cmac} || @var{cmac}, that is,
77 @var{cmac} repeated twice.
82 Consider the password @samp{pspp}. @var{password} is:
85 0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............|
86 0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
93 0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
100 0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
101 0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
105 * Encrypted System Files::
106 * Encrypted Syntax Files::
109 @node Encrypted System Files
110 @subsection Encrypted System Files
112 An encrypted system file uses @code{SAV} as the identifier in its
115 Before encryption, a system file is appended with as many null bytes
116 as needed (possibly zero) to make it a multiple of 16 bytes in length,
117 so that it fits exactly in a series of AES blocks. (This implies that
118 encrypted system files must always be compressed, because otherwise a
119 system file with only a single variable might appear to have an extra
122 @node Encrypted Syntax Files
123 @subsection Encrypted Syntax Files
125 An encrypted syntax file uses @code{SPS} as the identifier in its
128 Before encryption, a syntax file is prefixed with a line at the
129 beginning of the form @code{* Encoding: @var{encoding}.}, where
130 @var{encoding} is the encoding used for the rest of the file,
131 e.g. @code{windows-1252}. The syntax file is then appended with as
132 many bytes with value 04 as needed (possibly zero) to make it a
133 multiple of 16 bytes in length.
135 @node Password Encoding
136 @section Password Encoding
138 SPSS also supports what it calls ``encrypted passwords.'' These are
139 not encrypted. They are encoded with a simple, fixed scheme. An
140 encoded password is always a multiple of 2 characters long, and never
141 longer than 20 characters. The characters in an encoded password are
142 always in the graphic ASCII range 33 through 126. Each successive
143 pair of characters in the password encodes a single byte in the
146 Use the following algorithm to decode a pair of characters:
150 Let @var{a} be the ASCII code of the first character, and @var{b} be
151 the ASCII code of the second character.
154 Let @var{ah} be the most significant 4 bits of @var{a}. Find the line
155 in the table below that has @var{ah} on the left side. The right side
156 of the line is a set of possible values for the most significant 4
157 bits of the decoded byte.
160 @t{2 } @result{} @t{2367}
161 @t{3 } @result{} @t{0145}
162 @t{47} @result{} @t{89cd}
163 @t{56} @result{} @t{abef}
167 Let @var{bh} be the most significant 4 bits of @var{b}. Find the line
168 in the second table below that has @var{bh} on the left side. The
169 right side of the line is a set of possible values for the most
170 significant 4 bits of the decoded byte. Together with the results of
171 the previous step, only a single possibility is left.
174 @t{2 } @result{} @t{139b}
175 @t{3 } @result{} @t{028a}
176 @t{47} @result{} @t{46ce}
177 @t{56} @result{} @t{57df}
181 Let @var{al} be the least significant 4 bits of @var{a}. Find the
182 line in the table below that has @var{al} on the left side. The right
183 side of the line is a set of possible values for the least significant
184 4 bits of the decoded byte.
187 @t{03cf} @result{} @t{0145}
188 @t{12de} @result{} @t{2367}
189 @t{478b} @result{} @t{89cd}
190 @t{569a} @result{} @t{abef}
194 Let @var{bl} be the least significant 4 bits of @var{b}. Find the
195 line in the table below that has @var{bl} on the left side. The right
196 side of the line is a set of possible values for the least significant
197 4 bits of the decoded byte. Together with the results of the previous
198 step, only a single possibility is left.
201 @t{03cf} @result{} @t{028a}
202 @t{12de} @result{} @t{139b}
203 @t{478b} @result{} @t{46ce}
204 @t{569a} @result{} @t{57df}
210 Consider the encoded character pair @samp{-|}. @var{a} is
211 0x2d and @var{b} is 0x7c, so @var{ah} is 2, @var{bh} is 7, @var{al} is
212 0xd, and @var{bl} is 0xc. @var{ah} means that the most significant
213 four bits of the decoded character is 2, 3, 6, or 7, and @var{bh}
214 means that they are 4, 6, 0xc, or 0xe. The single possibility in
215 common is 6, so the most significant four bits are 6. Similarly,
216 @var{al} means that the least significant four bits are 2, 3, 6, or 7,
217 and @var{bl} means they are 0, 2, 8, or 0xa, so the least significant
218 four bits are 2. The decoded character is therefore 0x62, the letter