1 @node Encrypted File Wrappers
2 @chapter Encrypted File Wrappers
4 SPSS 21 and later can package multiple kinds of files inside an
5 encrypted wrapper. The wrapper has a common format, regardless of the
6 kind of the file that it contains.
9 The SPSS encryption wrapper is poorly designed. It is much cheaper
10 and faster to decrypt a file encrypted this way than if a well
11 designed alternative were used. If you must use this format, use a
12 10-byte randomly generated password.
16 * Common Wrapper Format::
20 @node Common Wrapper Format
21 @section Common Wrapper Format
23 This section describes the general format of an SPSS encrypted file
24 wrapper. The following sections describe the details for each kind of
27 An encrypted file wrapper begins with the following 36-byte header,
28 where @i{xxx} identifies the type of file encapsulated, as described
29 in the following sections:
32 0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE|
33 0010 44 @i{xx} @i{xx} @i{xx} 15 00 00 00 00 00 00 00 00 00 00 00 |D@i{xxx}............|
34 0020 00 00 00 00 |....|
37 Following the fixed header is essentially the regular contents of the
38 encapsulated file in its usual format, with each 16-byte block
39 encrypted with AES-256 in ECB mode. Each type of encapsulated file is
40 processed in a slightly different way before encryption, as described
41 in the following sections. The AES-256 key is derived from a password
46 Start from the literal password typed by the user. Truncate it to at
47 most 10 bytes, then append as many null bytes as necessary until there
48 are exactly 32 bytes. Call this @var{password}.
51 Let @var{constant} be the following 73-byte constant:
54 0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11
55 0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80
56 0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85
57 0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1
58 0040 e1 83 07 d8 0d 00 00 01 00
62 Compute CMAC-AES-256(@var{password}, @var{constant}). Call the
63 16-byte result @var{cmac}.
66 The 32-byte AES-256 key is @var{cmac} || @var{cmac}, that is,
67 @var{cmac} repeated twice.
72 Consider the password @samp{pspp}. @var{password} is:
75 0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............|
76 0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
83 0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
90 0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
91 0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
95 * Encrypted System Files::
96 * Encrypted Syntax Files::
99 @node Encrypted System Files
100 @subsection Encrypted System Files
102 An encrypted system file uses @code{SAV} as the identifier in its
105 Before encryption, a system file is appended with as many null bytes
106 as needed (possibly zero) to make it a multiple of 16 bytes in length,
107 so that it fits exactly in a series of AES blocks. (This implies that
108 encrypted system files must always be compressed, because otherwise a
109 system file with only a single variable might appear to have an extra
112 @node Encrypted Syntax Files
113 @subsection Encrypted Syntax Files
115 An encrypted syntax file uses @code{SPS} as the identifier in its
118 Before encryption, a syntax file is prefixed with a line at the
119 beginning of the form @code{* Encoding: @var{encoding}.}, where
120 @var{encoding} is the encoding used for the rest of the file,
121 e.g. @code{windows-1252}. The syntax file is then appended with as
122 many bytes with value 04 as needed (possibly zero) to make it a
123 multiple of 16 bytes in length.
125 @node Password Encoding
126 @section Password Encoding
128 SPSS also supports what it calls ``encrypted passwords.'' These are
129 not encrypted. They are encoded with a simple, fixed scheme. An
130 encoded password is always a multiple of 2 characters long, and never
131 longer than 20 characters. The characters in an encoded password are
132 always in the graphic ASCII range 33 through 126. Each successive
133 pair of characters in the password encodes a single byte in the
136 Use the following algorithm to decode a pair of characters:
140 Let @var{a} be the ASCII code of the first character, and @var{b} be
141 the ASCII code of the second character.
144 Let @var{ah} be the most significant 4 bits of @var{a}. Find the line
145 in the table below that has @var{ah} on the left side. The right side
146 of the line is a set of possible values for the most significant 4
147 bits of the decoded byte.
150 @t{2 } @result{} @t{2367}
151 @t{3 } @result{} @t{0145}
152 @t{47} @result{} @t{89cd}
153 @t{56} @result{} @t{abef}
157 Let @var{bh} be the most significant 4 bits of @var{b}. Find the line
158 in the second table below that has @var{bh} on the left side. The
159 right side of the line is a set of possible values for the most
160 significant 4 bits of the decoded byte. Together with the results of
161 the previous step, only a single possibility is left.
164 @t{2 } @result{} @t{139b}
165 @t{3 } @result{} @t{028a}
166 @t{47} @result{} @t{46ce}
167 @t{56} @result{} @t{57df}
171 Let @var{al} be the least significant 4 bits of @var{a}. Find the
172 line in the table below that has @var{al} on the left side. The right
173 side of the line is a set of possible values for the least significant
174 4 bits of the decoded byte.
177 @t{03cf} @result{} @t{0145}
178 @t{12de} @result{} @t{2367}
179 @t{478b} @result{} @t{89cd}
180 @t{569a} @result{} @t{abef}
184 Let @var{bl} be the least significant 4 bits of @var{b}. Find the
185 line in the table below that has @var{bl} on the left side. The right
186 side of the line is a set of possible values for the least significant
187 4 bits of the decoded byte. Together with the results of the previous
188 step, only a single possibility is left.
191 @t{03cf} @result{} @t{028a}
192 @t{12de} @result{} @t{139b}
193 @t{478b} @result{} @t{46ce}
194 @t{569a} @result{} @t{57df}
200 Consider the encoded character pair @samp{-|}. @var{a} is
201 0x2d and @var{b} is 0x7c, so @var{ah} is 2, @var{bh} is 7, @var{al} is
202 0xd, and @var{bl} is 0xc. @var{ah} means that the most significant
203 four bits of the decoded character is 2, 3, 6, or 7, and @var{bh}
204 means that they are 4, 6, 0xc, or 0xe. The single possibility in
205 common is 6, so the most significant four bits are 6. Similarly,
206 @var{al} means that the least significant four bits are 2, 3, 6, or 7,
207 and @var{bl} means they are 0, 2, 8, or 0xa, so the least significant
208 four bits are 2. The decoded character is therefore 0x62, the letter