--- /dev/null
+# Common Wrapper Format
+
+An encrypted file wrapper begins with the following 36-byte header,
+where `xxx` identifies the type of file encapsulated: `SAV` for a system
+file, `SPS` for a syntax file, `SPV` for a viewer file. PSPP code for
+identifying these files just checks for the `ENCRYPTED` keyword at
+offset 8, but the other bytes are also fixed in practice:
+
+```
+0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE|
+0010 44 xx xx xx 15 00 00 00 00 00 00 00 00 00 00 00 |Dxxx............|
+0020 00 00 00 00 |....|
+```
+
+Following the fixed header is essentially the regular contents of the
+encapsulated file in its usual format, with each 16-byte block
+encrypted with AES-256 in ECB mode.
+
+To make the plaintext an even multiple of 16 bytes in length, the
+encryption process appends PKCS #7 padding, as specified in RFC 5652
+section 6.3. Padding appends 1 to 16 bytes to the plaintext, in which
+each byte of padding is the number of padding bytes added. If the
+plaintext is, for example, 2 bytes short of a multiple of 16, the
+padding is 2 bytes with value 02; if the plaintext is a multiple of 16
+bytes in length, the padding is 16 bytes with value 0x10.
+
+The AES-256 key is derived from a password in the following way:
+
+1. Start from the literal password typed by the user. Truncate it to
+ at most 10 bytes, then append as many null bytes as necessary until
+ there are exactly 32 bytes. Call this `password`.
+
+2. Let `constant` be the following 73-byte constant:
+
+ ```
+ 0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11
+ 0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80
+ 0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85
+ 0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1
+ 0040 e1 83 07 d8 0d 00 00 01 00
+ ```
+
+3. Compute `CMAC-AES-256(password, constant)`. Call the 16-byte
+ result `cmac`.
+
+4. The 32-byte AES-256 key is `cmac || cmac`, that is, `cmac` repeated
+ twice.
+
+## Example
+
+Consider the password `pspp`. `password` is:
+
+```
+0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............|
+0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+```
+
+`cmac` is:
+
+```
+0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
+```
+
+The AES-256 key is:
+
+```
+0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
+0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
+```
+
+## Checking Passwords
+
+A program reading an encrypted file may wish to verify that the
+password it was given is the correct one. One way is to verify that
+the PKCS #7 padding at the end of the file is well formed. However,
+any plaintext that ends in byte 01 is well formed PKCS #7, meaning
+that about 1 in 256 keys will falsely pass this test. This might be
+acceptable for interactive use, but the false positive rate is too
+high for a brute-force search of the password space.
+
+A better test requires some knowledge of the file format being
+wrapped, to obtain a "magic number" for the beginning of the file.
+
+* The plaintext of system files begins with `$FL2@(#)` or `$FL3@(#)`.
+
+* Before encryption, a syntax file is prefixed with a line at the
+ beginning of the form `* Encoding: ENCODING.`, where ENCODING is the
+ encoding used for the rest of the file, e.g. `windows-1252`. Thus,
+ `* Encoding` may be used as a magic number for system files.
+
+* The plaintext of viewer files begins with `50 4b 03 04 14 00 08` (`50
+ 4b` is `PK`).
+
--- /dev/null
+# Password Encoding
+
+SPSS also supports what it calls "encrypted passwords." These are not
+encrypted. They are encoded with a simple, fixed scheme. An encoded
+password is always a multiple of 2 characters long, and never longer
+than 20 characters. The characters in an encoded password are always
+in the graphic ASCII range 33 through 126. Each successive pair of
+characters in the password encodes a single byte in the plaintext
+password.
+
+Use the following algorithm to decode a pair of characters:
+
+1. Let `a` be the ASCII code of the first character, and `b` be the
+ ASCII code of the second character.
+
+2. Let `ah` be the most significant 4 bits of `a`. Find the line in
+ the table below that has `ah` on the left side. The right side of
+ the line is a set of possible values for the most significant 4
+ bits of the decoded byte.
+
+ ```
+ 2 ⇒ 2367
+ 3 ⇒ 0145
+ 47 ⇒ 89cd
+ 56 ⇒ abef
+ ```
+
+3. Let `bh` be the most significant 4 bits of `b`. Find the line in
+ the second table below that has `bh` on the left side. The right
+ side of the line is a set of possible values for the most
+ significant 4 bits of the decoded byte. Together with the results
+ of the previous step, only a single possibility is left.
+
+ ```
+ 2 ⇒ 139b
+ 3 ⇒ 028a
+ 47 ⇒ 46ce
+ 56 ⇒ 57df
+ ```
+
+4. Let `al` be the least significant 4 bits of `a`. Find the line in
+ the table below that has `al` on the left side. The right side of
+ the line is a set of possible values for the least significant 4
+ bits of the decoded byte.
+
+ ```
+ 03cf ⇒ 0145
+ 12de ⇒ 2367
+ 478b ⇒ 89cd
+ 569a ⇒ abef
+ ```
+
+5. Let `bl` be the least significant 4 bits of `b`. Find the line in
+ the table below that has `bl` on the left side. The right side of
+ the line is a set of possible values for the least significant 4
+ bits of the decoded byte. Together with the results of the
+ previous step, only a single possibility is left.
+
+ ```
+ 03cf ⇒ 028a
+ 12de ⇒ 139b
+ 478b ⇒ 46ce
+ 569a ⇒ 57df
+ ```
+
+## Example
+
+Consider the encoded character pair `-|`. `a` is 0x2d and `b` is
+0x7c, so `ah` is 2, `bh` is 7, `al` is 0xd, and `bl` is 0xc. `ah`
+means that the most significant four bits of the decoded character is
+2, 3, 6, or 7, and `bh` means that they are 4, 6, 0xc, or 0xe. The
+single possibility in common is 6, so the most significant four bits
+are 6. Similarly, `al` means that the least significant four bits are
+2, 3, 6, or 7, and `bl` means they are 0, 2, 8, or 0xa, so the least
+significant four bits are 2. The decoded character is therefore 0x62,
+the letter `b`.
+