From: Ben Pfaff Date: Fri, 9 May 2025 21:04:01 +0000 (-0700) Subject: work on manual X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=b10ceab2a9db03a3776fbbaeaf53e768317af6a4;p=pspp work on manual --- diff --git a/rust/doc/src/SUMMARY.md b/rust/doc/src/SUMMARY.md index ab2ede3ab6..fc4f7691ee 100644 --- a/rust/doc/src/SUMMARY.md +++ b/rust/doc/src/SUMMARY.md @@ -195,8 +195,6 @@ - [SPSS Viewer File Format](spv/index.md) - [Structure Members](spv/structure.md) - [Light Detail Members](spv/light-detail.md) -- [Encrypted File Wrappers](encrypted-wrapper/index.md) - - [Common Wrapper Format](encrypted-wrapper/common-wrapper-format.md) - - [Password Encoding](encrypted-wrapper/password-encoding.md) +- [Encrypted File Wrappers](encrypted-wrapper.md) - [SPSS Portable File Format](portable.md) - [SPSS/PC+ System File Format](pc+.md) \ No newline at end of file diff --git a/rust/doc/src/encrypted-wrapper.md b/rust/doc/src/encrypted-wrapper.md new file mode 100644 index 0000000000..1b8994a086 --- /dev/null +++ b/rust/doc/src/encrypted-wrapper.md @@ -0,0 +1,185 @@ +# Encrypted File Wrappers + +SPSS 21 and later can package multiple kinds of files inside an +encrypted wrapper. The wrapper has a common format, regardless of the +kind of the file that it contains. + +> ⚠️ Warning: The SPSS encryption wrapper is poorly designed. When the +password is unknown, it is much cheaper and faster to decrypt a file +encrypted this way than if a well designed alternative were used. If +you must use this format, use a 10-byte randomly generated password. + +## Common Wrapper Format + +An encrypted file wrapper begins with the following 36-byte header, +where `xxx` identifies the type of file encapsulated: `SAV` for a system +file, `SPS` for a syntax file, `SPV` for a viewer file. PSPP code for +identifying these files just checks for the `ENCRYPTED` keyword at +offset 8, but the other bytes are also fixed in practice: + +``` +0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE| +0010 44 xx xx xx 15 00 00 00 00 00 00 00 00 00 00 00 |Dxxx............| +0020 00 00 00 00 |....| +``` + +Following the fixed header is essentially the regular contents of the +encapsulated file in its usual format, with each 16-byte block +encrypted with AES-256 in ECB mode. + +To make the plaintext an even multiple of 16 bytes in length, the +encryption process appends PKCS #7 padding, as specified in RFC 5652 +section 6.3. Padding appends 1 to 16 bytes to the plaintext, in which +each byte of padding is the number of padding bytes added. If the +plaintext is, for example, 2 bytes short of a multiple of 16, the +padding is 2 bytes with value 02; if the plaintext is a multiple of 16 +bytes in length, the padding is 16 bytes with value 0x10. + +The AES-256 key is derived from a password in the following way: + +1. Start from the literal password typed by the user. Truncate it to + at most 10 bytes, then append as many null bytes as necessary until + there are exactly 32 bytes. Call this `password`. + +2. Let `constant` be the following 73-byte constant: + + ``` + 0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11 + 0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80 + 0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85 + 0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1 + 0040 e1 83 07 d8 0d 00 00 01 00 + ``` + +3. Compute `CMAC-AES-256(password, constant)`. Call the 16-byte + result `cmac`. + +4. The 32-byte AES-256 key is `cmac || cmac`, that is, `cmac` repeated + twice. + +### Example + +Consider the password `pspp`. `password` is: + +``` +0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............| +0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| +``` + +`cmac` is: + +``` +0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 +``` + +The AES-256 key is: + +``` +0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 +0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 +``` + +### Checking Passwords + +A program reading an encrypted file may wish to verify that the +password it was given is the correct one. One way is to verify that +the PKCS #7 padding at the end of the file is well formed. However, +any plaintext that ends in byte 01 is well formed PKCS #7, meaning +that about 1 in 256 keys will falsely pass this test. This might be +acceptable for interactive use, but the false positive rate is too +high for a brute-force search of the password space. + +A better test requires some knowledge of the file format being +wrapped, to obtain a "magic number" for the beginning of the file. + +* The plaintext of system files begins with `$FL2@(#)` or `$FL3@(#)`. + +* Before encryption, a syntax file is prefixed with a line at the + beginning of the form `* Encoding: ENCODING.`, where ENCODING is the + encoding used for the rest of the file, e.g. `windows-1252`. Thus, + `* Encoding` may be used as a magic number for system files. + +* The plaintext of viewer files begins with `50 4b 03 04 14 00 08` (`50 + 4b` is `PK`). + +## Password Encoding + +SPSS also supports what it calls "encrypted passwords." + +> ⚠️ Warning: SPSS "encrypted passwords" are not encrypted. They are +encoded with a simple, fixed scheme and can be decoded to the original +password using the rules described below. + +An encoded password is always a multiple of 2 characters long, and +never longer than 20 characters. The characters in an encoded +password are always in the graphic ASCII range 33 through 126. Each +successive pair of characters in the password encodes a single byte in +the plaintext password. + +Use the following algorithm to decode a pair of characters: + +1. Let `a` be the ASCII code of the first character, and `b` be the + ASCII code of the second character. + +2. Let `ah` be the most significant 4 bits of `a`. Find the line in + the table below that has `ah` on the left side. The right side of + the line is a set of possible values for the most significant 4 + bits of the decoded byte. + + ``` + 2 ⇒ 2367 + 3 ⇒ 0145 + 47 ⇒ 89cd + 56 ⇒ abef + ``` + +3. Let `bh` be the most significant 4 bits of `b`. Find the line in + the second table below that has `bh` on the left side. The right + side of the line is a set of possible values for the most + significant 4 bits of the decoded byte. Together with the results + of the previous step, only a single possibility is left. + + ``` + 2 ⇒ 139b + 3 ⇒ 028a + 47 ⇒ 46ce + 56 ⇒ 57df + ``` + +4. Let `al` be the least significant 4 bits of `a`. Find the line in + the table below that has `al` on the left side. The right side of + the line is a set of possible values for the least significant 4 + bits of the decoded byte. + + ``` + 03cf ⇒ 0145 + 12de ⇒ 2367 + 478b ⇒ 89cd + 569a ⇒ abef + ``` + +5. Let `bl` be the least significant 4 bits of `b`. Find the line in + the table below that has `bl` on the left side. The right side of + the line is a set of possible values for the least significant 4 + bits of the decoded byte. Together with the results of the + previous step, only a single possibility is left. + + ``` + 03cf ⇒ 028a + 12de ⇒ 139b + 478b ⇒ 46ce + 569a ⇒ 57df + ``` + +### Example + +Consider the encoded character pair `-|`. `a` is 0x2d and `b` is +0x7c, so `ah` is 2, `bh` is 7, `al` is 0xd, and `bl` is 0xc. `ah` +means that the most significant four bits of the decoded character is +2, 3, 6, or 7, and `bh` means that they are 4, 6, 0xc, or 0xe. The +single possibility in common is 6, so the most significant four bits +are 6. Similarly, `al` means that the least significant four bits are +2, 3, 6, or 7, and `bl` means they are 0, 2, 8, or 0xa, so the least +significant four bits are 2. The decoded character is therefore 0x62, +the letter `b`. + diff --git a/rust/doc/src/encrypted-wrapper/common-wrapper-format.md b/rust/doc/src/encrypted-wrapper/common-wrapper-format.md deleted file mode 100644 index d7d6365f7a..0000000000 --- a/rust/doc/src/encrypted-wrapper/common-wrapper-format.md +++ /dev/null @@ -1,93 +0,0 @@ -# Common Wrapper Format - -An encrypted file wrapper begins with the following 36-byte header, -where `xxx` identifies the type of file encapsulated: `SAV` for a system -file, `SPS` for a syntax file, `SPV` for a viewer file. PSPP code for -identifying these files just checks for the `ENCRYPTED` keyword at -offset 8, but the other bytes are also fixed in practice: - -``` -0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE| -0010 44 xx xx xx 15 00 00 00 00 00 00 00 00 00 00 00 |Dxxx............| -0020 00 00 00 00 |....| -``` - -Following the fixed header is essentially the regular contents of the -encapsulated file in its usual format, with each 16-byte block -encrypted with AES-256 in ECB mode. - -To make the plaintext an even multiple of 16 bytes in length, the -encryption process appends PKCS #7 padding, as specified in RFC 5652 -section 6.3. Padding appends 1 to 16 bytes to the plaintext, in which -each byte of padding is the number of padding bytes added. If the -plaintext is, for example, 2 bytes short of a multiple of 16, the -padding is 2 bytes with value 02; if the plaintext is a multiple of 16 -bytes in length, the padding is 16 bytes with value 0x10. - -The AES-256 key is derived from a password in the following way: - -1. Start from the literal password typed by the user. Truncate it to - at most 10 bytes, then append as many null bytes as necessary until - there are exactly 32 bytes. Call this `password`. - -2. Let `constant` be the following 73-byte constant: - - ``` - 0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11 - 0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80 - 0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85 - 0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1 - 0040 e1 83 07 d8 0d 00 00 01 00 - ``` - -3. Compute `CMAC-AES-256(password, constant)`. Call the 16-byte - result `cmac`. - -4. The 32-byte AES-256 key is `cmac || cmac`, that is, `cmac` repeated - twice. - -## Example - -Consider the password `pspp`. `password` is: - -``` -0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............| -0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| -``` - -`cmac` is: - -``` -0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 -``` - -The AES-256 key is: - -``` -0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 -0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 -``` - -## Checking Passwords - -A program reading an encrypted file may wish to verify that the -password it was given is the correct one. One way is to verify that -the PKCS #7 padding at the end of the file is well formed. However, -any plaintext that ends in byte 01 is well formed PKCS #7, meaning -that about 1 in 256 keys will falsely pass this test. This might be -acceptable for interactive use, but the false positive rate is too -high for a brute-force search of the password space. - -A better test requires some knowledge of the file format being -wrapped, to obtain a "magic number" for the beginning of the file. - -* The plaintext of system files begins with `$FL2@(#)` or `$FL3@(#)`. - -* Before encryption, a syntax file is prefixed with a line at the - beginning of the form `* Encoding: ENCODING.`, where ENCODING is the - encoding used for the rest of the file, e.g. `windows-1252`. Thus, - `* Encoding` may be used as a magic number for system files. - -* The plaintext of viewer files begins with `50 4b 03 04 14 00 08` (`50 - 4b` is `PK`). - diff --git a/rust/doc/src/encrypted-wrapper/index.md b/rust/doc/src/encrypted-wrapper/index.md deleted file mode 100644 index 7a0c5b8f6b..0000000000 --- a/rust/doc/src/encrypted-wrapper/index.md +++ /dev/null @@ -1,10 +0,0 @@ -# Encrypted File Wrappers - -SPSS 21 and later can package multiple kinds of files inside an -encrypted wrapper. The wrapper has a common format, regardless of the -kind of the file that it contains. - -> ⚠️ Warning: The SPSS encryption wrapper is poorly designed. When the -password is unknown, it is much cheaper and faster to decrypt a file -encrypted this way than if a well designed alternative were used. If -you must use this format, use a 10-byte randomly generated password. diff --git a/rust/doc/src/encrypted-wrapper/password-encoding.md b/rust/doc/src/encrypted-wrapper/password-encoding.md deleted file mode 100644 index a123d77c18..0000000000 --- a/rust/doc/src/encrypted-wrapper/password-encoding.md +++ /dev/null @@ -1,77 +0,0 @@ -# Password Encoding - -SPSS also supports what it calls "encrypted passwords." These are not -encrypted. They are encoded with a simple, fixed scheme. An encoded -password is always a multiple of 2 characters long, and never longer -than 20 characters. The characters in an encoded password are always -in the graphic ASCII range 33 through 126. Each successive pair of -characters in the password encodes a single byte in the plaintext -password. - -Use the following algorithm to decode a pair of characters: - -1. Let `a` be the ASCII code of the first character, and `b` be the - ASCII code of the second character. - -2. Let `ah` be the most significant 4 bits of `a`. Find the line in - the table below that has `ah` on the left side. The right side of - the line is a set of possible values for the most significant 4 - bits of the decoded byte. - - ``` - 2 ⇒ 2367 - 3 ⇒ 0145 - 47 ⇒ 89cd - 56 ⇒ abef - ``` - -3. Let `bh` be the most significant 4 bits of `b`. Find the line in - the second table below that has `bh` on the left side. The right - side of the line is a set of possible values for the most - significant 4 bits of the decoded byte. Together with the results - of the previous step, only a single possibility is left. - - ``` - 2 ⇒ 139b - 3 ⇒ 028a - 47 ⇒ 46ce - 56 ⇒ 57df - ``` - -4. Let `al` be the least significant 4 bits of `a`. Find the line in - the table below that has `al` on the left side. The right side of - the line is a set of possible values for the least significant 4 - bits of the decoded byte. - - ``` - 03cf ⇒ 0145 - 12de ⇒ 2367 - 478b ⇒ 89cd - 569a ⇒ abef - ``` - -5. Let `bl` be the least significant 4 bits of `b`. Find the line in - the table below that has `bl` on the left side. The right side of - the line is a set of possible values for the least significant 4 - bits of the decoded byte. Together with the results of the - previous step, only a single possibility is left. - - ``` - 03cf ⇒ 028a - 12de ⇒ 139b - 478b ⇒ 46ce - 569a ⇒ 57df - ``` - -## Example - -Consider the encoded character pair `-|`. `a` is 0x2d and `b` is -0x7c, so `ah` is 2, `bh` is 7, `al` is 0xd, and `bl` is 0xc. `ah` -means that the most significant four bits of the decoded character is -2, 3, 6, or 7, and `bh` means that they are 4, 6, 0xc, or 0xe. The -single possibility in common is 6, so the most significant four bits -are 6. Similarly, `al` means that the least significant four bits are -2, 3, 6, or 7, and `bl` means they are 0, 2, 8, or 0xa, so the least -significant four bits are 2. The decoded character is therefore 0x62, -the letter `b`. -