From: Ben Pfaff Date: Wed, 16 Sep 2015 06:00:31 +0000 (-0700) Subject: doc: Document encrypted syntax file format. X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=f5e337fe69d0ee75e26ffc4b2c571234e73fa6dd;p=pspp doc: Document encrypted syntax file format. Thanks to charlesjohnsont@outlook.com for providing an example. Bug #45974. --- diff --git a/doc/dev/encrypted-file-wrappers.texi b/doc/dev/encrypted-file-wrappers.texi new file mode 100644 index 0000000000..aa8b192b88 --- /dev/null +++ b/doc/dev/encrypted-file-wrappers.texi @@ -0,0 +1,209 @@ +@node Encrypted File Wrappers +@chapter Encrypted File Wrappers + +SPSS 21 and later can package multiple kinds of files inside an +encrypted wrapper. The wrapper has a common format, regardless of the +kind of the file that it contains. + +@quotation Warning +The SPSS encryption wrapper is poorly designed. It is much cheaper +and faster to decrypt a file encrypted this way than if a well +designed alternative were used. If you must use this format, use a +10-byte randomly generated password. +@end quotation + +@menu +* Common Wrapper Format:: +* Password Encoding:: +@end menu + +@node Common Wrapper Format +@section Common Wrapper Format + +This section describes the general format of an SPSS encrypted file +wrapper. The following sections describe the details for each kind of +encapsulated file. + +An encrypted file wrapper begins with the following 36-byte header, +where @i{xxx} identifies the type of file encapsulated, as described +in the following sections: + +@example +0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE| +0010 44 @i{xx} @i{xx} @i{xx} 15 00 00 00 00 00 00 00 00 00 00 00 |D@i{xxx}............| +0020 00 00 00 00 |....| +@end example + +Following the fixed header is essentially the regular contents of the +encapsulated file in its usual format, with each 16-byte block +encrypted with AES-256 in ECB mode. Each type of encapsulated file is +processed in a slightly different way before encryption, as described +in the following sections. The AES-256 key is derived from a password +in the following way: + +@enumerate +@item +Start from the literal password typed by the user. Truncate it to at +most 10 bytes, then append as many null bytes as necessary until there +are exactly 32 bytes. Call this @var{password}. + +@item +Let @var{constant} be the following 73-byte constant: + +@example +0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11 +0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80 +0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85 +0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1 +0040 e1 83 07 d8 0d 00 00 01 00 +@end example + +@item +Compute CMAC-AES-256(@var{password}, @var{constant}). Call the +16-byte result @var{cmac}. + +@item +The 32-byte AES-256 key is @var{cmac} || @var{cmac}, that is, +@var{cmac} repeated twice. +@end enumerate + +@subheading Example + +Consider the password @samp{pspp}. @var{password} is: + +@example +0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............| +0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| +@end example + +@noindent +@var{cmac} is: + +@example +0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 +@end example + +@noindent +The AES-256 key is: + +@example +0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 +0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 +@end example + +@menu +* Encrypted System Files:: +* Encrypted Syntax Files:: +@end menu + +@node Encrypted System Files +@subsection Encrypted System Files + +An encrypted system file uses @code{SAV} as the identifier in its +header. + +Before encryption, a system file is appended with as many null bytes +as needed (possibly zero) to make it a multiple of 16 bytes in length, +so that it fits exactly in a series of AES blocks. (This implies that +encrypted system files must always be compressed, because otherwise a +system file with only a single variable might appear to have an extra +case.) + +@node Encrypted Syntax Files +@subsection Encrypted Syntax Files + +An encrypted syntax file uses @code{SPS} as the identifier in its +header. + +Before encryption, a syntax file is prefixed with a line at the +beginning of the form @code{* Encoding: @var{encoding}.}, where +@var{encoding} is the encoding used for the rest of the file, +e.g. @code{windows-1252}. The syntax file is then appended with as +many bytes with value 04 as needed (possibly zero) to make it a +multiple of 16 bytes in length. + +@node Password Encoding +@section Password Encoding + +SPSS also supports what it calls ``encrypted passwords.'' These are +not encrypted. They are encoded with a simple, fixed scheme. An +encoded password is always a multiple of 2 characters long, and never +longer than 20 characters. The characters in an encoded password are +always in the graphic ASCII range 33 through 126. Each successive +pair of characters in the password encodes a single byte in the +plaintext password. + +Use the following algorithm to decode a pair of characters: + +@enumerate +@item +Let @var{a} be the ASCII code of the first character, and @var{b} be +the ASCII code of the second character. + +@item +Let @var{ah} be the most significant 4 bits of @var{a}. Find the line +in the table below that has @var{ah} on the left side. The right side +of the line is a set of possible values for the most significant 4 +bits of the decoded byte. + +@display +@t{2 } @result{} @t{2367} +@t{3 } @result{} @t{0145} +@t{47} @result{} @t{89cd} +@t{56} @result{} @t{abef} +@end display + +@item +Let @var{bh} be the most significant 4 bits of @var{b}. Find the line +in the second table below that has @var{bh} on the left side. The +right side of the line is a set of possible values for the most +significant 4 bits of the decoded byte. Together with the results of +the previous step, only a single possibility is left. + +@display +@t{2 } @result{} @t{139b} +@t{3 } @result{} @t{028a} +@t{47} @result{} @t{46ce} +@t{56} @result{} @t{57df} +@end display + +@item +Let @var{al} be the least significant 4 bits of @var{a}. Find the +line in the table below that has @var{al} on the left side. The right +side of the line is a set of possible values for the least significant +4 bits of the decoded byte. + +@display +@t{03cf} @result{} @t{0145} +@t{12de} @result{} @t{2367} +@t{478b} @result{} @t{89cd} +@t{569a} @result{} @t{abef} +@end display + +@item +Let @var{bl} be the least significant 4 bits of @var{b}. Find the +line in the table below that has @var{bl} on the left side. The right +side of the line is a set of possible values for the least significant +4 bits of the decoded byte. Together with the results of the previous +step, only a single possibility is left. + +@display +@t{03cf} @result{} @t{028a} +@t{12de} @result{} @t{139b} +@t{478b} @result{} @t{46ce} +@t{569a} @result{} @t{57df} +@end display +@end enumerate + +@subheading Example + +Consider the encoded character pair @samp{-|}. @var{a} is +0x2d and @var{b} is 0x7c, so @var{ah} is 2, @var{bh} is 7, @var{al} is +0xd, and @var{bl} is 0xc. @var{ah} means that the most significant +four bits of the decoded character is 2, 3, 6, or 7, and @var{bh} +means that they are 4, 6, 0xc, or 0xe. The single possibility in +common is 6, so the most significant four bits are 6. Similarly, +@var{al} means that the least significant four bits are 2, 3, 6, or 7, +and @var{bl} means they are 0, 2, 8, or 0xa, so the least significant +four bits are 2. The decoded character is therefore 0x62, the letter +@samp{b}. diff --git a/doc/dev/system-file-format.texi b/doc/dev/system-file-format.texi index 19a6c05747..ff4c603451 100644 --- a/doc/dev/system-file-format.texi +++ b/doc/dev/system-file-format.texi @@ -89,7 +89,6 @@ possible to artificially synthesize files that use different encodings * Other Informational Records:: * Dictionary Termination Record:: * Data Record:: -* Encrypted System Files:: @end menu @node System File Record Structure @@ -1693,165 +1692,3 @@ system file. @end table @setfilename ignored - -@node Encrypted System Files -@section Encrypted System Files - -SPSS 21 and later support an encrypted system file format. - -@quotation Warning -The SPSS encrypted file format is poorly designed. It is much cheaper -and faster to decrypt a file encrypted this way than if a well -designed alternative were used. If you must use this format, use a -10-byte randomly generated password. -@end quotation - -@subheading Encrypted File Format - -Encrypted system files begin with the following 36-byte fixed header: - -@example -0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE| -0010 44 53 41 56 15 00 00 00 00 00 00 00 00 00 00 00 |DSAV............| -0020 00 00 00 00 |....| -@end example - -Following the fixed header is a complete system file in the usual -format, except that each 16-byte block is encrypted with AES-256 in -ECB mode. The AES-256 key is derived from a password in the following -way: - -@enumerate -@item -Start from the literal password typed by the user. Truncate it to at -most 10 bytes, then append (between 1 and 22) null bytes until there -are exactly 32 bytes. Call this @var{password}. - -@item -Let @var{constant} be the following 73-byte constant: - -@example -0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11 -0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80 -0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85 -0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1 -0040 e1 83 07 d8 0d 00 00 01 00 -@end example - -@item -Compute CMAC-AES-256(@var{password}, @var{constant}). Call the -16-byte result @var{cmac}. - -@item -The 32-byte AES-256 key is @var{cmac} || @var{cmac}, that is, -@var{cmac} repeated twice. -@end enumerate - -@subsubheading Example - -Consider the password @samp{pspp}. @var{password} is: - -@example -0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............| -0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| -@end example - -@noindent -@var{cmac} is: - -@example -0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 -@end example - -@noindent -The AES-256 key is: - -@example -0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 -0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45 -@end example - -@subheading Password Encoding - -SPSS also supports what it calls ``encrypted passwords.'' These are -not encrypted. They are encoded with a simple, fixed scheme. An -encoded password is always a multiple of 2 characters long, and never -longer than 20 characters. The characters in an encoded password are -always in the graphic ASCII range 33 through 126. Each successive -pair of characters in the password encodes a single byte in the -plaintext password. - -Use the following algorithm to decode a pair of characters: - -@enumerate -@item -Let @var{a} be the ASCII code of the first character, and @var{b} be -the ASCII code of the second character. - -@item -Let @var{ah} be the most significant 4 bits of @var{a}. Find the line -in the table below that has @var{ah} on the left side. The right side -of the line is a set of possible values for the most significant 4 -bits of the decoded byte. - -@display -@t{2 } @result{} @t{2367} -@t{3 } @result{} @t{0145} -@t{47} @result{} @t{89cd} -@t{56} @result{} @t{abef} -@end display - -@item -Let @var{bh} be the most significant 4 bits of @var{b}. Find the line -in the second table below that has @var{bh} on the left side. The -right side of the line is a set of possible values for the most -significant 4 bits of the decoded byte. Together with the results of -the previous step, only a single possibility is left. - -@display -@t{2 } @result{} @t{139b} -@t{3 } @result{} @t{028a} -@t{47} @result{} @t{46ce} -@t{56} @result{} @t{57df} -@end display - -@item -Let @var{al} be the least significant 4 bits of @var{a}. Find the -line in the table below that has @var{al} on the left side. The right -side of the line is a set of possible values for the least significant -4 bits of the decoded byte. - -@display -@t{03cf} @result{} @t{0145} -@t{12de} @result{} @t{2367} -@t{478b} @result{} @t{89cd} -@t{569a} @result{} @t{abef} -@end display - -@item -Let @var{bl} be the least significant 4 bits of @var{b}. Find the -line in the table below that has @var{bl} on the left side. The right -side of the line is a set of possible values for the least significant -4 bits of the decoded byte. Together with the results of the previous -step, only a single possibility is left. - -@display -@t{03cf} @result{} @t{028a} -@t{12de} @result{} @t{139b} -@t{478b} @result{} @t{46ce} -@t{569a} @result{} @t{57df} -@end display -@end enumerate - -@subsubheading Example - -Consider the encoded character pair @samp{-|}. @var{a} is -0x2d and @var{b} is 0x7c, so @var{ah} is 2, @var{bh} is 7, @var{al} is -0xd, and @var{bl} is 0xc. @var{ah} means that the most significant -four bits of the decoded character is 2, 3, 6, or 7, and @var{bh} -means that they are 4, 6, 0xc, or 0xe. The single possibility in -common is 6, so the most significant four bits are 6. Similarly, -@var{al} means that the least significant four bits are 2, 3, 6, or 7, -and @var{bl} means they are 0, 2, 8, or 0xa, so the least significant -four bits are 2. The decoded character is therefore 0x62, the letter -@samp{b}. diff --git a/doc/pspp-dev.texi b/doc/pspp-dev.texi index 80983a951f..07512fd88b 100644 --- a/doc/pspp-dev.texi +++ b/doc/pspp-dev.texi @@ -37,7 +37,7 @@ This manual is for GNU PSPP version @value{VERSION}, software for statistical analysis. -Copyright @copyright{} 1997, 1998, 2004, 2005, 2007, 2010, 2014 Free Software Foundation, Inc. +Copyright @copyright{} 1997, 1998, 2004, 2005, 2007, 2010, 2014, 2015 Free Software Foundation, Inc. @quotation Permission is granted to copy, distribute and/or modify this document @@ -84,6 +84,7 @@ Free Documentation License". * Portable File Format:: Format of PSPP portable files. * System File Format:: Format of PSPP system files. * SPSS/PC+ System File Format:: Format of SPSS/PC+ system files. +* Encrypted File Wrappers:: Common wrapper for encrypted SPSS files. * q2c Input Format:: Format of syntax accepted by q2c. * GNU Free Documentation License:: License for copying this manual. @@ -102,6 +103,7 @@ Free Documentation License". @include dev/portable-file-format.texi @include dev/system-file-format.texi @include dev/pc+-file-format.texi +@include dev/encrypted-file-wrappers.texi @include dev/q2c.texi @include fdl.texi