* Plain text output is no longer divided into pages, since it is now
rarely printed on paper.
- * pspp-convert: New "-a", "-l", "--password-list" options to search
- for an encrypted file's password.
+ * pspp-convert:
+
+ - New support to decrypt encrypted viewer (SPV) files. The
+ encrypted viewer file format is unacceptably insecure, so to
+ discourage its use PSPP and PSPPIRE do not directly read or write
+ this format.
+
+ - New "-a", "-l", "--password-list" options to search for an
+ encrypted file's password.
* Improvements to SAVE DATA COLLECTION support for MDD files.
kind of the file that it contains.
@quotation Warning
-The SPSS encryption wrapper is poorly designed. It is much cheaper
-and faster to decrypt a file encrypted this way than if a well
-designed alternative were used. If you must use this format, use a
-10-byte randomly generated password.
+The SPSS encryption wrapper is poorly designed. When the password is
+unknown, it is much cheaper and faster to decrypt a file encrypted
+this way than if a well designed alternative were used. If you must
+use this format, use a 10-byte randomly generated password.
@end quotation
@menu
@node Common Wrapper Format
@section Common Wrapper Format
-This section describes the general format of an SPSS encrypted file
-wrapper. The following sections describe the details for each kind of
-encapsulated file.
-
An encrypted file wrapper begins with the following 36-byte header,
-where @i{xxx} identifies the type of file encapsulated, as described
-in the following sections:
+where @i{xxx} identifies the type of file encapsulated: @code{SAV} for
+a system file, @code{SPS} for a syntax file, @code{SPV} for a viewer
+file. PSPP code for identifying these files just checks for the
+@code{ENCRYPTED} keyword at offset 8, but the other bytes are also
+fixed in practice:
@example
0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE|
Following the fixed header is essentially the regular contents of the
encapsulated file in its usual format, with each 16-byte block
-encrypted with AES-256 in ECB mode. Each type of encapsulated file is
-processed in a slightly different way before encryption, as described
-in the following sections. The AES-256 key is derived from a password
-in the following way:
+encrypted with AES-256 in ECB mode.
+
+To make the plaintext an even multiple of 16 bytes in length, the
+encryption process appends PKCS #7 padding, as specified in RFC 5652
+section 6.3. Padding appends 1 to 16 bytes to the plaintext, in which
+each byte of padding is the number of padding bytes added. If the
+plaintext is, for example, 2 bytes short of a multiple of 16, the
+padding is 2 bytes with value 02; if the plaintext is a multiple of 16
+bytes in length, the padding is 16 bytes with value 0x10.
+
+The AES-256 key is derived from a password in the following way:
@enumerate
@item
@end example
@menu
-* Encrypted System Files::
-* Encrypted Syntax Files::
+* Checking Passwords::
@end menu
-@node Encrypted System Files
-@subsection Encrypted System Files
-
-An encrypted system file uses @code{SAV} as the identifier in its
-header.
+@node Checking Passwords
+@subsection Checking Passwords
-Before encryption, a system file is appended with as many null bytes
-as needed (possibly zero) to make it a multiple of 16 bytes in length,
-so that it fits exactly in a series of AES blocks. (This implies that
-encrypted system files must always be compressed, because otherwise a
-system file with only a single variable might appear to have an extra
-case.)
+A program reading an encrypted file may wish to verify that the
+password it was given is the correct one. One way is to verify that
+the PKCS #7 padding at the end of the file is well formed. However,
+any plaintext that ends in byte 01 is well formed PKCS #7, meaning
+that about 1 in 256 keys will falsely pass this test. This might be
+acceptable for interactive use, but the false positive rate is too
+high for a brute-force search of the password space.
-@node Encrypted Syntax Files
-@subsection Encrypted Syntax Files
+A better test requires some knowledge of the file format being
+wrapped, to obtain a ``magic number'' for the beginning of the file.
-An encrypted syntax file uses @code{SPS} as the identifier in its
-header.
+@itemize @bullet
+@item
+The plaintext of system files begins with @code{$FL2@@(#)} or
+@code{$FL3@@(#)}.
+@item
Before encryption, a syntax file is prefixed with a line at the
beginning of the form @code{* Encoding: @var{encoding}.}, where
@var{encoding} is the encoding used for the rest of the file,
-e.g. @code{windows-1252}. The syntax file is then appended with as
-many bytes with value 04 as needed (possibly zero) to make it a
-multiple of 16 bytes in length.
+e.g.@: @code{windows-1252}. Thus, @code{* Encoding} may be used as a
+magic number for system files.
+
+@item
+The plaintext of viewer files begins with 50 4b 03 04 14 00 08 (50 4b
+is @code{PK}).
+@end itemize
@node Password Encoding
@section Password Encoding
@end table
@command{pspp-convert} can convert most input formats to most output
-formats. Encrypted system file and syntax files are exceptions: if
-the input file is in an encrypted format, then the output file must be
-the same format (decrypted). To decrypt such a file, specify the
-encrypted file as @var{input}. The output will be the equivalent
-plaintext file.
+formats. Encrypted SPSS file formats are exceptions: if the input
+file is in an encrypted format, then the output file will be the same
+format (decrypted). To decrypt such a file, specify the encrypted
+file as @var{input}. The output will be the equivalent plaintext
+file. Options for the output format are ignored in this case.
The password for encrypted files can be specified a few different
ways. If the password is known, use the @option{-p} option
#include "libpspp/cast.h"
#include "libpspp/cmac-aes256.h"
#include "libpspp/message.h"
+#include "libpspp/str.h"
#include "gl/minmax.h"
#include "gl/rijndael-alg-fst.h"
struct encrypted_file
{
+ const struct file_handle *fh;
FILE *file;
- enum { SYSTEM, SYNTAX } type;
int error;
- uint8_t ciphertext[16];
- uint8_t plaintext[16];
- unsigned int ofs, n;
+ uint8_t ciphertext[256];
+ uint8_t plaintext[256];
+ unsigned int ofs, n, readable;
uint32_t rk[4 * (RIJNDAEL_MAXNR + 1)];
int Nr;
};
static bool decode_password (const char *input, char output[11]);
-static bool fill_buffer (struct encrypted_file *);
+static void fill_buffer (struct encrypted_file *);
/* If FILENAME names an encrypted SPSS file, returns 1 and initializes *FP
for further use by the caller.
encrypted_file_open (struct encrypted_file **fp, const struct file_handle *fh)
{
struct encrypted_file *f;
- char header[36 + 16];
+ enum { HEADER_SIZE = 36 };
+ char data[HEADER_SIZE + sizeof f->ciphertext];
int retval;
int n;
f = xmalloc (sizeof *f);
f->error = 0;
+ f->fh = fh;
f->file = fn_open (fh, "rb");
if (f->file == NULL)
{
goto error;
}
- n = fread (header, 1, sizeof header, f->file);
- if (n != sizeof header)
+ n = fread (data, 1, sizeof data, f->file);
+ if (n < HEADER_SIZE + 2 * 16)
{
int error = feof (f->file) ? 0 : errno;
if (error)
goto error;
}
- if (!memcmp (header + 8, "ENCRYPTEDSAV", 12))
- f->type = SYSTEM;
- else if (!memcmp (header + 8, "ENCRYPTEDSPS", 12))
- f->type = SYNTAX;
- else
+ if (memcmp (data + 8, "ENCRYPTED", 9))
{
retval = 0;
goto error;
}
- memcpy (f->ciphertext, header + 36, 16);
- f->n = 16;
+ f->n = n - HEADER_SIZE;
+ memcpy (f->ciphertext, data + HEADER_SIZE, f->n);
f->ofs = 0;
+ f->readable = 0;
*fp = f;
return 1;
uint8_t *buf = buf_;
size_t ofs = 0;
- if (f->error)
- return 0;
-
while (ofs < n)
{
- unsigned int chunk = MIN (n - ofs, f->n - f->ofs);
+ unsigned int chunk = MIN (n - ofs, f->readable - f->ofs);
if (chunk > 0)
{
memcpy (buf + ofs, &f->plaintext[f->ofs], chunk);
}
else
{
- if (!fill_buffer (f))
- return ofs;
+ fill_buffer (f);
+ if (!f->readable)
+ break;
}
}
int
encrypted_file_close (struct encrypted_file *f)
{
- int error = f->error;
+ int error = f->error > 0 ? f->error : 0;
if (fclose (f->file) == EOF && !error)
error = errno;
free (f);
return error;
}
-
-/* Returns true if F is an encrypted system file,
- false if it is an encrypted syntax file. */
-bool
-encrypted_file_is_sav (const struct encrypted_file *f)
-{
- return f->type == SYSTEM;
-}
\f
#define b(x) (1 << (x))
return true;
}
+/* Check for magic number at beginning of plaintext decrypted from F. */
+static bool
+is_good_magic (const struct encrypted_file *f)
+{
+ char plaintext[16];
+ rijndaelDecrypt (f->rk, f->Nr, CHAR_CAST (const char *, f->ciphertext),
+ plaintext);
+
+ const struct substring magic[] = {
+ ss_cstr ("$FL2@(#)"),
+ ss_cstr ("$FL3@(#)"),
+ ss_cstr ("* Encoding"),
+ ss_buffer ("PK\3\4\x14\0\x8", 7)
+ };
+ for (size_t i = 0; i < sizeof magic / sizeof *magic; i++)
+ if (ss_equals (ss_buffer (plaintext, magic[i].length), magic[i]))
+ return true;
+ return false;
+}
+
/* Attempts to use plaintext password PASSWORD to unlock F. Returns true if
successful, otherwise false. */
bool
assert (sizeof key == 32);
f->Nr = rijndaelKeySetupDec (f->rk, CHAR_CAST (const char *, key), 256);
- /* Check for magic number at beginning of plaintext. */
- rijndaelDecrypt (f->rk, f->Nr,
- CHAR_CAST (const char *, f->ciphertext),
- CHAR_CAST (char *, f->plaintext));
+ if (!is_good_magic (f))
+ return false;
- const char *magic = f->type == SYSTEM ? "$FL?@(#)" : "* Encoding";
- for (int i = 0; magic[i]; i++)
- if (magic[i] != '?' && f->plaintext[i] != magic[i])
- return false;
+ fill_buffer (f);
return true;
}
-static bool
+/* Checks the 16 bytes of PLAINTEXT for PKCS#7 padding bytes. Returns the
+ number of padding bytes (between 1 and 16, inclusive), if well formed,
+ otherwise 0. */
+static int
+check_padding (const uint8_t *plaintext)
+{
+ uint8_t pad = plaintext[15];
+ if (pad < 1 || pad > 16)
+ return 0;
+
+ for (size_t i = 1; i < pad; i++)
+ if (plaintext[15 - i] != pad)
+ return 0;
+
+ return pad;
+}
+
+static void
fill_buffer (struct encrypted_file *f)
{
- f->n = fread (f->ciphertext, 1, sizeof f->ciphertext, f->file);
+ /* Move bytes between f->ciphertext[f->readable] and f->ciphertext[f->n] to
+ the beginning of f->ciphertext.
+
+ The first time this is called for a given file, it does nothing because
+ f->readable is initially 0. After that, in steady state f->readable is 16
+ less than f->n, so the final 16 bytes of ciphertext become the first 16
+ bytes. This is necessary because we don't know until we hit end-of-file
+ whether padding in the last 16 bytes will require us to discard up to 16
+ bytes of data. */
+ memmove (f->ciphertext, f->ciphertext + f->readable, f->n - f->readable);
+ f->n -= f->readable;
+ f->readable = 0;
f->ofs = 0;
- if (f->n == sizeof f->ciphertext)
+
+ if (f->error) /* or assert(!f->error)? */
+ return;
+
+ /* Read new ciphernext, extending f->n, until we've filled up f->ciphertext
+ or until we reach end-of-file or encounter an error.
+
+ Afterward, f->error indicates what happened. */
+ while (f->n < sizeof f->ciphertext)
{
- rijndaelDecrypt (f->rk, f->Nr,
- CHAR_CAST (const char *, f->ciphertext),
- CHAR_CAST (char *, f->plaintext));
- if (f->type == SYNTAX)
+ size_t retval = fread (f->ciphertext + f->n, 1,
+ sizeof f->ciphertext - f->n, f->file);
+ if (!retval)
{
- const char *eof = memchr (f->plaintext, '\04', sizeof f->plaintext);
- if (eof)
- f->n = CHAR_CAST (const uint8_t *, eof) - f->plaintext;
+ f->error = ferror (f->file) ? errno : EOF;
+ break;
}
- return true;
+ f->n += retval;
+ }
+
+ /* Calculate the number of readable bytes. If we're at the end of the file,
+ then we can read everything, otherwise we hold back the last 16 bytes
+ because they might be padding or not. */
+ if (!f->error)
+ {
+ assert (f->n == sizeof f->ciphertext);
+ f->readable = f->n - 16;
}
else
+ f->readable = f->n;
+
+ /* If we have an incomplete block then trim it off and complain. */
+ unsigned int overhang = f->readable % 16;
+ if (overhang)
{
- if (ferror (f->file))
- f->error = errno;
- return false;
+ assert (f->error);
+ msg (ME, _("%s: encrypted file corrupted (ends in incomplete %u-byte "
+ "ciphertext block)"),
+ fh_get_file_name (f->fh), overhang);
+ f->error = EIO;
+ f->readable -= overhang;
+ }
+
+ /* Decrypt all the blocks we have. */
+ for (size_t ofs = 0; ofs < f->readable; ofs += 16)
+ rijndaelDecrypt (f->rk, f->Nr,
+ CHAR_CAST (const char *, f->ciphertext + ofs),
+ CHAR_CAST (char *, f->plaintext + ofs));
+
+ /* If we're at end of file then check the padding and trim it off. */
+ if (f->error == EOF)
+ {
+ unsigned int pad = check_padding (&f->plaintext[f->n - 16]);
+ if (!pad)
+ {
+ msg (ME, _("%s: encrypted file corrupted (ends with bad padding)"),
+ fh_get_file_name (f->fh));
+ f->error = EIO;
+ return;
+ }
+
+ f->readable -= pad;
}
}
size_t encrypted_file_read (struct encrypted_file *, void *, size_t);
int encrypted_file_close (struct encrypted_file *);
-bool encrypted_file_is_sav (const struct encrypted_file *);
-
#endif /* encrypted-file.h */
tests/data/v13.sav \
tests/data/v14.sav \
tests/data/test-encrypted.sps \
+ tests/data/test-decrypted.spv \
+ tests/data/test-encrypted.spv \
tests/language/mann-whitney.txt \
tests/language/data-io/Book1.gnm.unzipped \
tests/language/data-io/test.ods \
])
AT_CLEANUP
+AT_SETUP([decrypt an encrypted viewer file])
+AT_KEYWORDS([syntax file decrypt pspp-convert spv])
+AT_CHECK([pspp-convert $srcdir/data/test-encrypted.spv test.spv -p Password1])
+AT_CHECK([cmp $srcdir/data/test-decrypted.spv test.spv])
+AT_CLEANUP
.
.PP
\fBpspp\-convert\fR can convert most input formats to most output
-formats. Encrypted system file and syntax files are exceptions: if
-the input file is in an encrypted format, then the output file must
-be the same format (decrypted).
+formats. Encrypted SPSS file formats are exceptions: if the input
+file is in an encrypted format, then the output file will be the same
+format (decrypted). Options for the output format are ignored in this
+case.
.
.SH "OPTIONS"
.SS "General Options"
output_fh = fh_create_file (NULL, output_filename, NULL, fh_default_properties ());
if (encrypted_file_open (&enc, input_fh) > 0)
{
- if (encrypted_file_is_sav (enc))
- {
- if (strcmp (output_format, "sav") && strcmp (output_format, "sys"))
- error (1, 0, _("can only convert encrypted data file to sav or "
- "sys format"));
- }
- else
- {
- if (strcmp (output_format, "sps"))
- error (1, 0, _("can only convert encrypted syntax file to sps "
- "format"));
- }
-
- if (!decrypt_file (enc, input_fh, output_fh, password,
+ if (decrypt_file (enc, input_fh, output_fh, password,
ds_cstr (&alphabet), length, password_list))
+ goto exit;
+ else
goto error;
-
- goto exit;
}