Quantcast
Viewing all articles
Browse latest Browse all 29

Adding Native Encryption to ‘dump’, part 2: Cryptanalysis

This is the second of a two articles on adding native encryption to the Unix ‘dump’ application. We conclude with a cryptanalysis of our options. N.B., this is an overview and not a formal paper.

Threat Analysis

DUMP files are most commonly used in two situations:

  • Backups which are used for disaster recovery and must only be kept for a week to month. The main concern is unauthorized disclosure.
  • Archives which are used in legal proceedings and must be kept indefinitely. The main concerns are unauthorized disclosure, data integrity and nonreputability.

The second case is more restrictive so it will be used in subsequent analysis.

A few requirements follow directly from the nature of our problem:

  • We cannot encrypt the full archive as a whole – besides performance issues archives are often multiple gigabytes in size and a single bit error could cause a massive loss of data.
  • We cannot run a third-party app such as GPG on individual files or tape segments due to the performance hits involved in launching an app thousands of times. We may be able to use their supporting libraries.
  • We must use standard algorithms.

Some typical threat scenarios are:

  • An employee leaves a backup tape in an unlocked car and it is stolen. (Note: “backup tape” includes laptops containing backup files.)
  • A rogue employee duplicates a backup tape and provides it to a third party.
  • A rogue employee replaces a valid backup tape with one provided by a third party and “restores” files from it.
  • A lawyer prepares to respond to a wrongful termination suit and retrieves the appropriate archive tapes. He does not have the necessary decryption keys.
  • Same lawyer – he retrieves the appropriate archive tapes but is unable to state with full confidence that the files have not been modified.
  • Same lawyer – he retrieves the appropriate archive tapes and believes they have been modified but has no way to prove it.
  • Same lawyer – but one of the archive tapes is missing.

(On rogue employees – everyone seeks to hire the best but all it takes is one person turned by the classics – cash, drugs or hookers.)

These scenarios show us the following requirements:

  • The archive must be encrypted with keys and algorithm appropriate for long term (10+ year) storage. Call it AES encryption for symmetric encryption, 2048-bit keys for RSA encryption, for archives written today.
  • We should have a mechanism to rekey archives, if possible. (There may be legal reasons why archives can’t be rekeyed regardless of circumstances.)
  • The archive must provide digital signatures to detect modifications.

There are several additional practical requirements.

Key Management

The necessary key management for a solid encryption system is straightforward.

Session Key – a unique, random session key is created for each archive. The session key contains two (AES) keys. One key is used to encrypt the data, the second key is used to encrypt a nonce to be used as the Initial Vector (IV) for the data encryption.

Nonce – a known value used to produce the Initial Vector (IV). We have two good candidates – the inode value when performing per-file encryption and the tapea value when performing per-tape-segment encryption.

Key Encryption – the session key can be stored, encrypted by a public key, can be stored in either the standard TAPE segment header or in a new DUMP segment. In the former case the encrypted session key can be stored in every tape segment, otherwise it will need to be written once at the top of each volume. The session key should be encrypted by multiple public keys to facilitate recovery.

Digital Signatures I – the data must be digitally signed at the level of the individual (per-file or per-tape segment) level.

Digital Signatures II – the data must be digitally signed at the volume level.

Finally there’s the possibility of desiring to perform no compression or authentication, just block-to-block substitution in the same fashion as disk-wide encryption. There are several solutions to this problem but its hard to foresee much demand for it.

Approach 1: Per-Tape Segment Encryption

This approach requires the least change to the existing format.

Encryption

  • Write the TAPE segment header. The header should contain extensions that indicate 1) that encryption has been used and 2) the encryption algorithm, e.g., AES/CBC/PKCS5Padding. (Anyone suggesting ECB will be taken out and shot.)
  • Create an Initial Vector by encrypting the TAPE segment’s tapea value with the IV session key.
  • Write the payload of the tape segment to a buffer. Append an HMAC value for the payload for each private key. This provides non-repudiation.
  • Compress the results.
  • Encrypt the results using the data session key and the IV determined above. Use padding.
  • Write the encrypted data to the tape. This provides data secrecy.

The order of these steps is important in order to prevent information disclosure.

Decryption

  • Read the TAPE segment header and check whether encryption has been used. Assuming it has been…
  • Create an Initial Vector by encrypting the TAPE segment’s tapea value with the IV session key.
  • Read the encrypted data and attempt to decrypt it into a buffer. If padding has been used a bad encryption key will cause the decryption to fail. (Strictly speaking there’s a remote chance that decryption will still ‘succeed’ but that’s handled below.)
  • Decompress the buffer.
  • Attempt to verify the HMAC values(s) using one or more public keys.
  • If an HMAC matches then we have confidence that the contents of this tape segment have been unmodified and we can write them to disk.

The primary benefit to this approach is that it requires the least change to the existing format. The primary drawback is that it requires the developer to be comfortable using cryptographic libraries correctly. That’s a serious hurdle – cryptographic libraries are notorious for nuances that can make the difference between a strong system and one that’s easily cracked.

Approach 2: Per-Tape Segment Encryption using OpenPGP Payloads

A slightly more complex approach is to replace the standard tape segment payload with an OpenPGP Message (RFC 4880) segment. Specifically we want to create a Sym. Encrypted Integrity Protected Data Packet that contains a Compressed Data Packet.

If this approach is taken then the archive can also provide the keying material in OpenPGP packets.

The primary benefit to this approach is that open source libraries are widely available that implement the OpenPGP specification so developers are less likely to misuse the cryptographic libraries. The drawback is that developers must learn how to use a new library in a non-standard way.

Approach 3: Per-File Encryption using OpenPGP Payloads

A much more complex approach is to encrypt the contents of the file with one or more Sym. Encrypted Integrity Protected Data Packets. The compressed data will have to be suitably padded to a full data block.

The primary benefit to this approach is that a minimally modified ‘restore’ application can still work with these files. Extracted files will still be encrypted but with an easy change I believe they could be converted to standard PGP/GPG encrypted files. This provides substantially stronger security since it would permit relatively untrusted parties to extract encrypted files.

The primary drawback is that it requires a substantially more complex process and will probably lose information about ‘holes’ in the file.

N.B., if this approach is used the encrypted files should not include the standard PGP/GPG headers, key material, etc. since it is unnecessary and will only serve to bloat the archive.

Conclusion

The first approach follows the spirit of the existing format the best but it can be tricky to code correctly and will require a lot of work to develop the necessary key management tools.

The second approach is a compromise that introduces a new dependency but which allows the application to use standard PGP/GPG software for key management.

The final approach is arguably the most useful but would only be appropriate during a major refactor. That should never be undertaken casually but the size of the dump/restore application is modest and adding support for a GUI interface (e.g., gnome) would require a refactorization anyway.

The best choice comes down to policy. If you see the DUMP format as dying as new filesystems are introduced then the best choice is probably #2 (due to the key management issues). If you see the format having a future and plan to introduce a GUI then the best choice is clearly #3. It is possible to do both (setting a header bit appropriately) but I believe that would introduce unnecessary confusion.

Sidenote: digital certificates and keystores are more likely to be used in a corporate environment. That’s a moot point though since it’s easy to get keypairs from either GPG/PGP keys and PKI keystores, the main thing is that there needs to be infrastructure to support key management and the mechanism used for it is somewhat irrelevant.

Addendum 9/21/2012

Another approach came to me shortly after publishing this page (naturally) – an OpenPGP packet could be used for each 1024 byte block instead of a full tape buffer. This will require minimum changes to the existing software but still allow individual encrypted files to be extracted. It also allows the INODE block to be written uncompressed. That results in a modest amount of data leak but it will make it much easier to recover from a corrupted archive.


Viewing all articles
Browse latest Browse all 29

Trending Articles