aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSven Vermeulen <sven.vermeulen@siphos.be>2013-12-11 21:51:46 +0100
committerSven Vermeulen <sven.vermeulen@siphos.be>2013-12-11 21:51:46 +0100
commitbaa48490bf79347f89906989d1c2b7db4c38d05f (patch)
tree55473b077102c002dc00499091c14ba6e05a5e4b
parentFix QUOTA check (better output) (diff)
downloadhardened-docs-baa48490bf79347f89906989d1c2b7db4c38d05f.tar.gz
hardened-docs-baa48490bf79347f89906989d1c2b7db4c38d05f.tar.bz2
hardened-docs-baa48490bf79347f89906989d1c2b7db4c38d05f.zip
Now on wiki
-rw-r--r--xml/integrity/concepts.xml685
1 files changed, 0 insertions, 685 deletions
diff --git a/xml/integrity/concepts.xml b/xml/integrity/concepts.xml
deleted file mode 100644
index c9f6313..0000000
--- a/xml/integrity/concepts.xml
+++ /dev/null
@@ -1,685 +0,0 @@
-<?xml version='1.0' encoding='UTF-8'?>
-<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">
-<!-- $Header$ -->
-
-<guide lang="en">
-<title>Integrity - Introduction and Concepts</title>
-
-<author title="Author">
- <mail link="swift"/>
-</author>
-
-<abstract>
-Integrity validation is a wide field in which many technologies play a role.
-This guide aims to offer a high-level view on what integrity validation is all
-about and how the various technologies work together to achieve a (hopefully)
-more secure environment to work in.
-</abstract>
-
-<!-- The content of this document is licensed under the CC-BY-SA license -->
-<!-- See http://creativecommons.org/licenses/by-sa/3.0 -->
-<license version="3.0" />
-
-<version>2</version>
-<date>2012-08-14</date>
-
-<chapter>
-<title>It is about trust</title>
-<section>
-<title>Introduction</title>
-<body>
-
-<p>
-Integrity is about trusting components within your environment, and in our case
-the workstations, servers and machines you work on. You definitely want to be
-certain that the workstation you type your credentials on to log on to the
-infrastructure is not compromised in any way. This "trust" in your environment
-is a combination of various factors: physical security, system security patching
-process, secure configuration, access controls and more.
-</p>
-
-<p>
-Integrity plays a role in this security field: it tries to ensure that the
-systems have not been tampered with by malicious people or organizations. And
-this tamperproof-ness extends to a wide range of components that need to be
-validated. You probably want to be certain that the binaries that are ran (and
-libraries that are loaded) are those you built yourself (in case of Gentoo) or
-were provided to you by someone (or something) you trust. And that the Linux
-kernel you booted (and the modules that are loaded) are those you made, and not
-someone else.
-</p>
-
-<p>
-Most people trust themselves and look at integrity as if it needs to prove that
-things are still as you've built them. But to support this claim, the systems you
-use to ensure integrity need to be trusted too: you want to make sure that
-whatever system is in place to offer you the final yes/no on the integrity only
-uses trusted information (did it really validate the binary) and services (is it
-not running on a compromised system). To support these claims, many ideas,
-technologies, processes and algorithms have passed the review.
-</p>
-
-<p>
-In this document, we will talk about a few of those, and how they play in the
-Gentoo Hardened Integrity subprojects' vision and roadmap.
-</p>
-
-</body>
-</section>
-</chapter>
-
-<chapter>
-<title>Hash results</title>
-<section>
-<title>Algorithmically validating a file's content</title>
-<body>
-
-<p>
-Hashes are a primary method for validating if a file (or other resource) has
-not been changed since it was first inspected. A hash is the result of a
-mathematical calculation on the content of a file (most often a number or
-ordered set of numbers), and exhibits the following properties:
-</p>
-
-<ul>
- <li>
- The resulting number is represented in a <e>small (often fixed-size) length</e>.
- This is necessary to allow fast verification if two hash values are the same
- or not, but also to allow storing the value in a secure location (which is,
- more than often, much more restricted in space).
- </li>
- <li>
- The hash function always <e>returns the same hash</e> (output) when the file it
- inspects has not been changed (input). Otherwise it'll be impossible to
- ensure that the file content hasn't changed.
- </li>
- <li>
- The hash function is fast to run (the calculation of a hash result does not
- take up too much time or even resources). Without this property, it would
- take too long to generate and even validate hash results, leading to users
- being malcontent (and more likely to disable the validation alltogether).
- </li>
- <li>
- The hash result <e>cannot be used to reconstruct</e> the file. Although this is
- often seen as a result of the first property (small length), it is important
- because hash results are often also seen as a "public validation" of data
- that is otherwise private in nature. In other words, many processes relie on
- the inability of users (or hackers) to reverse-engineer information based on
- its hash result. A good example are passwords and password databases, which
- <e>should</e> store hashes of the passwords, not the passwords themselves.
- </li>
- <li>
- Given a hash result, it is near impossible to find another file with the
- same hash result (or to create such a file yourself). Since the hash result
- is limited in space, there are many inputs that will map onto the same
- hash result. The power of a good hash function is that it is not feasible to
- find them (or calculate them) except by brute force. When such a match is
- found, it is called a <e>collision</e>.
- </li>
-</ul>
-
-<p>
-Compared with checksums, hashes try to be more cryptographically secure (and as
-such more effort is made in the last property to make sure collisions are very
-hard to obtain). Some even try to generate hash results in a way that the
-duration to calculate hashes cannot be used to obtain information from the data
-(such as if it contains more 0s than 1s, etc.)
-</p>
-
-</body>
-</section>
-<section>
-<title>Hashes in integrity validation</title>
-<body>
-
-<p>
-Integrity validation services are often based on hash generation and validation.
-Tools such as <uri link="http://www.tripwire.org/">tripwire</uri> or <uri
-link="http://aide.sourceforge.net/">AIDE</uri> generate hashes of files and
-directories on your systems and then ask you to store them safely. When you want
-the integrity of your systems checked, you provide this information to the
-program (most likely in a read-only manner since you don't want this list to
-be modified while validating) which then recalculates the hashes of the files
-and compares them with the given list. Any changes in files are detected and can
-be reported to you (or the administrator).
-</p>
-
-<p>
-A popular hash functions is SHA-1 (which you can generate and validate using the
-<c>sha1sum</c> command) which gained momentum after MD5 (using <c>md5sum</c>)
-was found to be less secure (nowadays collisions in MD5 are easy to generate).
-SHA-2 also exists (but is less popular than SHA-1) and can be played with using
-the commands <c>sha224sum</c>, <c>sha256sum</c>, <c>sha384sum</c> and
-<c>sha512sum</c>.
-</p>
-
-<pre caption="Generating the SHA-1 sum of a file">
-~$ <i>sha1sum ~/Downloads/pastie-4301043.rb</i>
-6b9b4e0946044ec752992c2afffa7be103c2e748 /home/swift/Downloads/pastie-4301043.rb
-</pre>
-
-</body>
-</section>
-<section>
-<title>Hashes are a means, not a solution</title>
-<body>
-
-<p>
-Hashes, in the field of integrity validation, are a means to compare data and
-integrity in a relatively fast way. However, by itself hashes cannot be used to
-provide integrity assurance towards the administrator. Take the use of
-<c>sha1sum</c> by itself for instance.
-</p>
-
-<p>
-You are not guaranteed that the <c>sha1sum</c> application behaves correctly
-(and as such has or hasn't been tampered with). You can't use <c>sha1sum</c>
-against itself since malicious modifications of the command can easily just
-return (print out) the expected SHA-1 sum rather than the real one. A way to
-thwart this is to provide the binary together with the hash values on read-only
-media.
-</p>
-
-<p>
-But then you're still not certain that it is that application that is executed:
-a modified system might have you think it is executing that application, but
-instead is using a different application. To provide this level of trust, you
-need to get insurance from a higher-positioned, trusted service that the right
-application is being ran. Running with a trusted kernel helps here (but might
-not provide 100% closure on it) but you most likely need assistance from the
-hardware (we will talk about the Trusted Platform Module later).
-</p>
-
-<p>
-Likewise, you are not guaranteed that it is still your file with hash results
-that is being used to verify the integrity of a file. Another file (with
-modified content) may be bind-mounted on top of it. To support integrity
-validation with a trusted information source, some solutions use HMAC digests
-instead of plain hashes.
-</p>
-
-<p>
-Finally, checksums should not only be taken on file level, but also its
-attributes (which are often used to provide access controls or even toggle
-particular security measures on/off on a file, such as is the case with PaX
-markings), directories (holding information about directory updates such
-as file adds or removals) and privileges. These are things that a program like
-<c>sha1sum</c> doesn't offer (but tools like AIDE do).
-</p>
-
-</body>
-</section>
-</chapter>
-
-<chapter>
-<title>Hash-based Message Authentication Codes</title>
-<section>
-<title>Trusting the hash result</title>
-<body>
-
-<p>
-In order to trust a hash result, some solutions use HMAC digests instead. An
-HMAC digest combines a regular hash function (and its properties) with a
-a secret cryptographic key. As such, the function generates the hash of the
-content of a file together with the secret cryptographic key. This not only
-provides integrity validation of the file, but also a signature telling the
-verification tool that the hash was made by a trusted application (one that
-knows the cryptographic key) in the past and has not been tampered with.
-</p>
-
-<p>
-By using HMAC digests, malicious users will find it more difficult to modify
-code and then present a "fake" hash results file since the user cannot reproduce
-the secret cryptographic key that needs to be added to generate this new hash
-result. When you see terms like <e>HMAC-SHA1</e> it means that a SHA-1 hash
-result is used together with a cryptographic key.
-</p>
-
-</body>
-</section>
-<section>
-<title>Managing the keys</title>
-<body>
-
-<p>
-Using keys to "protect" the hash results introduces another level of complexity:
-how do you properly, securely store the keys and access them only when needed?
-You cannot just embed the key in the hash list (since a tampered system might
-read it out when you are verifying the system, generate its own results file and
-have you check against that instead). Likewise you can't just embed the key in
-the application itself, because a tampered system might just read out the
-application binary to find the key (and once compromised, you might need to
-rebuild the application completely with a new key).
-</p>
-
-<p>
-You might be tempted to just provide the key as a command-line argument, but
-then again you are not certain that a malicious user is idling on your system,
-waiting to capture this valuable information from the output of <c>ps</c>, etc.
-</p>
-
-<p>
-Again rises the need to trust a higher-level component. When you trust the
-kernel, you might be able to use the kernel key ring for this.
-</p>
-
-</body>
-</section>
-</chapter>
-
-<chapter>
-<title>Using private/public key cryptography</title>
-<section>
-<title>Validating integrity using public keys</title>
-<body>
-
-<p>
-One way to work around the vulnerability of having the malicious user getting
-hold of the secret key is to not rely on the key for the authentication of the
-hash result in the first place when verifying the integrity of the system. This
-can be accomplised if you, instead of using just an HMAC, you also encrypt HMAC
-digest with a private key.
-</p>
-
-<p>
-During validation of the hashes, you decrypt the HMAC with the public key (not
-the private key) and use this to generate the HMAC digests again to validate.
-</p>
-
-<p>
-In this approach, an attacker cannot forge a fake HMAC since forgery requires
-access to the private key, and the private key is never used on the system to
-validate signatures. And as long as no collisions occur, he also cannot reuse
-the encrypted HMAC values (which you could consider to be a replay attack).
-</p>
-
-</body>
-</section>
-<section>
-<title>Ensuring the key integrity</title>
-<body>
-
-<p>
-Of course, this still requires that the public key is not modifyable by a
-tampered system: a fake list of hash results can be made using a different
-private key, and the moment the tool wants to decrypt the encrypted values, the
-tampered system replaces the public key with its own public key, and the system
-is again vulnerable.
-</p>
-
-</body>
-</section>
-</chapter>
-
-<chapter>
-<title>Trust chain</title>
-<section>
-<title>Handing over trust</title>
-<body>
-
-<p>
-As you've noticed from the methods and services above, you always need to have
-something you trust and that you can build on. If you trust nothing, you can't
-validate anything since nothing can be trusted to return a valid response. And
-to trust something means you also want to have confidence that that system
-itself uses trusted resources.
-</p>
-
-<p>
-For many users, the hardware level is something they trust. After all, as long
-as no burglar has come in the house and tampered with the hardware itself, it is
-reasonable to expect that the hardware is still the same. In effect, the users
-trust that the physical protection of their house is sufficient for them.
-</p>
-
-<p>
-For companies, the physical protection of the working environment is not
-sufficient for ultimate trust. They want to make sure that the hardware is not
-tampered with (or different hardware is suddenly used), specifically when that
-company uses laptops instead of (less portable) workstations.
-</p>
-
-<p>
-The more you don't trust, the more things you need to take care of in order to
-be confident that the system is not tampered with. In the Gentoo Hardened
-Integrity subproject we will use the following "order" of resources:
-</p>
-
-<ul>
- <li>
- <e>System root-owned files and root-running processes</e>. In most cases
- and most households, properly configured and protected systems will trust
- root-owned files and processes. Any request for integrity validation of
- the system is usually applied against user-provided files (no-one tampered
- with the user account or specific user files) and not against the system
- itself.
- </li>
- <li>
- <e>Operating system kernel</e> (in our case the Linux kernel). Although some
- precautions need to be taken, a properly configured and protected kernel can
- provide a higher trust level. Integrity validation on kernel level can offer
- a higher trust in the systems' integrity, although you must be aware that
- most kernels still reside on the system itself.
- </li>
- <li>
- <e>Live environments</e>. A bootable (preferably) read-only medium can be
- used to boot up a validation environment that scans and verifies the
- integrity of the system-under-investigation. In this case, even tampered
- kernel boot images can be detected, and by taking proper precautions when
- running the validation (such as ensuring no network access is enabled from
- the boot up until the final compliance check has occurred) you can make
- yourself confident of the state of the entire system.
- </li>
- <li>
- <e>Hypervisor level</e>. Hypervisors are by many organizations seen as
- trusted resources (the isolation of a virtual environment is hard to break
- out of). Integrity validation on the hypervisor level can therefor provide
- confidence, especially when "chaining trusts": the hypervisor first
- validates the kernel to boot, and then boots this (now trusted) kernel which
- loads up the rest of the system.
- </li>
- <li>
- <e>Hardware level</e>. Whereas hypervisors are still "just software", you
- can lift up trust up to the hardware level and use the hardware-offered
- integrity features to provide you with confidence that the system you are
- about to boot has not been tampered with.
- </li>
-</ul>
-
-<p>
-In the Gentoo Hardened Integrity subproject, we aim to eventually support all
-these levels (and perhaps more) to provide you as a user the tools and methods
-you need to validate the integrity of your system, up to the point that you
-trust. The less you trust, the more complex a trust chain might become to
-validate (and manage), but we will not limit our research and support to a
-single technology (or chain of technologies).
-</p>
-
-<p>
-Chaining trust is an important aspect to keep things from becoming too complex
-and unmanageable. It also allows users to just "drop in" at the level of trust
-they feel is sufficient, rather than requiring technologies for higher levels.
-</p>
-
-<p>
-For instance:
-</p>
-
-<ul>
- <li>
- A hardware component that you trust (like a <e>Trusted Platform Module</e>
- or a specific BIOS-supported functionality) verifies the integrity of the
- boot regions on your disk. When ok, it passes control over to the
- bootloader.
- </li>
- <li>
- The bootloader now validates the integrity of its configuration and of the
- files (kernel and initramfs) it is told to boot up. If it checks out, it
- boots the kernel and hands over control to this kernel.
- </li>
- <li>
- The kernel, together with the initial ram file system, verifies the
- integrity of the system components (and for instance SELinux policy) before
- the initial ram system changes to the real system and boots up the
- (verified) init system.
- </li>
- <li>
- The (root-running) init system validates the integrity of the services it
- wants to start before handing over control of the system to the user.
- </li>
-</ul>
-
-<p>
-An even longer chain can be seen with hypervisors:
-</p>
-
-<ul>
- <li>
- Hardware validates boot loader
- </li>
- <li>
- Boot loader validates hypervisor kernel and system
- </li>
- <li>
- Hypervisor validates kernel(s) of the images (or the entire images)
- </li>
- <li>
- Hypervisor-managed virtual environment starts the image
- </li>
- <li>
- ...
- </li>
-</ul>
-
-</body>
-</section>
-<section>
-<title>Integrity on serviced platforms</title>
-<body>
-
-<p>
-Sometimes you cannot trust higher positioned components, but still want to be
-assured that your service is not tampered with. An example would be when you are
-hosting a system in a remote, non-accessible data center or when you manage an
-image hosted by a virtualized hosting provider (I don't want to say "cloud"
-here, but it fits).
-</p>
-
-<p>
-In these cases, you want a level of assurance that your own image has not been
-tampered with while being offline (you can imagine manipulating the guest image,
-injecting trojans or other backdoors, and then booting the image) or even while
-running the system. Instead of trusting the higher components, you try to deal
-with a level of distrust that you want to manage.
-</p>
-
-<p>
-Providing you with some confidence at this level too is our goal within the
-Gentoo Hardened Integrity subproject.
-</p>
-
-</body>
-</section>
-<section>
-<title>From measurement to protection</title>
-<body>
-
-<p>
-When dealing with integrity (and trust chains), the idea behind the top-down
-trust chain is that higher level components first measure the integrity of the
-next component, validate (and take appropriate action) and then hand over
-control to this component. This is what we call <e>protection</e> or
-<e>integrity enforcement</e> of resources.
-</p>
-
-<p>
-If the system cannot validate the integrity, or the system is too volatile to
-enforce this integrity from a higher level, it is necessary to provide a trusted
-method for other services to validate the integrity. In this case, the system
-<e>attests</e> the state of the underlying component(s) towards a third party
-service, which <e>appraises</e> this state against a known "good" value.
-</p>
-
-<p>
-In the case of our HMAC-based checks, there is no enforcement of integrity of
-the files, but the tool itself attests the state of the resources by generating
-new HMAC digests and validating (appraising) it against the list of HMAC digests
-it took before.
-</p>
-
-</body>
-</section>
-</chapter>
-
-<chapter>
-<title>An implementation: the Trusted Computing Group functionality</title>
-<section>
-<title>Trusted Platform Module</title>
-<body>
-
-<p>
-Years ago, a non-profit organization called the <uri
-link="http://www.trustedcomputinggroup.org">Trusted Computing Group</uri> was
-formed to work on and promote open standards for hardware-enabled trusted
-computing and security technologies, including hardware blocks and software
-interfaces across multiple platforms.
-</p>
-
-<p>
-One of its deliverables is the <e>Trusted Platform Module</e>, abbreviated to
-TPM, to help achieve these goals. But what are these goals exactly (especially
-in light of our integrity project)?
-</p>
-
-<ul>
- <li>
- Support hardware-assisted record (measuring) of what software is (or was)
- running on the system since it booted in a way that modifications to this
- record (or the presentation of a different, fake record) can be easily
- detected
- </li>
- <li>
- Support the secure reporting to a third party of this state (measurement) so
- that the third party can attest that the system is indeed in a sane state
- </li>
-</ul>
-
-<p>
-The idea of providing a hardware-assisted method is to prevent software-based
-attacks or malpractices that would circumvent security measures. By running some
-basic (but important) functions in a protected, tamper-resistant hardware module
-(the TPM) even rooted devices cannot work around some of the measures taken to
-"trust" a system.
-</p>
-
-<p>
-The TPM chip itself does not influence the execution of a system. It is, in
-fact, a simple request/reply service and needs to be called by software
-functions. However, it provides a few services that make it a good candidate to
-set up a trusted platform (next to its hardware-based protection measures to
-prevent tampering of the TPM hardware itself):
-</p>
-
-<ul>
- <li>
- Asymmetric crypto engine, supporting the generation of asymmetric keys (RSA
- with a keylength of 2048 bits) and standard operations with those keys
- </li>
- <li>
- A random noise generator
- </li>
- <li>
- A SHA-1 hashing engine
- </li>
- <li>
- Protected (and encrypted) memory for user data and key storage
- </li>
- <li>
- Specific registers (called PCRs) to which a system can "add" data to
- </li>
-</ul>
-
-</body>
-</section>
-<section>
-<title>Platform Configuration Registers, Reporting and Storage</title>
-<body>
-
-<p>
-PCR registers are made available to support securely recording the state of
-(specific parts of) the system. Unlike processor registers that software can
-reset as needed, PCR registers can only be "extended": the previous value in the
-register is taken together with the new provided value, hashed and
-stored again. This has the advantage that a value stores both the knowledge of
-the data presented to it as well as its order (providing values AAA and BBB
-gives a different end result than providing values BBB and AAA), and that the
-PCR can be extended an unlimited number of times.
-</p>
-
-<p>
-A system that wants to securely "record" each command executed can take the hash
-of each command (before it executes it), send that to the PCR, record the event
-and then execute the command. The system (kernel or program) is responsible for
-recording the values sent to the PCR, but at the end, the value inside
-the PCR has to be the same as the one calculated from the record. If it differs,
-then the list is incorrect and the "secure" state of the system cannot be proven.
-</p>
-
-<p>
-To support secure reporting of this value to a "third party" (be it a local
-software agent or a remote service) the TPM supports secure reporting of the PCR
-values: an RSA signature is made on the PCR value as well as on a random
-number (often called the "nonce") given by the third party (proving there is no
-man-in-the-middle or replay attack). Because the private key of this signature
-is securely stored on the TPM this signature cannot be forged.
-</p>
-
-<p>
-The TPM chip has (at least) 24 PCR registers available. These registers will
-contain the extended values for
-</p>
-
-<ul>
- <li>
- BIOS, ROM and memory block data (PCR 0-4)
- </li>
- <li>
- OS loaders (PCR 5-7)
- </li>
- <li>
- Operating System-provided data (PCR 8-15)
- </li>
- <li>
- Debugging data (PCR 16)
- </li>
- <li>
- Localities and Trusted Operating System data (PCR 17-22)
- </li>
- <li>
- Application-specific data (PCR 23)
- </li>
-</ul>
-
-<p>
-The idea of using PCRs is to first <e>measure</e> the data a component is about
-to execute (or transfer control to), then <e>extend</e> the appropriate PCR,
-then <e>log</e> this event in a measurement log and finally <e>transfer
-control</e> to the measured component. This provides a trust "chain".
-</p>
-
-</body>
-</section>
-<section>
-<title>Trusting the TPM</title>
-<body>
-
-<p>
-In order to trust the TPM, the TCG basis its model on asymmetric keys. Each TPM chip
-has a 2048-bit private RSA key securely stored in the chip. This key, called the
-<e>Endorsement Key</e>, is typically generated by the TPM manufacturer during
-the creation of the TPM chip, and is backed by an Endorsement Key certificate
-issued by the TPM manufacturer. This EK certificate guarantees that the EK is in
-fact an Endorsement Key for a given TPM (similar to how an SSL certificate is
-"signed" by a Root CA). The private key cannot leave the TPM chip.
-</p>
-
-<p>
-A second key, called the <e>Storage Root Key</e>, is generated by the TPM chip
-when someone takes "ownership" of the TPM. Although the key cannot leave the TPM
-chip, it can be removed (when someone else takes ownership). This key is used to
-encrypt data and other keys (user <e>Storage Keys</e> and <e>Signature
-Keys</e>).
-</p>
-
-<p>
-The other keys (storage and signature keys) can leave the TPM chip, but always
-in an encrypted state that only the TPM can decrypt. That way, the system can
-generate specific user storage keys securely and extract them, storing them on
-non-protected storage and reload them when needed in a secure manner).
-</p>
-
-</body>
-</section>
-</chapter>
-
-</guide>