diff options
author | Sven Vermeulen <sven.vermeulen@siphos.be> | 2013-12-11 21:51:46 +0100 |
---|---|---|
committer | Sven Vermeulen <sven.vermeulen@siphos.be> | 2013-12-11 21:51:46 +0100 |
commit | baa48490bf79347f89906989d1c2b7db4c38d05f (patch) | |
tree | 55473b077102c002dc00499091c14ba6e05a5e4b | |
parent | Fix QUOTA check (better output) (diff) | |
download | hardened-docs-baa48490bf79347f89906989d1c2b7db4c38d05f.tar.gz hardened-docs-baa48490bf79347f89906989d1c2b7db4c38d05f.tar.bz2 hardened-docs-baa48490bf79347f89906989d1c2b7db4c38d05f.zip |
Now on wiki
-rw-r--r-- | xml/integrity/concepts.xml | 685 |
1 files changed, 0 insertions, 685 deletions
diff --git a/xml/integrity/concepts.xml b/xml/integrity/concepts.xml deleted file mode 100644 index c9f6313..0000000 --- a/xml/integrity/concepts.xml +++ /dev/null @@ -1,685 +0,0 @@ -<?xml version='1.0' encoding='UTF-8'?> -<!DOCTYPE guide SYSTEM "/dtd/guide.dtd"> -<!-- $Header$ --> - -<guide lang="en"> -<title>Integrity - Introduction and Concepts</title> - -<author title="Author"> - <mail link="swift"/> -</author> - -<abstract> -Integrity validation is a wide field in which many technologies play a role. -This guide aims to offer a high-level view on what integrity validation is all -about and how the various technologies work together to achieve a (hopefully) -more secure environment to work in. -</abstract> - -<!-- The content of this document is licensed under the CC-BY-SA license --> -<!-- See http://creativecommons.org/licenses/by-sa/3.0 --> -<license version="3.0" /> - -<version>2</version> -<date>2012-08-14</date> - -<chapter> -<title>It is about trust</title> -<section> -<title>Introduction</title> -<body> - -<p> -Integrity is about trusting components within your environment, and in our case -the workstations, servers and machines you work on. You definitely want to be -certain that the workstation you type your credentials on to log on to the -infrastructure is not compromised in any way. This "trust" in your environment -is a combination of various factors: physical security, system security patching -process, secure configuration, access controls and more. -</p> - -<p> -Integrity plays a role in this security field: it tries to ensure that the -systems have not been tampered with by malicious people or organizations. And -this tamperproof-ness extends to a wide range of components that need to be -validated. You probably want to be certain that the binaries that are ran (and -libraries that are loaded) are those you built yourself (in case of Gentoo) or -were provided to you by someone (or something) you trust. And that the Linux -kernel you booted (and the modules that are loaded) are those you made, and not -someone else. -</p> - -<p> -Most people trust themselves and look at integrity as if it needs to prove that -things are still as you've built them. But to support this claim, the systems you -use to ensure integrity need to be trusted too: you want to make sure that -whatever system is in place to offer you the final yes/no on the integrity only -uses trusted information (did it really validate the binary) and services (is it -not running on a compromised system). To support these claims, many ideas, -technologies, processes and algorithms have passed the review. -</p> - -<p> -In this document, we will talk about a few of those, and how they play in the -Gentoo Hardened Integrity subprojects' vision and roadmap. -</p> - -</body> -</section> -</chapter> - -<chapter> -<title>Hash results</title> -<section> -<title>Algorithmically validating a file's content</title> -<body> - -<p> -Hashes are a primary method for validating if a file (or other resource) has -not been changed since it was first inspected. A hash is the result of a -mathematical calculation on the content of a file (most often a number or -ordered set of numbers), and exhibits the following properties: -</p> - -<ul> - <li> - The resulting number is represented in a <e>small (often fixed-size) length</e>. - This is necessary to allow fast verification if two hash values are the same - or not, but also to allow storing the value in a secure location (which is, - more than often, much more restricted in space). - </li> - <li> - The hash function always <e>returns the same hash</e> (output) when the file it - inspects has not been changed (input). Otherwise it'll be impossible to - ensure that the file content hasn't changed. - </li> - <li> - The hash function is fast to run (the calculation of a hash result does not - take up too much time or even resources). Without this property, it would - take too long to generate and even validate hash results, leading to users - being malcontent (and more likely to disable the validation alltogether). - </li> - <li> - The hash result <e>cannot be used to reconstruct</e> the file. Although this is - often seen as a result of the first property (small length), it is important - because hash results are often also seen as a "public validation" of data - that is otherwise private in nature. In other words, many processes relie on - the inability of users (or hackers) to reverse-engineer information based on - its hash result. A good example are passwords and password databases, which - <e>should</e> store hashes of the passwords, not the passwords themselves. - </li> - <li> - Given a hash result, it is near impossible to find another file with the - same hash result (or to create such a file yourself). Since the hash result - is limited in space, there are many inputs that will map onto the same - hash result. The power of a good hash function is that it is not feasible to - find them (or calculate them) except by brute force. When such a match is - found, it is called a <e>collision</e>. - </li> -</ul> - -<p> -Compared with checksums, hashes try to be more cryptographically secure (and as -such more effort is made in the last property to make sure collisions are very -hard to obtain). Some even try to generate hash results in a way that the -duration to calculate hashes cannot be used to obtain information from the data -(such as if it contains more 0s than 1s, etc.) -</p> - -</body> -</section> -<section> -<title>Hashes in integrity validation</title> -<body> - -<p> -Integrity validation services are often based on hash generation and validation. -Tools such as <uri link="http://www.tripwire.org/">tripwire</uri> or <uri -link="http://aide.sourceforge.net/">AIDE</uri> generate hashes of files and -directories on your systems and then ask you to store them safely. When you want -the integrity of your systems checked, you provide this information to the -program (most likely in a read-only manner since you don't want this list to -be modified while validating) which then recalculates the hashes of the files -and compares them with the given list. Any changes in files are detected and can -be reported to you (or the administrator). -</p> - -<p> -A popular hash functions is SHA-1 (which you can generate and validate using the -<c>sha1sum</c> command) which gained momentum after MD5 (using <c>md5sum</c>) -was found to be less secure (nowadays collisions in MD5 are easy to generate). -SHA-2 also exists (but is less popular than SHA-1) and can be played with using -the commands <c>sha224sum</c>, <c>sha256sum</c>, <c>sha384sum</c> and -<c>sha512sum</c>. -</p> - -<pre caption="Generating the SHA-1 sum of a file"> -~$ <i>sha1sum ~/Downloads/pastie-4301043.rb</i> -6b9b4e0946044ec752992c2afffa7be103c2e748 /home/swift/Downloads/pastie-4301043.rb -</pre> - -</body> -</section> -<section> -<title>Hashes are a means, not a solution</title> -<body> - -<p> -Hashes, in the field of integrity validation, are a means to compare data and -integrity in a relatively fast way. However, by itself hashes cannot be used to -provide integrity assurance towards the administrator. Take the use of -<c>sha1sum</c> by itself for instance. -</p> - -<p> -You are not guaranteed that the <c>sha1sum</c> application behaves correctly -(and as such has or hasn't been tampered with). You can't use <c>sha1sum</c> -against itself since malicious modifications of the command can easily just -return (print out) the expected SHA-1 sum rather than the real one. A way to -thwart this is to provide the binary together with the hash values on read-only -media. -</p> - -<p> -But then you're still not certain that it is that application that is executed: -a modified system might have you think it is executing that application, but -instead is using a different application. To provide this level of trust, you -need to get insurance from a higher-positioned, trusted service that the right -application is being ran. Running with a trusted kernel helps here (but might -not provide 100% closure on it) but you most likely need assistance from the -hardware (we will talk about the Trusted Platform Module later). -</p> - -<p> -Likewise, you are not guaranteed that it is still your file with hash results -that is being used to verify the integrity of a file. Another file (with -modified content) may be bind-mounted on top of it. To support integrity -validation with a trusted information source, some solutions use HMAC digests -instead of plain hashes. -</p> - -<p> -Finally, checksums should not only be taken on file level, but also its -attributes (which are often used to provide access controls or even toggle -particular security measures on/off on a file, such as is the case with PaX -markings), directories (holding information about directory updates such -as file adds or removals) and privileges. These are things that a program like -<c>sha1sum</c> doesn't offer (but tools like AIDE do). -</p> - -</body> -</section> -</chapter> - -<chapter> -<title>Hash-based Message Authentication Codes</title> -<section> -<title>Trusting the hash result</title> -<body> - -<p> -In order to trust a hash result, some solutions use HMAC digests instead. An -HMAC digest combines a regular hash function (and its properties) with a -a secret cryptographic key. As such, the function generates the hash of the -content of a file together with the secret cryptographic key. This not only -provides integrity validation of the file, but also a signature telling the -verification tool that the hash was made by a trusted application (one that -knows the cryptographic key) in the past and has not been tampered with. -</p> - -<p> -By using HMAC digests, malicious users will find it more difficult to modify -code and then present a "fake" hash results file since the user cannot reproduce -the secret cryptographic key that needs to be added to generate this new hash -result. When you see terms like <e>HMAC-SHA1</e> it means that a SHA-1 hash -result is used together with a cryptographic key. -</p> - -</body> -</section> -<section> -<title>Managing the keys</title> -<body> - -<p> -Using keys to "protect" the hash results introduces another level of complexity: -how do you properly, securely store the keys and access them only when needed? -You cannot just embed the key in the hash list (since a tampered system might -read it out when you are verifying the system, generate its own results file and -have you check against that instead). Likewise you can't just embed the key in -the application itself, because a tampered system might just read out the -application binary to find the key (and once compromised, you might need to -rebuild the application completely with a new key). -</p> - -<p> -You might be tempted to just provide the key as a command-line argument, but -then again you are not certain that a malicious user is idling on your system, -waiting to capture this valuable information from the output of <c>ps</c>, etc. -</p> - -<p> -Again rises the need to trust a higher-level component. When you trust the -kernel, you might be able to use the kernel key ring for this. -</p> - -</body> -</section> -</chapter> - -<chapter> -<title>Using private/public key cryptography</title> -<section> -<title>Validating integrity using public keys</title> -<body> - -<p> -One way to work around the vulnerability of having the malicious user getting -hold of the secret key is to not rely on the key for the authentication of the -hash result in the first place when verifying the integrity of the system. This -can be accomplised if you, instead of using just an HMAC, you also encrypt HMAC -digest with a private key. -</p> - -<p> -During validation of the hashes, you decrypt the HMAC with the public key (not -the private key) and use this to generate the HMAC digests again to validate. -</p> - -<p> -In this approach, an attacker cannot forge a fake HMAC since forgery requires -access to the private key, and the private key is never used on the system to -validate signatures. And as long as no collisions occur, he also cannot reuse -the encrypted HMAC values (which you could consider to be a replay attack). -</p> - -</body> -</section> -<section> -<title>Ensuring the key integrity</title> -<body> - -<p> -Of course, this still requires that the public key is not modifyable by a -tampered system: a fake list of hash results can be made using a different -private key, and the moment the tool wants to decrypt the encrypted values, the -tampered system replaces the public key with its own public key, and the system -is again vulnerable. -</p> - -</body> -</section> -</chapter> - -<chapter> -<title>Trust chain</title> -<section> -<title>Handing over trust</title> -<body> - -<p> -As you've noticed from the methods and services above, you always need to have -something you trust and that you can build on. If you trust nothing, you can't -validate anything since nothing can be trusted to return a valid response. And -to trust something means you also want to have confidence that that system -itself uses trusted resources. -</p> - -<p> -For many users, the hardware level is something they trust. After all, as long -as no burglar has come in the house and tampered with the hardware itself, it is -reasonable to expect that the hardware is still the same. In effect, the users -trust that the physical protection of their house is sufficient for them. -</p> - -<p> -For companies, the physical protection of the working environment is not -sufficient for ultimate trust. They want to make sure that the hardware is not -tampered with (or different hardware is suddenly used), specifically when that -company uses laptops instead of (less portable) workstations. -</p> - -<p> -The more you don't trust, the more things you need to take care of in order to -be confident that the system is not tampered with. In the Gentoo Hardened -Integrity subproject we will use the following "order" of resources: -</p> - -<ul> - <li> - <e>System root-owned files and root-running processes</e>. In most cases - and most households, properly configured and protected systems will trust - root-owned files and processes. Any request for integrity validation of - the system is usually applied against user-provided files (no-one tampered - with the user account or specific user files) and not against the system - itself. - </li> - <li> - <e>Operating system kernel</e> (in our case the Linux kernel). Although some - precautions need to be taken, a properly configured and protected kernel can - provide a higher trust level. Integrity validation on kernel level can offer - a higher trust in the systems' integrity, although you must be aware that - most kernels still reside on the system itself. - </li> - <li> - <e>Live environments</e>. A bootable (preferably) read-only medium can be - used to boot up a validation environment that scans and verifies the - integrity of the system-under-investigation. In this case, even tampered - kernel boot images can be detected, and by taking proper precautions when - running the validation (such as ensuring no network access is enabled from - the boot up until the final compliance check has occurred) you can make - yourself confident of the state of the entire system. - </li> - <li> - <e>Hypervisor level</e>. Hypervisors are by many organizations seen as - trusted resources (the isolation of a virtual environment is hard to break - out of). Integrity validation on the hypervisor level can therefor provide - confidence, especially when "chaining trusts": the hypervisor first - validates the kernel to boot, and then boots this (now trusted) kernel which - loads up the rest of the system. - </li> - <li> - <e>Hardware level</e>. Whereas hypervisors are still "just software", you - can lift up trust up to the hardware level and use the hardware-offered - integrity features to provide you with confidence that the system you are - about to boot has not been tampered with. - </li> -</ul> - -<p> -In the Gentoo Hardened Integrity subproject, we aim to eventually support all -these levels (and perhaps more) to provide you as a user the tools and methods -you need to validate the integrity of your system, up to the point that you -trust. The less you trust, the more complex a trust chain might become to -validate (and manage), but we will not limit our research and support to a -single technology (or chain of technologies). -</p> - -<p> -Chaining trust is an important aspect to keep things from becoming too complex -and unmanageable. It also allows users to just "drop in" at the level of trust -they feel is sufficient, rather than requiring technologies for higher levels. -</p> - -<p> -For instance: -</p> - -<ul> - <li> - A hardware component that you trust (like a <e>Trusted Platform Module</e> - or a specific BIOS-supported functionality) verifies the integrity of the - boot regions on your disk. When ok, it passes control over to the - bootloader. - </li> - <li> - The bootloader now validates the integrity of its configuration and of the - files (kernel and initramfs) it is told to boot up. If it checks out, it - boots the kernel and hands over control to this kernel. - </li> - <li> - The kernel, together with the initial ram file system, verifies the - integrity of the system components (and for instance SELinux policy) before - the initial ram system changes to the real system and boots up the - (verified) init system. - </li> - <li> - The (root-running) init system validates the integrity of the services it - wants to start before handing over control of the system to the user. - </li> -</ul> - -<p> -An even longer chain can be seen with hypervisors: -</p> - -<ul> - <li> - Hardware validates boot loader - </li> - <li> - Boot loader validates hypervisor kernel and system - </li> - <li> - Hypervisor validates kernel(s) of the images (or the entire images) - </li> - <li> - Hypervisor-managed virtual environment starts the image - </li> - <li> - ... - </li> -</ul> - -</body> -</section> -<section> -<title>Integrity on serviced platforms</title> -<body> - -<p> -Sometimes you cannot trust higher positioned components, but still want to be -assured that your service is not tampered with. An example would be when you are -hosting a system in a remote, non-accessible data center or when you manage an -image hosted by a virtualized hosting provider (I don't want to say "cloud" -here, but it fits). -</p> - -<p> -In these cases, you want a level of assurance that your own image has not been -tampered with while being offline (you can imagine manipulating the guest image, -injecting trojans or other backdoors, and then booting the image) or even while -running the system. Instead of trusting the higher components, you try to deal -with a level of distrust that you want to manage. -</p> - -<p> -Providing you with some confidence at this level too is our goal within the -Gentoo Hardened Integrity subproject. -</p> - -</body> -</section> -<section> -<title>From measurement to protection</title> -<body> - -<p> -When dealing with integrity (and trust chains), the idea behind the top-down -trust chain is that higher level components first measure the integrity of the -next component, validate (and take appropriate action) and then hand over -control to this component. This is what we call <e>protection</e> or -<e>integrity enforcement</e> of resources. -</p> - -<p> -If the system cannot validate the integrity, or the system is too volatile to -enforce this integrity from a higher level, it is necessary to provide a trusted -method for other services to validate the integrity. In this case, the system -<e>attests</e> the state of the underlying component(s) towards a third party -service, which <e>appraises</e> this state against a known "good" value. -</p> - -<p> -In the case of our HMAC-based checks, there is no enforcement of integrity of -the files, but the tool itself attests the state of the resources by generating -new HMAC digests and validating (appraising) it against the list of HMAC digests -it took before. -</p> - -</body> -</section> -</chapter> - -<chapter> -<title>An implementation: the Trusted Computing Group functionality</title> -<section> -<title>Trusted Platform Module</title> -<body> - -<p> -Years ago, a non-profit organization called the <uri -link="http://www.trustedcomputinggroup.org">Trusted Computing Group</uri> was -formed to work on and promote open standards for hardware-enabled trusted -computing and security technologies, including hardware blocks and software -interfaces across multiple platforms. -</p> - -<p> -One of its deliverables is the <e>Trusted Platform Module</e>, abbreviated to -TPM, to help achieve these goals. But what are these goals exactly (especially -in light of our integrity project)? -</p> - -<ul> - <li> - Support hardware-assisted record (measuring) of what software is (or was) - running on the system since it booted in a way that modifications to this - record (or the presentation of a different, fake record) can be easily - detected - </li> - <li> - Support the secure reporting to a third party of this state (measurement) so - that the third party can attest that the system is indeed in a sane state - </li> -</ul> - -<p> -The idea of providing a hardware-assisted method is to prevent software-based -attacks or malpractices that would circumvent security measures. By running some -basic (but important) functions in a protected, tamper-resistant hardware module -(the TPM) even rooted devices cannot work around some of the measures taken to -"trust" a system. -</p> - -<p> -The TPM chip itself does not influence the execution of a system. It is, in -fact, a simple request/reply service and needs to be called by software -functions. However, it provides a few services that make it a good candidate to -set up a trusted platform (next to its hardware-based protection measures to -prevent tampering of the TPM hardware itself): -</p> - -<ul> - <li> - Asymmetric crypto engine, supporting the generation of asymmetric keys (RSA - with a keylength of 2048 bits) and standard operations with those keys - </li> - <li> - A random noise generator - </li> - <li> - A SHA-1 hashing engine - </li> - <li> - Protected (and encrypted) memory for user data and key storage - </li> - <li> - Specific registers (called PCRs) to which a system can "add" data to - </li> -</ul> - -</body> -</section> -<section> -<title>Platform Configuration Registers, Reporting and Storage</title> -<body> - -<p> -PCR registers are made available to support securely recording the state of -(specific parts of) the system. Unlike processor registers that software can -reset as needed, PCR registers can only be "extended": the previous value in the -register is taken together with the new provided value, hashed and -stored again. This has the advantage that a value stores both the knowledge of -the data presented to it as well as its order (providing values AAA and BBB -gives a different end result than providing values BBB and AAA), and that the -PCR can be extended an unlimited number of times. -</p> - -<p> -A system that wants to securely "record" each command executed can take the hash -of each command (before it executes it), send that to the PCR, record the event -and then execute the command. The system (kernel or program) is responsible for -recording the values sent to the PCR, but at the end, the value inside -the PCR has to be the same as the one calculated from the record. If it differs, -then the list is incorrect and the "secure" state of the system cannot be proven. -</p> - -<p> -To support secure reporting of this value to a "third party" (be it a local -software agent or a remote service) the TPM supports secure reporting of the PCR -values: an RSA signature is made on the PCR value as well as on a random -number (often called the "nonce") given by the third party (proving there is no -man-in-the-middle or replay attack). Because the private key of this signature -is securely stored on the TPM this signature cannot be forged. -</p> - -<p> -The TPM chip has (at least) 24 PCR registers available. These registers will -contain the extended values for -</p> - -<ul> - <li> - BIOS, ROM and memory block data (PCR 0-4) - </li> - <li> - OS loaders (PCR 5-7) - </li> - <li> - Operating System-provided data (PCR 8-15) - </li> - <li> - Debugging data (PCR 16) - </li> - <li> - Localities and Trusted Operating System data (PCR 17-22) - </li> - <li> - Application-specific data (PCR 23) - </li> -</ul> - -<p> -The idea of using PCRs is to first <e>measure</e> the data a component is about -to execute (or transfer control to), then <e>extend</e> the appropriate PCR, -then <e>log</e> this event in a measurement log and finally <e>transfer -control</e> to the measured component. This provides a trust "chain". -</p> - -</body> -</section> -<section> -<title>Trusting the TPM</title> -<body> - -<p> -In order to trust the TPM, the TCG basis its model on asymmetric keys. Each TPM chip -has a 2048-bit private RSA key securely stored in the chip. This key, called the -<e>Endorsement Key</e>, is typically generated by the TPM manufacturer during -the creation of the TPM chip, and is backed by an Endorsement Key certificate -issued by the TPM manufacturer. This EK certificate guarantees that the EK is in -fact an Endorsement Key for a given TPM (similar to how an SSL certificate is -"signed" by a Root CA). The private key cannot leave the TPM chip. -</p> - -<p> -A second key, called the <e>Storage Root Key</e>, is generated by the TPM chip -when someone takes "ownership" of the TPM. Although the key cannot leave the TPM -chip, it can be removed (when someone else takes ownership). This key is used to -encrypt data and other keys (user <e>Storage Keys</e> and <e>Signature -Keys</e>). -</p> - -<p> -The other keys (storage and signature keys) can leave the TPM chip, but always -in an encrypted state that only the TPM can decrypt. That way, the system can -generate specific user storage keys securely and extract them, storing them on -non-protected storage and reload them when needed in a secure manner). -</p> - -</body> -</section> -</chapter> - -</guide> |