Security Concepts

travis+security@subspacefield.org

Abstract

This is an online book about computer, network, technical, physical, information and cryptographic security. It is a labor of love, incomplete until the day I am finished.

Contents

1  Metadata
    1.1  Copyright and Distribution Control
    1.2  Goals
    1.3  Audience
    1.4  About This Work
    1.5  How to Read the Online Version
    1.6  About Writing This
    1.7  Tools Used To Create This Book
2  Security Properties
    2.1  Information Security is a PAIN
    2.2  Parkerian Hexad
    2.3  Pentagon of Trust
    2.4  Security Equivalency
    2.5  Other Questions
3  Security Concepts
    3.1  Attack Surface
    3.2  The Classification Problem
        3.2.1  Classification Errors
        3.2.2  The Base-Rate Fallacy
        3.2.3  Test Efficiency
        3.2.4  Incompletely-Defined Sets
        3.2.5  The Guessing Hazard
    3.3  Security Layers
    3.4  Privilege Levels
    3.5  Attack Characteristics
    3.6  What is a Vulnerability?
    3.7  Accuracy Limitations in Making Decisions That Impact Security
4  Adversaries and Threats
    4.1  Common Psychological Errors
    4.2  Cost-Benefit
    4.3  Risk Tolerance
    4.4  Capabilities
    4.5  Sophistication Distribution
    4.6  Goals
5  Physical Security
    5.1  No Physical Security Means No Security
    5.2  Data Remanence
        5.2.1  Magnetic Storage Media (Disks)
        5.2.2  Semiconductor Storage (RAM)
6  Distributed Systems
    6.1  Cryptography is the Sine Qua Non of Secure Distributed Systems
    6.2  Hello, My Name is 192.168.1.1
    6.3  Source Tapping; The First Hop and Last Mile
    6.4  Security Zones
    6.5  Security Equivalent Things Go Together
    6.6  Outsider Threats vs. Insider Threats
    6.7  A Proposed Perimeter Defense
    6.8  Man In The Middle
        6.8.1  DNS Issues
        6.8.2  IP Routing
        6.8.3  Link-layer Issues
        6.8.4  Physical Layer
        6.8.5  Periodic Rechecking
        6.8.6  Out-of-Band Comparison
        6.8.7  Parallel Paths
        6.8.8  Formatting
7  Identification and Authentication
    7.1  Identity
    7.2  What Authority?
    7.3  Authentication Factors
    7.4  Authentication Issues: When, What
    7.5  The Identity Continuum
    7.6  Problems Remaining Anonymous
    7.7  Problems with Identifying People
    7.8  Remote Attestation
8  Access Control
    8.1  Privilege Escalation
    8.2  Physical Access Control
    8.3  Operating System Access Control: DAC, MAC, RBAC
9  Secure System Administration
    9.1  Change Management
    9.2  Self-Healing Systems
    9.3  Heterogeneous vs. Homogeneous Defenses
10  Logging
    10.1  Synchronized Time
    10.2  Syslog
11  Reports
    11.1  Change Reporting
    11.2  Artificial Ignorance
    11.3  Dead Man's Switch
12  Abuse Detection
    12.1  Misuse Detection vs. Anomaly Detection
    12.2  Honey Traps
    12.3  Tripwires and Booby Traps
    12.4  Anti-Malware
    12.5  Anti-Spam
    12.6  Detecting Automated Peers
        12.6.1  CAPTCHA
        12.6.2  Bot Traps
        12.6.3  Velocity Checks
        12.6.4  Typing Mistakes
    12.7  Host-Based Intrusion Detection
    12.8  Intrusion Detection Principles
    12.9  Intrusion Information Collection
    12.10  Intrusion Alerting
        12.10.1  Possible Intrusion Alerting Solutions
13  Abuse Response
    13.1  How to Respond to Abuse
    13.2  The Silent Treatment
    13.3  Random Response
    13.4  Faux Positives
    13.5  The Simulation Defense
        13.5.1  Fishbowls
    13.6  Hack-Back
        13.6.1  Reverse-Hack
        13.6.2  Mirror Defense
        13.6.3  Counterhack
    13.7  Identification Issues
    13.8  Proportional Response
14  Forensics
    14.1  Forensic Limitations
    14.2  Ephemeral Data
    14.3  Remnant Data
    14.4  Hidden Data
    14.5  Metadata
    14.6  Forensic Inference
15  Network Security
    15.1  The Current State of Things
    15.2  Traffic Identification: RPC, Dynamic Ports, User-Specified Ports and Encapsulation
        15.2.1  RPC
        15.2.2  Dynamic Port Numbers
        15.2.3  Encapsulation
        15.2.4  Possible Solutions
    15.3  Advanced Network Security Technologies
16  Web Security
    16.1  Direct Browser Attacks
    16.2  Indirect Browser Attacks
    16.3  SSL Certificates Made Redundant
17  Application Security
    17.1  Security is a Subset of Correctness
    17.2  Malware vs. Data-Directed Attacks
    17.3  Reverse Engineering
    17.4  Application Exploitation
    17.5  Application Exploitation Defenses
        17.5.1  Stack-Smashing Protection
        17.5.2  Address-Space Layout Randomization (ASLR)
        17.5.3  Write XOR Execute
    17.6  Software Complexity
        17.6.1  Complexity of Network Protocols
        17.6.2  Polymorphism and Complexity
    17.7  Failure Modes
    17.8  Fault Tolerance
    17.9  Implications of Incorrectness
18  Trust
    18.1  Trust and Trustworthiness
    18.2  Code Provenance: Signed Programs and Trusted Authors
19  Cryptology
    19.1  Limits of Cryptography
        19.1.1  The Last Foot of the Communication
        19.1.2  Limitations Regarding Endpoint Security
        19.1.3  In Practice
    19.2  How Strong Should My Cryptography Be?
        19.2.1  Key Lengths
    19.3  Cryptographic Algorithms
        19.3.1  Combiners
        19.3.2  Speed of Algorithms and the Hybrid Encryption Scheme
        19.3.3  HMAC: The Symmetric Digital Signature
    19.4  Cryptographic Algorithm Enhancements
        19.4.1  Hashing Stored Authentication Data
        19.4.2  Offline Dictionary Attacks and Iterated Hashes
        19.4.3  Salts vs. Offline Dictionary Attacks and Rainbow Tables
        19.4.4  Offline Dictionary Attacks with Partial Confidentiality
    19.5  Cryptographic Combinations
        19.5.1  The Sign then Encrypt Problem
        19.5.2  Key Derivation Functions
    19.6  Cryptographic Protocols
        19.6.1  DoS and Anti-Clogging Tokens
        19.6.2  The Problem with Authenticating within an Encrypted Channel
        19.6.3  How to Protect the Integrity of a Session
        19.6.4  Freshness and Replay Attacks
        19.6.5  Authentication
        19.6.6  Key Exchange and Hybrid Encryption Schemes
    19.7  Encrypted Storage
        19.7.1  Key Escrow for Encrypted Storage
        19.7.2  Evolution of Cryptographic Storage Technologies
        19.7.3  Filesystem Crypto Layers
        19.7.4  File Systems with Optional Encryption
        19.7.5  Block Device Crypto
        19.7.6  The Cryptographically-Strong Pseudo-random Quick Fill
        19.7.7  Backups
    19.8  Key Management
        19.8.1  Key Exchange and the Bootstrapping Problem
        19.8.2  One Key, One Purpose
        19.8.3  Time Compartmentalization
        19.8.4  Key Indirection
        19.8.5  Secret Sharing
20  Randomness and Unpredictability
    20.1  What is an Ideal Random Number Generator?
    20.2  Definitions of Unpredictability
    20.3  Definitions of Randomness
    20.4  Why Entropy and Unpredictability Are Not The Same
    20.5  Unpredictability is the Sine Qua Non of Cryptography
    20.6  Predictability is Provable, Unpredictability is Not
    20.7  Randomly-Generated Samples Are No Different Than Any Other Sample
    20.8  Testing For Predictability
    20.9  Ways to Fail
    20.10  Humans Are Too Predictable
    20.11  Sources of Unpredictability
    20.12  The Laws of Unpredictability
21  Lateral Thinking
    21.1  Traffic Analysis
    21.2  Side Channels
        21.2.1  Physical Information-Gathering Attacks and Defenses
        21.2.2  Signal Injection Attacks and Defenses
        21.2.3  System-Local Side-Channel Attacks
        21.2.4  Timing Side-Channels
22  Information and Intelligence
    22.1  Controlling Information Flow
    22.2  Labeling and Regulations
    22.3  Knowledge is Power
    22.4  Secrecy is Power
    22.5  Never Confirm Guesses
    22.6  What You Don't Know Can Hurt You
    22.7  How Secrecy is Lost
    22.8  Costs of Disclosure
    22.9  Dissemination
    22.10  Information, Misinformation, Disinformation
23  Conflict and Combat
    23.1  Indicators and Warnings
    23.2  Attacker's Advantage in Network Warfare
    23.3  Defender's Advantage in Network Warfare
    23.4  OODA Loops
24  Security Principles
    24.1  The Principle of Least Privilege
    24.2  The Principle of Agility
    24.3  The Principle of Minimal Assumptions
    24.4  The Principle of Fail-Secure Design
    24.5  The Principle of Unique Identifiers
    24.6  The Principles of Simplicity
    24.7  The Principle of Defense in Depth
    24.8  The Principle of Uniform Fronts
    24.9  The Principle of Split Control
    24.10  The Principle of Minimal Changes
    24.11  The Principle of Centralized Management
    24.12  The Principle of Least Surprise
    24.13  The Principle of Removing Excuses
    24.14  The Principle of Retaining Control
    24.15  Availability Principles
25  Common Arguments
    25.1  Disclosure: Full, Partial, or None?
        25.1.1  Arguments for Full Disclosure
        25.1.2  Arguments Against Full Disclosure - Vendor
        25.1.3  Arguments Against Full Disclosure - Vendor's Employees
        25.1.4  Arguments Against Full Disclosure - End User
    25.2  Theorists vs. Pragmatists, Absolute vs. Effective Security
    25.3  Quantification and Metrics vs. Intuition
    25.4  Security Through Obscurity
    25.5  Security of Open Source vs. Closed Source
    25.6  Prevention vs. Detection
    25.7  Prevention vs. Monitoring
    25.8  Early vs. Late Adopters
26  Editorials, Predictions, Polemics, and Personal Opinions
    26.1  Security is for Polymaths
    26.2  Computers are Transcending our Limitations
    26.3  Reusable Authentication Data Considered Harmful
    26.4  Password Length Limits Considered Harmful
    26.5  Everything Will Be Encrypted Soon
    26.6  Error Propagation Characteristics Usually Don't Matter
    26.7  Keep it Legal, Stupid
    26.8  Should My Employees Attend "Hacker" Conferences?
    26.9  I'm a Young Hacker, Should I Sell Out and Do Security for a Corporation?
    26.10  Anonymity is not a Crime
        26.10.1  Example: Sears Makes Customer Purchase Information Available Online, Provides Spyware to Customers
    26.11  Monitoring Your Employees
    26.12  Trust People in Spite of Counterexamples
    26.13  Do What I Mean vs. Do What I Say
27  Resources
    27.1  My Other Stuff
    27.2  Conferences
    27.3  Books
        27.3.1  Publishers
        27.3.2  Titles
    27.4  Periodicals
    27.5  Blogs
    27.6  Mailing Lists
28  Credits

1  Metadata

1.1  Copyright and Distribution Control

Kindly link a person to it instead of redistributing it, so that people may always receive the latest version. However, even an outdated copy is better than none. The PDF version is preferred and more likely to render properly (especially graphics and special mathematical characters), but the HTML version is simply too convenient to not have it available. The latest version is always here:
PDF
http://www.subspacefield.org/security/security_concepts.pdf
HTML
http://www.subspacefield.org/security/security_concepts.html
This is a copyrighted work, with some rights reserved. This work is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License (http://creativecommons.org/licenses/by-nc-nd/3.0/us/). This means you may make redistribute it, that you must attribute me properly (without suggesting I endorse your work), so long as you do not use it for commerical purposes. For attribution, please include a prominent link back to this original work and some text describing the changes. I am comfortable with certain derivative works, such as translation into other languages, but not sure about others, so have yet not explicitly granted permission for all derivative uses. If you have any questions, please email me and I'll be happy to discuss it with you.

1.2  Goals

I wrote this paper to try and examine the typical problems in computer security and related areas, and attempt to extract from them principles for defending systems. To this end I attempt to synthesize various fields of knowledge, including computer security, network security, cryptology, and intelligence. I also attempt to extract the principles and implicit assumptions behind cryptography and the protection of classified information, as obtained through reverse-engineering (that is, informed speculation based on existing regulations and stuff I read in books), where they are relevant to technological security.

1.3  Audience

When I picture a perfect reader, I always picture a monster of courage and curiosity, also something supple, cunning, cautious, a born adventurer and discoverer.
- Friedreich Nietzsche
This is not intended to be an introductory text, although a beginner could gain something from it. The reason behind this is that beginners think in terms of tactics, rather than strategy, and of details rather than generalities. There are many fine books on computer and network security tactics (and many more not-so-fine books), and tactics change quickly, and being unpaid for this work, I am a lazy author. The reason why even a beginner may gain from it is that I have attempted to extract abstract concepts and strategies which are not necessarily tied to computer security. And I have attempted to illustrate the points with interesting and entertaining examples and would love to have more, so if you can think of an example for one of my points, please send it to me!
I'm writing this for you, noble reader, so your comments are very welcome; you will be helping me make this better for every future reader. If you send a contribution or comment, you'll save me a lot of work if you tell me whether you wish to be mentioned in the credits (see 28) or not; I want to respect the privacy of anonymous contributors. If you're concerned that would be presumptuous, don't be; I consider it considerate of you to save me an email exchange. Security bloggers will find plenty of fodder by looking for new URLs added to this page, and I encourage you to do it, since I simply don't have time to comment on everything I link to. If you link to this paper from your blog entry, all the better.

1.4  About This Work

I have started this book with some terminology as a way to frame the discussion. Then I get into the details of the technology. Since this is adequately explained in other works, these sections are somewhat lean and may merely be a list of links. Then I get into my primary contribution, which is the fundamental principles of security which I have extracted from the technological details. Afterwards, I summarize some common arguments that one sees among security people, and I finish up with some of my personal observations and opinions.

1.5  How to Read the Online Version

Since this document is constantly being revised, I suggest that you start with the table of contents and click on the subject headings so that you can see which ones you have read already. If I add a section, it will show up as unread. By the time it has expired from your browser's history, it is probably time to re-read it anyway, since the contents have probably been updated.
See the end of this page for the date it was generated (which is also the last update time). I currently update this about once every two weeks.

1.6  About Writing This

Part of the challenge with writing about this topic is that we are always learning and it never seems to settle down, nor does one ever seem to get a sense of completion. I consider it more permanent and organized than a blog, more up-to-date than a book, and more comprehensive and self-contained than most web pages. I know it's uneven; in some areas it's just a heading with a paragraph, or a few links, in other places it can be as smoothly written as a book. I thought about breaking it up into multiple documents, so I could release each with much more fanfare, but that's just not the way I write, and it makes it difficult to do as much cross-linking as I'd like.
This is to my knowledge the first attempt to publish a computer security book on the web before printing it, so I have no idea if it will even be possible to print it commercially. That's okay; I'm not writing for money. I'd like for the Internet to be the public library of the 21st century, and this is my first significant donation to the collection. I am reminded of the advice of a staffer in the computer science department, who said, "do what you love, and the money will take care of itself".
That having been said, if you wanted towards the effort, you can help me defray the costs of maintaining a server and such by visiting our donation page (http://www.subspacefield.org/donate.html). If you would like to donate but cannot, you may wait until such a time as you can afford to, and then give something away (i.e. pay it forward).

1.7  Tools Used To Create This Book

I use lyx (http://www.lyx.org/), but I'm still a bit of a novice. I have a love/hate relationship with it and the underlying typesetting language LATEX(http://en.wikipedia.org/wiki/LaTeX).

2  Security Properties

What do we mean by secure? When I say secure, I mean that an adversary can't make the system do something that its owner (or designer, or administrator, or even user) did not intend. Often this involves a violation of a general security property. Some security properties include:
confidentiality
refers to whether the information in question is disclosed or remains private.
integrity
refers to whether the systems (or data) remain uncorrupted. The opposite of this is malleability, where it is possible to change data without detection, and believe it or not, sometimes this is a desirable security property.
availability
is whether the system is available when you need it or not.
consistency
is whether the system behaves the same each time you use it.
auditabilty
is whether the system keeps good records of what has happened so it can be investigated later. Direct-record electronic voting machines (with no paper trail) are unauditable.
control
is whether the system obeys only the authorized users or not.
authentication
is whether the system can properly identify users. Sometimes, it is desirable that the system cannot do so, in which case it is anonymous or pseudonymous.
non-repudiation
is a relatively obscure term meaning that if you take an action, you won't be able to deny it later. Sometimes, you want the opposite, in which case you want repudiability ("plausible deniability").
Please forgive the slight difference in the way they are named; while English is partly to blame, these properties are not entirely parallel. For example, confidentiality refers to information (or inferences drawn on such) just as program refers to an executable stored on the disk, whereas control implies an active system just as process refers to a running program (as they say, "a process is a program in motion"). Also, you can compromise my data confidentiality with a completely passive attack such as reading my backup tapes, whereas controlling my system is inherently detectable since it involves interacting with it in some way.

2.1  Information Security is a PAIN

You can remember the security properties of information as PAIN; Privacy, Authenticity, Integrity, Non-Repudiation.

2.2  Parkerian Hexad

There is something similar known as the "Parkerian Hexad", defined by Donn B. Parker, which is six fundamental, atomic, non-overlapping attributes of information that are protected by information security measures:
  1. confidentiality
  2. possession
  3. integrity
  4. authenticity
  5. availability
  6. utility

2.3  Pentagon of Trust

  1. Admissibility (is the remote node trustworthy?)
  2. Authentication (who are you?)
  3. Authorization (what are you allowed to do?)
  4. Availability (is the data accessible?)
  5. Authenticity (is the data intact?)

2.4  Security Equivalency

I consider two objects to be security equivalent if they are identical with respect to the security properties under discussion; for precision, I may refer to confidentiality-equivalent pieces of information if the sets of parties to which they may be disclosed (without violating security) are exactly the same (and conversely, so are the sets of parties to which they may not be disclosed). In this case, I'm discussing objects which, if treated improperly, could lead to a compromise of the security goal of confidentiality. Or I could say that two cryptosystems are confidentiality-equivalent, in which case the objects help achieve the security goal. To be perverse, these last two examples could be combined; if the information in the first example was actually the keys for the cryptosystem in the second example, then disclosure of the first could impact the confidentiality of the keys and thus the confidentiality of anything handled by the cryptosystems. Alternately, I could refer to access-control equivalence between two firewall implementations; in this case, I am discussing objects which implement a security mechanism which helps us achieve the security goal, such as confidentiality of something.

2.5  Other Questions

  1. Secure to whom? A web site may be secure (to its owners) against unauthorized control, but may employ no encryption when collecting information from customers.
  2. Secure from whom? A site may be secure against outsiders, but not insiders.

3  Security Concepts

There is no security on this earth, there is only opportunity.
- General Douglas MacArthur (1880-1964)
These are important concepts which appear to apply across multiple security domains.

3.1  Attack Surface

Gnothi Seauton ("Know Thyself")
- ancient Greek aphorism (http://en.wikipedia.org/wiki/Know_thyself)
When discussing security, it's often useful to analyze the part which may interact with a particular adversary (or set of adversaries). For example, let's assume you are only worried about remote adversaries. If your system or network is only connected to outside world via the Internet, then the attack surface is the parts of your system that interact with things on the Internet, or the parts of your system which accept input from the Internet. A firewall, then, limits the attack surface to a smaller portion of your systems by filtering some of your network traffic. Often, the firewall blocks all incoming connections.
Sometimes the attack surface is pervasive. For example, if you have a network-enabled embedded device like a web cam on your network that has a vulnerability in its networking stack, then anything which can send it packets may be able to exploit it. Since you probably can't fix the software in it, you must then use a firewall to attempt to limit what can trigger the bug. Similarly, there was a bug in sendmail that could be exploited by sending a carefully-crafted email through a vulnerable server. The interesting bit here is that it might be an internal server that wasn't exposed to the Internet; the exploit was data-directed and so could be passed through your infrastructure until it hit a vulnerable implementation. That's why I consistently use one implementation (not sendmail) throughout my network now.
If plugging a USB drive into your system causes it to automatically run things like a standard Microsoft Windows XP install, then any plugged-in device is part of the attack surface (http://it.slashdot.org/article.pl?sid=08/01/13/1533243). But even if it does not, then by plugging a USB device in you could potentially overflow the code which handles the USB or the driver for the particular device which is loaded (http://www.eweek.com/article2/0,1895,1840141,00.asp, http://www.schneier.com/blog/archives/2006/06/hacking_compute.html); thus, the USB networking code and all drivers are part of the attack surface if you can control what is plugged into the system. Moreover, a recent vulnerability (http://it.slashdot.org/it/08/01/14/1319256.shtml) illustrates that when you have something which inspects network traffic, such as uPNP devices or port knocking daemons, then their code forms part of the attack surface.
Sometimes you will hear people talk about the "anonymous attack surface"; this is the attack surface available to everyone (on the Internet). Since this number of people is so large, and you usually can't identify them or punish them, you want to be really sure that the anonymous attack surface is limited and doesn't have any so-called "pre-auth" vulnerabilities, because those can be exploited prior to identification and authentication.

3.2  The Classification Problem

Many times in security you wish to distinguish between classes of data. This occurs in firewalls, where you want to allow certain traffic but not all, and in intrusion detection where you want to allow benign traffic but not allow malicious traffic, and in operating system security, we wish to allow the user to run their programs but not malware. In doing so, we run into a number of limitations in various domains that deserve mention together.

3.2.1  Classification Errors

False Positives vs. False Negatives, also called Type I and Type II errors. Discuss equal error rate (EER) and its use in biometrics. Sometimes in medicine they will do a cheap test with a high error rate biased one direction (often false positives), and a more expensive test with a lower error rate, usually biased in the other direction.

3.2.2  The Base-Rate Fallacy

In The Base Rate Fallacy and its Implications for Intrusion Detection (http://www.raid-symposium.org/raid99/PAPERS/Axelsson.pdf), the author essentially points out that there's a lot of benign traffic for every attack, and so even a small chance of a false positive will quickly overwhelm any true positives. Put another way, if one out of every 10,001 connections is malicious, and the test has a 1% false positive error rate, then for every 1 real malicious connection there 10,000 benign connections, and hence 100 false positives.

3.2.3  Test Efficiency

In other cases, you are perfectly capable of performing an accurate test, but not on all the traffic. You may want to apply a cheap test with some errors on one side before applying a second, more expensive test on the side with errors to weed them out. This is done in BSD Unix with packet capturing via tcpdump, which uploads a coarse filter into the kernel, and then applies a more expensive but finer-grained test in userland which only operates on the packets which pass the first test.

3.2.4  Incompletely-Defined Sets

As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.
- Albert Einstein
Stop for a moment and think about the difficulty of trying to list all the undesirable things that your computer shouldn't do. If you find yourself finished, then ask yourself; did you include that it shouldn't attack other computers? Did you include that it shouldn't transfer $1000 to a mafia-run web site when you really intended to transfer $100 to your mother? Did you include that it shouldn't send spam to your address book? The list goes on and on.
Thus, if we had a complete list of everything that was bad, we'd block it and never have to worry about it again. However, often we either don't know, or the set is infinite. Similarly, we may not be able to obtain a complete list of everything that is good; imagine trying to specify in advance all the network packets that should be allowed into your enterprise!

3.2.5  The Guessing Hazard

So often we can't enumerate all the things we would want to do, nor all the things that we would not want to do. Because of this, intrusion detection systems (see 12) often simply guess; they try to detect attacks unknown to them by looking for features that are likely to be present in malware but not in normal traffic. At the current moment, you can find out if your traffic is passing through an IPS by trying to send a long string of A's in a session. This isn't malicious by itself, but is a common letter with which people pad exploits (see 17.4). In this case, it's a great example of a false positive, or collateral damage, generated through guilt-by-association; there's nothing inherently suspicious about a string of A's, it's just that exploit writers use them a lot, and IPS vendors decided that made them suspicious. I'm not a big fan of these because I feel that it breaks functionality that doesn't threaten the system, and that it could be used as evidence of malfeasance against someone by someone who doesn't really understand the technology. I'm already irritated by the false-positives or excessive warnings about security tools from anti-virus software; it seems to alert to "potentially-unwanted programs" an absurd amount of the time; most novices don't understand that the anti-virus software reads the disk even though I'm not running the programs, and that you have nothing to fear if you don't run the programs. I fear that one day my Internet Service Provider will start filtering them out of my email or network streams, but fortunately they just don't care that much.

3.3  Security Layers

I like to think of security as a hierarchy. At the base, you have physical security. On top of that is OS security, and on top of that is application security, and on top of that, network security. You may have an unbeatable firewall, but if your OS doesn't require a password and your adversary has physical access, you lose. So each layer of the pyramid can not be more secure (in an absolute sense) as the layer below it. Ideally, each layer should be available to fewer adversaries than the layer above it, so that one has a sort of balance or risk equivalency.
  1. network security
  2. application/database security
  3. OS security
  4. physical security
In operating system security, we distinguish between users of the system, and perhaps the roles they are fulfilling, and only concern ourselves with activities within that computer. It is assumed that the adversary has some access, but less than full privileges on the system. In network security, we concern ourselves with nodes in the networks (usually individual computers), and do not distinguish between users of each system. In some sense, we are now assigning rights to computers and not people. This is often justified since it is usually easier to leverage one user's access to gain another's within the same system than to gain access to another system (but this is not a truism).

3.4  Privilege Levels

Here's a taxonomy of some commonly-useful privilege levels.
  1. Anonymous, remote systems
  2. Authenticated remote systems
  3. Local unprivileged user (UID > 0)
  4. Administrator (UID 0)
  5. Kernel (privileged mode, ring 0)
  6. Hardware (TPM, ring -1, hypervisors, trojaned hardware)
Actual systems may vary, levels may not be strictly hierarchical, etc. Basically the higher the level you get, the harder you are to detect. The gateways between the levels are access control devices, analogous with firewalls.

3.5  Attack Characteristics

All attacks are not created equal. They may sometimes be grouped together in various ways, though, and so that leads us to ask whether there are any dimensions, or characteristics, by which we may classify known attacks.
access required
to execute the attack varies; some attacks require a system account, while others can be exploited by anyone on the Internet.
detectability
usually means that the attack involves a non-standard interaction with us, and therefore involves something which we could (in theory) look for and recognize. Passive attacks, typically eavesdropping, are very difficult or impossible to detect.
recoverability
refers to whether we may, after detecting or suspecting an attack, restore the state of the system to a secure one. Usually once an adversary has complete control of a system, we cannot return it to a secure state without some unusual actions, because they may have tampered with any tools we may be using to inspect or fix the system.
preventability
refers to whether there exists a defense which allows us to prevent it, or whether we must be content with detecting it. We can sometimes prevent attacks we cannot detect; for example, we can prevent someone from reading our wireless transmissions by encrypting them properly, but we can't usually detect whether or not any third party is receiving them.
scalability
means the same attack will probably work against many systems, and does not require human effort to develop or customize for each system.
offline exploitability
means that the attack may be conducted once but exploited several times, as when you steal a cryptographic key.
sophistication
refers to the property of requiring a great deal of skill, versus an unsophisticated attack like guessing a password to a known system account.
Much of this list is thanks to the Everest voting machine report (http://www.sos.state.oh.us/sos/info/EVEREST/14-AcademicFinalEVERESTReport.pdf).
Putting a key in a smart card or TPM or HSM prevents it from being copied and reused later, offline, but it doesn't prevent it from being abused by the adversary while he has control of its inputs. For example, a trojan can submit bogus documents to a smart card to have them signed, and the user has no way of knowing. Similarly, sometimes techniques like putting passphrases on SSH keys can prevent them from being stolen right away, requiring a second visit (or at least an exfiltration at a later date). However, each interaction with the system by the adversary risks detection, so he wants to do so once only, instead of multiple times.
For example, your adversary could pilfer your SSL cert, and then use it to create a phishing site elsewhere. This is a single loss of confidentiality, then an authentication attack (forgery) not against you, but against your customers (third parties). Or he could pilfer your GPG key, then use it to forge messages from you (a similar detectable attack) or read your email (passive attack, undetectable). Or he might break in, wanting to copy your SSH key, find that it's encrypted with a passphrase, install a key logger, and come back later to retrieve the passphrase (two active attacks). Alternately, the key logger could send the data out automatically (exfiltration).

3.6  What is a Vulnerability?

Now that you know what a security property is, what constitutes (or should constitute) a vulnerability? On the arguable end of the scale we have "loss of availability", or susceptibility to denial of service (DoS). On the inarguable end of the scale, we have "loss of control", which usually arbitrary code execution, which often means that the adversary can do whatever he wants with the system, and therefore can violate any other security property.

3.7  Accuracy Limitations in Making Decisions That Impact Security

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
- Charles Babbage
This is sometimes called the GIGO rule (Garbage In, Garbage Out). Stated this way, this seems self-evident. However, you should realize that this applies to systems as well as programs. For example, if your system depends on DNS to locate a host, then the correctness of your system's operation depends on DNS. Whether or not this is exploitable (beyond a simple denial of service) depends a great deal on the details of the procedures. This is a parallel to the question of whether it is possible to exploit a program via an unsantitized input.
You can never be more accurate than the data you used for your input. Try to be neither precisely inaccurate, nor imprecisely accurate. Learn to use footnotes.

4  Adversaries and Threats

If you know the enemy and know yourself, you need not fear the result of a hundred battles.
If you know yourself but not the enemy, for every victory gained you will also suffer a defeat.
If you know neither the enemy nor yourself, you will succumb in every battle.
- Sun Tzu, The Art of War (http://en.wikipedia.org/wiki/The_Art_of_War)
After deciding what you need to protect (your assets), you need to know about the threats you wish to protect it against, or the adversaries (sometimes called threat agents) which may threaten it. Generally intelligence units have threat shops, where they monitor and keep track of the people who may threaten their operations. This is natural, since it is easier to get an idea of who will try and do something than how some unspecified person may try to do it, and can help by hardening systems in enemy territory more than those in safer areas, leading to more efficient use of resources. In technology, people tend to focus on how rather than who, which seems to work better when anyone can potentially attack any system (like with publicly-facing systems on the Internet) and when protection mechanisms have low or no incremental cost (like with free and open-source software). Modeling these is called threat modeling (http://en.wikipedia.org/wiki/Threat_model).
In attacker-centric threat modeling, the implicit assumptions are that you have a limited budget and the number of threats is so large that you cannot defend against all of them. So you now need to decide where to allocate your resources. Part of this involves trying to figure out who your adversaries are and what their capabilities and intentions are, and thus how much to worry about particular domains of knowledge or technology. You don't have to know their name, location and social security number; it can be as simple as "some high school student on the Internet somewhere who doesn't like us", "a disgruntled employee" (as opposed to a gruntled employee), or "some sexually frustrated script-kiddie on IRC who doesn't like the fact that he is a jerk who enjoys abusing people and therefore his only friends are other dysfunctional jerks like him". People in charge of doing attacker-centric threat modeling must understand their adversaries and be willing to take chances by allocating resources against an adversary which hasn't actually attacked them yet, or else they will always be defending against yesterday's adversary, and get caught flat-footed by a new one.

4.1  Common Psychological Errors

The excellent but poorly titled1 book Searching for Happiness tells us that we make two common kinds of errors when reasoning about other humans:
  1. Overly different; if you looked at grapes all day, you'd know a hundred different kinds, and naturally think them very different. But they all squish when you step on them, they are all fruits and frankly, not terribly different at all. So too we are conditioned to see people as different because the things that matter most to us, like finding an appropriate mate or trusting people, cannot be discerned with questions like "do you like breathing?". An interesting experiment showed that a description of how they felt by people who had gone through a process is more accurate in predicting how a person will feel after the process than a description of the process itself. Put another way, people assume that the experience of others is too dependent on the minor differences between humans that we mentally exaggerate.
  2. Overly similar; people assume that others are motivated by the same things they are motivated by; we project onto them a reflection of our self. If a financier or accountant has ever climbed mount Everest, I am not aware of it. Surely it is a cost center, yes?

4.2  Cost-Benefit

Often, the lower layers of the security hierarchy cost more to build out than the higher levels. Physical security requires guards, locks, iron bars, shatterproof windows, shielding, and various other things which, being physical, cost real money. On the other hand, network security may only need a free software firewall. However, what an adversary could cost you during a physical attack (e.g. a burglar looting your home) may be greater than an adversary could cost you by defacing your web site.

4.3  Risk Tolerance

We may assume that the distribution of risk tolerance among adversaries is monotonically decreasing; that is, the number of adversaries who are willing to try a low-risk attack is greater than the number of adversaries who are willing to attempt a high-risk attack to get the same result. Beware of risk evaluation though; while a hacker may be taking a great risk to gain access to your home, local law enforcement with a valid warrant is not going to be risking as much.
So, if you are concerned about a whole spectrum of adversaries, known and unknown, you may wish to have greater network security than physical security, simply because there are going to be more remote attacks.

4.4  Capabilities

You only have to worry about things to the extent they may lie within the capabilities of your adversaries. It is rare that adversaries use outside help when it comes to critical intelligence; it could, for all they know, be disinformation, or the outsider could be an agent-provocateur.

4.5  Sophistication Distribution

If they were capable, honest, and hard-working, they wouldn't need to steal.
Along similar lines, one can assume a monotonically decreasing number of adversaries with a certain level of sophistication. My rule of thumb is that for every person who knows how to perform a technique, there are x people who know about it, where x is a small number, perhaps 3 to 10. The same rule applies to people with the ability to write an exploit versus those able to download and use it (the so-called script kiddies). Once an exploit is coded into a worm, the chance of a compromised host having been compromised by the worm (instead of a human who targets it specifically) approaches 100%. Discuss Bayesian inference.

4.6  Goals

We've all met or know about people who would like nothing more than to break things, just for the heck of it; schoolyard bullies who feel hurt and want to hurt others, or their overgrown sadist kin. Vandals who merely want to write their name on your storefront. A street thug who will steal a cell phone just to throw it through a window. I'm sure the sort of person reading this isn't like that, but unfortunately some people are. What exactly are your adversary's goals? Are they to maximize ROI (Return On Investment) for themselves, or are they out to maximize pain (tax your resources) for you? Are they monetarily or ideologically motivated? What do they consider investment? What do they consider a reward? Put another way, you can't just assign a dollar value on assets, you must consider their value to the adversary.

5  Physical Security

When people think of physical security, these often are the limit on the strength of access control devices; I recall a story of a cat burglar who used a chainsaw to cut through victim's walls, bypassing any access control devices. I remember reading someone saying that a deep space probe is the ultimate in physical security.

5.1  No Physical Security Means No Security

A couple of limitations come up without physical security for a system. For confidentiality, all of the sensitive data needs to be encrypted. But even if you encrypt the data, an adversary with physical access could trojan the OS and capture the data (this is a control attack now, not just confidentiality breach; go this far and you've protected against overt seizure, theft, improper disposal and such). So you'll need to you protect the confidentiality and integrity of the OS, he trojans the kernel. If you protect the kernel, he trojans the boot loader. If you protect the boot loader (say by putting on a removable medium), he trojans the BIOS. If you protect the BIOS, he trojans the CPU. So you put a tamper-evident label on it, with your signature on it, and check it every time. But he can install a keyboard logger. So suppose you make a sealed box with everything in it, and connectors on the front. Now he gets measurements and photos of your machine, spends a fortune replicating it, replaces your system with an outwardly identical one of his design (the trojan box), which communicates (say, via encrypted spread-spectrum radio) to your real box. When you type plaintext, it goes through his system, gets logged, and relayed to your system as keystrokes. Since you talk plaintext, neither of you are the wiser.
The physical layer is a common place to facilitate a side-channel attack (see 21.2).

5.2  Data Remanence

Data remanence is the the residual physical representation of your information on media after you believe that you have removed it (definition thanks to Wikipedia, http://en.wikipedia.org/wiki/Data_remanence). This is a disputed region of technology, with a great deal of speculation, self-styled experts, but very little hard science.
As of 2006, the most definitive study seems to be the NIST Computer Security Division paper Guidelines for Media Sanitization (http://csrc.nist.gov/publications/nistpubs/800-88/NISTSP800-88_rev1.pdf). NIST is known to work with the NSA on some topics, and this may be one of them. It introduces some useful terminology:
disposing
is the act of discarding media with no other considerations
clearing
is a level of media sanitization that resists anything you could do at the keyboard or remotely, and usually involves overwriting the data at least once
purging
is a process that protects against a laboratory attack (signal processing equipment and specially trained personnel)
destroying
is the ultimate form of sanitization, and means that the medium can no longer be used as originally intended

5.2.1  Magnetic Storage Media (Disks)

The seminal paper on this is Peter Gutmann's Secure Deletion of Data from Magnetic and Solid-State Memory (http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html). In early versions of his paper, he speculated that one could extract data due to hysteresis effects even after a single overwrite, but on subsequent revisions he stated that there was no evidence a single overwrite was insufficient. Simson Garfinkel wrote about it recently in his blog (https://www.techreview.com/blog/garfinkel/17567/).
The NIST paper has some interesting tidbits in it. Obviously, disposal cannot protect confidentiality of unencrypted media. Clearing is probably sufficient security for 99% of all data; I highly recommend Darik's Boot and Nuke (http://dban.sourceforge.net/), which is a bootable floppy or CD based on Linux. However, it cannot work if the storage device stops working properly, and it does not overwrite sectors or tracks marked bad and transparently relocated by the drive firmware. With all ATA drives over 15GB, there is a "secure delete" ATA command which can be accessed from hdparm within Linux, and Gordon Hughes has some interesting documents and a Microsoft-based utility (http://cmrr.ucsd.edu/people/Hughes/SecureErase.shtml). There's a useful blog entry about it (http://storagemojo.com/2007/05/02/secure-erase-data-security-you-already-own/). In the case of very damaged disks, you may have to resort to physical destruction. However, with disk densities being what they are, even 1/125" of a disk platter may hold a full sector, and someone with absurd amounts of money could theoretically extract small quantities of data. Fortunately, nobody cares this much about your data.
Now, you may wonder what you can do about very damaged disks, or what to do if the media isn't online (for example, you buried it in an underground bunker), or if you have to get rid of the data fast. I would suggest that encrypted storage (see 19.7) would almost always be a good idea. If you use it, you merely have to protect the confidentiality of the key, and if you can properly sanitize the media, all the better. Recently Simson Garfinkel re-discovered a technique for getting the data off broken drives; freezing them. Another technique that I have used is to replace the logic board with one from a working drive.

5.2.2  Semiconductor Storage (RAM)

Peter Gutmann's Data Remanence in Semiconductor Devices (http://www.cypherpunks.to/~peter/usenix01.pdf) shows that if a particular value is held in RAM for extended periods of time, various processes such as electromigration make permanent changes to the semiconductor's structure. In some cases, it is possible for the value to be "burned in" to the cell, such that it cannot hold another value.
Recently a Princeton team (http://citp.princeton.edu/memory/) found that the values held in DRAM decay in predictable ways after power is removed, such that one can merely reboot the system and recover keys for most encrypted storage systems (http://citp.princeton.edu/pub/coldboot.pdf). This generated much talk in the industry. This prompted an interesting overview of attacks against encrypted storage systems (http://www.news.com/8301-13578_3-9876060-38.html).

6  Distributed Systems

The objects involved in network security are called nodes. One can talk about networks composed of humans (social networks), but that's not the kind of network we're talking about here. Often in network security the adversary is assumed to control the network; this is a bit of a holdover from the days when the network was radio, or when the node was an embassy in a country controlled by the adversary. In modern practice, this doesn't seem to usually be the case, but it'd be hard to know for sure. In network security we almost always assume the adversary controls at least one of the nodes on the network.
In network security, we can lure an adversary to a system, tempt them with something inviting; such a system is called a honeypot, and a network of such systems is sometimes called a honeynet. A honeypot may or may not be instrumented for careful monitoring; sometimes systems so instrumented are called fishbowls, to emphasize the transparent nature of activity within them. Often one doesn't want to allow a honeypot to be used as a launch point for attacks, so outbound network traffic is sanitized or scrubbed; if traffic to other hosts is blocked completely, some people call it a jail, but that is also the name of an operating system security technology used by FreeBSD, so I consider it confusing.
To reduce a distributed system problem to a physical security (see 5) problem, you can use an air gap, or sneakernet between one system and another. However, the data you transport between them may be capable of exploiting the offline system. One could keep a machine offline except during certain windows; this could be as simple as a cron job which turns on or off the network interface via ifconfig. However, an offline system may be difficult to administer, or keep up-to-date with security patches.

6.1  Cryptography is the Sine Qua Non of Secure Distributed Systems

All cryptography lets you do is create trust relationships across untrustworthy media; the problem is still trust between endpoints and transitive trust.
- Marcus Ranum
Put simply, you can't have a secure distributed system (with the normal assumptions of untrusted nodes and network links potentially controlled by the adversary) without using cryptography somewhere ("sine qua non" is Latin for "without which it could not be"). If the adversary can read communications, then to protect the confidentiality of the network traffic, it must be encrypted. If the adversary can modify network communication, then it must have its integrity protected and be authenticated (that is, to have the source identified). Even physical layer communication security technologies, like the KLJN cipher, quantum cryptography, and spread-spectrum communication, use cryptography in one way or another.
I would go farther and say that performing network security decisions on anything other than cryptographic keys is never going to be as strong as if it depended on cryptography. Very few Internet adversaries currently have the capability to arbitrarily route data around. Most cannot jump between VLANs on a tagged port. Some don't even have the capability to sniff on their LAN. But none of the mechanisms preventing this are stronger than strong cryptography, and often they are much weaker, possibly only security through obscurity. Let me put it to you this way; to support a general argument otherwise, think about how much assurance a firewall has that a packet claiming to be from a given IP address is actually from the system the firewall maintainer believes it to be. Often these things are complex, and way beyond his control. However, it would be totally reasonable to filter on IP address first, and only then allow a cryptographic check; this makes it resistant to resource consumption attacks from anyone who cannot spoof a legitimate IP address (see 3.2.1).

6.2  Hello, My Name is 192.168.1.1

Humans are incapable of securely storing high-quality cryptographic keys, and they have unacceptable speed and accuracy when performing cryptographic operations. (They are also large, expensive to maintain, difficult to manage, and they pollute the environment. It is astonishing that these devices continue to be manufactured and deployed. But they are sufficiently pervasive that we must design our protocols around their limitations).
- Network Security / PRIVATE Communication in a PUBLIC World by Charlie Kaufman, Radia Perlman, & Mike Speciner (Prentice Hall 2002; p.237)
Because humans communicate in slowly, in plaintext, and don't plug into a network, we consider the nodes within the network to be computing devices. The system a person interacts with has equivalency with them; break into the system administrator's console, and you have access to anything he or she accesses. In some cases, you may have access to anything he or she can access. You may think that the your LDAP or Kerberos server is the most important, but isn't the node of the guy who administers it just as critical? This is especially true if OS security is weak and any user can control the system, or if the administrator is not trusted, but it is also convenient because packets do not have user names, just source IPs. When some remote system connects to a server, unless both are under the control of the same entity, the server has no reason to trust the remote system's claim about who is using it, nor does it have any reason to treat one user on the remote system different than any other.

6.3  Source Tapping; The First Hop and Last Mile

One can learn a lot more about a target by observing the first link from them than from some more remote place. That is, the best vantage point is one closest to the target. For this reason, the first hop is far more critical than any other. An exception may involve a target that is more network-mobile than the eavesdropper. The more common exception is tunneling/encryption (to include tor and VPN technologies); these relocate the first hop somewhere else which is not physically proximate to the target's meat space coordinates, which may make it more difficult to locate.
Things to consider here involve the difficulty of interception, which is a secondary concern (it is never all that difficult). For example, it is probably less confidential from the ISP to use an ISP's caching proxy than to access the service directly, since most proxy software makes it trivial to log the connection and content; however, one should not assume that one is safe by not using the proxy (especially now that many do transparent proxying). However, it is less anonymous from the remote site to access the remote site directly; using the ISP's proxy affords some anonymity (unless the remote site colludes with the ISP).

6.4  Security Zones

The firewall was originally defined as a device between different networks that had different security characteristics; it was named after the barrier between a car interior and the engine, which is designed to prevent a engine fire from spreading to the cabin. Demilitarized zones (DMZs) were originally defined as an area outside the firewall but inside a border router, then as a separate leg of the firewall, and now in a variety of ways. An untrusted network may be the Internet, or a wifi network, or a network with public access. What these definitions all have in common is that they define a security zone (this term thanks to the authors of Extreme Exploits), or the barrier between security zones. I believe this concept, that of a security zone where all the nodes inside have roughly equivalent access to or from other security zones, is the most important and fundamental way of thinking of network security. Do not confuse this with the idea that all the systems in the zone have the same relevance to the network's security, or that the systems have the same impact if compromised (for example, your site's DNS servers may be in the same zone as desktops); that is a complication and more of a matter of operating system security than network security.

6.5  Security Equivalent Things Go Together

One issue that always seems to come up is availability versus other goals. For example, suppose you install a new biometric voice recognition system. Then you have a cold and can't get in. Did you prioritize correctly? Which is more important? Similar issues come up in almost every place with regard to security. For example, your system may authenticate users versus a global server, or it may have a local database for authentication. The former means that one can revoke a user's credentials globally immediately, but also means that if the global server is down, nobody can authenticate. Attempts to get the best of both worlds ("authenticate locally if global server is unreachable") often reduce to availability (adversary just DOSes link between system and global server to force local authentication).
My philosophy on this is simple; put like things together. That is, I think authentication information for a system should be on the system. That way, the system is essentially a self-contained unit. By spreading the data out, one multiplies potential attack targets, and reduces availability. If someone can hack the local system, then being able to alter a local authentication database is relatively insignificant.

6.6  Outsider Threats vs. Insider Threats

The perimeter is not here nor there, but it is inside you, and among you.
Most organizations consider the unauthenticated and unauthorized person on the Internet to be the largest threat, and despite hype to the contrary, I believe this is correct. Most people are trustworthy for the sorts of things we trust them for, and if they weren't, society would probably collapse. The difference is that on the Internet, the pool of potential adversaries is much larger, and while a person can only hold one job, they can easily hack into many different organizations. The veterans (and critics) of Usenet and IRC are well aware of this, where the unbalanced tend to be most vocal and most annoying. Some of them seem to have no goal other than to irritate others. In the real world, people learn to avoid these sorts, and employers choose not to hire them, but on the Internet, it's a bit more difficult to filter out the chaff, so to speak. Also, if we detect a misbehaving insider, we can usually identify and therefore punish them; by contrast, it is difficult to take a simple IP address and end up with a successful lawsuit or criminal case, particularly if the IP is in another country. Essentially, perimeter defenses protect against most adversaries, whereas distributed defenses on each host protect against all adversaries (that is, remote systems; local users are the domain of OS security).
The idea of pointing outward versus pointing inward is a well-known one in alarm systems. Your typical door and window sensors are perimeter defenses, and the typical motion detector or pressure mat an internal defense. As with alarm systems, the internally-focused defenses are prone to triggering on authorized activity, whereas the perimeter defenses are less so.
However, I am beginning to think that perimeter defenses are insufficient. As we become more networked, we will have more borders with more systems. End-to-end protocol encryption and VPNs prevent any sort of application-layer data inspection by NIDS devices located at choke points and gateways. High-speed networks, particularly fiber to the desktop, challenge our ability to centralize, inspect, and filter traffic, and requires expensive, high-performance equipment. Tunneling and firewall-penetrating technologies like skype create tunnels (some may say covert channels) through the firewall. Put simply, "the perimeter is everywhere", and the forward-looking should consider how to distribute our security over our assets. For example, everything that is done by a NIDS can be done on the endpoint, and it doesn't suffer from many of the typical problems that a separate device does (including evasion techniques and interpretation ambiguities). Also, this means each internal node pays for its own security; if I am downloading 1Gbps, I am also inspecting it, whereas an idle system isn't spending any cycles inspecting traffic. With the proper design, no packets get lost, dropped, or ignored, nor is it necessary to limit bandwidth because of limited inspection capacity at the perimeter. And we can use commodity hardware (the hardware we already have) to do the work.
Another important issue to consider is series versus parallel defenses (see 24.8). Suppose the gateway, firewall, and VPN endpoint for your organization's main office uses the pf firewall (IMHO, the best open-source firewall out there). Now, suppose a remote office wants to connect in from Linux, so they use iptables. Now, should there be an exploitable weakness in iptables, then they might be able to penetrate the remote office, making them inside the perimeter. Courtesy of the VPN tunnel, they are now inside the perimeter of the main office as well, and your perimeter security is worthless. Given the trend towards a more complex and convoluted perimeter, I think this suggests moving away from perimeter defenses and towards distributed defenses; we can start by creating concentric perimeters, or firewalls between internal networks, and move towards (the ideal but probably unreachable goal of) a packet filter on every machine, implementing least privilege on every system.
A hardware security module (HSM) basically makes everyone but the vendor an outsider; insurance companies love this because they defend against insider threats as well as outsiders.
Dave G. of Matasano has published an interesting piece on the insider threat (http://www.matasano.com/log/984/the-insidious-insider-threat/).

6.7  A Proposed Perimeter Defense

I believe the following design would be a useful design for perimeter defenses for most organizations and individuals. First, there would be an outer layer of reactive prevention, followed by an inner layer of prevention and detection that acts as a fail-safe mechanism. If the outer preventative defense should fail for some reason (hardware, software, configuration) then incoming connections will be stopped by the inner layer and the detection will notify us that something is wrong. The idea of a dual layer of firewalling is already becoming popular with financial institutions and military networks, but really derives itself from the lessons learned trying to guarantee high availability and specifically the goal of eliminating single points of failure. This system also doesn't require monitoring traffic blocked by the outer layer, which virtually eliminates the resources it takes to monitor traffic that gets blocked anyway. However, if the outer layer were not reactive, then we would effectively be discarding any useful intelligence that is gained by detecting probes (that is, a failed connection or attack is still valuable in determining intent). With a reactive firewall as the outer layer, when an adversary probes our defenses looking for holes or weak spots, we take appropriate action, usually shunning that network address, and this makes enumeration a much more difficult process. With a little imagination, we can construct more deceptive defensive measures, like returning random responses, or redirection to a honey-net (which is essentially just a consistent set of bogus responses, plus monitoring). Since enumeration is strictly an information-gathering activity, the obvious countermeasure is deception. The range of deceptive responses runs from none (that is, complete silence, or lack of information) through random responses (misinformation) to consistent, strategic deception (disinformation). Stronger responses are out of proportion to the provocation (network scans are legal in most countries), and often illegal in any circumstances.

6.8  Man In The Middle

How do we detect MITM or impersonation in web, PGP/GPG, SSH contexts? The typical process for creating a connection involves a DNS resolution at the application layer (unless you use IP addresses), then sending packets to the IP address (at the network layer), which have to be routed; at the link layer, ARP typically is used to find the next hop at each stage.

6.8.1  DNS Issues

Poisoning, spoofing (transaction ID issues) or maybe you are querying a DNS server the adversary controls (i.e. your ISP)

6.8.2  IP Routing

Announcing bogus routes, or topological considerations

6.8.3  Link-layer Issues

ARP poisoning (dsniff)

6.8.4  Physical Layer

Tapping the wire (or listening to wireless)

6.8.5  Periodic Rechecking

It's difficult to stay perpetually in the middle. When you aren't, typically, the cryptographic fingerprints will no longer match and the MITM will be detected. It's handy to occasionally compare them using different channels, so that if the ones you originally relied upon were proxied, the tampering will be detected. SSH does this automatically and is called the baby duck model (i.e. it bonds to the first thing it sees, and complains if it changes identities). However, this detects the problem only retroactively.

6.8.6  Out-of-Band Comparison

One can compare digests/fingerprints/hashes over a different, low-bandwidth communication medium (i.e. the phone, postal mail).

6.8.7  Parallel Paths

OOB comparison is really an example of creating two disjoint paths between two entities and making sure that they give the same results. This can occur in multiple contexts. For example, it can be used for the bootstrapping problem; how can I trust the first connection? By creating two paths I can compare the identities of the peer both places. I once used this to check the integrity of my PGP downloads by downloading it from home and from another location, and comparing the results.
TODO: show a diagram of what I mean here.

6.8.8  Formatting

Imagine that the adversary is conducting a MITM against, say, an SSH session, so instead of A<->B it is A<->O<->B. Your countermeasure as A may be to check the IP addresses of the peer at B, so that the adversary would have to spoof IPs in both directions (this is often printed automatically at login). Another technique is to check the host key fingerprint as part of your login sequence, sending the fingerprint through the tunneled connection. The adversary may modify the data at the application layer automatically, to change the fingerprint on the way through. But what if you transformed (e.g. encrypted) the fingerprint using a command-line tool, and represented it as printable characters, and printed them through the tunnel, and inverted the transformation at the local end? Then he'd have a very difficult time writing a program to detect this, especially if you kept the exact mechanism a secret. You could run the program automatically through ssh, so it isn't stored on the remote system.

7  Identification and Authentication

Identification is necessary before making any sort of access control decisions. Often it can reduce abuse, because an identified individual knows that if they do something there can be consequences or sanctions. For example, if an employee abuses the corporate network, they may find themselves on the receiving end of the sysadmin's luser attitude readjustment tool (LART). I tend to think of authentication as a process you perform on objects (like paintings, antiques, and digitally signed documents), and identification as a process that subjects (people) perform, but in network security you're really looking at data created by a person for the purpose of identifying them, so I use them interchangeably.

7.1  Identity

Sometimes I suspect I'm not who I think I am.
- Ghost in the Shell
An identity, for our purposes, is an abstract concept; it does not map to a person, it maps to a persona. Some people call this a digital ID, but since this paper doesn't talk about non-digital identities, I'm dropping the qualifier. Identities are different from credentials, which are something you use to prove identity. For example, your login password is a credential. In relational database design, it is considered a good practice for the primary key (http://en.wikipedia.org/wiki/Primary_key) of a table to be an integer, perhaps a row number, that is not used for anything else. That is because the primary key is used as an identifier for the row. An identifier is shorthand, a handle; like a pointer, it allows us to modify the object itself, so that the modification occurs in all places simultaneously. Most competent DBAs realize that people change names, phone numbers, locations, and so on; they may even change social security numbers. They also realize that people may share any of these things (even social security numbers are not necessarily unique, especially if they lie about it). So to be able to identify a person across any of these changes, you need to use a row number. The exact same principle applies with security systems.
In Unix, a person is given a username (identity) and a password (credential). This is good, because the password may be changed without losing the idea of the identity of the person. However, there are subtle gotchas. In actuality, the username is mapped to a user ID (UID), which is the real way that Unix keeps track of identity. It isn't necessarily a one-to-one mapping. Also, a poor system administer may reassign an unused user ID without going through the file system and looking for files owned by the old user, in which case their ownership is silently reassigned.
PGP and GPG made the mistake of using a cryptographic key as an identifier. If one has to revoke that key, one basically loses anything (such as signatures) which applied to that key, and the trust that other people have indicated towards that key. And if you have multiple keys, friends of yours who have all of them cannot treat them all as equivalent, since GPG can't be told that they are associated to the same identity, because the keys are the identity. Instead, they must manage statements about you (such as how much they trust you to act as an introducer) on each key independently.

7.2  What Authority?

Does it follow that I reject all authority? Far from me such a thought. In the matter of boots, I refer to the authority of the bootmaker; concerning houses, canals, or railroads, I consult that of the architect or the engineer.
- Mikhail Bakunin, What is Authority? 1882 (http://www.panarchy.org/bakunin/authority.1871.html)
When we are attempting to identify someone, we are relying upon some authority, usually the state government. When you register a domain name with a registrar, they record your personal information in the WHOIS database; this is the system of record (http://en.wikipedia.org/wiki/System_of_record). No matter how careful we are, we can never have a higher level of assurance than this authority has. If the government gave that person a false identity, or the person bribed a DMV clerk to do so, we can do absolutely nothing about it. This is an important implication of the limitations of accuracy (see 3.7).

7.3  Authentication Factors

There are many ways you can prove your identity to a system. They may include:
something you are
like biometric signatures such as the pattern of capillaries on your retina, your fingerprints, etc.
something you have
like a token, physical key, or thumb drive
something you know
like a passphrase or password
somewhere you are
if you put a GPS device in a computer, or did direction-finding on transmissions, or simply require a person to be physically present somewhere to operate the system
somewhere you can be reached
like a mailing address, network address, email address, or phone number
At the risk of self-promotion, I want to point out that, to my knowledge, the last factor has not been explicitly stated in computer security literature, although it is demonstrated every time a web site emails you your password, or every time a financial company mails something to your home.

7.4  Authentication Issues: When, What

Do we authenticate each transaction or command (sudo), or a session (SSH), or only certain commands (passwd)? What is being authenticated, the remote system, the agent, or the user?

7.5  The Identity Continuum

Identification can range from fully anonymous to pseudonymous, to full identification. Ensuring identity can be expensive, and is never perfect. Think about what you are trying to accomplish. Applies to cookies from web sites, email addresses, "real names", and so on.

7.6  Problems Remaining Anonymous

In cyberspace everyone will be anonymous for 15 minutes.
- Graham Greenleaf
What can we learn from anonymizer, mixmaster, tor, and so on? Often one can de-anonymize. Some people have de-anonymized search queries this way, and census data, and many more data sets that are supposed to be anonymous.

7.7  Problems with Identifying People

7.8  Remote Attestation

A concept in network security involves knowing that the remote system is a particular program or piece of hardware is called remote attestation. When I connect securely over the network to a machine I believe I have full privileges on, how do I know I'm actually talking to the machine, and not a similar system controlled by the adversary? This is usually attempted by hiding an encryption key in some tamper-proof part of the system, but is vulnerable to all kinds of disclosure and side-channel attacks, especially if the owner of the remote system is the adversary.
The most successful example seems to be the satellite television industry, where they embed cryptographic and software secrets in an inexpensive smart card with restricted availability, and change them frequently enough that the resources required to reverse engineer each new card exceeds the cost of the data it is protecting. In the satellite TV industry, there's something they call ECMs (electronic counter-measures), which are program updates of the form "look at memory location 0xFC, and if it's not 0xFA, then HCF" (Halt and Catch Fire). The obvious crack is to simply remove that part of the code, but then you will trigger another check that looks at the code for the first check, and so on.
The sorts of non-cryptographic self-checks they request the card to do, such as computing a checksum (such as a CRC) over some memory locations, are similar to the sorts of protections against reverse engineering, where the program computes a checksum to detect modifications to itself.

8  Access Control

8.1  Privilege Escalation

Ideally, all services would be impossible to abuse. Since this is difficult or impossible, we often restrict access to them, to limit the potential pool of adversaries. Of course, if some users can do some things and others can't, this creates the opportunity for the adversary to perform an unauthorized action, but that's often unavoidable. For example, you probably want to be able to do things to your computer, like reformat it and install a new operating system, that you wouldn't want others to do. You will want your employees to do things an anonymous Internet user cannot (see 3.4). Thus, many adversaries want to escalate their privileges to that of some more powerful user, possibly you. Generally, privilege escalation attacks refer to techniques that require some level of access above that of an anonymous remote system, but grant an even higher level of access, bypassing access controls.

8.2  Physical Access Control

These include locks. I like Medeco, but none are perfect. It's easy to find guides to lock picking:

8.3  Operating System Access Control: DAC, MAC, RBAC

Discretionary Access Control (DAC) is up to the end-user. They can choose to let other people write to their files, if they wish, and the defaults tend to be global. This is how file permissions on classic Unix and Windows works. A more secure system often involves Mandatory Access Control (MAC), where the security administrator sets up the permissions globally. Some MAC types are Type Enforcement and Domain Type Enforcement. Implementations include SELinux and systrace. Often they are combined, where the access request has to pass both tests, meaning that the effective permission set is the intersection (union) of the MAC and DAC permissions. Another way of looking at it is that MAC sets the maximum permissions that DAC can give. Role-Based Access Control (RBAC) could be considered a form of MAC. In RBAC, there are roles to whom permissions are assigned, and one switches roles to change permission sets. For example, you might have a security administrator role, but you don't need that to read email or surf the web, so you only switch to it when doing security administrator stuff. This prevents you from accidentally running malware with full permissions. Unix emulates this with pseudo-users and sudo.

9  Secure System Administration

9.1  Change Management

Change management is the combination of both pro-active declaring and approving of intended changes, and retroactively monitoring the system for changes, comparing them to the approved changes, and altering and escalating any unapproved changes. Change management is based on the theory that unapproved changes are potentially bad, and therefore related to anomaly detection (see 12.1). It is normally applied to files and databases.

9.2  Self-Healing Systems

There is a system administration tool called cfengine (http://www.cfengine.org/) which implements a concept called "self-healing systems", whereby any changes made on a given machine are automatically reverted to the (ostensibly correct and secure) state periodically. Any change to these parameters made on a given system but not in the central configuration file are considered to be accidents or attacks, and so if you really want to make a change it has to be done on the centrally-managed and ostensibly monitored configuration file. You can also implement similar concepts by using a tool like rsync to manage the contents of part of the file system.

9.3  Heterogeneous vs. Homogeneous Defenses

Often homogeneous solutions are easier to administer. Having different systems requires more resources, in training yourself, learning to use them properly, keeping up with vulnerabilities, and increases the risk of misconfiguration (assuming you aren't as good at N systems as you would be at one). But there are cases where heterogeneity is easier, or where homogeneity is impossible. Maybe a particular OS you're installing comes with sendmail as the default, and changing it leads to headaches (or the one you want just isn't available on it, because it is a proprietary platform). Embedded devices often have a fixed TCP/IP stack that can't be changed, so if you are to guard against things like such things, you must either run only one kind of software on all Internet-enabled systems, denying yourself the convenience of all the new network-enabled devices, or you must break Internet-level connectivity with a firewall and admit impotency to defend against internal threats (and anyone who can bypass the perimeter).

10  Logging

10.1  Synchronized Time

It is absolutely vital that your systems have consistent timestamps. Consistency is more important than accuracy, because you are primarily going to be comparing logs between your systems. There are a number of problems comparing timestamps with other systems, including time zones and the fact that their clocks may be skewed. However, ideally, you'd want both, so that you could compare if the other systems are accurate, and so you can make it easier for others to compare their logs with yours. Thus, the Network Time Protocol (NTP) is vital. My suggestion is to have one system at every physical location that act as NTP servers for the location, so that if the network connections go down, the site remains consistent. They should all feed into one server for your administrative domain, and that should connect with numerous time servers. This also minimizes network traffic and having a nearby server is almost always better for reducing jitter.

10.2  Syslog

See the SAGE booklet on "Building a Logging Infrastructure".

11  Reports

11.1  Change Reporting

I spend a lot of time reading the same things over and over in security reports. I'd like to be able to filter things that I decided were okay last time without tweaking every single security reporting script. What I want is something that will let me see the changes from day to day. Ideally, I'd be able to review the complete data, but normally I read the reports every day and only want to know what has changed from one day to the next.

11.2  Artificial Ignorance

To be able to specify things that I want to ignore in reports is what perhaps Marcus Ranum termed "artificial ignorance" back around 1994 (described here: http://www.ranum.com/security/computer_security/papers/ai/index.html). Instead of specifying what I want to see, which is akin to misuse detection, I want to see anything I haven't already said was okay, which is anomaly detection. Put another way, what you don't know can hurt you (see 22.6), which is why "default deny" is usually a safer access control strategy (see 24.1).

11.3  Dead Man's Switch

In some movies, a character has a switch which goes off if they die, which is known as a dead man's switch, which can be applied to software (http://en.wikipedia.org/wiki/Dead_man's_switch#Software_uses) I want to see if some subsystem has not reported in. If an adversary overtly disables our system, we are aware that it has been disabled, and we can assume that something security-relevant occurred during that time. But if through some oversight on our side, we allow a system to stop monitoring something, we do not know if anything has occurred during that time. Therefore, we must be vigilant that our systems are always monitoring, to avoid that sort of ambiguity. Therefore, we want to know if they are not reporting because of a misconfiguration or failure. Therefore, we need a periodic heartbeat or system test, and a dead man's switch.

12  Abuse Detection

Doveriai, no proveriai ("trust, but verify")
- Russian Proverb (http://en.wikipedia.org/wiki/Trust,_but_Verify)
It is becoming apparent that there's more to computers than shell access nowadays. One wants to allow benign email, and stop unsolicited bulk email. For wikis and blogs, one wants to allow collaboration, but doesn't want "comment spam". Some still want to read topical USENET messages, and not read spam (I feel that's a lost cause now). If you're an ISP, you want to allow customers to do some things but don't want them spamming or hacking. If you have a public wifi hot-spot, you'd like people to use it but not abuse it. So I generalized IDS, anti-virus, and anti-spam as abuse detection.

12.1  Misuse Detection vs. Anomaly Detection

Most intrusion detection systems categorize behavior, making it an instance of the classification problem (see 3.2). Generally, there are two kinds of intrusion detection systems, commonly called misuse detection and anomaly detection. Misuse detection involves products with signature databases which indicate bad behavior. By analogy, this is like a cop who is told to look for guys in white-and-black striped jumpsuits with burlap sacks with dollar signs printed on them. This is how physical alarm sensors work; they detect the separation of two objects, or the breaking of a piece of glass, or some specific thing. The second is called anomaly detection, which is like a cop who is told to look for "anything out of the ordinary". The first has more false negatives and fewer false positives than the second. The first (theoretically) only finds security-relevant events, whereas the second (theoretically) notes any major changes. This can play out in operating system security (as anti-virus and other anti-malware products) or in network security (as NIDS/IPS). The first is great for vendors; they get to sell you a subscription to the signature database. The second is virtually non-existent and probably rather limited in practice (you have to decide what to measure/quantify in the first place).
In misuse detection, you need to have a good idea of what the adversary is after, or how they may operate. If you get this guess wrong, your signature may be completely ineffective; it may minimize false positives at the risk of false negatives, particularly if the adversary is actually a script that isn't smart enough to take the bait. In this sense, misuse detection is a kind of enumerating badness, which means anything not specifically listed is allowed, and therefore violates the principle of least privilege (see 24.1).

12.2  Honey Traps

Tart words make no friends; a spoonful of honey will catch more flies than a gallon of vinegar.
- Benjamin Franklin
Noted security expert Marcus Ranum gave a talk on burglar alarms once at Usenix Security, and had a lesson that applies to computer security. He said that when a customer of theirs had an alarm sensor that was disguised as a jewelry container or a gun cabinet, it was almost always sure to trick the burglar, and trigger the alarm. Criminals, by and large, are opportunistic, and when something valuable is offered to them, they rarely look a gift horse in the mouth. I also recall a sting operation where a law enforcement agency had a list of criminals they wanted to locate but who never seemed to be home. They sent winning sweepstakes tickets to wanted criminals who dutifully showed up to claim their "prize". So a honey trap may well be the cheapest and most effective misuse detection mechanism you can employ.
One of the ways to detect spam is to have an email address which should never receive any email; if any email is received, then it is from a spammer. These are called spamtraps. Unix systems may have user accounts which may have guessable passwords and no actual owners, so they should never have any legitimate logins. I've also heard of banks which have trap accounts; these tend to be large accounts which should never have a legitimate transaction; they exist on paper only. Any transaction on such an account is, by definition, fraudulent and a sign of a compromised system. One could even go farther and define a profile of transactions, possibly pseudo-random, any deviation from which is considered very important to investigate. The advantage of these types of traps are the extremely low false-positive rate, and as a deterrent to potential adversaries who fear being caught and punished.

12.3  Tripwires and Booby Traps

Other misuse detection methods involve detecting some common activity after the intrusion, such as fetching additional tools (outbound TFTP connections to servers in Eastern Europe are not usually authorized) or connecting back to the adversary's system to bypass ingress rules on the firewall (e.g. shoveling application output to a remote X server). Marcus Ranum once recompiled "ls" to shut down the system if it was run as root, and he learned to habitually use "echo *" instead. One may wish to check that it has a controlling tty as well, so that root-owned scripts do not set it off. In fact, having a root-owned shell with no controlling tty may be an event worth logging.

12.4  Anti-Malware

This includes anti-virus, anti-trojan, anti-spyware, etc.

12.5  Anti-Spam

There's content filtering (including Bayesian filtering, and signature-based algorithms), delays of various kinds (graylisting), resource-consumption responses (teergrubing), blacklisting, micro-payment schemes, SPF, DKIM, and so on.

12.6  Detecting Automated Peers

People who abuse things for money want to do a lot of it, so frequently you'll want to try to detect them. You could be doing this for any of a number of reasons:
  1. To prevent people from harvesting email addresses for spamming
  2. To prevent bots from defacing your wiki with links to unrelated sites
  3. To prevent password-guessing
Related links:

12.6.1  CAPTCHA

A CAPTCHA is a Completely Automated Turing test to tell Computers and Humans Apart (http://en.wikipedia.org/wiki/Captcha). Basically they are problems whose answers are known and which are difficult for computers to answer directly.

12.6.2  Bot Traps

If you want to stop people from spidering your web site, you may use something called a "bot trap". This is similar to a CAPTCHA in that it tries to lure bots into identifying themselves by exploiting a behavior difference from humans.

12.6.3  Velocity Checks

This is an application of anomaly detection to differentiate computers and humans, or to differentiate between use and abuse. You simply look at how many transactions they are doing. You can take a baseline of what you think a human can do, and trigger any time an entity exceeds this. Or, you can profile each entity and trigger if they exceed their normal statistical profile, possibly applying machine learning algorithms to adjust expectations over time.

12.6.4  Typing Mistakes

The kojoney honey pot (http://kojoney.sourceforge.net/) emulates an SSH server in order to gather intelligence against adversaries. Regarding how it separates bots from humans, it says:
We, the humans, are clumsy. The script seeks for SUPR and BACKSPACE characters in the executed commands.
The script also checks if the intruder tried to change the window size or tried to forward X11 requests.

12.7  Host-Based Intrusion Detection

Game over man! Game over!