Security Concepts

travis+security@subspacefield.org

Abstract

This is an online book about computer, network, technical, physical, information and cryptographic security. It is a labor of love, incomplete until the day I am finished.

Contents

1  Metadata
    1.1  Copyright and Distribution Control
    1.2  Goals
    1.3  Audience
    1.4  About This Work
    1.5  How to Read the Online Version
    1.6  About Writing This
    1.7  Tools Used To Create This Book
2  Security Properties
    2.1  Information Security is a PAIN
    2.2  Parkerian Hexad
    2.3  Pentagon of Trust
    2.4  Security Equivalency
    2.5  Other Questions
3  Security Concepts
    3.1  The Classification Problem
        3.1.1  Classification Errors
        3.1.2  The Base-Rate Fallacy
        3.1.3  Test Efficiency
        3.1.4  Incompletely-Defined Sets
        3.1.5  The Guessing Hazard
        3.1.6  Machine Learning
    3.2  Security Layers
    3.3  Privilege Levels
    3.4  What is a Vulnerability?
    3.5  Accuracy Limitations
4  Economics of Security
    4.1  How Expensive are Security Failures?
5  Adversary Modeling
    5.1  Common Psychological Errors
    5.2  Cost-Benefit
    5.3  Risk Tolerance
    5.4  Capabilities
    5.5  Sophistication Distribution
    5.6  Goals
6  Threat Modeling
    6.1  Attack Surface
    6.2  Attack Trees
    6.3  The Weakest Link
7  Physical Security
    7.1  No Physical Security Means No Security
    7.2  Data Remanence
        7.2.1  Magnetic Storage Media (Disks)
        7.2.2  Semiconductor Storage (RAM)
8  Distributed Systems
    8.1  Network Security Overview
    8.2  Network Access Control
    8.3  Network Reconnaissance
    8.4  Network Intrusion Detection and Prevention
    8.5  Cryptography is the Sine Qua Non of Secure Distributed Systems
    8.6  Hello, My Name is 192.168.1.1
    8.7  Source Tapping; The First Hop and Last Mile
    8.8  Security Equivalent Things Go Together
    8.9  A Proposed Perimeter Defense
    8.10  Man In The Middle
        8.10.1  DNS Issues
        8.10.2  IP Routing
        8.10.3  Link-layer Issues
        8.10.4  Physical Layer
        8.10.5  Periodic Rechecking
        8.10.6  Out-of-Band Comparison
        8.10.7  Parallel Paths
        8.10.8  Formatting
    8.11  Network Surveillance
9  Identification and Authentication
    9.1  Identity
    9.2  What Authority?
    9.3  Authentication Factors
    9.4  Authentication Issues: When, What
    9.5  The Identity Continuum
    9.6  Problems Remaining Anonymous
    9.7  Problems with Identifying People
    9.8  Remote Attestation
    9.9  Advanced Authentication Tools
10  Authorization - Access Control
    10.1  Privilege Escalation
    10.2  Physical Access Control
    10.3  Operating System Access Control
    10.4  Application Authorization Decisions
        10.4.1  Apache Access Control
    10.5  IPTables, IPChains, Netfilter
    10.6  PF
    10.7  Keynote
11  Secure System Administration
    11.1  Monitoring
    11.2  Change Management
    11.3  Self-Healing Systems
    11.4  Heterogeneous vs. Homogeneous Defenses
12  Logging
    12.1  Synchronized Time
    12.2  Syslog
13  Reports
    13.1  Change Reporting
    13.2  Artificial Ignorance
    13.3  Dead Man's Switch
14  Abuse Detection
    14.1  Misuse Detection vs. Anomaly Detection
    14.2  Computer Immune Systems
    14.3  Behavior-Based Detection
    14.4  Honey Traps
    14.5  Tripwires and Booby Traps
    14.6  Anti-Malware
    14.7  Anti-Spam
        14.7.1  Content filtering
        14.7.2  Delays
        14.7.3  Blocking Known Offenders
        14.7.4  Sending Email
        14.7.5  Macro-Level Techniques
        14.7.6  Individual-Level Techniques
        14.7.7  Micropayment Systems
        14.7.8  Insolubility
    14.8  Detecting Automated Peers
        14.8.1  CAPTCHA
        14.8.2  Bot Traps
        14.8.3  Velocity Checks
        14.8.4  Typing Mistakes
    14.9  Host-Based Intrusion Detection
    14.10  Intrusion Detection Principles
    14.11  Intrusion Information Collection
    14.12  Intrusion Alerting
        14.12.1  Possible Intrusion Alerting Solutions
15  Abuse Response
    15.1  How to Respond to Abuse
        15.1.1  The Silent Treatment
        15.1.2  Honest Rejection
        15.1.3  Random Response
        15.1.4  Faux Positives
        15.1.5  The Simulation Defense
        15.1.6  Fishbowls
        15.1.7  Hack-Back
    15.2  Identification Issues
    15.3  Resource Consumption Defenses
    15.4  Proportional Response
16  Forensics
    16.1  Forensic Limitations
    16.2  Ephemeral Data
    16.3  Remnant Data
    16.4  Hidden Data
    16.5  Metadata
    16.6  Locating Encryption Keys and Encrypted Data
    16.7  Forensic Inference
17  Intrusion Response
18  Network Security
    18.1  The Current State of Things
    18.2  Traffic Identification
        18.2.1  RPC
        18.2.2  Dynamic Port Numbers
        18.2.3  Encapsulation
        18.2.4  Possible Solutions
    18.3  Brute-Force Defenses
    18.4  Federated Defense
    18.5  VLANs Are Not Security Technologies
    18.6  Advanced Network Security Technologies
19  Email Security
    19.1  Unsolicited Bulk Email
        19.1.1  Filtering
        19.1.2  Graylisting
    19.2  Phishing
20  Web Security
    20.1  Direct Browser Attacks
    20.2  Indirect Browser Attacks
    20.3  Crawler Attacks
    20.4  SSL Certificates Made Redundant
21  Application Security
    21.1  Security is a Subset of Correctness
    21.2  Malware vs. Data-Directed Attacks
    21.3  Reverse Engineering
        21.3.1  Tutorials
        21.3.2  Analyses
        21.3.3  Tools
        21.3.4  Anti-Anti-Reverse Engineering
    21.4  Application Exploitation
    21.5  Application Exploitation Defenses
        21.5.1  Stack-Smashing Protection
        21.5.2  Address-Space Layout Randomization (ASLR)
        21.5.3  Write XOR Execute
    21.6  Software Complexity
        21.6.1  Complexity of Network Protocols
        21.6.2  Polymorphism and Complexity
    21.7  Failure Modes
    21.8  Fault Tolerance
    21.9  Implications of Incorrectness
22  Human Factors and Usability
    22.1  The Psychology of Security
    22.2  Security Should Be Obvious
    22.3  Security Should Be Easy to Use
    22.4  No Hidden Data
23  Attack Patterns
    23.1  Attack Taxonomy
    23.2  Attack Properties
    23.3  Attack Cycle
24  Trust
    24.1  Trust and Trustworthiness
    24.2  Who or What Are You Trusting?
    24.3  Code Provenance
    24.4  The Incompetence Defense
25  Cryptology
    25.1  Limits of Cryptography
        25.1.1  The Last Foot of the Communication
        25.1.2  Limitations Regarding Endpoint Security
        25.1.3  Keys Must Be Exchanged
        25.1.4  In Practice
        25.1.5  The Complexity Trap
    25.2  Things To Know Before Doing Crypto
        25.2.1  Dramatis Personae
        25.2.2  Jargon
        25.2.3  How Strong Should My Cryptography Be?
        25.2.4  Key Lengths
        25.2.5  Eight Bit Clean Handling
        25.2.6  Encoding Binary Data
        25.2.7  Avoiding Ambiguity
        25.2.8  End-to-End vs. Link Level
    25.3  Cryptographic Algorithms
        25.3.1  Ciphers
        25.3.2  Cryptographic Hashes
        25.3.3  Message Authentication Codes and HMAC
        25.3.4  Signing
    25.4  Cryptographic Algorithm Enhancements
        25.4.1  Speed of Algorithms and the Hybrid Encryption Scheme
        25.4.2  Hashing Stored Authentication Data
        25.4.3  Offline Dictionary Attacks and Iterated Hashes
        25.4.4  Salts vs. Offline Dictionary Attacks and Rainbow Tables
        25.4.5  Offline Dictionary Attacks with Partial Confidentiality
        25.4.6  Never Use User Passphrases As Keys
        25.4.7  Run Algorithm Inputs through OWF
    25.5  Cryptographic Combinations
        25.5.1  Combiners
        25.5.2  The Sign then Encrypt Problem
        25.5.3  Key Derivation Functions
        25.5.4  Serialization, Records and Encoding
        25.5.5  Polymorphic Data and Ambiguity
    25.6  Cryptographic Protocols
        25.6.1  DoS and Anti-Clogging Tokens
        25.6.2  The Problem with Authenticating within an Encrypted Channel
        25.6.3  How to Protect the Integrity of a Session
        25.6.4  Freshness and Replay Attacks
        25.6.5  Preventing Feedback
        25.6.6  Identification
        25.6.7  Authentication
        25.6.8  Eschew Multiple Encoding Schemes Unless Necessary
        25.6.9  Key Exchange and Hybrid Encryption Schemes
    25.7  Encrypted Storage
        25.7.1  Key Escrow for Encrypted Storage
        25.7.2  Evolution of Cryptographic Storage Technologies
        25.7.3  Filesystem Crypto Layers
        25.7.4  File Systems with Optional Encryption
        25.7.5  Block Device Crypto
        25.7.6  The Cryptographically-Strong Pseudo-random Quick Fill
        25.7.7  Backups
    25.8  Key Management
        25.8.1  Key Exchange and the Bootstrapping Problem
        25.8.2  Key Management and Scalability
        25.8.3  On-Air Keying (OAK)
        25.8.4  One Key, One Purpose
        25.8.5  Time Compartmentalization
        25.8.6  Key Indirection
        25.8.7  Secret Sharing
    25.9  Cryptanalysis
        25.9.1  Cryptographic Attack Patterns
        25.9.2  A Priori Knowledge
26  Randomness and Unpredictability
    26.1  An Ideal Random Number Generator
    26.2  Definitions of Unpredictability
    26.3  Definitions of Randomness
    26.4  Why Entropy and Unpredictability Are Not the Same
    26.5  Unpredictability is the Sine Qua Non of Cryptography
    26.6  Unpredictability is Not Provable
    26.7  Randomly Generated Samples
    26.8  Testing For Predictability
    26.9  Ways to Fail
    26.10  Humans Are Too Predictable
    26.11  Sources of Unpredictability
    26.12  The Laws of Unpredictability
        26.12.1  The First Law of Unpredictability
        26.12.2  The Second Law of Unpredictability
        26.12.3  Mixing Unpredictability
        26.12.4  Getting it Wrong
27  Lateral Thinking
    27.1  Traffic Analysis
    27.2  Side Channels
        27.2.1  Physical Information-Gathering Attacks and Defenses
        27.2.2  Signal Injection Attacks and Defenses
        27.2.3  System-Local Side-Channel Attacks
        27.2.4  Timing Side-Channels
28  Information and Intelligence
    28.1  Intelligence Jargon
    28.2  Controlling Information Flow
    28.3  Labeling and Regulations
    28.4  Knowledge is Power
    28.5  Secrecy is Power
    28.6  Never Confirm Guesses
    28.7  What You Don't Know Can Hurt You
    28.8  How Secrecy is Lost
    28.9  Costs of Disclosure
    28.10  Dissemination
    28.11  Information, Misinformation, Disinformation
29  Conflict and Combat
    29.1  Indicators and Warnings
    29.2  Attacker's Advantage in Network Warfare
    29.3  Defender's Advantage in Network Warfare
    29.4  OODA Loops
30  Security Principles
    30.1  The Principle of Least Privilege
    30.2  The Principle of Agility
    30.3  The Principle of Minimal Assumptions
    30.4  The Principle of Fail-Secure Design
    30.5  The Principle of Unique Identifiers
    30.6  The Principles of Simplicity
    30.7  The Principle of Defense in Depth
    30.8  The Principle of Uniform Fronts
    30.9  The Principle of Split Control
    30.10  The Principle of Minimal Changes
    30.11  The Principle of Centralized Management
    30.12  The Principle of Least Surprise
    30.13  The Principle of Removing Excuses
    30.14  The Principle of Retaining Control
    30.15  Availability Principles
31  Common Arguments
    31.1  Disclosure: Full, Partial, or None?
        31.1.1  Arguments For Full Disclosure
        31.1.2  Arguments Against Full Disclosure - Vendor
        31.1.3  Arguments Against Full Disclosure - Vendor's Employees
        31.1.4  Arguments Against Full Disclosure - End User
    31.2  Absolute vs. Effective Security
    31.3  Quantification and Metrics vs. Intuition
    31.4  Security Through Obscurity
    31.5  Security of Open Source vs. Closed Source
    31.6  Insider Threat vs. Outsider Threat
        31.6.1  In Favor of Perimeter Defenses
        31.6.2  What Perimeter?
        31.6.3  Performance Issues
    31.7  Prevention vs. Detection
        31.7.1  Prevention over Detection
        31.7.2  Detection over Prevention
        31.7.3  Impact on Intelligence Collection
    31.8  Audit vs. Monitoring
    31.9  Early vs. Late Adopters
    31.10  Sending HTML Email
32  Editorials, Predictions, Polemics, and Personal Opinions
    32.1  Security is for Polymaths
    32.2  Linear Order Please!
    32.3  Computers are Transcending our Limitations
    32.4  Reusable Authentication Data Considered Harmful
    32.5  Password Length Limits Considered Harmful
    32.6  Everything Will Be Encrypted Soon
    32.7  Error Propagation Characteristics Usually Don't Matter
    32.8  Keep it Legal, Stupid
    32.9  Should My Employees Attend "Hacker" Conferences?
    32.10  Should You Sell Out?
    32.11  Anonymity is not a Crime
        32.11.1  Example: Sears Makes Customer Purchase Information Available Online, Provides Spyware to Customers
    32.12  Monitoring Your Employees
    32.13  Trust People in Spite of Counterexamples
    32.14  Do What I Mean vs. Do What I Say
    32.15  You Are Part of the Problem if You...
    32.16  What Do I Do to Not Get Hacked?
33  Resources
    33.1  My Other Stuff
    33.2  Conferences
    33.3  Books
        33.3.1  Publishers
        33.3.2  Titles
    33.4  Periodicals
    33.5  Blogs
    33.6  Mailing Lists
34  Credits

1  Metadata

1.1  Copyright and Distribution Control

Kindly link a person to it instead of redistributing it, so that people may always receive the latest version. However, even an outdated copy is better than none. The PDF version is preferred and more likely to render properly (especially graphics and special mathematical characters), but the HTML version is simply too convenient to not have it available. The latest version is always here:
http://www.subspacefield.org/security/security_concepts.html
 
This is a copyrighted work, with some rights reserved. This work is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License (http://creativecommons.org/licenses/by-nc-nd/3.0/us/). This means you may redistribute it for non-commercial purposes, and that you must attribute me properly (without suggesting I endorse your work). For attribution, please include a prominent link back to this original work and some text describing the changes. I am comfortable with certain derivative works, such as translation into other languages, but not sure about others, so have yet not explicitly granted permission for all derivative uses. If you have any questions, please email me and I'll be happy to discuss it with you.

1.2  Goals

I wrote this paper to try and examine the typical problems in computer security and related areas, and attempt to extract from them principles for defending systems. To this end I attempt to synthesize various fields of knowledge, including computer security, network security, cryptology, and intelligence. I also attempt to extract the principles and implicit assumptions behind cryptography and the protection of classified information, as obtained through reverse-engineering (that is, informed speculation based on existing regulations and stuff I read in books), where they are relevant to technological security.

1.3  Audience

When I picture a perfect reader, I always picture a monster of courage and curiosity, also something supple, cunning, cautious, a born adventurer and discoverer.
- Friedreich Nietzsche
This is not intended to be an introductory text, although a beginner could gain something from it. The reason behind this is that beginners think in terms of tactics, rather than strategy, and of details rather than generalities. There are many fine books on computer and network security tactics (and many more not-so-fine books), and tactics change quickly, and being unpaid for this work, I am a lazy author. The reason why even a beginner may gain from it is that I have attempted to extract abstract concepts and strategies which are not necessarily tied to computer security. And I have attempted to illustrate the points with interesting and entertaining examples and would love to have more, so if you can think of an example for one of my points, please send it to me!
I'm writing this for you, noble reader, so your comments are very welcome; you will be helping me make this better for every future reader. If you send a contribution or comment, you'll save me a lot of work if you tell me whether you wish to be mentioned in the credits (see ) or not; I want to respect the privacy of anonymous contributors. If you're concerned that would be presumptuous, don't be; I consider it considerate of you to save me an email exchange. Security bloggers will find plenty of fodder by looking for new URLs added to this page, and I encourage you to do it, since I simply don't have time to comment on everything I link to. If you link to this paper from your blog entry, all the better.

1.4  About This Work

I have started this book with some terminology as a way to frame the discussion. Then I get into the details of the technology. Since this is adequately explained in other works, these sections are somewhat lean and may merely be a list of links. Then I get into my primary contribution, which is the fundamental principles of security which I have extracted from the technological details. Afterwards, I summarize some common arguments that one sees among security people, and I finish up with some of my personal observations and opinions.

1.5  How to Read the Online Version

Since this document is constantly being revised, I suggest that you start with the table of contents and click on the subject headings so that you can see which ones you have read already. If I add a section, it will show up as unread. By the time it has expired from your browser's history, it is probably time to re-read it anyway, since the contents have probably been updated.
See the end of this page for the date it was generated (which is also the last update time). I currently update this about once every two weeks.

1.6  About Writing This

Part of the challenge with writing about this topic is that we are always learning and it never seems to settle down, nor does one ever seem to get a sense of completion. I consider it more permanent and organized than a blog, more up-to-date than a book, and more comprehensive and self-contained than most web pages. I know it's uneven; in some areas it's just a heading with a paragraph, or a few links, in other places it can be as smoothly written as a book. I thought about breaking it up into multiple documents, so I could release each with much more fanfare, but that's just not the way I write, and it makes it difficult to do as much cross-linking as I'd like.
This is to my knowledge the first attempt to publish a computer security book on the web before printing it, so I have no idea if it will even be possible to print it commercially. That's okay; I'm not writing for money. I'd like for the Internet to be the public library of the 21st century, and this is my first significant donation to the collection. I am reminded of the advice of a staffer in the computer science department, who said, "do what you love, and the money will take care of itself".
That having been said, if you wanted towards the effort, you can help me defray the costs of maintaining a server and such by visiting our donation page (http://www.subspacefield.org/donate.html). If you would like to donate but cannot, you may wait until such a time as you can afford to, and then give something away (i.e. pay it forward).

1.7  Tools Used To Create This Book

I use lyx (http://www.lyx.org/), but I'm still a bit of a novice. I have a love/hate relationship with it and the underlying typesetting language LATEX(http://en.wikipedia.org/wiki/LaTeX).

2  Security Properties

What do we mean by secure? When I say secure, I mean that an adversary can't make the system do something that its owner (or designer, or administrator, or even user) did not intend. Often this involves a violation of a general security property. Some security properties include:
confidentiality
refers to whether the information in question is disclosed or remains private.
integrity
refers to whether the systems (or data) remain uncorrupted. The opposite of this is malleability, where it is possible to change data without detection, and believe it or not, sometimes this is a desirable security property.
availability
is whether the system is available when you need it or not.
consistency
is whether the system behaves the same each time you use it.
auditabilty
is whether the system keeps good records of what has happened so it can be investigated later. Direct-record electronic voting machines (with no paper trail) are unauditable.
control
is whether the system obeys only the authorized users or not.
authentication
is whether the system can properly identify users. Sometimes, it is desirable that the system cannot do so, in which case it is anonymous or pseudonymous.
non-repudiation
is a relatively obscure term meaning that if you take an action, you won't be able to deny it later. Sometimes, you want the opposite, in which case you want repudiability ("plausible deniability").
Please forgive the slight difference in the way they are named; while English is partly to blame, these properties are not entirely parallel. For example, confidentiality refers to information (or inferences drawn on such) just as program refers to an executable stored on the disk, whereas control implies an active system just as process refers to a running program (as they say, "a process is a program in motion"). Also, you can compromise my data confidentiality with a completely passive attack such as reading my backup tapes, whereas controlling my system is inherently detectable since it involves interacting with it in some way.

2.1  Information Security is a PAIN

You can remember the security properties of information as PAIN; Privacy, Authenticity, Integrity, Non-repudiation.

2.2  Parkerian Hexad

There is something similar known as the "Parkerian Hexad", defined by Donn B. Parker, which is six fundamental, atomic, non-overlapping attributes of information that are protected by information security measures:
  1. confidentiality
  2. possession
  3. integrity
  4. authenticity
  5. availability
  6. utility

2.3  Pentagon of Trust

  1. Admissibility (is the remote node trustworthy?)
  2. Authentication (who are you?)
  3. Authorization (what are you allowed to do?)
  4. Availability (is the data accessible?)
  5. Authenticity (is the data intact?)

2.4  Security Equivalency

I consider two objects to be security equivalent if they are identical with respect to the security properties under discussion; for precision, I may refer to confidentiality-equivalent pieces of information if the sets of parties to which they may be disclosed (without violating security) are exactly the same (and conversely, so are the sets of parties to which they may not be disclosed). In this case, I'm discussing objects which, if treated improperly, could lead to a compromise of the security goal of confidentiality. Or I could say that two cryptosystems are confidentiality-equivalent, in which case the objects help achieve the security goal. To be perverse, these last two examples could be combined; if the information in the first example was actually the keys for the cryptosystem in the second example, then disclosure of the first could impact the confidentiality of the keys and thus the confidentiality of anything handled by the cryptosystems. Alternately, I could refer to access-control equivalence between two firewall implementations; in this case, I am discussing objects which implement a security mechanism which helps us achieve the security goal, such as confidentiality of something.

2.5  Other Questions

  1. Secure to whom? A web site may be secure (to its owners) against unauthorized control, but may employ no encryption when collecting information from customers.
  2. Secure from whom? A site may be secure against outsiders, but not insiders.

3  Security Concepts

There is no security on this earth, there is only opportunity.
- General Douglas MacArthur (1880-1964)
These are important concepts which appear to apply across multiple security domains.

3.1  The Classification Problem

Many times in security you wish to distinguish between classes of data. This occurs in firewalls, where you want to allow certain traffic but not all, and in intrusion detection where you want to allow benign traffic but not allow malicious traffic, and in operating system security, we wish to allow the user to run their programs but not malware. In doing so, we run into a number of limitations in various domains that deserve mention together.

3.1.1  Classification Errors

False Positives vs. False Negatives, also called Type I and Type II errors. Discuss equal error rate (EER) and its use in biometrics. Sometimes in medicine they will do a cheap test with a high error rate biased one direction (often false positives), and a more expensive test with a lower error rate, usually biased in the other direction.

3.1.2  The Base-Rate Fallacy

In The Base Rate Fallacy and its Implications for Intrusion Detection (http://www.raid-symposium.org/raid99/PAPERS/Axelsson.pdf), the author essentially points out that there's a lot of benign traffic for every attack, and so even a small chance of a false positive will quickly overwhelm any true positives. Put another way, if one out of every 10,001 connections is malicious, and the test has a 1% false positive error rate, then for every 1 real malicious connection there 10,000 benign connections, and hence 100 false positives.

3.1.3  Test Efficiency

In other cases, you are perfectly capable of performing an accurate test, but not on all the traffic. You may want to apply a cheap test with some errors on one side before applying a second, more expensive test on the side with errors to weed them out. This is done in BSD Unix with packet capturing via tcpdump, which uploads a coarse filter into the kernel, and then applies a more expensive but finer-grained test in userland which only operates on the packets which pass the first test.

3.1.4  Incompletely-Defined Sets

As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.
- Albert Einstein
Stop for a moment and think about the difficulty of trying to list all the undesirable things that your computer shouldn't do. If you find yourself finished, then ask yourself; did you include that it shouldn't attack other computers? Did you include that it shouldn't transfer $1000 to a mafia-run web site when you really intended to transfer $100 to your mother? Did you include that it shouldn't send spam to your address book? The list goes on and on.
Thus, if we had a complete list of everything that was bad, we'd block it and never have to worry about it again. However, often we either don't know, or the set is infinite. Similarly, we may not be able to obtain a complete list of everything that is good; imagine trying to specify in advance all the network packets that should be allowed into your enterprise!

3.1.5  The Guessing Hazard

So often we can't enumerate all the things we would want to do, nor all the things that we would not want to do. Because of this, intrusion detection systems (see ) often simply guess; they try to detect attacks unknown to them by looking for features that are likely to be present in malware but not in normal traffic. At the current moment, you can find out if your traffic is passing through an IPS by trying to send a long string of A's in a session. This isn't malicious by itself, but is a common letter with which people pad exploits (see ). In this case, it's a great example of a false positive, or collateral damage, generated through guilt-by-association; there's nothing inherently suspicious about a string of A's, it's just that exploit writers use them a lot, and IPS vendors decided that made them suspicious. I'm not a big fan of these because I feel that it breaks functionality that doesn't threaten the system, and that it could be used as evidence of malfeasance against someone by someone who doesn't really understand the technology. I'm already irritated by the false-positives or excessive warnings about security tools from anti-virus software; it seems to alert to "potentially-unwanted programs" an absurd amount of the time; most novices don't understand that the anti-virus software reads the disk even though I'm not running the programs, and that you have nothing to fear if you don't run the programs. I fear that one day my Internet Service Provider will start filtering them out of my email or network streams, but fortunately they just don't care that much.

3.1.6  Machine Learning

Someone could profitably apply machine learning to decide which signatures are best for classification.

3.2  Security Layers

I like to think of security as a hierarchy. At the base, you have physical security. On top of that is OS security, and on top of that is application security, and on top of that, network security. You may have an unbeatable firewall, but if your OS doesn't require a password and your adversary has physical access, you lose. So each layer of the pyramid can not be more secure (in an absolute sense) as the layer below it. Ideally, each layer should be available to fewer adversaries than the layer above it, so that one has a sort of balance or risk equivalency.
  1. network security
  2. application/database security
  3. OS security
  4. physical security
In operating system security, we distinguish between users of the system, and perhaps the roles they are fulfilling, and only concern ourselves with activities within that computer. It is assumed that the adversary has some access, but less than full privileges on the system. In network security, we concern ourselves with nodes in the networks (usually individual computers), and do not distinguish between users of each system. In some sense, we are now assigning rights to computers and not people. This is often justified since it is usually easier to leverage one user's access to gain another's within the same system than to gain access to another system (but this is not a truism).

3.3  Privilege Levels

Here's a taxonomy of some commonly-useful privilege levels.
  1. Anonymous, remote systems
  2. Authenticated remote systems
  3. Local unprivileged user (UID > 0)
  4. Administrator (UID 0)
  5. Kernel (privileged mode, ring 0)
  6. Hardware (TPM, ring -1, hypervisors, trojaned hardware)
Actual systems may vary, levels may not be strictly hierarchical, etc. Basically the higher the level you get, the harder you are to detect. The gateways between the levels are access control devices, analogous with firewalls.

3.4  What is a Vulnerability?

Now that you know what a security property is, what constitutes (or should constitute) a vulnerability? On the arguable end of the scale we have "loss of availability", or susceptibility to denial of service (DoS). On the inarguable end of the scale, we have "loss of control", which usually arbitrary code execution, which often means that the adversary can do whatever he wants with the system, and therefore can violate any other security property.
In an ideal world, every piece of software would state its assumptions about its environment, and then state the security properties it attempts to guarantee; this would be a security policy. Any violation of these explicitly-stated security properties would then be a vulnerability, and any other security properties would simply be "outside the design goals". However, I only know of one piece of commonly-available software which does this, and that's OpenSSL (http://oss-institute.org/FIPS_733/SecurityPolicy-1.1.1_733.pdf).

3.5  Accuracy Limitations in Making Decisions That Impact Security

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
- Charles Babbage
This is sometimes called the GIGO rule (Garbage In, Garbage Out). Stated this way, this seems self-evident. However, you should realize that this applies to systems as well as programs. For example, if your system depends on DNS to locate a host, then the correctness of your system's operation depends on DNS. Whether or not this is exploitable (beyond a simple denial of service) depends a great deal on the details of the procedures. This is a parallel to the question of whether it is possible to exploit a program via an unsantitized input.
You can never be more accurate than the data you used for your input. Try to be neither precisely inaccurate, nor imprecisely accurate. Learn to use footnotes.

4  Economics of Security

4.1  How Expensive are Security Failures?

I'm looking for good examples of companies put out of business by system crackers. Also large penalties will do. Please send them to me if you can dig up a reference.

5  Adversary Modeling

If you know the enemy and know yourself, you need not fear the result of a hundred battles.
If you know yourself but not the enemy, for every victory gained you will also suffer a defeat.
If you know neither the enemy nor yourself, you will succumb in every battle.
- Sun Tzu, The Art of War (http://en.wikipedia.org/wiki/The_Art_of_War)
After deciding what you need to protect (your assets), you need to know about the threats you wish to protect it against, or the adversaries (sometimes called threat agents) which may threaten it. Generally intelligence units have threat shops, where they monitor and keep track of the people who may threaten their operations. This is natural, since it is easier to get an idea of who will try and do something than how some unspecified person may try to do it, and can help by hardening systems in enemy territory more than those in safer areas, leading to more efficient use of resources. I shall call this adversary modeling.
In adversary modeling, the implicit assumptions are that you have a limited budget and the number of threats is so large that you cannot defend against all of them. So you now need to decide where to allocate your resources. Part of this involves trying to figure out who your adversaries are and what their capabilities and intentions are, and thus how much to worry about particular domains of knowledge or technology. You don't have to know their name, location and social security number; it can be as simple as "some high school student on the Internet somewhere who doesn't like us", "a disgruntled employee" (as opposed to a gruntled employee), or "some sexually frustrated script-kiddie on IRC who doesn't like the fact that he is a jerk who enjoys abusing people and therefore his only friends are other dysfunctional jerks like him". People in charge of doing attacker-centric threat modeling must understand their adversaries and be willing to take chances by allocating resources against an adversary which hasn't actually attacked them yet, or else they will always be defending against yesterday's adversary, and get caught flat-footed by a new one.

5.1  Common Psychological Errors

The excellent but poorly titled1 book Searching for Happiness tells us that we make two common kinds of errors when reasoning about other humans:
  1. Overly different; if you looked at grapes all day, you'd know a hundred different kinds, and naturally think them very different. But they all squish when you step on them, they are all fruits and frankly, not terribly different at all. So too we are conditioned to see people as different because the things that matter most to us, like finding an appropriate mate or trusting people, cannot be discerned with questions like "do you like breathing?". An interesting experiment showed that a description of how they felt by people who had gone through a process is more accurate in predicting how a person will feel after the process than a description of the process itself. Put another way, people assume that the experience of others is too dependent on the minor differences between humans that we mentally exaggerate.
  2. Overly similar; people assume that others are motivated by the same things they are motivated by; we project onto them a reflection of our self. If a financier or accountant has ever climbed mount Everest, I am not aware of it. Surely it is a cost center, yes?

5.2  Cost-Benefit

Often, the lower layers of the security hierarchy cost more to build out than the higher levels. Physical security requires guards, locks, iron bars, shatterproof windows, shielding, and various other things which, being physical, cost real money. On the other hand, network security may only need a free software firewall. However, what an adversary could cost you during a physical attack (e.g. a burglar looting your home) may be greater than an adversary could cost you by defacing your web site.

5.3  Risk Tolerance

We may assume that the distribution of risk tolerance among adversaries is monotonically decreasing; that is, the number of adversaries who are willing to try a low-risk attack is greater than the number of adversaries who are willing to attempt a high-risk attack to get the same result. Beware of risk evaluation though; while a hacker may be taking a great risk to gain access to your home, local law enforcement with a valid warrant is not going to be risking as much.
So, if you are concerned about a whole spectrum of adversaries, known and unknown, you may wish to have greater network security than physical security, simply because there are going to be more remote attacks.

5.4  Capabilities

Men of sense often learn from their enemies. It is from their foes, not their friends, that cities learn the lesson of building high walls and ships of war. - Aristophanes
You only have to worry about things to the extent they may lie within the capabilities of your adversaries. It is rare that adversaries use outside help when it comes to critical intelligence; it could, for all they know, be disinformation, or the outsider could be an agent-provocateur.

5.5  Sophistication Distribution

If they were capable, honest, and hard-working, they wouldn't need to steal.
Along similar lines, one can assume a monotonically decreasing number of adversaries with a certain level of sophistication. My rule of thumb is that for every person who knows how to perform a technique, there are x people who know about it, where x is a small number, perhaps 3 to 10. The same rule applies to people with the ability to write an exploit versus those able to download and use it (the so-called script kiddies). Once an exploit is coded into a worm, the chance of a compromised host having been compromised by the worm (instead of a human who targets it specifically) approaches 100%. Discuss Bayesian inference.

5.6  Goals

We've all met or know about people who would like nothing more than to break things, just for the heck of it; schoolyard bullies who feel hurt and want to hurt others, or their overgrown sadist kin. Vandals who merely want to write their name on your storefront. A street thug who will steal a cell phone just to throw it through a window. I'm sure the sort of person reading this isn't like that, but unfortunately some people are. What exactly are your adversary's goals? Are they to maximize ROI (Return On Investment) for themselves, or are they out to maximize pain (tax your resources) for you? Are they monetarily or ideologically motivated? What do they consider investment? What do they consider a reward? Put another way, you can't just assign a dollar value on assets, you must consider their value to the adversary.

6  Threat Modeling

In technology, people tend to focus on how rather than who, which seems to work better when anyone can potentially attack any system (like with publicly-facing systems on the Internet) and when protection mechanisms have low or no incremental cost (like with free and open-source software). I shall call modeling these threat modeling (http://en.wikipedia.org/wiki/Threat_model).

6.1  Attack Surface

Gnothi Seauton ("Know Thyself")
- ancient Greek aphorism (http://en.wikipedia.org/wiki/Know_thyself)
When discussing security, it's often useful to analyze the part which may interact with a particular adversary (or set of adversaries). For example, let's assume you are only worried about remote adversaries. If your system or network is only connected to outside world via the Internet, then the attack surface is the parts of your system that interact with things on the Internet, or the parts of your system which accept input from the Internet. A firewall, then, limits the attack surface to a smaller portion of your systems by filtering some of your network traffic. Often, the firewall blocks all incoming connections.
Sometimes the attack surface is pervasive. For example, if you have a network-enabled embedded device like a web cam on your network that has a vulnerability in its networking stack, then anything which can send it packets may be able to exploit it. Since you probably can't fix the software in it, you must then use a firewall to attempt to limit what can trigger the bug. Similarly, there was a bug in sendmail that could be exploited by sending a carefully-crafted email through a vulnerable server. The interesting bit here is that it might be an internal server that wasn't exposed to the Internet; the exploit was data-directed and so could be passed through your infrastructure until it hit a vulnerable implementation. That's why I consistently use one implementation (not sendmail) throughout my network now.
If plugging a USB drive into your system causes it to automatically run things like a standard Microsoft Windows XP install, then any plugged-in device is part of the attack surface (http://it.slashdot.org/article.pl?sid=08/01/13/1533243). But even if it does not, then by plugging a USB device in you could potentially overflow the code which handles the USB or the driver for the particular device which is loaded (http://www.eweek.com/article2/0,1895,1840141,00.asp, http://www.schneier.com/blog/archives/2006/06/hacking_compute.html); thus, the USB networking code and all drivers are part of the attack surface if you can control what is plugged into the system. Moreover, a recent vulnerability (http://it.slashdot.org/it/08/01/14/1319256.shtml) illustrates that when you have something which inspects network traffic, such as uPNP devices or port knocking daemons, then their code forms part of the attack surface.
Sometimes you will hear people talk about the "anonymous attack surface"; this is the attack surface available to everyone (on the Internet). Since this number of people is so large, and you usually can't identify them or punish them, you want to be really sure that the anonymous attack surface is limited and doesn't have any so-called "pre-auth" vulnerabilities, because those can be exploited prior to identification and authentication.

6.2  Attack Trees

The next logical step is to move from defining the attack surface to modeling attacks and quantify risk levels.
Microsoft is actually getting into this kind of stuff now and if I can find something useful and novel from them I'll refer to it here.

6.3  The Weakest Link

Amdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved.
- Wikipedia (http://en.wikipedia.org/wiki/Amdahl%27s_law)
Let us think of our security posture for whatever we're protecting as being composed of a number of systems (or groups of systems possibly offering defense-in-depth). The strength of these systems to attack may vary. You may wish to pour all your resources into one, but the security will likely be broken at the weakest point, either by chance or by an intelligent adversary.
This is an analogy to Amdahl's law, stated above, in that we can only increase our overall security posture by maintaining a delicate balance between the different defenses to attack vectors. Most of the time, your resources are best spent on the weakest area, which for some institutions (financial, military) is usually personnel.
The reasons you might not balance all security systems:
Economics
matter here; it may be much cheaper and reliable to buy a firewall than put your employees through security training.
Exposure
matters; an Internet attack is much more likely than a physical attack.
Capability
matters in that you may have a strong technical base but a very naive view of people. This is the "you do what you're good at" approach.
Costs
matter; you have only so many resources to put into things, but not enough to balance them. Physical things like walls of thick concrete often cost too much to consider.

7  Physical Security

When people think of physical security, these often are the limit on the strength of access control devices; I recall a story of a cat burglar who used a chainsaw to cut through victim's walls, bypassing any access control devices. I remember reading someone saying that a deep space probe is the ultimate in physical security.

7.1  No Physical Security Means No Security

A couple of limitations come up without physical security for a system. For confidentiality, all of the sensitive data needs to be encrypted. But even if you encrypt the data, an adversary with physical access could trojan the OS and capture the data (this is a control attack now, not just confidentiality breach; go this far and you've protected against overt seizure, theft, improper disposal and such). So you'll need to you protect the confidentiality and integrity of the OS, he trojans the kernel. If you protect the kernel, he trojans the boot loader. If you protect the boot loader (say by putting on a removable medium), he trojans the BIOS. If you protect the BIOS, he trojans the CPU. So you put a tamper-evident label on it, with your signature on it, and check it every time. But he can install a keyboard logger. So suppose you make a sealed box with everything in it, and connectors on the front. Now he gets measurements and photos of your machine, spends a fortune replicating it, replaces your system with an outwardly identical one of his design (the trojan box), which communicates (say, via encrypted spread-spectrum radio) to your real box. When you type plaintext, it goes through his system, gets logged, and relayed to your system as keystrokes. Since you talk plaintext, neither of you are the wiser.
The physical layer is a common place to facilitate a side-channel attack (see ).

7.2  Data Remanence

Data remanence is the the residual physical representation of your information on media after you believe that you have removed it (definition thanks to Wikipedia, http://en.wikipedia.org/wiki/Data_remanence). This is a disputed region of technology, with a great deal of speculation, self-styled experts, but very little hard science.
As of 2006, the most definitive study seems to be the NIST Computer Security Division paper Guidelines for Media Sanitization (http://csrc.nist.gov/publications/nistpubs/800-88/NISTSP800-88_rev1.pdf). NIST is known to work with the NSA on some topics, and this may be one of them. It introduces some useful terminology:
disposing
is the act of discarding media with no other considerations
clearing
is a level of media sanitization that resists anything you could do at the keyboard or remotely, and usually involves overwriting the data at least once
purging
is a process that protects against a laboratory attack (signal processing equipment and specially trained personnel)
destroying
is the ultimate form of sanitization, and means that the medium can no longer be used as originally intended

7.2.1  Magnetic Storage Media (Disks)

The seminal paper on this is Peter Gutmann's Secure Deletion of Data from Magnetic and Solid-State Memory (http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html). In early versions of his paper, he speculated that one could extract data due to hysteresis effects even after a single overwrite, but on subsequent revisions he stated that there was no evidence a single overwrite was insufficient. Simson Garfinkel wrote about it recently in his blog (https://www.techreview.com/blog/garfinkel/17567/).
The NIST paper has some interesting tidbits in it. Obviously, disposal cannot protect confidentiality of unencrypted media. Clearing is probably sufficient security for 99% of all data; I highly recommend Darik's Boot and Nuke (http://dban.sourceforge.net/), which is a bootable floppy or CD based on Linux. However, it cannot work if the storage device stops working properly, and it does not overwrite sectors or tracks marked bad and transparently relocated by the drive firmware. With all ATA drives over 15GB, there is a "secure delete" ATA command which can be accessed from hdparm within Linux, and Gordon Hughes has some interesting documents and a Microsoft-based utility (http://cmrr.ucsd.edu/people/Hughes/SecureErase.shtml). There's a useful blog entry about it (http://storagemojo.com/2007/05/02/secure-erase-data-security-you-already-own/). In the case of very damaged disks, you may have to resort to physical destruction. However, with disk densities being what they are, even 1/125" of a disk platter may hold a full sector, and someone with absurd amounts of money could theoretically extract small quantities of data. Fortunately, nobody cares this much about your data.
Now, you may wonder what you can do about very damaged disks, or what to do if the media isn't online (for example, you buried it in an underground bunker), or if you have to get rid of the data fast. I would suggest that encrypted storage (see ) would almost always be a good idea. If you use it, you merely have to protect the confidentiality of the key, and if you can properly sanitize the media, all the better. Recently Simson Garfinkel re-discovered a technique for getting the data off broken drives; freezing them. Another technique that I have used is to replace the logic board with one from a working drive.

7.2.2  Semiconductor Storage (RAM)

Peter Gutmann's Data Remanence in Semiconductor Devices (http://www.cypherpunks.to/~peter/usenix01.pdf) shows that if a particular value is held in RAM for extended periods of time, various processes such as electromigration make permanent changes to the semiconductor's structure. In some cases, it is possible for the value to be "burned in" to the cell, such that it cannot hold another value.
Cold Boot Attack  
Recently a Princeton team (http://citp.princeton.edu/memory/) found that the values held in DRAM decay in predictable ways after power is removed, such that one can merely reboot the system and recover keys for most encrypted storage systems (http://citp.princeton.edu/pub/coldboot.pdf). By cooling the chip first, this data remains longer. This generated much talk in the industry. This prompted an interesting overview of attacks against encrypted storage systems (http://www.news.com/8301-13578_3-9876060-38.html).

8  Distributed Systems

8.1  Network Security Overview

The things involved in network security are called nodes. One can talk about networks composed of humans (social networks), but that's not the kind of network we're talking about here; I always mean a computer unless I say otherwise. Often in network security the adversary is assumed to control the network in whole or part; this is a bit of a holdover from the days when the network was radio, or when the node was an embassy in a country controlled by the adversary. In modern practice, this doesn't seem to usually be the case, but it'd be hard to know for sure. In the application of network security to the Internet, we almost always assume the adversary controls at least one of the nodes on the network.
In network security, we can lure an adversary to a system, tempt them with something inviting; such a system is called a honeypot, and a network of such systems is sometimes called a honeynet. A honeypot may or may not be instrumented for careful monitoring; sometimes systems so instrumented are called fishbowls, to emphasize the transparent nature of activity within them. Often one doesn't want to allow a honeypot to be used as a launch point for attacks, so outbound network traffic is sanitized or scrubbed; if traffic to other hosts is blocked completely, some people call it a jail, but that is also the name of an operating system security technology used by FreeBSD, so I consider it confusing.
To reduce a distributed system problem to a physical security (see 7) problem, you can use an air gap, or sneakernet between one system and another. However, the data you transport between them may be capable of exploiting the offline system. One could keep a machine offline except during certain windows; this could be as simple as a cron job which turns on or off the network interface via ifconfig. However, an offline system may be difficult to administer, or keep up-to-date with security patches.

8.2  Network Access Control: Packet Filters, Firewalls, Security Zones

Most network applications use TCP, a connection-oriented protocol, and they use a client/server model. The client initiates a handshake with the server, and then they have a conversation. Sometimes people use the terms client and server to mean the application programs, and other times they mean the node itself. Other names for server applications include services and daemons. Obviously if you can't speak with the server at all, or (less obviously) if you can't properly complete a handshake, you will find it difficult to attack the server application. This is what a packet filter does; it allows or prevents communication between a pair of sockets. A packet filter does not generally do more than a simple all-or-nothing filtering.
The firewall was originally defined as a device between different networks that had different security characteristics; it was named after the barrier between a automobile interior and the engine, which is designed to prevent a engine fire from spreading to the passenger cabin. As our understanding of network security improved, people started to define various parts of their network. The canonical types of networks are:
What these definitions all have in common is that they end up defining security zones (this term thanks to the authors of Extreme Exploits). All the nodes inside a security zone have roughly equivalent access to or from other security zones. I believe this is the most important and fundamental way of thinking of network security. Do not confuse this with the idea that all the systems in the zone have the same relevance to the network's security, or that the systems have the same impact if compromised; that is a complication and more of a matter of operating system security than network security. In other words, two systems (a desktop and your DNS server) may not be security equivalent, but they may be in the same security zone.

8.3  Network Reconnaissance: Ping Sweeps, Port Scanning

Typically an adversary needs to know what he can attack before he can attack it. This is called reconnaissance, and involves gathering information about the target and identifying ways in which he can attack the target. In network security, the adversary may want to know what systems are available for attack, and a technique such as a ping sweep of your network block may facilitate this. Then, he may choose to enumerate (get a list of) all the services available via a technique such as a port scan. A port scan may be a horizontal scan (one port, many IP addresses) or vertical scan (one IP address, multiple ports), or some combination thereof. You can sometimes determine what service (and possibly what implementation) it is by banner grabbing or fingerprinting the service.
In an ideal world, knowing that you can talk to a service does not matter. Thus, a port scan should only reveal what you already assumed your adversary already knew. However, it is considered very rude, even antisocial, like walking down the street and trying to open the front door of every house or business that you pass; people will assume you are trying to trespass, and possibly illicitly copy their data.
Typical tools used for network reconnaissance include:

8.4  Network Intrusion Detection and Prevention

Most security-conscious organizations are capable of detecting most scans using [network] intrusion detection systems (IDS) or intrusion prevention systems (IPS); see .

8.5  Cryptography is the Sine Qua Non of Secure Distributed Systems

All cryptography lets you do is create trust relationships across untrustworthy media; the problem is still trust between endpoints and transitive trust.
- Marcus Ranum
Put simply, you can't have a secure distributed system (with the normal assumptions of untrusted nodes and network links potentially controlled by the adversary) without using cryptography somewhere ("sine qua non" is Latin for "without which it could not be"). If the adversary can read communications, then to protect the confidentiality of the network traffic, it must be encrypted. If the adversary can modify network communication, then it must have its integrity protected and be authenticated (that is, to have the source identified). Even physical layer communication security technologies, like the KLJN cipher, quantum cryptography, and spread-spectrum communication, use cryptography in one way or another.
I would go farther and say that performing network security decisions on anything other than cryptographic keys is never going to be as strong as if it depended on cryptography. Very few Internet adversaries currently have the capability to arbitrarily route data around. Most cannot jump between VLANs on a tagged port. Some don't even have the capability to sniff on their LAN. But none of the mechanisms preventing this are stronger than strong cryptography, and often they are much weaker, possibly only security through obscurity. Let me put it to you this way; to support a general argument otherwise, think about how much assurance a firewall has that a packet claiming to be from a given IP address is actually from the system the firewall maintainer believes it to be. Often these things are complex, and way beyond his control. However, it would be totally reasonable to filter on IP address first, and only then allow a cryptographic check; this makes it resistant to resource consumption attacks from anyone who cannot spoof a legitimate IP address (see 3.1.1).

8.6  Hello, My Name is 192.168.1.1

Humans are incapable of securely storing high-quality cryptographic keys, and they have unacceptable speed and accuracy when performing cryptographic operations. (They are also large, expensive to maintain, difficult to manage, and they pollute the environment. It is astonishing that these devices continue to be manufactured and deployed. But they are sufficiently pervasive that we must design our protocols around their limitations).
- Network Security / PRIVATE Communication in a PUBLIC World by Charlie Kaufman, Radia Perlman, & Mike Speciner (Prentice Hall 2002; p.237)
Because humans communicate in slowly, in plaintext, and don't plug into a network, we consider the nodes within the network to be computing devices. The system a person interacts with has equivalency with them; break into the system administrator's console, and you have access to anything he or she accesses. In some cases, you may have access to anything he or she can access. You may think that the your LDAP or Kerberos server is the most important, but isn't the node of the guy who administers it just as critical? This is especially true if OS security is weak and any user can control the system, or if the administrator is not trusted, but it is also convenient because packets do not have user names, just source IPs. When some remote system connects to a server, unless both are under the control of the same entity, the server has no reason to trust the remote system's claim about who is using it, nor does it have any reason to treat one user on the remote system different than any other.

8.7  Source Tapping; The First Hop and Last Mile

One can learn a lot more about a target by observing the first link from them than from some more remote place. That is, the best vantage point is one closest to the target. For this reason, the first hop is far more critical than any other. An exception may involve a target that is more network-mobile than the eavesdropper. The more common exception is tunneling/encryption (to include tor and VPN technologies); these relocate the first hop somewhere else which is not physically proximate to the target's meat space coordinates, which may make it more difficult to locate.
Things to consider here involve the difficulty of interception, which is a secondary concern (it is never all that difficult). For example, it is probably less confidential from the ISP to use an ISP's caching proxy than to access the service directly, since most proxy software makes it trivial to log the connection and content; however, one should not assume that one is safe by not using the proxy (especially now that many do transparent proxying). However, it is less anonymous from the remote site to access the remote site directly; using the ISP's proxy affords some anonymity (unless the remote site colludes with the ISP).

8.8  Security Equivalent Things Go Together

One issue that always seems to come up is availability versus other goals. For example, suppose you install a new biometric voice recognition system. Then you have a cold and can't get in. Did you prioritize correctly? Which is more important? Similar issues come up in almost every place with regard to security. For example, your system may authenticate users versus a global server, or it may have a local database for authentication. The former means that one can revoke a user's credentials globally immediately, but also means that if the global server is down, nobody can authenticate. Attempts to get the best of both worlds ("authenticate locally if global server is unreachable") often reduce to availability (adversary just DOSes link between system and global server to force local authentication).
My philosophy on this is simple; put like things together. That is, I think authentication information for a system should be on the system. That way, the system is essentially a self-contained unit. By spreading the data out, one multiplies potential attack targets, and reduces availability. If someone can hack the local system, then being able to alter a local authentication database is relatively insignificant.

8.9  A Proposed Perimeter Defense

I believe the following design would be a useful design for perimeter defenses for most organizations and individuals. First, there would be an outer layer of reactive prevention, followed by an inner layer of prevention and detection that acts as a fail-safe mechanism. If the outer preventative defense should fail for some reason (hardware, software, configuration) then incoming connections will be stopped by the inner layer and the detection will notify us that something is wrong. The idea of a dual layer of firewalling is already becoming popular with financial institutions and military networks, but really derives itself from the lessons learned trying to guarantee high availability and specifically the goal of eliminating single points of failure. This system also doesn't require monitoring traffic blocked by the outer layer, which virtually eliminates the resources it takes to monitor traffic that gets blocked anyway. However, if the outer layer were not reactive, then we would effectively be discarding any useful intelligence that is gained by detecting probes (that is, a failed connection or attack is still valuable in determining intent). With a reactive firewall as the outer layer, when an adversary probes our defenses looking for holes or weak spots, we take appropriate action, usually shunning that network address, and this makes enumeration a much more difficult process. With a little imagination, we can construct more deceptive defensive measures, like returning random responses, or redirection to a honey-net (which is essentially just a consistent set of bogus responses, plus monitoring). Since enumeration is strictly an information-gathering activity, the obvious countermeasure is deception. The range of deceptive responses runs from none (that is, complete silence, or lack of information) through random responses (misinformation) to consistent, strategic deception (disinformation). Stronger responses are out of proportion to the provocation (network scans are legal in most countries), and often illegal in any circumstances.

8.10  Man In The Middle

How do we detect MITM or impersonation in web, PGP/GPG, SSH contexts? The typical process for creating a connection involves a DNS resolution at the application layer (unless you use IP addresses), then sending packets to the IP address (at the network layer), which have to be routed; at the link layer, ARP typically is used to find the next hop at each stage.

8.10.1  DNS Issues

Poisoning, spoofing (transaction ID issues) or maybe you are querying a DNS server the adversary controls (i.e. your ISP)

8.10.2  IP Routing

Announcing bogus routes, or topological considerations

8.10.3  Link-layer Issues

ARP spoofing or poisoning

8.10.4  Physical Layer

Tapping the wire (or listening to wireless)

8.10.5  Periodic Rechecking

It's difficult to stay perpetually in the middle. When you aren't, typically, the cryptographic fingerprints will no longer match and the MITM will be detected. It's handy to occasionally compare them using different channels, so that if the ones you originally relied upon were proxied, the tampering will be detected. SSH does this automatically and is called the baby duck model (i.e. it bonds to the first thing it sees, and complains if it changes identities). However, this detects the problem only retroactively.

8.10.6  Out-of-Band Comparison

One can compare digests/fingerprints/hashes over a different, low-bandwidth communication medium (i.e. the phone, postal mail).

8.10.7  Parallel Paths

OOB comparison is really an example of creating two disjoint paths between two entities and making sure that they give the same results. This can occur in multiple contexts. For example, it can be used for the bootstrapping problem; how can I trust the first connection? By creating two paths I can compare the identities of the peer both places. I once used this to check the integrity of my PGP downloads by downloading it from home and from another location, and comparing the results.

8.10.8  Formatting

Imagine that the adversary is conducting a MITM against, say, an SSH session, so instead of A<->B it is A<->O<->B. Your countermeasure as A may be to check the IP addresses of the peer at B, so that the adversary would have to spoof IPs in both directions (this is often printed automatically at login). Another technique is to check the host key fingerprint as part of your login sequence, sending the fingerprint through the tunneled connection. The adversary may modify the data at the application layer automatically, to change the fingerprint on the way through. But what if you transformed (e.g. encrypted) the fingerprint using a command-line tool, and represented it as printable characters, and printed them through the tunnel, and inverted the transformation at the local end? Then he'd have a very difficult time writing a program to detect this, especially if you kept the exact mechanism a secret. You could run the program automatically through ssh, so it isn't stored on the remote system.

8.11  Network Surveillance

9  Identification and Authentication

Identification is necessary before making any sort of access control decisions. Often it can reduce abuse, because an identified individual knows that if they do something there can be consequences or sanctions. For example, if an employee abuses the corporate network, they may find themselves on the receiving end of the sysadmin's luser attitude readjustment tool (LART). I tend to think of authentication as a process you perform on objects (like paintings, antiques, and digitally signed documents), and identification as a process that subjects (people) perform, but in network security you're really looking at data created by a person for the purpose of identifying them, so I use them interchangeably.

9.1  Identity

Sometimes I suspect I'm not who I think I am.
- Ghost in the Shell
An identity, for our purposes, is an abstract concept; it does not map to a person, it maps to a persona. Some people call this a digital ID, but since this paper doesn't talk about non-digital identities, I'm dropping the qualifier. Identities are different from credentials, which are something you use to prove identity. For example, your login password is a credential. In relational database design, it is considered a good practice for the primary key (http://en.wikipedia.org/wiki/Primary_key) of a table to be an integer, perhaps a row number, that is not used for anything else. That is because the primary key is used as an identifier for the row. An identifier is shorthand, a handle; like a pointer, it allows us to modify the object itself, so that the modification occurs in all places simultaneously. Most competent DBAs realize that people change names, phone numbers, locations, and so on; they may even change social security numbers. They also realize that people may share any of these things (even social security numbers are not necessarily unique, especially if they lie about it). So to be able to identify a person across any of these changes, you need to use a row number. The exact same principle applies with security systems.
In Unix, a person is given a username (identity) and a password (credential). This is good, because the password may be changed without losing the idea of the identity of the person. However, there are subtle gotchas. In actuality, the username is mapped to a user ID (UID), which is the real way that Unix keeps track of identity. It isn't necessarily a one-to-one mapping. Also, a poor system administer may reassign an unused user ID without going through the file system and looking for files owned by the old user, in which case their ownership is silently reassigned.
PGP and GPG made the mistake of using a cryptographic key as an identifier. If one has to revoke that key, one basically loses anything (such as signatures) which applied to that key, and the trust that other people have indicated towards that key. And if you have multiple keys, friends of yours who have all of them cannot treat them all as equivalent, since GPG can't be told that they are associated to the same identity, because the keys are the identity. Instead, they must manage statements about you (such as how much they trust you to act as an introducer) on each key independently.

9.2  What Authority?

Does it follow that I reject all authority? Far from me such a thought. In the matter of boots, I refer to the authority of the bootmaker; concerning houses, canals, or railroads, I consult that of the architect or the engineer.
- Mikhail Bakunin, What is Authority? 1882 (http://www.panarchy.org/bakunin/authority.1871.html)
When we are attempting to identify someone, we are relying upon some authority, usually the state government. When you register a domain name with a registrar, they record your personal information in the WHOIS database; this is the system of record (http://en.wikipedia.org/wiki/System_of_record). No matter how careful we are, we can never have a higher level of assurance than this authority has. If the government gave that person a false identity, or the person bribed a DMV clerk to do so, we can do absolutely nothing about it. This is an important implication of the limitations of accuracy (see 3.5).

9.3  Authentication Factors

There are many ways you can prove your identity to a system. They may include:
something you are
like biometric signatures such as the pattern of capillaries on your retina, your fingerprints, etc.
something you have
like a token, physical key, or thumb drive
something you know
like a passphrase or password
somewhere you are
if you put a GPS device in a computer, or did direction-finding on transmissions, or simply require a person to be physically present somewhere to operate the system
somewhere you can be reached
like a mailing address, network address, email address, or phone number
At the risk of self-promotion, I want to point out that, to my knowledge, the last factor has not been explicitly stated in computer security literature, although it is demonstrated every time a web site emails you your password, or every time a financial company mails something to your home.

9.4  Authentication Issues: When, What

Do we authenticate each transaction or command (sudo), or a session (SSH), or only certain commands (passwd)? What is being authenticated, the remote system, the agent, the user, or the data?

9.5  The Identity Continuum

Identification can range from fully anonymous to pseudonymous, to full identification. Ensuring identity can be expensive, and is never perfect. Think about what you are trying to accomplish. Applies to cookies from web sites, email addresses, "real names", and so on.

9.6  Problems Remaining Anonymous

In cyberspace everyone will be anonymous for 15 minutes.
- Graham Greenleaf
What can we learn from anonymizer, mixmaster, tor, and so on? Often one can de-anonymize. Some people have de-anonymized search queries this way, and census data, and many more data sets that are supposed to be anonymous.

9.7  Problems with Identifying People

9.8  Remote Attestation

A concept in network security involves knowing that the remote system is a particular program or piece of hardware is called remote attestation. When I connect securely over the network to a machine I believe I have full privileges on, how do I know I'm actually talking to the machine, and not a similar system controlled by the adversary? This is usually attempted by hiding an encryption key in some tamper-proof part of the system, but is vulnerable to all kinds of disclosure and side-channel attacks, especially if the owner of the remote system is the adversary.
The most successful example seems to be the satellite television industry, where they embed cryptographic and software secrets in an inexpensive smart card with restricted availability, and change them frequently enough that the resources required to reverse engineer each new card exceeds the cost of the data it is protecting. In the satellite TV industry, there's something they call ECMs (electronic counter-measures), which are program updates of the form "look at memory location 0xFC, and if it's not 0xFA, then HCF" (Halt and Catch Fire). The obvious crack is to simply remove that part of the code, but then you will trigger another check that looks at the code for the first check, and so on.
The sorts of non-cryptographic self-checks they request the card to do, such as computing a checksum (such as a CRC) over some memory locations, are similar to the sorts of protections against reverse engineering, where the program computes a checksum to detect modifications to itself.

9.9  Advanced Authentication Tools

10  Authorization - Access Control

10.1  Privilege Escalation

Ideally, all services would be impossible to abuse. Since this is difficult or impossible, we often restrict access to them, to limit the potential pool of adversaries. Of course, if some users can do some things and others can't, this creates the opportunity for the adversary to perform an unauthorized action, but that's often unavoidable. For example, you probably want to be able to do things to your computer, like reformat it and install a new operating system, that you wouldn't want others to do. You will want your employees to do things an anonymous Internet user cannot (see 3.3). Thus, many adversaries want to escalate their privileges to that of some more powerful user, possibly you. Generally, privilege escalation attacks refer to techniques that require some level of access above that of an anonymous remote system, but grant an even higher level of access, bypassing access controls.
They can come in horizontal (user becomes another user) or vertical (normal user becomes root or Administrator) escalations.

10.2  Physical Access Control

These include locks. I like Medeco, but none are perfect. It's easy to find guides to lock picking:

10.3  Operating System Access Control: DAC, MAC, RBAC

Discretionary Access Control (DAC) is up to the end-user. They can choose to let other people write to their files, if they wish, and the defaults tend to be global. This is how file permissions on classic Unix and Windows works. A more secure system often involves Mandatory Access Control (MAC), where the security administrator sets up the permissions globally. Some MAC types are Type Enforcement and Domain Type Enforcement. Implementations include SELinux and systrace. Often they are combined, where the access request has to pass both tests, meaning that the effective permission set is the intersection (union) of the MAC and DAC permissions. Another way of looking at it is that MAC sets the maximum permissions that DAC can give. Role-Based Access Control (RBAC) could be considered a form of MAC. In RBAC, there are roles to whom permissions are assigned, and one switches roles to change permission sets. For example, you might have a security administrator role, but you don't need that to read email or surf the web, so you only switch to it when doing security administrator stuff. This prevents you from accidentally running malware with full permissions. Unix emulates this with pseudo-users and sudo.

10.4  Application Authorization Decisions

There are many applications which have tried to allow some users to perform some functions, but not others. Let's forget what we're trying to authorize, and focus on information about the requester.
For example, network-based authorization may depend on (in descending order of value):
An operating system authorization usually depends on:
There are other factors involved in authorization decisions but these are just examples. Instead of tying things to one system, let's keep it simple and pretend we're allowing or denying natural numbers, rather than usernames or things of that nature. Let's also define some access control matching primitives such as:
In a well-designed system these primitive functions would be rather complete and not the few we have here. Futher, there should be some easy way to compose these tests to achieve the desired access control:
Systems which do not do this kind of authorization are neccessarily incomplete, and cannot express all desired combinations of sets.

10.4.1  Apache Access Control

Apache has three access control directives
Allow
specifies who can use the resource
Deny
specifies who can not use the resource
Order
specifies the ordering of evaluation of those directives as either 'deny, allow', 'allow, deny', or mutual-failure.
This is unfortunately quite confusing, and it's hard to know where to start. By having an allow list and a deny list, we have four sets of objects defined:
  1. Those that are neither allowed nor denied
  2. Those that are allowed
  3. Those that are denied
  4. Those that are both allowed and denied
The truth table for this is as follows (D means default, O means open, X means denied):
1 2 3 4
DA D O X O
AD D O X X
MF D O X X
Do you see what I mean? AD and MF are essentially the same, unless I misread this section in the O'Reilly book.
Now, suppose we wish to allow in everyone except the naughty prime numbers. We would write:
So far so good, right? Now let's say that we want to deny the large primes but allow the number 2 in. Unless our combinators for access-control primitives were powerful enough to express "primes greater than two", we might be stuck already. Apache has no way to combine primitives, so is unable to offer such access control. But given that it's the web, we can't rail on it too harshly.
What we really want is a list of directives that express the set we wish very easily. For example, imagine that we didn't have an order directive, but we could simply specify what deny and allow rules we have in such a way that ealier takes precedence (the so-called "short circuit" evaluation)
However, we're unable to do that in Apache. Put simply, one can't easily treat subsets of sets created by access control matching in a different manner than the set they reside in. We couldn't allow in "2" while denying primes, unless the access control matching functions were more sophisticated.

10.4.2  Squid

Squid has one of the more powerful access control systems in use.
Primitives