Security Concepts
travis+security@subspacefield.org
Abstract
This is an online book about computer, network, technical, physical,
information and cryptographic security. It is a labor of love, incomplete
until the day I am finished.
Contents
1 Metadata
1.1 Copyright and Distribution Control
1.2 Goals
1.3 Audience
1.4 About This Work
1.5 How to Read the Online Version
1.6 About Writing This
1.7 Tools Used To Create This Book
2 Security Properties
2.1 Information Security is a PAIN
2.2 Parkerian Hexad
2.3 Pentagon of Trust
2.4 Security Equivalency
2.5 Other Questions
3 Security Concepts
3.1 The Classification Problem
3.1.1 Classification Errors
3.1.2 The Base-Rate Fallacy
3.1.3 Test Efficiency
3.1.4 Incompletely-Defined Sets
3.1.5 The Guessing Hazard
3.1.6 Machine Learning
3.2 Security Layers
3.3 Privilege Levels
3.4 What is a Vulnerability?
3.5 Accuracy Limitations
4 Economics of Security
4.1 How Expensive are Security Failures?
5 Adversary Modeling
5.1 Common Psychological Errors
5.2 Cost-Benefit
5.3 Risk Tolerance
5.4 Capabilities
5.5 Sophistication Distribution
5.6 Goals
6 Threat Modeling
6.1 Attack Surface
6.2 Attack Trees
6.3 The Weakest Link
7 Physical Security
7.1 No Physical Security Means No Security
7.2 Data Remanence
7.2.1 Magnetic Storage Media (Disks)
7.2.2 Semiconductor Storage (RAM)
8 Distributed Systems
8.1 Network Security Overview
8.2 Network Access Control
8.3 Network Reconnaissance
8.4 Network Intrusion Detection and Prevention
8.5 Cryptography is the Sine Qua Non of Secure Distributed Systems
8.6 Hello, My Name is 192.168.1.1
8.7 Source Tapping; The First Hop and Last Mile
8.8 Security Equivalent Things Go Together
8.9 A Proposed Perimeter Defense
8.10 Man In The Middle
8.10.1 DNS Issues
8.10.2 IP Routing
8.10.3 Link-layer Issues
8.10.4 Physical Layer
8.10.5 Periodic Rechecking
8.10.6 Out-of-Band Comparison
8.10.7 Parallel Paths
8.10.8 Formatting
8.11 Network Surveillance
9 Identification and Authentication
9.1 Identity
9.2 What Authority?
9.3 Authentication Factors
9.4 Authentication Issues: When, What
9.5 The Identity Continuum
9.6 Problems Remaining Anonymous
9.7 Problems with Identifying People
9.8 Remote Attestation
9.9 Advanced Authentication Tools
10 Authorization - Access Control
10.1 Privilege Escalation
10.2 Physical Access Control
10.3 Operating System Access Control
10.4 Application Authorization Decisions
10.4.1 Apache Access Control
10.5 IPTables, IPChains, Netfilter
10.6 PF
10.7 Keynote
11 Secure System Administration
11.1 Monitoring
11.2 Change Management
11.3 Self-Healing Systems
11.4 Heterogeneous vs. Homogeneous Defenses
12 Logging
12.1 Synchronized Time
12.2 Syslog
13 Reports
13.1 Change Reporting
13.2 Artificial Ignorance
13.3 Dead Man's Switch
14 Abuse Detection
14.1 Misuse Detection vs. Anomaly Detection
14.2 Computer Immune Systems
14.3 Behavior-Based Detection
14.4 Honey Traps
14.5 Tripwires and Booby Traps
14.6 Anti-Malware
14.7 Anti-Spam
14.7.1 Content filtering
14.7.2 Delays
14.7.3 Blocking Known Offenders
14.7.4 Sending Email
14.7.5 Macro-Level Techniques
14.7.6 Individual-Level Techniques
14.7.7 Micropayment Systems
14.7.8 Insolubility
14.8 Detecting Automated Peers
14.8.1 CAPTCHA
14.8.2 Bot Traps
14.8.3 Velocity Checks
14.8.4 Typing Mistakes
14.9 Host-Based Intrusion Detection
14.10 Intrusion Detection Principles
14.11 Intrusion Information Collection
14.12 Intrusion Alerting
14.12.1 Possible Intrusion Alerting Solutions
15 Abuse Response
15.1 How to Respond to Abuse
15.1.1 The Silent Treatment
15.1.2 Honest Rejection
15.1.3 Random Response
15.1.4 Faux Positives
15.1.5 The Simulation Defense
15.1.6 Fishbowls
15.1.7 Hack-Back
15.2 Identification Issues
15.3 Resource Consumption Defenses
15.4 Proportional Response
16 Forensics
16.1 Forensic Limitations
16.2 Ephemeral Data
16.3 Remnant Data
16.4 Hidden Data
16.5 Metadata
16.6 Locating Encryption Keys and Encrypted Data
16.7 Forensic Inference
17 Intrusion Response
18 Network Security
18.1 The Current State of Things
18.2 Traffic Identification
18.2.1 RPC
18.2.2 Dynamic Port Numbers
18.2.3 Encapsulation
18.2.4 Possible Solutions
18.3 Brute-Force Defenses
18.4 Federated Defense
18.5 VLANs Are Not Security Technologies
18.6 Advanced Network Security Technologies
19 Email Security
19.1 Unsolicited Bulk Email
19.1.1 Filtering
19.1.2 Graylisting
19.2 Phishing
20 Web Security
20.1 Direct Browser Attacks
20.2 Indirect Browser Attacks
20.3 Crawler Attacks
20.4 SSL Certificates Made Redundant
21 Application Security
21.1 Security is a Subset of Correctness
21.2 Malware vs. Data-Directed Attacks
21.3 Reverse Engineering
21.3.1 Tutorials
21.3.2 Analyses
21.3.3 Tools
21.3.4 Anti-Anti-Reverse Engineering
21.4 Application Exploitation
21.5 Application Exploitation Defenses
21.5.1 Stack-Smashing Protection
21.5.2 Address-Space Layout Randomization (ASLR)
21.5.3 Write XOR Execute
21.6 Software Complexity
21.6.1 Complexity of Network Protocols
21.6.2 Polymorphism and Complexity
21.7 Failure Modes
21.8 Fault Tolerance
21.9 Implications of Incorrectness
22 Human Factors and Usability
22.1 The Psychology of Security
22.2 Security Should Be Obvious
22.3 Security Should Be Easy to Use
22.4 No Hidden Data
23 Attack Patterns
23.1 Attack Taxonomy
23.2 Attack Properties
23.3 Attack Cycle
24 Trust
24.1 Trust and Trustworthiness
24.2 Who or What Are You Trusting?
24.3 Code Provenance
24.4 The Incompetence Defense
25 Cryptology
25.1 Limits of Cryptography
25.1.1 The Last Foot of the Communication
25.1.2 Limitations Regarding Endpoint Security
25.1.3 Keys Must Be Exchanged
25.1.4 In Practice
25.1.5 The Complexity Trap
25.2 Things To Know Before Doing Crypto
25.2.1 Dramatis Personae
25.2.2 Jargon
25.2.3 How Strong Should My Cryptography Be?
25.2.4 Key Lengths
25.2.5 Eight Bit Clean Handling
25.2.6 Encoding Binary Data
25.2.7 Avoiding Ambiguity
25.2.8 End-to-End vs. Link Level
25.3 Cryptographic Algorithms
25.3.1 Ciphers
25.3.2 Cryptographic Hashes
25.3.3 Message Authentication Codes and HMAC
25.3.4 Signing
25.4 Cryptographic Algorithm Enhancements
25.4.1 Speed of Algorithms and the Hybrid Encryption Scheme
25.4.2 Hashing Stored Authentication Data
25.4.3 Offline Dictionary Attacks and Iterated Hashes
25.4.4 Salts vs. Offline Dictionary Attacks and Rainbow Tables
25.4.5 Offline Dictionary Attacks with Partial Confidentiality
25.4.6 Never Use User Passphrases As Keys
25.4.7 Run Algorithm Inputs through OWF
25.5 Cryptographic Combinations
25.5.1 Combiners
25.5.2 The Sign then Encrypt Problem
25.5.3 Key Derivation Functions
25.5.4 Serialization, Records and Encoding
25.5.5 Polymorphic Data and Ambiguity
25.6 Cryptographic Protocols
25.6.1 DoS and Anti-Clogging Tokens
25.6.2 The Problem with Authenticating within an Encrypted Channel
25.6.3 How to Protect the Integrity of a Session
25.6.4 Freshness and Replay Attacks
25.6.5 Preventing Feedback
25.6.6 Identification
25.6.7 Authentication
25.6.8 Eschew Multiple Encoding Schemes Unless Necessary
25.6.9 Key Exchange and Hybrid Encryption Schemes
25.7 Encrypted Storage
25.7.1 Key Escrow for Encrypted Storage
25.7.2 Evolution of Cryptographic Storage Technologies
25.7.3 Filesystem Crypto Layers
25.7.4 File Systems with Optional Encryption
25.7.5 Block Device Crypto
25.7.6 The Cryptographically-Strong Pseudo-random Quick Fill
25.7.7 Backups
25.8 Key Management
25.8.1 Key Exchange and the Bootstrapping Problem
25.8.2 Key Management and Scalability
25.8.3 On-Air Keying (OAK)
25.8.4 One Key, One Purpose
25.8.5 Time Compartmentalization
25.8.6 Key Indirection
25.8.7 Secret Sharing
25.9 Cryptanalysis
25.9.1 Cryptographic Attack Patterns
25.9.2 A Priori Knowledge
26 Randomness and Unpredictability
26.1 An Ideal Random Number Generator
26.2 Definitions of Unpredictability
26.3 Definitions of Randomness
26.4 Why Entropy and Unpredictability Are Not the Same
26.5 Unpredictability is the Sine Qua Non of Cryptography
26.6 Unpredictability is Not Provable
26.7 Randomly Generated Samples
26.8 Testing For Predictability
26.9 Ways to Fail
26.10 Humans Are Too Predictable
26.11 Sources of Unpredictability
26.12 The Laws of Unpredictability
26.12.1 The First Law of Unpredictability
26.12.2 The Second Law of Unpredictability
26.12.3 Mixing Unpredictability
26.12.4 Getting it Wrong
27 Lateral Thinking
27.1 Traffic Analysis
27.2 Side Channels
27.2.1 Physical Information-Gathering Attacks and Defenses
27.2.2 Signal Injection Attacks and Defenses
27.2.3 System-Local Side-Channel Attacks
27.2.4 Timing Side-Channels
28 Information and Intelligence
28.1 Intelligence Jargon
28.2 Controlling Information Flow
28.3 Labeling and Regulations
28.4 Knowledge is Power
28.5 Secrecy is Power
28.6 Never Confirm Guesses
28.7 What You Don't Know Can Hurt You
28.8 How Secrecy is Lost
28.9 Costs of Disclosure
28.10 Dissemination
28.11 Information, Misinformation, Disinformation
29 Conflict and Combat
29.1 Indicators and Warnings
29.2 Attacker's Advantage in Network Warfare
29.3 Defender's Advantage in Network Warfare
29.4 OODA Loops
30 Security Principles
30.1 The Principle of Least Privilege
30.2 The Principle of Agility
30.3 The Principle of Minimal Assumptions
30.4 The Principle of Fail-Secure Design
30.5 The Principle of Unique Identifiers
30.6 The Principles of Simplicity
30.7 The Principle of Defense in Depth
30.8 The Principle of Uniform Fronts
30.9 The Principle of Split Control
30.10 The Principle of Minimal Changes
30.11 The Principle of Centralized Management
30.12 The Principle of Least Surprise
30.13 The Principle of Removing Excuses
30.14 The Principle of Retaining Control
30.15 Availability Principles
31 Common Arguments
31.1 Disclosure: Full, Partial, or None?
31.1.1 Arguments For Full Disclosure
31.1.2 Arguments Against Full Disclosure - Vendor
31.1.3 Arguments Against Full Disclosure - Vendor's Employees
31.1.4 Arguments Against Full Disclosure - End User
31.2 Absolute vs. Effective Security
31.3 Quantification and Metrics vs. Intuition
31.4 Security Through Obscurity
31.5 Security of Open Source vs. Closed Source
31.6 Insider Threat vs. Outsider Threat
31.6.1 In Favor of Perimeter Defenses
31.6.2 What Perimeter?
31.6.3 Performance Issues
31.7 Prevention vs. Detection
31.7.1 Prevention over Detection
31.7.2 Detection over Prevention
31.7.3 Impact on Intelligence Collection
31.8 Audit vs. Monitoring
31.9 Early vs. Late Adopters
31.10 Sending HTML Email
32 Editorials, Predictions, Polemics, and Personal Opinions
32.1 Security is for Polymaths
32.2 Linear Order Please!
32.3 Computers are Transcending our Limitations
32.4 Reusable Authentication Data Considered Harmful
32.5 Password Length Limits Considered Harmful
32.6 Everything Will Be Encrypted Soon
32.7 Error Propagation Characteristics Usually Don't Matter
32.8 Keep it Legal, Stupid
32.9 Should My Employees Attend "Hacker" Conferences?
32.10 Should You Sell Out?
32.11 Anonymity is not a Crime
32.11.1 Example: Sears Makes Customer Purchase Information Available Online, Provides Spyware to Customers
32.12 Monitoring Your Employees
32.13 Trust People in Spite of Counterexamples
32.14 Do What I Mean vs. Do What I Say
32.15 You Are Part of the Problem if You...
32.16 What Do I Do to Not Get Hacked?
33 Resources
33.1 My Other Stuff
33.2 Conferences
33.3 Books
33.3.1 Publishers
33.3.2 Titles
33.4 Periodicals
33.5 Blogs
33.6 Mailing Lists
34 Credits
1 Metadata
1.1 Copyright and Distribution Control
Kindly link a person to it instead of redistributing it, so that people
may always receive the latest version. However, even an outdated copy
is better than none. The PDF version is preferred and more likely
to render properly (especially graphics and special mathematical characters),
but the HTML version is simply too convenient to not have it available.
The latest version is always here:
- http://www.subspacefield.org/security/security_concepts.html
-
This is a copyrighted work, with some rights reserved. This work is
licensed under the Creative Commons Attribution-Noncommercial-No
Derivative Works 3.0 United States License (http://creativecommons.org/licenses/by-nc-nd/3.0/us/).
This means you may redistribute it for non-commercial purposes, and
that you must attribute me properly (without suggesting I endorse
your work). For attribution, please include a prominent link back
to this original work and some text describing the changes. I am comfortable
with certain derivative works, such as translation into other languages,
but not sure about others, so have yet not explicitly granted permission
for all derivative uses. If you have any questions, please email me
and I'll be happy to discuss it with you.
I wrote this paper to try and examine the typical problems in computer
security and related areas, and attempt to extract from them principles
for defending systems. To this end I attempt to synthesize various
fields of knowledge, including computer security, network security,
cryptology, and intelligence. I also attempt to extract the principles
and implicit assumptions behind cryptography and the protection of
classified information, as obtained through reverse-engineering (that
is, informed speculation based on existing regulations and stuff I
read in books), where they are relevant to technological security.
1.3 Audience
When I picture a perfect reader, I always picture a monster of courage
and curiosity, also something supple, cunning, cautious, a born adventurer
and discoverer.
- Friedreich Nietzsche
This is not intended to be an introductory text, although a beginner
could gain something from it. The reason behind this is that beginners
think in terms of tactics, rather than strategy, and of details rather
than generalities. There are many fine books on computer and network
security tactics (and many more not-so-fine books), and tactics change
quickly, and being unpaid for this work, I am a lazy author. The reason
why even a beginner may gain from it is that I have attempted to extract
abstract concepts and strategies which are not necessarily tied to
computer security. And I have attempted to illustrate the points with
interesting and entertaining examples and would love to have more,
so if you can think of an example for one of my points, please send
it to me!
I'm writing this for you, noble reader, so your comments are very
welcome; you will be helping me make this better for every future
reader. If you send a contribution or comment, you'll save me a lot
of work if you tell me whether you wish to be mentioned in the credits
(see ) or not; I want to respect the privacy of
anonymous contributors. If you're concerned that would be presumptuous,
don't be; I consider it considerate of you to save me an email exchange.
Security bloggers will find plenty of fodder by looking for new URLs
added to this page, and I encourage you to do it, since I simply don't
have time to comment on everything I link to. If you link to this
paper from your blog entry, all the better.
1.4 About This Work
I have started this book with some terminology as a way to frame the
discussion. Then I get into the details of the technology. Since this
is adequately explained in other works, these sections are somewhat
lean and may merely be a list of links. Then I get into my primary
contribution, which is the fundamental principles of security which
I have extracted from the technological details. Afterwards, I summarize
some common arguments that one sees among security people, and I finish
up with some of my personal observations and opinions.
1.5 How to Read the Online Version
Since this document is constantly being revised, I suggest that you
start with the table of contents and click on the subject headings
so that you can see which ones you have read already. If I add a section,
it will show up as unread. By the time it has expired from your browser's
history, it is probably time to re-read it anyway, since the contents
have probably been updated.
See the end of this page for the date it was generated (which is also
the last update time). I currently update this about once every two
weeks.
1.6 About Writing This
Part of the challenge with writing about this topic is that we are
always learning and it never seems to settle down, nor does one ever
seem to get a sense of completion. I consider it more permanent and
organized than a blog, more up-to-date than a book, and more comprehensive
and self-contained than most web pages. I know it's uneven; in some
areas it's just a heading with a paragraph, or a few links, in other
places it can be as smoothly written as a book. I thought about breaking
it up into multiple documents, so I could release each with much more
fanfare, but that's just not the way I write, and it makes it difficult
to do as much cross-linking as I'd like.
This is to my knowledge the first attempt to publish a computer security
book on the web before printing it, so I have no idea if it will even
be possible to print it commercially. That's okay; I'm not writing
for money. I'd like for the Internet to be the public library of the
21st century, and this is my first significant donation
to the collection. I am reminded of the advice of a staffer in the
computer science department, who said, "do what you love, and
the money will take care of itself".
That having been said, if you wanted towards the effort, you can help
me defray the costs of maintaining a server and such by visiting our
donation page (http://www.subspacefield.org/donate.html). If
you would like to donate but cannot, you may wait until such a time
as you can afford to, and then give something away (i.e. pay it forward).
1.7 Tools Used To Create This Book
I use lyx (http://www.lyx.org/), but I'm still a bit of a novice.
I have a love/hate relationship with it and the underlying typesetting
language LATEX(http://en.wikipedia.org/wiki/LaTeX).
2 Security Properties
What do we mean by secure? When I say secure, I mean that an
adversary can't make the system do something that its owner (or designer,
or administrator, or even user) did not intend. Often this involves
a violation of a general security property. Some security properties
include:
- confidentiality
- refers to whether the information in question
is disclosed or remains private.
- integrity
- refers to whether the systems (or data) remain uncorrupted.
The opposite of this is malleability, where it is possible
to change data without detection, and believe it or not, sometimes
this is a desirable security property.
- availability
- is whether the system is available when you need
it or not.
- consistency
- is whether the system behaves the same each time
you use it.
- auditabilty
- is whether the system keeps good records of what
has happened so it can be investigated later. Direct-record electronic
voting machines (with no paper trail) are unauditable.
- control
- is whether the system obeys only the authorized users
or not.
- authentication
- is whether the system can properly identify users.
Sometimes, it is desirable that the system cannot do so, in which
case it is anonymous or pseudonymous.
- non-repudiation
- is a relatively obscure term meaning that if
you take an action, you won't be able to deny it later. Sometimes,
you want the opposite, in which case you want repudiability
("plausible deniability").
Please forgive the slight difference in the way they are named; while
English is partly to blame, these properties are not entirely parallel.
For example, confidentiality refers to information (or inferences
drawn on such) just as program refers to an executable stored on the
disk, whereas control implies an active system just as process refers
to a running program (as they say, "a process is a program in
motion"). Also, you can compromise my data confidentiality with
a completely passive attack such as reading my backup tapes, whereas
controlling my system is inherently detectable since it involves interacting
with it in some way.
2.1 Information Security is a PAIN
You can remember the security properties of information as PAIN; Privacy,
Authenticity, Integrity, Non-repudiation.
2.2 Parkerian Hexad
There is something similar known as the "Parkerian Hexad", defined
by Donn B. Parker, which is six fundamental, atomic, non-overlapping
attributes of information that are protected by information security
measures:
- confidentiality
- possession
- integrity
- authenticity
- availability
- utility
2.3 Pentagon of Trust
- Admissibility (is the remote node trustworthy?)
- Authentication (who are you?)
- Authorization (what are you allowed to do?)
- Availability (is the data accessible?)
- Authenticity (is the data intact?)
2.4 Security Equivalency
I consider two objects to be security equivalent if they are
identical with respect to the security properties under discussion;
for precision, I may refer to confidentiality-equivalent pieces
of information if the sets of parties to which they may be disclosed
(without violating security) are exactly the same (and conversely,
so are the sets of parties to which they may not be disclosed). In
this case, I'm discussing objects which, if treated improperly, could
lead to a compromise of the security goal of confidentiality.
Or I could say that two cryptosystems are confidentiality-equivalent,
in which case the objects help achieve the security goal. To be perverse,
these last two examples could be combined; if the information in the
first example was actually the keys for the cryptosystem in the second
example, then disclosure of the first could impact the confidentiality
of the keys and thus the confidentiality of anything handled by the
cryptosystems. Alternately, I could refer to access-control equivalence
between two firewall implementations; in this case, I am discussing
objects which implement a security mechanism which helps us
achieve the security goal, such as confidentiality of something.
2.5 Other Questions
- Secure to whom? A web site may be secure (to its owners) against unauthorized
control, but may employ no encryption when collecting information
from customers.
- Secure from whom? A site may be secure against outsiders, but not
insiders.
3 Security Concepts
There is no security on this earth, there is only opportunity.
- General Douglas MacArthur (1880-1964)
These are important concepts which appear to apply across multiple
security domains.
3.1 The Classification Problem
Many times in security you wish to distinguish between classes of
data. This occurs in firewalls, where you want to allow certain traffic
but not all, and in intrusion detection where you want to allow benign
traffic but not allow malicious traffic, and in operating system security,
we wish to allow the user to run their programs but not malware. In
doing so, we run into a number of limitations in various domains that
deserve mention together.
3.1.1 Classification Errors
False Positives vs. False Negatives, also called Type I and Type II
errors. Discuss equal error rate (EER) and its use in biometrics.
Sometimes in medicine they will do a cheap test with a high error
rate biased one direction (often false positives), and a more expensive
test with a lower error rate, usually biased in the other direction.
3.1.2 The Base-Rate Fallacy
In The Base Rate Fallacy and its Implications for Intrusion
Detection (http://www.raid-symposium.org/raid99/PAPERS/Axelsson.pdf),
the author essentially points out that there's a lot of benign traffic
for every attack, and so even a small chance of a false positive will
quickly overwhelm any true positives. Put another way, if one out
of every 10,001 connections is malicious, and the test has a 1% false
positive error rate, then for every 1 real malicious connection there
10,000 benign connections, and hence 100 false positives.
3.1.3 Test Efficiency
In other cases, you are perfectly capable of performing an accurate
test, but not on all the traffic. You may want to apply a cheap test
with some errors on one side before applying a second, more expensive
test on the side with errors to weed them out. This is done in BSD
Unix with packet capturing via tcpdump, which uploads a coarse filter
into the kernel, and then applies a more expensive but finer-grained
test in userland which only operates on the packets which pass the
first test.
3.1.4 Incompletely-Defined Sets
As far as the laws of mathematics refer to reality, they are not certain;
and as far as they are certain, they do not refer to reality.
- Albert Einstein
Stop for a moment and think about the difficulty of trying to list
all the undesirable things that your computer shouldn't do. If you
find yourself finished, then ask yourself; did you include that it
shouldn't attack other computers? Did you include that it shouldn't
transfer $1000 to a mafia-run web site when you really intended to
transfer $100 to your mother? Did you include that it shouldn't send
spam to your address book? The list goes on and on.
Thus, if we had a complete list of everything that was bad, we'd block
it and never have to worry about it again. However, often we either
don't know, or the set is infinite. Similarly, we may not be able
to obtain a complete list of everything that is good; imagine trying
to specify in advance all the network packets that should be allowed
into your enterprise!
3.1.5 The Guessing Hazard
So often we can't enumerate all the things we would want to do, nor
all the things that we would not want to do. Because of this, intrusion
detection systems (see ) often simply guess;
they try to detect attacks unknown to them by looking for features
that are likely to be present in malware but not in normal traffic.
At the current moment, you can find out if your traffic is passing
through an IPS by trying to send a long string of A's in a session.
This isn't malicious by itself, but is a common letter with which
people pad exploits (see ). In
this case, it's a great example of a false positive, or collateral
damage, generated through guilt-by-association; there's nothing
inherently suspicious about a string of A's, it's just that
exploit writers use them a lot, and IPS vendors decided that made
them suspicious. I'm not a big fan of these because I feel that it
breaks functionality that doesn't threaten the system, and that it
could be used as evidence of malfeasance against someone by someone
who doesn't really understand the technology. I'm already irritated
by the false-positives or excessive warnings about security tools
from anti-virus software; it seems to alert to "potentially-unwanted
programs" an absurd amount of the time; most novices don't understand
that the anti-virus software reads the disk even though I'm not running
the programs, and that you have nothing to fear if you don't run the
programs. I fear that one day my Internet Service Provider will start
filtering them out of my email or network streams, but fortunately
they just don't care that much.
3.1.6 Machine Learning
Someone could profitably apply machine learning to decide which signatures
are best for classification.
3.2 Security Layers
I like to think of security as a hierarchy. At the base, you have
physical security. On top of that is OS security, and on top of that
is application security, and on top of that, network security. You
may have an unbeatable firewall, but if your OS doesn't require a
password and your adversary has physical access, you lose. So each
layer of the pyramid can not be more secure (in an absolute sense)
as the layer below it. Ideally, each layer should be available to
fewer adversaries than the layer above it, so that one has a sort
of balance or risk equivalency.
- network security
- application/database security
- OS security
- physical security
In operating system security, we distinguish between users
of the system, and perhaps the roles they are fulfilling, and only
concern ourselves with activities within that computer. It is assumed
that the adversary has some access, but less than full privileges
on the system. In network security, we concern ourselves with
nodes in the networks (usually individual computers), and do not distinguish
between users of each system. In some sense, we are now assigning
rights to computers and not people. This is often justified since
it is usually easier to leverage one user's access to gain another's
within the same system than to gain access to another system (but
this is not a truism).
3.3 Privilege Levels
Here's a taxonomy of some commonly-useful privilege levels.
- Anonymous, remote systems
- Authenticated remote systems
- Local unprivileged user (UID > 0)
- Administrator (UID 0)
- Kernel (privileged mode, ring 0)
- Hardware (TPM, ring -1, hypervisors, trojaned hardware)
Actual systems may vary, levels may not be strictly hierarchical,
etc. Basically the higher the level you get, the harder you are to
detect. The gateways between the levels are access control devices,
analogous with firewalls.
3.4 What is a Vulnerability?
Now that you know what a security property is, what constitutes (or
should constitute) a vulnerability? On the arguable end of the scale
we have "loss of availability", or susceptibility to denial
of service (DoS). On the inarguable end of the scale, we have "loss
of control", which usually arbitrary code execution, which often
means that the adversary can do whatever he wants with the system,
and therefore can violate any other security property.
In an ideal world, every piece of software would state its assumptions
about its environment, and then state the security properties it attempts
to guarantee; this would be a security policy. Any violation
of these explicitly-stated security properties would then be a vulnerability,
and any other security properties would simply be "outside the
design goals". However, I only know of one piece of commonly-available
software which does this, and that's OpenSSL (http://oss-institute.org/FIPS_733/SecurityPolicy-1.1.1_733.pdf).
3.5 Accuracy Limitations in Making Decisions That Impact Security
On two occasions I have been asked, "Pray, Mr. Babbage, if you
put into the machine wrong figures, will the right answers come out?"
In one case a member of the Upper, and in the other a member of the
Lower, House put this question. I am not able rightly to apprehend
the kind of confusion of ideas that could provoke such a question.
- Charles Babbage
This is sometimes called the GIGO rule (Garbage In, Garbage Out).
Stated this way, this seems self-evident. However, you should realize
that this applies to systems as well as programs. For example, if
your system depends on DNS to locate a host, then the correctness
of your system's operation depends on DNS. Whether or not this is
exploitable (beyond a simple denial of service) depends a great deal
on the details of the procedures. This is a parallel to the question
of whether it is possible to exploit a program via an unsantitized
input.
You can never be more accurate than the data you used for your input.
Try to be neither precisely inaccurate, nor imprecisely accurate.
Learn to use footnotes.
4 Economics of Security
4.1 How Expensive are Security Failures?
I'm looking for good examples of companies put out of business by
system crackers. Also large penalties will do. Please send them to
me if you can dig up a reference.
5 Adversary Modeling
If you know the enemy and know yourself, you need not fear the result
of a hundred battles.
If you know yourself but not the enemy, for every victory gained you
will also suffer a defeat.
If you know neither the enemy nor yourself, you will succumb in every
battle.
- Sun Tzu, The Art of War (http://en.wikipedia.org/wiki/The_Art_of_War)
After deciding what you need to protect (your assets), you
need to know about the threats you wish to protect it against,
or the adversaries (sometimes called threat agents)
which may threaten it. Generally intelligence units have threat
shops, where they monitor and keep track of the people who may threaten
their operations. This is natural, since it is easier to get an idea
of who will try and do something than how some unspecified person
may try to do it, and can help by hardening systems in enemy territory
more than those in safer areas, leading to more efficient use of resources.
I shall call this adversary modeling.
In adversary modeling, the implicit assumptions are that you have
a limited budget and the number of threats is so large that you cannot
defend against all of them. So you now need to decide where to allocate
your resources. Part of this involves trying to figure out who your
adversaries are and what their capabilities and intentions are, and
thus how much to worry about particular domains of knowledge or technology.
You don't have to know their name, location and social security number;
it can be as simple as "some high school student on the Internet
somewhere who doesn't like us", "a disgruntled employee" (as
opposed to a gruntled employee), or "some sexually frustrated
script-kiddie on IRC who doesn't like the fact that he is a jerk who
enjoys abusing people and therefore his only friends are other dysfunctional
jerks like him". People in charge of doing attacker-centric threat
modeling must understand their adversaries and be willing to
take chances by allocating resources against an adversary which hasn't
actually attacked them yet, or else they will always be defending
against yesterday's adversary, and get caught flat-footed by a new
one.
5.1 Common Psychological Errors
The excellent but poorly titled1 book Searching for Happiness tells us that we make two common
kinds of errors when reasoning about other humans:
- Overly different; if you looked at grapes all day, you'd know a hundred
different kinds, and naturally think them very different. But they
all squish when you step on them, they are all fruits and frankly,
not terribly different at all. So too we are conditioned to see people
as different because the things that matter most to us, like finding
an appropriate mate or trusting people, cannot be discerned with questions
like "do you like breathing?". An interesting experiment showed
that a description of how they felt by people who had gone through
a process is more accurate in predicting how a person will feel after
the process than a description of the process itself. Put another
way, people assume that the experience of others is too dependent
on the minor differences between humans that we mentally exaggerate.
- Overly similar; people assume that others are motivated by the same
things they are motivated by; we project onto them a reflection of
our self. If a financier or accountant has ever climbed mount Everest,
I am not aware of it. Surely it is a cost center, yes?
5.2 Cost-Benefit
Often, the lower layers of the security hierarchy cost more to build
out than the higher levels. Physical security requires guards, locks,
iron bars, shatterproof windows, shielding, and various other things
which, being physical, cost real money. On the other hand, network
security may only need a free software firewall. However, what an
adversary could cost you during a physical attack (e.g. a burglar
looting your home) may be greater than an adversary could cost you
by defacing your web site.
5.3 Risk Tolerance
We may assume that the distribution of risk tolerance among adversaries
is monotonically decreasing; that is, the number of adversaries who
are willing to try a low-risk attack is greater than the number of
adversaries who are willing to attempt a high-risk attack to get the
same result. Beware of risk evaluation though; while a hacker may
be taking a great risk to gain access to your home, local law enforcement
with a valid warrant is not going to be risking as much.
So, if you are concerned about a whole spectrum of adversaries, known
and unknown, you may wish to have greater network security than physical
security, simply because there are going to be more remote attacks.
5.4 Capabilities
Men of sense often learn from their enemies. It is from their foes,
not their friends, that cities learn the lesson of building high walls
and ships of war. - Aristophanes
You only have to worry about things to the extent they may lie within
the capabilities of your adversaries. It is rare that adversaries
use outside help when it comes to critical intelligence; it could,
for all they know, be disinformation, or the outsider could be an
agent-provocateur.
5.5 Sophistication Distribution
If they were capable, honest, and hard-working, they wouldn't need
to steal.
Along similar lines, one can assume a monotonically decreasing number
of adversaries with a certain level of sophistication. My rule of
thumb is that for every person who knows how to perform a technique,
there are x people who know about it, where x
is a small number, perhaps 3 to 10. The same rule applies to people
with the ability to write an exploit versus those able to download
and use it (the so-called script kiddies). Once an exploit
is coded into a worm, the chance of a compromised host having been
compromised by the worm (instead of a human who targets it specifically)
approaches 100%. Discuss Bayesian inference.
We've all met or know about people who would like nothing more than
to break things, just for the heck of it; schoolyard bullies who feel
hurt and want to hurt others, or their overgrown sadist kin. Vandals
who merely want to write their name on your storefront. A street thug
who will steal a cell phone just to throw it through a window. I'm
sure the sort of person reading this isn't like that, but unfortunately
some people are. What exactly are your adversary's goals? Are they
to maximize ROI (Return On Investment) for themselves, or are they
out to maximize pain (tax your resources) for you? Are they monetarily
or ideologically motivated? What do they consider investment? What
do they consider a reward? Put another way, you can't just assign
a dollar value on assets, you must consider their value to the adversary.
6 Threat Modeling
In technology, people tend to focus on how rather than who, which
seems to work better when anyone can potentially attack any system
(like with publicly-facing systems on the Internet) and when protection
mechanisms have low or no incremental cost (like with free and open-source
software). I shall call modeling these threat modeling (http://en.wikipedia.org/wiki/Threat_model).
6.1 Attack Surface
Gnothi Seauton ("Know Thyself")
- ancient Greek aphorism (http://en.wikipedia.org/wiki/Know_thyself)
When discussing security, it's often useful to analyze the part which
may interact with a particular adversary (or set of adversaries).
For example, let's assume you are only worried about remote adversaries.
If your system or network is only connected to outside world via the
Internet, then the attack surface is the parts of your system that
interact with things on the Internet, or the parts of your system
which accept input from the Internet. A firewall, then, limits the
attack surface to a smaller portion of your systems by filtering some
of your network traffic. Often, the firewall blocks all incoming connections.
Sometimes the attack surface is pervasive. For example, if you have
a network-enabled embedded device like a web cam on your network that
has a vulnerability in its networking stack, then anything which can
send it packets may be able to exploit it. Since you probably can't
fix the software in it, you must then use a firewall to attempt to
limit what can trigger the bug. Similarly, there was a bug in sendmail
that could be exploited by sending a carefully-crafted email through
a vulnerable server. The interesting bit here is that it might be
an internal server that wasn't exposed to the Internet; the exploit
was data-directed and so could be passed through your infrastructure
until it hit a vulnerable implementation. That's why I consistently
use one implementation (not sendmail) throughout my network now.
If plugging a USB drive into your system causes it to automatically
run things like a standard Microsoft Windows XP install, then any
plugged-in device is part of the attack surface (http://it.slashdot.org/article.pl?sid=08/01/13/1533243).
But even if it does not, then by plugging a USB device in you could
potentially overflow the code which handles the USB or the driver
for the particular device which is loaded (http://www.eweek.com/article2/0,1895,1840141,00.asp,
http://www.schneier.com/blog/archives/2006/06/hacking_compute.html);
thus, the USB networking code and all drivers are part of the attack
surface if you can control what is plugged into the system. Moreover,
a recent vulnerability (http://it.slashdot.org/it/08/01/14/1319256.shtml)
illustrates that when you have something which inspects network traffic,
such as uPNP devices or port knocking daemons, then their code forms
part of the attack surface.
Sometimes you will hear people talk about the "anonymous attack
surface"; this is the attack surface available to everyone (on the
Internet). Since this number of people is so large, and you usually
can't identify them or punish them, you want to be really sure that
the anonymous attack surface is limited and doesn't have any so-called
"pre-auth" vulnerabilities, because those can be exploited prior
to identification and authentication.
6.2 Attack Trees
The next logical step is to move from defining the attack surface
to modeling attacks and quantify risk levels.
Microsoft is actually getting into this kind of stuff now and if I
can find something useful and novel from them I'll refer to it here.
6.3 The Weakest Link
Amdahl's law, also known as Amdahl's argument, is named after computer
architect Gene Amdahl, and is used to find the maximum expected improvement
to an overall system when only part of the system is improved.
- Wikipedia (http://en.wikipedia.org/wiki/Amdahl%27s_law)
Let us think of our security posture for whatever we're protecting
as being composed of a number of systems (or groups of systems possibly
offering defense-in-depth). The strength of these systems to attack
may vary. You may wish to pour all your resources into one, but the
security will likely be broken at the weakest point, either by chance
or by an intelligent adversary.
This is an analogy to Amdahl's law, stated above, in that we can only
increase our overall security posture by maintaining a delicate balance
between the different defenses to attack vectors. Most of the time,
your resources are best spent on the weakest area, which for some
institutions (financial, military) is usually personnel.
The reasons you might not balance all security systems:
- Economics
- matter here; it may be much cheaper and reliable to
buy a firewall than put your employees through security training.
- Exposure
- matters; an Internet attack is much more likely than
a physical attack.
- Capability
- matters in that you may have a strong technical base
but a very naive view of people. This is the "you do what you're
good at" approach.
- Costs
- matter; you have only so many resources to put into things,
but not enough to balance them. Physical things like walls of thick
concrete often cost too much to consider.
7 Physical Security
When people think of physical security, these often are the limit
on the strength of access control devices; I recall a story of a cat
burglar who used a chainsaw to cut through victim's walls, bypassing
any access control devices. I remember reading someone saying that
a deep space probe is the ultimate in physical security.
7.1 No Physical Security Means No Security
A couple of limitations come up without physical security for a system.
For confidentiality, all of the sensitive data needs to be encrypted.
But even if you encrypt the data, an adversary with physical access
could trojan the OS and capture the data (this is a control attack
now, not just confidentiality breach; go this far and you've protected
against overt seizure, theft, improper disposal and such). So you'll
need to you protect the confidentiality and integrity of the OS, he
trojans the kernel. If you protect the kernel, he trojans the boot
loader. If you protect the boot loader (say by putting on a removable
medium), he trojans the BIOS. If you protect the BIOS, he trojans
the CPU. So you put a tamper-evident label on it, with your signature
on it, and check it every time. But he can install a keyboard logger.
So suppose you make a sealed box with everything in it, and connectors
on the front. Now he gets measurements and photos of your machine,
spends a fortune replicating it, replaces your system with an outwardly
identical one of his design (the trojan box), which communicates (say,
via encrypted spread-spectrum radio) to your real box. When you type
plaintext, it goes through his system, gets logged, and relayed to
your system as keystrokes. Since you talk plaintext, neither of you
are the wiser.
The physical layer is a common place to facilitate a side-channel
attack (see ).
7.2 Data Remanence
Data remanence is the the residual physical representation of your
information on media after you believe that you have removed it (definition
thanks to Wikipedia, http://en.wikipedia.org/wiki/Data_remanence).
This is a disputed region of technology, with a great deal of speculation,
self-styled experts, but very little hard science.
As of 2006, the most definitive study seems to be the NIST Computer
Security Division paper Guidelines for Media Sanitization (http://csrc.nist.gov/publications/nistpubs/800-88/NISTSP800-88_rev1.pdf).
NIST is known to work with the NSA on some topics, and this may be
one of them. It introduces some useful terminology:
- disposing
- is the act of discarding media with no other considerations
- clearing
- is a level of media sanitization that resists anything
you could do at the keyboard or remotely, and usually involves overwriting
the data at least once
- purging
- is a process that protects against a laboratory attack
(signal processing equipment and specially trained personnel)
- destroying
- is the ultimate form of sanitization, and means that
the medium can no longer be used as originally intended
7.2.1 Magnetic Storage Media (Disks)
The seminal paper on this is Peter Gutmann's Secure Deletion
of Data from Magnetic and Solid-State Memory (http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html).
In early versions of his paper, he speculated that one could extract
data due to hysteresis effects even after a single overwrite, but
on subsequent revisions he stated that there was no evidence a single
overwrite was insufficient. Simson Garfinkel wrote about it recently
in his blog (https://www.techreview.com/blog/garfinkel/17567/).
The NIST paper has some interesting tidbits in it. Obviously, disposal
cannot protect confidentiality of unencrypted media. Clearing is probably
sufficient security for 99% of all data; I highly recommend Darik's
Boot and Nuke (http://dban.sourceforge.net/), which is a bootable
floppy or CD based on Linux. However, it cannot work if the storage
device stops working properly, and it does not overwrite sectors or
tracks marked bad and transparently relocated by the drive firmware.
With all ATA drives over 15GB, there is a "secure delete" ATA
command which can be accessed from hdparm within Linux, and Gordon
Hughes has some interesting documents and a Microsoft-based utility
(http://cmrr.ucsd.edu/people/Hughes/SecureErase.shtml). There's
a useful blog entry about it (http://storagemojo.com/2007/05/02/secure-erase-data-security-you-already-own/).
In the case of very damaged disks, you may have to resort to physical
destruction. However, with disk densities being what they are, even
1/125" of a disk platter may hold a full sector, and someone with
absurd amounts of money could theoretically extract small quantities
of data. Fortunately, nobody cares this much about your data.
Now, you may wonder what you can do about very damaged disks, or what
to do if the media isn't online (for example, you buried it in an
underground bunker), or if you have to get rid of the data fast. I
would suggest that encrypted storage (see )
would almost always be a good idea. If you use it, you merely have
to protect the confidentiality of the key, and if you can properly
sanitize the media, all the better. Recently Simson Garfinkel re-discovered
a technique for getting the data off broken drives; freezing them.
Another technique that I have used is to replace the logic board with
one from a working drive.
7.2.2 Semiconductor Storage (RAM)
Peter Gutmann's Data Remanence in Semiconductor Devices (http://www.cypherpunks.to/~peter/usenix01.pdf)
shows that if a particular value is held in RAM for extended periods
of time, various processes such as electromigration make permanent
changes to the semiconductor's structure. In some cases, it is possible
for the value to be "burned in" to the cell, such that it cannot
hold another value.
Cold Boot Attack
Recently a Princeton team (http://citp.princeton.edu/memory/)
found that the values held in DRAM decay in predictable ways after
power is removed, such that one can merely reboot the system and recover
keys for most encrypted storage systems (http://citp.princeton.edu/pub/coldboot.pdf).
By cooling the chip first, this data remains longer. This generated
much talk in the industry. This prompted an interesting overview of
attacks against encrypted storage systems (http://www.news.com/8301-13578_3-9876060-38.html).
8 Distributed Systems
8.1 Network Security Overview
The things involved in network security are called nodes. One
can talk about networks composed of humans (social networks), but
that's not the kind of network we're talking about here; I always
mean a computer unless I say otherwise. Often in network security
the adversary is assumed to control the network in whole or part;
this is a bit of a holdover from the days when the network was radio,
or when the node was an embassy in a country controlled by the adversary.
In modern practice, this doesn't seem to usually be the case, but
it'd be hard to know for sure. In the application of network security
to the Internet, we almost always assume the adversary controls at
least one of the nodes on the network.
In network security, we can lure an adversary to a system,
tempt them with something inviting; such a system is called a honeypot,
and a network of such systems is sometimes called a honeynet.
A honeypot may or may not be instrumented for careful monitoring;
sometimes systems so instrumented are called fishbowls, to
emphasize the transparent nature of activity within them. Often one
doesn't want to allow a honeypot to be used as a launch point for
attacks, so outbound network traffic is sanitized or scrubbed;
if traffic to other hosts is blocked completely, some people call
it a jail, but that is also the name of an operating system
security technology used by FreeBSD, so I consider it confusing.
To reduce a distributed system problem to a physical security (see
7) problem, you can use an air gap,
or sneakernet between one system and another. However, the
data you transport between them may be capable of exploiting the offline
system. One could keep a machine offline except during certain windows;
this could be as simple as a cron job which turns on or off the network
interface via ifconfig. However, an offline system may be difficult
to administer, or keep up-to-date with security patches.
8.2 Network Access Control: Packet Filters, Firewalls, Security Zones
Most network applications use TCP, a connection-oriented protocol,
and they use a client/server model. The client initiates a
handshake with the server, and then they have a conversation.
Sometimes people use the terms client and server to mean the application
programs, and other times they mean the node itself. Other names for
server applications include services and daemons. Obviously
if you can't speak with the server at all, or (less obviously) if
you can't properly complete a handshake, you will find it difficult
to attack the server application. This is what a packet filter
does; it allows or prevents communication between a pair of sockets.
A packet filter does not generally do more than a simple all-or-nothing
filtering.
The firewall was originally defined as a device between different
networks that had different security characteristics; it was named
after the barrier between a automobile interior and the engine, which
is designed to prevent a engine fire from spreading to the passenger
cabin. As our understanding of network security improved, people started
to define various parts of their network. The canonical types of networks
are:
- Trusted networks were internal to your corporation.
- An untrusted network may be the Internet, or a wifi network,
or any network with open, public access.
- Demilitarized zones (DMZs) were originally defined as an area
for placing machines that must talk to nodes on both trusted and untrusted
networks. At first they were placed outside the firewall but inside
a border router, then as a separate leg of the firewall, and now in
are defined and protected in a variety of ways.
What these definitions all have in common is that they end up defining
security zones (this term thanks to the authors of Extreme
Exploits). All the nodes inside a security zone have roughly equivalent
access to or from other security zones. I believe this is the most
important and fundamental way of thinking of network security. Do
not confuse this with the idea that all the systems in the zone have
the same relevance to the network's security, or that the systems
have the same impact if compromised; that is a complication and more
of a matter of operating system security than network security. In
other words, two systems (a desktop and your DNS server) may not be
security equivalent, but they may be in the same security zone.
8.3 Network Reconnaissance: Ping Sweeps, Port Scanning
Typically an adversary needs to know what he can attack before he
can attack it. This is called reconnaissance, and involves
gathering information about the target and identifying ways in which
he can attack the target. In network security, the adversary may want
to know what systems are available for attack, and a technique such
as a ping sweep of your network block may facilitate this.
Then, he may choose to enumerate (get a list of) all the services
available via a technique such as a port scan. A port scan
may be a horizontal scan (one port, many IP addresses) or vertical
scan (one IP address, multiple ports), or some combination thereof.
You can sometimes determine what service (and possibly what implementation)
it is by banner grabbing or fingerprinting the service.
In an ideal world, knowing that you can talk to a service does not
matter. Thus, a port scan should only reveal what you already assumed
your adversary already knew. However, it is considered very rude,
even antisocial, like walking down the street and trying to open the
front door of every house or business that you pass; people will
assume you are trying to trespass, and possibly illicitly copy their
data.
Typical tools used for network reconnaissance include:
8.4 Network Intrusion Detection and Prevention
Most security-conscious organizations are capable of detecting most
scans using [network] intrusion detection systems (IDS) or intrusion
prevention systems (IPS); see .
8.5 Cryptography is the Sine Qua Non of Secure Distributed Systems
All cryptography lets you do is create trust relationships across
untrustworthy media; the problem is still trust between endpoints
and transitive trust.
- Marcus Ranum
Put simply, you can't have a secure distributed system (with the normal
assumptions of untrusted nodes and network links potentially controlled
by the adversary) without using cryptography somewhere ("sine
qua non" is Latin for "without which it could not be"). If
the adversary can read communications, then to protect the confidentiality
of the network traffic, it must be encrypted. If the adversary can
modify network communication, then it must have its integrity protected
and be authenticated (that is, to have the source identified). Even
physical layer communication security technologies, like the KLJN
cipher, quantum cryptography, and spread-spectrum communication, use
cryptography in one way or another.
I would go farther and say that performing network security decisions
on anything other than cryptographic keys is never going to be as
strong as if it depended on cryptography. Very few Internet adversaries
currently have the capability to arbitrarily route data around. Most
cannot jump between VLANs on a tagged port. Some don't even have the
capability to sniff on their LAN. But none of the mechanisms preventing
this are stronger than strong cryptography, and often they are much
weaker, possibly only security through obscurity. Let me put it to
you this way; to support a general argument otherwise, think about
how much assurance a firewall has that a packet claiming to be from
a given IP address is actually from the system the firewall maintainer
believes it to be. Often these things are complex, and way beyond
his control. However, it would be totally reasonable to filter on
IP address first, and only then allow a cryptographic check; this
makes it resistant to resource consumption attacks from anyone who
cannot spoof a legitimate IP address (see 3.1.1).
8.6 Hello, My Name is 192.168.1.1
Humans are incapable of securely storing high-quality cryptographic
keys, and they have unacceptable speed and accuracy when performing
cryptographic operations. (They are also large, expensive to maintain,
difficult to manage, and they pollute the environment. It is astonishing
that these devices continue to be manufactured and deployed. But they
are sufficiently pervasive that we must design our protocols around
their limitations).
- Network Security / PRIVATE Communication in a PUBLIC World by Charlie
Kaufman, Radia Perlman, & Mike Speciner (Prentice Hall 2002; p.237)
Because humans communicate in slowly, in plaintext, and don't plug
into a network, we consider the nodes within the network to be computing
devices. The system a person interacts with has equivalency with them;
break into the system administrator's console, and you have access
to anything he or she accesses. In some cases, you may have access
to anything he or she can access. You may think that the your LDAP
or Kerberos server is the most important, but isn't the node of the
guy who administers it just as critical? This is especially true if
OS security is weak and any user can control the system, or if the
administrator is not trusted, but it is also convenient because packets
do not have user names, just source IPs. When some remote system connects
to a server, unless both are under the control of the same entity,
the server has no reason to trust the remote system's claim about
who is using it, nor does it have any reason to treat one user on
the remote system different than any other.
8.7 Source Tapping; The First Hop and Last Mile
One can learn a lot more about a target by observing the first link
from them than from some more remote place. That is, the best vantage
point is one closest to the target. For this reason, the first hop
is far more critical than any other. An exception may involve a target
that is more network-mobile than the eavesdropper. The more common
exception is tunneling/encryption (to include tor and VPN technologies);
these relocate the first hop somewhere else which is not physically
proximate to the target's meat space coordinates, which may make it
more difficult to locate.
Things to consider here involve the difficulty of interception, which
is a secondary concern (it is never all that difficult). For example,
it is probably less confidential from the ISP to use an ISP's caching
proxy than to access the service directly, since most proxy software
makes it trivial to log the connection and content; however, one should
not assume that one is safe by not using the proxy (especially now
that many do transparent proxying). However, it is less anonymous
from the remote site to access the remote site directly; using the
ISP's proxy affords some anonymity (unless the remote site colludes
with the ISP).
8.8 Security Equivalent Things Go Together
One issue that always seems to come up is availability versus other
goals. For example, suppose you install a new biometric voice recognition
system. Then you have a cold and can't get in. Did you prioritize
correctly? Which is more important? Similar issues come up in almost
every place with regard to security. For example, your system may
authenticate users versus a global server, or it may have a local
database for authentication. The former means that one can revoke
a user's credentials globally immediately, but also means that if
the global server is down, nobody can authenticate. Attempts to get
the best of both worlds ("authenticate locally if global server
is unreachable") often reduce to availability (adversary just DOSes
link between system and global server to force local authentication).
My philosophy on this is simple; put like things together. That is,
I think authentication information for a system should be on the system.
That way, the system is essentially a self-contained unit. By spreading
the data out, one multiplies potential attack targets, and reduces
availability. If someone can hack the local system, then being able
to alter a local authentication database is relatively insignificant.
8.9 A Proposed Perimeter Defense
I believe the following design would be a useful design for perimeter
defenses for most organizations and individuals. First, there would
be an outer layer of reactive prevention, followed by an inner layer
of prevention and detection that acts as a fail-safe mechanism. If
the outer preventative defense should fail for some reason (hardware,
software, configuration) then incoming connections will be stopped
by the inner layer and the detection will notify us that something
is wrong. The idea of a dual layer of firewalling is already becoming
popular with financial institutions and military networks, but really
derives itself from the lessons learned trying to guarantee high availability
and specifically the goal of eliminating single points of failure.
This system also doesn't require monitoring traffic blocked by the
outer layer, which virtually eliminates the resources it takes to
monitor traffic that gets blocked anyway. However, if the outer layer
were not reactive, then we would effectively be discarding any useful
intelligence that is gained by detecting probes (that is, a failed
connection or attack is still valuable in determining intent).
With a reactive firewall as the outer layer, when an adversary probes
our defenses looking for holes or weak spots, we take appropriate
action, usually shunning that network address, and this makes enumeration
a much more difficult process. With a little imagination, we can construct
more deceptive defensive measures, like returning random responses,
or redirection to a honey-net (which is essentially just a consistent
set of bogus responses, plus monitoring). Since enumeration is strictly
an information-gathering activity, the obvious countermeasure is deception.
The range of deceptive responses runs from none (that is, complete
silence, or lack of information) through random responses (misinformation)
to consistent, strategic deception (disinformation). Stronger responses
are out of proportion to the provocation (network scans are legal
in most countries), and often illegal in any circumstances.
8.10 Man In The Middle
How do we detect MITM or impersonation in web, PGP/GPG, SSH contexts?
The typical process for creating a connection involves a DNS resolution
at the application layer (unless you use IP addresses), then sending
packets to the IP address (at the network layer), which have to be
routed; at the link layer, ARP typically is used to find the next
hop at each stage.
Poisoning, spoofing (transaction ID issues) or maybe you are querying
a DNS server the adversary controls (i.e. your ISP)
Announcing bogus routes, or topological considerations
8.10.3 Link-layer Issues
ARP spoofing or poisoning
8.10.4 Physical Layer
Tapping the wire (or listening to wireless)
8.10.5 Periodic Rechecking
It's difficult to stay perpetually in the middle. When you aren't,
typically, the cryptographic fingerprints will no longer match and
the MITM will be detected. It's handy to occasionally compare them
using different channels, so that if the ones you originally relied
upon were proxied, the tampering will be detected. SSH does this automatically
and is called the baby duck model (i.e. it bonds to the first
thing it sees, and complains if it changes identities). However, this
detects the problem only retroactively.
8.10.6 Out-of-Band Comparison
One can compare digests/fingerprints/hashes over a different, low-bandwidth
communication medium (i.e. the phone, postal mail).
8.10.7 Parallel Paths
OOB comparison is really an example of creating two disjoint paths
between two entities and making sure that they give the same results.
This can occur in multiple contexts. For example, it can be used for
the bootstrapping problem; how can I trust the first connection? By
creating two paths I can compare the identities of the peer both places.
I once used this to check the integrity of my PGP downloads by downloading
it from home and from another location, and comparing the results.
Imagine that the adversary is conducting a MITM against, say, an SSH
session, so instead of A<->B it is A<->O<->B. Your countermeasure
as A may be to check the IP addresses of the peer at B, so that the
adversary would have to spoof IPs in both directions (this is often
printed automatically at login). Another technique is to check the
host key fingerprint as part of your login sequence, sending the fingerprint
through the tunneled connection. The adversary may modify the data
at the application layer automatically, to change the fingerprint
on the way through. But what if you transformed (e.g. encrypted) the
fingerprint using a command-line tool, and represented it as printable
characters, and printed them through the tunnel, and inverted the
transformation at the local end? Then he'd have a very difficult time
writing a program to detect this, especially if you kept the exact
mechanism a secret. You could run the program automatically through
ssh, so it isn't stored on the remote system.
8.11 Network Surveillance
9 Identification and Authentication
Identification is necessary before making any sort of access control
decisions. Often it can reduce abuse, because an identified individual
knows that if they do something there can be consequences or sanctions.
For example, if an employee abuses the corporate network, they may
find themselves on the receiving end of the sysadmin's luser attitude
readjustment tool (LART). I tend to think of authentication as a process
you perform on objects (like paintings, antiques, and digitally signed
documents), and identification as a process that subjects (people)
perform, but in network security you're really looking at data created
by a person for the purpose of identifying them, so I use them interchangeably.
9.1 Identity
Sometimes I suspect I'm not who I think I am.
- Ghost in the Shell
An identity, for our purposes, is an abstract concept; it does
not map to a person, it maps to a persona. Some people call
this a digital ID, but since this paper doesn't talk about
non-digital identities, I'm dropping the qualifier. Identities are
different from credentials, which are something you use to
prove identity. For example, your login password is a credential.
In relational database design, it is considered a good practice for
the primary key (http://en.wikipedia.org/wiki/Primary_key)
of a table to be an integer, perhaps a row number, that is not used
for anything else. That is because the primary key is used as an identifier
for the row. An identifier is shorthand, a handle; like a pointer,
it allows us to modify the object itself, so that the modification
occurs in all places simultaneously. Most competent DBAs realize that
people change names, phone numbers, locations, and so on; they may
even change social security numbers. They also realize that people
may share any of these things (even social security numbers are not
necessarily unique, especially if they lie about it). So to be able
to identify a person across any of these changes, you need to use
a row number. The exact same principle applies with security systems.
In Unix, a person is given a username (identity) and a password (credential).
This is good, because the password may be changed without losing the
idea of the identity of the person. However, there are subtle gotchas.
In actuality, the username is mapped to a user ID (UID), which is
the real way that Unix keeps track of identity. It isn't necessarily
a one-to-one mapping. Also, a poor system administer may reassign
an unused user ID without going through the file system and looking
for files owned by the old user, in which case their ownership is
silently reassigned.
PGP and GPG made the mistake of using a cryptographic key as an identifier.
If one has to revoke that key, one basically loses anything (such
as signatures) which applied to that key, and the trust that other
people have indicated towards that key. And if you have multiple keys,
friends of yours who have all of them cannot treat them all as equivalent,
since GPG can't be told that they are associated to the same identity,
because the keys are the identity. Instead, they must manage
statements about you (such as how much they trust you to act as an
introducer) on each key independently.
9.2 What Authority?
Does it follow that I reject all authority? Far from me such a thought.
In the matter of boots, I refer to the authority of the bootmaker;
concerning houses, canals, or railroads, I consult that of the architect
or the engineer.
- Mikhail Bakunin, What is Authority? 1882 (http://www.panarchy.org/bakunin/authority.1871.html)
When we are attempting to identify someone, we are relying upon some
authority, usually the state government. When you register a domain
name with a registrar, they record your personal information in the
WHOIS database; this is the system of record (http://en.wikipedia.org/wiki/System_of_record).
No matter how careful we are, we can never have a higher level of
assurance than this authority has. If the government gave that person
a false identity, or the person bribed a DMV clerk to do so, we can
do absolutely nothing about it. This is an important implication of
the limitations of accuracy (see 3.5).
9.3 Authentication Factors
There are many ways you can prove your identity to a system. They
may include:
- something you are
- like biometric signatures such as the pattern
of capillaries on your retina, your fingerprints, etc.
- something you have
- like a token, physical key, or thumb drive
- something you know
- like a passphrase or password
- somewhere you are
- if you put a GPS device in a computer, or
did direction-finding on transmissions, or simply require a person
to be physically present somewhere to operate the system
- somewhere you can be reached
- like a mailing address, network
address, email address, or phone number
At the risk of self-promotion, I want to point out that, to my knowledge,
the last factor has not been explicitly stated in computer security
literature, although it is demonstrated every time a web site emails
you your password, or every time a financial company mails something
to your home.
9.4 Authentication Issues: When, What
Do we authenticate each transaction or command (sudo), or a session
(SSH), or only certain commands (passwd)? What is being authenticated,
the remote system, the agent, the user, or the data?
9.5 The Identity Continuum
Identification can range from fully anonymous to pseudonymous, to
full identification. Ensuring identity can be expensive, and is never
perfect. Think about what you are trying to accomplish. Applies to
cookies from web sites, email addresses, "real names", and so
on.
9.6 Problems Remaining Anonymous
In cyberspace everyone will be anonymous for 15 minutes.
- Graham Greenleaf
What can we learn from anonymizer, mixmaster, tor, and so on? Often
one can de-anonymize. Some people have de-anonymized search queries
this way, and census data, and many more data sets that are supposed
to be anonymous.
9.7 Problems with Identifying People
- Randomly-Chosen Identity
- Fictitious Identity
- Stolen Identity
9.8 Remote Attestation
A concept in network security involves knowing that the remote system
is a particular program or piece of hardware is called remote
attestation. When I connect securely over the network to a machine
I believe I have full privileges on, how do I know I'm actually talking
to the machine, and not a similar system controlled by the adversary?
This is usually attempted by hiding an encryption key in some tamper-proof
part of the system, but is vulnerable to all kinds of disclosure and
side-channel attacks, especially if the owner of the remote system
is the adversary.
The most successful example seems to be the satellite television industry,
where they embed cryptographic and software secrets in an inexpensive
smart card with restricted availability, and change them frequently
enough that the resources required to reverse engineer each new card
exceeds the cost of the data it is protecting. In the satellite TV
industry, there's something they call ECMs (electronic counter-measures),
which are program updates of the form "look at memory location
0xFC, and if it's not 0xFA, then HCF" (Halt and Catch Fire). The
obvious crack is to simply remove that part of the code, but then
you will trigger another check that looks at the code for the first
check, and so on.
The sorts of non-cryptographic self-checks they request the card to
do, such as computing a checksum (such as a CRC) over some memory
locations, are similar to the sorts of protections against reverse
engineering, where the program computes a checksum to detect modifications
to itself.
9.9 Advanced Authentication Tools
10 Authorization - Access Control
10.1 Privilege Escalation
Ideally, all services would be impossible to abuse. Since this is
difficult or impossible, we often restrict access to them, to limit
the potential pool of adversaries. Of course, if some users can do
some things and others can't, this creates the opportunity for the
adversary to perform an unauthorized action, but that's often unavoidable.
For example, you probably want to be able to do things to your computer,
like reformat it and install a new operating system, that you wouldn't
want others to do. You will want your employees to do things an anonymous
Internet user cannot (see 3.3). Thus, many
adversaries want to escalate their privileges to that of some more
powerful user, possibly you. Generally, privilege escalation
attacks refer to techniques that require some level of access above
that of an anonymous remote system, but grant an even higher level
of access, bypassing access controls.
They can come in horizontal (user becomes another user) or
vertical (normal user becomes root or Administrator) escalations.
10.2 Physical Access Control
These include locks. I like Medeco, but none are perfect. It's easy
to find guides to lock picking:
10.3 Operating System Access Control: DAC, MAC, RBAC
Discretionary Access Control (DAC) is up to the end-user. They
can choose to let other people write to their files, if they wish,
and the defaults tend to be global. This is how file permissions on
classic Unix and Windows works. A more secure system often involves
Mandatory Access Control (MAC), where the security administrator
sets up the permissions globally. Some MAC types are Type Enforcement
and Domain Type Enforcement. Implementations include SELinux and systrace.
Often they are combined, where the access request has to pass both
tests, meaning that the effective permission set is the intersection
(union) of the MAC and DAC permissions. Another way of looking at
it is that MAC sets the maximum permissions that DAC can give. Role-Based
Access Control (RBAC) could be considered a form of MAC. In RBAC,
there are roles to whom permissions are assigned, and one switches
roles to change permission sets. For example, you might have a security
administrator role, but you don't need that to read email or surf
the web, so you only switch to it when doing security administrator
stuff. This prevents you from accidentally running malware with full
permissions. Unix emulates this with pseudo-users and sudo.
10.4 Application Authorization Decisions
There are many applications which have tried to allow some users to
perform some functions, but not others. Let's forget what we're trying
to authorize, and focus on information about the requester.
For example, network-based authorization may depend on (in descending
order of value):
- cryptographic key
- MAC address
- IP address
- port number
An operating system authorization usually depends on:
- Being root or Administrator (uid=0 in Unix)
- The identity of the user, this being the effective UID (or EUID in
Unix)
- The group(s) in which that user participates
- Tags, labels, and other things related to advanced topics (see 10.3)
There are other factors involved in authorization decisions but these
are just examples. Instead of tying things to one system, let's keep
it simple and pretend we're allowing or denying natural numbers, rather
than usernames or things of that nature. Let's also define some access
control matching primitives such as:
- odd
- even
- prime
- less than x
- greater than y
In a well-designed system these primitive functions would be rather
complete and not the few we have here. Futher, there should be some
easy way to compose these tests to achieve the desired access control:
Systems which do not do this kind of authorization are neccessarily
incomplete, and cannot express all desired combinations of sets.
10.4.1 Apache Access Control
Apache has three access control directives
- Allow
- specifies who can use the resource
- Deny
- specifies who can not use the resource
- Order
- specifies the ordering of evaluation of those directives
as either 'deny, allow', 'allow, deny', or mutual-failure.
- deny, allow means that the deny directives are evaluated
first, and is the default. This basically is an example of enumerating
badness (). This may make sense for a public
webserver where anyone on the Internet should be able to browse, but
blacklisting is not an effective way to run a secure operation.
- allow, deny is the more secure option, only
allowing those who pass the allow operation to continue, but it still
processes the deny section and anyone who was allowed in and then
later denied is still rejected.
- mutual-failure means hosts that appear on the allow list
but not appear on the deny list are granted access. This seems to
be redundant with "allow, deny".
This is unfortunately quite confusing, and it's hard to know where
to start. By having an allow list and a deny list, we have four sets
of objects defined:
- Those that are neither allowed nor denied
- Those that are allowed
- Those that are denied
- Those that are both allowed and denied
The truth table for this is as follows (D means default, O means open,
X means denied):
| 1 | 2 | 3 | 4 |
|
| DA | D | O | X | O |
| AD | D | O | X | X |
| MF | D | O | X | X |
Do you see what I mean? AD and MF are essentially the same, unless
I misread this section in the O'Reilly book.
Now, suppose we wish to allow in everyone except the naughty prime
numbers. We would write:
- deny primes
- allow all
- order deny, allow
So far so good, right? Now let's say that we want to deny the large
primes but allow the number 2 in. Unless our combinators for access-control
primitives were powerful enough to express "primes greater than
two", we might be stuck already. Apache has no way to combine primitives,
so is unable to offer such access control. But given that it's the
web, we can't rail on it too harshly.
What we really want is a list of directives that express the set we
wish very easily. For example, imagine that we didn't have an order
directive, but we could simply specify what deny and allow rules we
have in such a way that ealier takes precedence (the so-called "short
circuit" evaluation)
- allow 2
- deny primes
- allow all
However, we're unable to do that in Apache. Put simply, one can't
easily treat subsets of sets created by access control matching in
a different manner than the set they reside in. We couldn't allow
in "2" while denying primes, unless the access control matching
functions were more sophisticated.
Squid has one of the more powerful access control systems in use.
Primitives
- HTTP response header matches
- HTTP username (a la HTTP basic authentication)
- external
- IP address and netmask (source or destination)
<