Notes and essays about the topics I am currently doing academic research on. For the past decade this has been computer and network security, often having something to do with the (ab)use of the Internet for censorship and surveillance.

On Replacements for Passwords

Your post advocates a

□ software □ hardware □ cognitive □ two-factor □ other ___________

universal replacement for passwords. Your idea will not work. Here is why it won’t work:

□ It’s too easy to trick users into revealing their credentials
□ It’s too hard to change a credential if it’s stolen
□ It initiates an arms race which will inevitably be won by the attackers
□ Users will not put up with it
□ Server administrators will not put up with it
□ Web browser developers will not put up with it
□ National governments will not put up with it
□ Apple would have to sacrifice their extremely profitable hardware monopoly
□ It cannot coexist with passwords even during a transition period
□ It requires immediate total cooperation from everybody at once

Specifically, your plan fails to account for these human factors:

□ More than one person might use the same computer
□ One person might use more than one computer
□ One person might use more than one type of Web browser
□ People use software that isn’t a Web browser at all
□ Users rapidly learn to ignore security alerts of this type
□ This secret is even easier to guess by brute force than the typical password
□ This secret is even less memorable than the typical password
□ It’s too hard to type something that complicated on a phone keyboard
□ Not everyone can see the difference between red and green
□ Not everyone can make fine motor movements with that level of precision
□ Not everyone has thumbs

and technical obstacles:

□ Clock skew
□ Unreliable servers
□ Network latency
□ Wireless eavesdropping and jamming
□ Zooko’s Triangle
□ Computers do not necessarily have any USB ports
□ SMTP messages are often recoded or discarded in transit
□ SMS messages are trivially forgeable by anyone with a PBX
□ This protocol was shown to be insecure by ________________, ____ years ago
□ This protocol must be implemented perfectly or it is insecure

and the following philosophical objections may also apply:

□ It relies on a psychologically unnatural notion of trustworthiness
□ People want to present different facets of their identity in different contexts
□ Not everyone trusts your government
□ Not everyone trusts their own government
□ Who’s going to run this brand new global, always-online directory authority?
□ I should be able to authenticate a local communication without Internet access
□ I should be able to communicate without having met someone in person first
□ Anonymity is vital to robust public debate

To sum up,

□ It’s a decent idea, but I don’t think it will work. Keep trying!
□ This is a terrible idea and you should feel terrible.
□ You are the Russian Mafia and I claim my five pounds.

hat tip to the original

Dear Everyone Running for ACM or IEEE Management

It’s professional-organization management election time again. This is my response to everyone who’s about to send me an invitation to vote for them:

When it comes to ACM and IEEE elections, I am a single-issue voter, and the issue is open access to research. I will vote for you if and only if you make a public statement committing to aggressive pursuit of the following goals within your organization, in decreasing order of priority:

  1. As immediately as practical, begin providing to the general public zero-cost, no-registration, no-strings-attached online access to new publications in your organization’s venues.

  2. Commit to a timetable (which should also be as quickly as practical, but could be somewhat slower than for the above) for opening up your organization’s older publications to zero-cost, no-registration, no-strings-attached online access.

  3. Abandon the practice of requiring authors to assign copyright to your organization; instead, require only a license substantively similar to that requested by USENIX (exclusive publication rights for no longer than 12 months with exception for posting an electronic copy on your own website, nonexclusive right to continue disseminating afterward).

  4. On a definite timetable, revert copyright to all authors who published under the old copyright policy, retaining only the rights requested under the new policy.

Thank you for your consideration.

CCS 2012 Conference Report

The ACM held its annual Conference on Computer and Communications Security two weeks ago today in Raleigh, North Carolina. CCS is larger than Oakland and has two presentation tracks; I attended less than half of the talks, and my brain was still completely full afterward. Instead of doing one exhaustive post per day like I did with Oakland, I’m just going to highlight a handful of interesting papers over the course of the entire conference, plus the pre-conference Workshop on Privacy in the Electronic Society.


CCS’12: StegoTorus

I just presented the major focus of my time and effort for the past year-and-a-bit, StegoTorus, at this year’s ACM Conference on Computer and Communications Security. You can see my slides and the code (also at Github). I was going to explain in more detail but all of the brain went into actually giving the talk. My apologies.

This is an ongoing project and we are looking for help; please do get in touch if you’re interested.

teaser: some very alpha software

Readers of this blog may find and of interest.

The Conference Formerly Known as Oakland, day 3

This day had a lot of interesting papers, but some of the presentations were disappointing: they spent their time on uninteresting aspects of their work, or handwaved over critical details.

That said, most of the work on passwords was compelling, and if you read to the end there’s a cranky rant about the panel discussion.


The Conference Formerly Known as Oakland, day 2

I skipped the 8:30AM session today, it was mostly not interesting to me and I badly needed the extra hour of sleep. I’m sorry to miss On the Feasibility of Internet-Scale Author Identification, but I will read the paper. I also skipped the business meeting, so, summaries start with the 10:30 session, and end with the short talks.


The Conference Formerly Known as Oakland, day 1

I’m attending the IEEE Symposium on Security and Privacy, 2012 and I’m going to try taking notes and posting them here, again. The last time I tried this (at CCS 2010), most of the notes didn’t ever get posted, but I paid a whole lot more attention to the talks than I do when I’m not taking notes. This time, I’m going to try to clean up the notes and post them the next morning at the latest.

S&P was at the Claremont Hotel in Oakland, California for thirty-odd years, and they didn’t really want to leave, but there wasn’t room for all the people who wanted to attend. Last year they turned nearly 200 people away. This year, it’s in San Francisco at a hotel on Union Square—amusingly, the exact same hotel that USENIX Security was at, last August—with much higher capacity, and while I still have to get up at dawn to get there on time, at least I don’t have to drive.

I have not had time to read any of the papers, so this is all based on the talks, only. However, where possible I have linked each section heading to the paper or to a related website.

Mozilla folks: I would like to draw your attention particularly to the talks entitled Dissecting Android Malware, The Psychology of Security for the Home Computer User, and User-Driven Access Control.


The ethics of preventing third-party net filtering

I haven’t posted anything research-related in a while because I’ve been on a project that I’m not supposed to talk about till it’s done, and it’s not done yet. I can say, though, that it’s about ways to get around country-scale filtration of the Internet. I’m writing it up now, starting with the threat model, as you do:

Alice Arishat wishes to publish things for Brutus to read. Cato does not approve of what Arishat has to say, and seeks to prevent her from publishing anything.

Most online discussion of censorship starts from the premise that Cato is automatically in the wrong here. That’s one of the cypherpunk premises that underpin most discussion of theoretical Internet security. I want to play devil’s advocate today, though, and explore circumstances where we might choose to support Cato. In the offline world, we trade off free speech against all sorts of other values every day:


How To Choose Passwords

When I talk to people who aren’t security researchers about history sniffing, they want to know whether they should worry about it, and I say no: the only thing you can do to protect yourself is use the latest version of your favorite browser, which you should do anyway; besides, the interactive attacks will probably never appear in the wild. But if I only ever talk about computer security topics that are only relevant to researchers, I’m not helping people as much as I could, and I’m scaring them about things they can’t control. So this post is about something you should worry about, because it’s under your direct control; lots of people do it poorly and that does make them less safe online; and it’s easy to do well. That thing is choosing passwords.

You have probably heard that you shouldn’t reuse the same password on many different websites, and that your passwords should be long, contain numbers and punctuation, and avoid dictionary words. But you probably haven’t heard anyone explain why, and you probably have noticed that these two pieces of advice are hard to follow at the same time, because long gibberish passwords are hard to remember even if you only have one of them. I’m going to tell you why you should do these things, and how to do them without too much grief.

Don’t use the same password on many different websites

No matter how good your password is, the bad guys might discover what it is. For instance, if you log into an unencrypted website over an unencrypted wireless network, anyone else on the same wireless network can listen in on the radio traffic and discover your password. (It’s just like eavesdropping on a private conversation.) Or you might accidentally type your password into a website that looks like the real thing but is actually a fake created to trick you.

Suppose the bad guys have discovered your password for a Web forum. That’s not a big deal, because someone impersonating you on one forum probably isn’t a big deal. You might have to apologize to some people for letting some schmuck insult them while pretending to be you. But the bad guys know that people often use the same password on many different websites, so they’re going to try to log into your email with that password, and your bank, and so on. If they succeed—if you did use the same password—they might be able to ruin your life, or at least steal some of your money. But if you always use different passwords on different websites, the bad guys have to discover the password you use for your bank (and nothing else) in order to steal your money.

How do you manage to remember lots of different passwords, especially when (as I’m about to explain) they all need to be long and complicated? The best way is to let the computer—specifically, your browser’s password manager—do it for you. This may seem unsafe, but it’s actually much safer than using the same password for everything. The password manager cannot be fooled by phishing sites, and it has no trouble remembering lots of long complicated passwords. Yes, all the passwords are in a file on your computer. But the only way the bad guys can get at that is by physically stealing your computer, or installing spyware on it remotely. If you keep your computer up to date with security patches, you don’t have to worry about spyware much. If your computer is in danger of being physically stolen (e.g. it’s a laptop) you should use the master password mode of your browser’s password manager, so that the file on your computer is encrypted. Whether or not you have to worry about theft, you should enable Sync, or equivalent feature, even if you have no other computer to sync with; that way, if your computer breaks, there’s still a backup of all your passwords out there in the cloud (safely encrypted).

Use long, complicated passwords

The other way the bad guys discover passwords is by breaking into servers that store entire databases of them. If these databases have been designed correctly, that doesn’t tell them anything by itself, because the passwords are hashed. Hashing deserves a little explanation: suppose my password on some site is 12345 (the kind of thing that an idiot would have on his luggage). The server doesn’t store 12345 in its database, it stores 827ccb0eea8a706c4c34a16891f84e7b, which is the result of running 12345 through a cryptographic hash, in this case MD5. It’s easy to convert a password into its hash, but it’s prohibitively hard to do the reverse. MD5 is old and no longer considered a good choice for passwords (or anything, for that matter), but there is still no known algorithm to take an arbitrary MD5 hash and reveal an input that produces that hash, other than guess-and-check.

So the bad guys can’t just read the passwords from a database once they have it. But they can guess passwords, run the guesses through MD5 (or whatever was used), and compare the results to the database entries. (They can guess passwords even if they haven’t stolen a database, by feeding the guesses to the site’s login form—but that’s much slower and the site admins are likely to notice.) 12345 isn’t a good password because it’s easy to guess—but so is any five-digit number: a cheap laptop can calculate the MD5 of all 100,000 five-digit (or smaller) numbers in less than a second. There are something like 250,000 words in English—that’s maybe five seconds’ worth of work for the same laptop—so any word in the dictionary is bad, too. You can buy a 40-million-entry word list for $30 that has not only all the words in 20 different languages, but mangled versions of them (e.g. f0od)—that might take an hour or two to process.

The longer and more complicated your password is, the harder it is to guess; but that makes it harder to remember as well. Adding punctuation and numbers doesn’t help as much as one would like. There are 95 characters that you can type on a US keyboard, so there are 958, or about a quadrillion (short scale) possible eight-character passwords, if you use all those characters. A quadrillion possibilities is out of the reach of a cheap laptop, but it’s a few weeks’ effort for a small cluster of beefy computers—a determined bad guy could do this for maybe $25,000.

The good news is, you can have passwords that can’t be guessed this way but are still easy to remember. The trick is to use phrases rather than words. One random English word is 250,000 possibilities. Two random English words are 62.5 billion possiblities—250,000 squared. That’s still not enough. But ten random English words is 250,00010 ≈ 1054 possibilities, which is big enough that a modern supercomputer tasked with the problem would still be guessing when the Sun burns out five billion years from now.

You can’t take just any phrase, though. The bad guys could easily try every phrase in the Oxford Dictionary of Quotations, because there are only 20,000 of them. I haven’t worked out the math, but I think guessing every sentence in the complete works of Shakespeare is doable. But nobody has a database of every sentence in every work of literature that was written with the Latin alphabet. A phrase taken from somewhere in the middle of an obscure but lengthy book is a good choice. Or you could follow this procedure:

  1. Go to Wikipedia and click on random article. (You can use any site with a random article feature for this step, if you’d rather.)
  2. Copy the URL of the page you get, and paste it into the Eater of Meaning. Leave the drop-down on Eat word endings.
  3. Choose ten consecutive words from the result. They don’t have to all come from the same sentence.

Don’t worry about finding a sentence that you can remember yourself, because you’re going to have the password manager do it (unless you’re trying to pick the master password).

Some sites have limits on the length of their passwords. This is bad, and you should complain; but until they fix it, just use the first letter of each word in your ten-word phrase, with some numbers and punctuation if they insist on numbers and punctuation. That kind of password is theoretically crackable, as I said earlier, but it’s likely to be better than lots of other passwords in the database. So if the bad guys get the database, they will crack so many other people’s passwords before they get to yours that they don’t feel they have to bother cracking yours. (It’s kind of like the joke about how fast you need to run away from a lion.)

If there’s no limit on the length of the password, but the site still insists on numbers and/or punctuation, put them in between the words; that’s easier to type.