I Will File Bugs For You

This post prompted by Aaron Klotz’s Diffusion of Responsibility and Sumana Harihareswara’s Inessential Weirdnesses in Open Source.

One of the most common ways to start interacting with a free software project, as opposed to just using the software produced by that project, is when you trip over a bug or a missing feature and now you need to go tell the developers about it. Unfortunately, that process is often incredibly off-putting. If there’s a bug tracking system, it is probably optimized for people who spend all day every day working with it, and may appear to demand all kinds of information you have no idea how to supply. If there isn’t, you’re probably looking at signing up for some sort of mailing list (mailing list! how retro!) Either way, it may not be easy to find, and there’s a nonzero chance that some neckbeard with a bad attitude is going to yell at you. It shouldn’t be so, but it is.

So, I make this offer to you, the general public, as I have been doing for close friends for many years: if you don’t want to deal with that shit, I will file bugs for you. I’ve been on the Internet since not quite the elder days, and I’ve been hacking free software almost as long; I know how to find these people and I know how to talk to them. We’ll have a conversation and we’ll figure out exactly what’s wrong and then I’ll take it from there. I’m best at compilers and Web browsers, but I’ll give anything a shot.

THE FINE PRINT: If you want to take me up on this, please do so only via email; my address is on the Contact page. Please allow up to one week for an initial response, as this service is provided in my copious free time.

Offer valid only for free software (also known as open source) (as opposed to software that you are not allowed to modify or redistribute, e.g. Microsoft Word). Offer also only valid for problems which I can personally reproduce; it’s not going to go well for anyone involved if I have to play telephone with you and the developers. Offer specifically not valid for operating system kernels or device drivers of any kind, both because those people are even less pleasant to work with than the usual run of neckbeards, and because that class of bugs tends to be hardware-dependent and therefore difficult for me to personally reproduce on account of I don’t have the exact same computer as you.

The management cannot guarantee this service will cause bugs to actually get fixed in any kind of timely fashion, or, in fact, ever.

2014 Hugo Awards ballot

I’m not attending the Worldcon, but I most certainly am voting the Hugos this year, and moreover I am publishing my ballot with one-paragraph reviews of everything I voted on. If you care about this sort of thing you probably already know why. If you don’t, the short version is: Some of the works nominated this year allegedly only made the shortlist because of bloc voting by Larry Correia’s fans, he having published a slate of recommendations.

There’s nothing intrinsically wrong with publishing a slate of recommendations—don’t we all tell our friends to read the stuff we love? In this case, though, the slate came with a bunch of political bloviation attached, and one of the recommended works was written by Vox Day, who is such a horrible person that even your common or garden variety Internet asshole backs slowly away from him, but nonetheless he has a posse of devoted fanboys and sock puppets. A frank exchange of views ensued; be glad you missed it, and I hope the reviews are useful to you anyway. If you want more detail, Far Beyond Reality has a link roundup.

I value characterization, sociological plausibility, and big ideas, in that order. I often appreciate ambitious and/or experimental stylistic choices. I don’t mind an absence of plot or conflict; if everyone involved is having a good time exploring the vast enigmatic construction, nothing bad happens, and it’s all about the mystery, that’s just fine by me. However, if I find that I don’t care what happens to these people, no amount of plot or concept will compensate. In the context of the Hugos, I am also giving a lot of weight to novelty. There is a lot of stuff in this year’s ballot that has been done already, and the prior art was much better. For similar reasons, volume N of a series has to be really good to make up for being volume N.

Continued…

PETS rump session talk

I spoke briefly at PETS 2014 about which websites are censored in which countries, and what we can learn just from the lists.

another small dispatch from the coalface

For all countries for which Herdict contains enough reports to be credible (concretely, such that the error bars below cover less than 10% of the range), the estimated probability that a webpage will be inaccessible. Vertically sorted by the left edge of the error bar. Further right is worse. I suspect major systemic errors in this data set, but it’s the only data set in town.

a small dispatch from the coalface

category count %
total 5 838 383 100.000
ok 2 212 565 37.897
ok (redirected) 1 999 341 34.245
network or protocol error 798 231 13.672
timeout 412 759 7.070
hostname not found 166 623 2.854
page not found (404/410) 110 241 1.888
forbidden (403) 75 054 1.286
service unavailable (503) 18 648 .319
server error (500) 15 150 .259
bad request (400) 14 397 .247
authentication required (401) 9 199 .158
redirection loop 2 972 .051
proxy error (502/504/52x) 1 845 .032
other HTTP response 1 010 .017
crawler failure 329 .006
syntactically invalid URL 19 .000

Sorry about the non-tabular figures.

Redesigning Income Tax

Here is an opinionated proposal, having no chance whatsoever of adoption, for how taxes ought to be levied on income. This post was originally scheduled for Income Tax Day here in the good old U.S.A., but I was having trouble with the algebra and then I was on vacation. So instead you get it for Canadian tax day. Everything is calculated in US dollars and compared to existing US systems, but I don’t see any great difficulty translating it to other countries.

Premises for this redesign:

  • The existing tax code is way too damn complicated.
  • It is the rules that need to be simplified, not the mathematics. In particular, the marginal tax rate does not need to be a step function just to simplify the arithmetic involved. You look that part up in a table anyway.
  • All individuals’ take-home pay should increase monotonically with their gross income, regardless of any other factors.
  • High levels of income inequality constitute a negative externality; income tax should therefore be Pigovian.
  • Tax rates should not vary based on any categorization of income (e.g. interest and capital gains should be taxed at the same rate as wages). This principle by itself removes a great deal of the complexity, and a great deal of the perverse incentives in the current system as well.

Read on for concrete details of the proposal.

Continued…

Secure channels are like immunization

For a while now, when people ask me how they can improve their websites’ security, I tell them: Start by turning on HTTPS for everything. Run a separate server on port 80 that issues nothing but permanent redirects to the https:// version of the same URL. There’s lots more you can do, but that’s the easy first step. There are a number of common objections to this plan; today I want to talk about the it should be the user’s choice objection, expressed for instance in Google to Gmail customers: You WILL use HTTPS by Robert L. Mitchell. It goes something like this:

Why should I (the operator of the website) assume I know better than each of my users what their security posture should be? Maybe this is a throwaway account, of no great importance to them. Maybe they are on a slow link that is intrinsically hard to eavesdrop upon, so the extra network round-trips involved in setting up a secure channel make the site annoyingly slow for no benefit.

This objection ignores the public health benefits of secure channels. I’d like to make an analogy to immunization, here. If you get vaccinated against the measles (for instance), that’s good for you because you are much less likely to get the disease yourself. But it is also good for everyone who lives near you, because now you can’t infect them either. If enough people in a region are immune, then nobody will get the disease, even if they aren’t immune; this is called herd immunity. Secure channels have similar benefits to the general public—unconditionally securing a website improves security for everyone on the ’net, whether or not they use that website! Here’s why.

Most of the criminals who crack websites don’t care which accounts they gain access to. This surprises people; if you ask users, they often say things like well, nobody would bother breaking into my email / bank account / personal computer, because I’m not a celebrity and I don’t have any money! But the attackers don’t care about that. They break into email accounts so they can send spam; any @gmail.com address is as good as any other. They break into bank accounts so they can commit credit card fraud; any given person’s card is probably only good for US$1000 or so, but multiply that by thousands of cards and you’re talking about real money. They break into PCs so they can run botnets; they don’t care about data stored on the computer, they want the CPU and the network connection. For more on this point, see the paper Folk Models of Home Computer Security by Rick Wash. This is the most important reason why security needs to be unconditional. Accounts may be throwaway to their users, but they are all the same to the attackers.

Often, criminals who crack websites don’t care which websites they gain access to, either. The logic is similar: the legitimate contents of the website are irrelevant. All the attacker wants is to reuse a legitimate site as part of a spamming scheme or to copy the user list, guess the weaker passwords, and try those username+password combinations on more important websites. This is why everyone who has a website, even if it’s tiny and attracts hardly any traffic, needs to worry about its security. This is also why making websites secure improves security for everyone, even if they never intentionally visit that website.

Now, how does HTTPS help with all this? The easiest several ways to break into websites involve snooping on unsecured network traffic to steal user credentials. This is possible even with the common-but-insufficient tactic of sending only the login form over HTTPS, because every insecure HTTP request after login includes a piece of data called a session cookie that can be stolen and used to impersonate the user for most purposes without having to know the user’s password. (It’s often not possible to change the user’s password without also knowing the old password, but that’s about it. If an attacker just wants to send spam, and doesn’t care about maintaining control of the account, a session cookie is good enough.) It’s also possible even if all logged-in users are served only HTTPS, but you get an unsecured page until you login, because then an attacker can modify the unsecured page and make it steal credentials. Only applying channel security to the entire site for everyone, whoever they are, logged in or not, makes this class of attacks go away.

Unconditional use of HTTPS also enables further security improvements. For instance, a site that is exclusively HTTPS can use the Strict-Transport-Security mechanism to put browsers on notice that they should never communicate with it over an insecure channel: this is important because there are turnkey SSL stripping tools that lurk in between a legitimate site and a targeted user and make it look like the site wasn’t HTTPS in the first place. There are subtle differences in the browser’s presentation that a clever human might notice—or you could direct the computer to pay attention, and then it will notice. But this only works, again, if the site is always HTTPS for everyone. Similarly, an always-secured site can mark all of its cookies secure and httponly which cuts off more ways for attackers to steal user credentials. And if a site runs complicated code on the server, exposing that code to the public Internet two different ways (HTTP and HTTPS) enlarges the server’s attack surface. If the only thing on port 80 is a boilerplate try again with HTTPS permanent redirect, this is not an issue. (Bonus points for invalidating session cookies and passwords that just went over the wire in cleartext.)

Finally, I’ll mention that if a site’s users can turn security off, then there’s a per-user toggle switch in the site’s memory banks somewhere, and the site operators can flip that switch off if they want. Or if they have been, shall we say, leaned on. It’s a lot easier for the site operators to stand up to being leaned on if they can say that’s not a thing our code can do.

Should new web features be HTTPS only?

I doubt anyone who reads this will disagree with the proposition that the Web needs to move toward all traffic being encrypted always. Yet there is constant back pressure in the standards groups, people trying to propose network-level innovations that provide only some of the fundamental three guarantees of a secure channel—maybe you can have integrity but not confidentiality or authenticity, for instance. I can personally see a case for an authentic channel that provides integrity and authenticity but not confidentiality, but I don’t think it’s useful enough to back off on the principle that everything should be encrypted always.

So here’s a way browser vendors could signal that we will not stand for erosion of secure channels: starting with a particular, documented and well-announced, version, all new content features are only usable for fully HTTPS pages. Everything that worked prior to that point continues to work, of course. I am informed that there is at least some support for this within the Chrome team. It might be hard to sell Microsoft on it. What does the fox think?

HTTP application layer integrity/authenticity guarantees

Note: These are half-baked ideas I’ve been turning over in my head, and should not be taken all that seriously.

Best available practice for mutually authenticated Web services (that is, both the client and the server know who the other party is) goes like this: TLS provides channel confidentiality and integrity to both parties; an X.509 certificate (countersigned by some sort of CA) offers evidence that the server is whom the client expects it to be; all resources are served from https:// URLs, thus the channel’s integrity guarantee can be taken to apply to the content; the client identifies itself to the server with either a username and password, or a third-party identity voucher (OAuth, OpenID, etc), which is exchanged for a session cookie. Nobody can impersonate the server without either subverting a CA or stealing the server’s private key, but all of the client’s proffered credentials are bearer tokens: anyone who can read them can impersonate the client to the server, probably for an extended period. TLS’s channel confidentiality assures that no one in the middle can read the tokens, but there are an awful lot of ways they can leak at the endpoints. Security-conscious sites nowadays have been adding one-time passwords and/or computer-identifying secondary cookies, but the combination of session cookie and secondary cookie is still a bearer token (possibly you also have to masquerade the client’s IP address).

Here are some design requirements for a better scheme:

  • Identify clients to servers using something that is not a bearer token: that is, even if client and server are communicating on an open (not confidential) channel, no eavesdropper gains sufficient information to impersonate client to server.
  • Provide application-layer message authentication in both directions: that is, both receivers can verify that each HTTP query and response is what the sender sent, without relying on TLS’s channel integrity assurance.
  • The application layer MACs should be cryptographically bound to the TLS server certificate (server→client) and the long-term client identity (when available) (client→server).
  • Neither party should be able to forge MACs in the name of their peer (i.e. server does not gain ability to impersonate client to a third party, and vice versa).
  • The client should not implicitly identify itself to the server when the user thinks they’re logged out.
  • Must afford at least as much design flexibility to site authors as the status quo.
  • Must gracefully degrade to the status quo when only one party supports the new system.
  • Must minimize number of additional expensive cryptographic operations on the server.
  • Must minimize server-held state.
  • Must not make server administrators deal with X.509 more than they already do.
  • Compromise of any key material that has to be held in online storage must not be a catastrophe.
  • If we can build a foundation for getting away from the CA quagmire in here somewhere, that would be nice.
  • If we can free sites from having to maintain databases of hashed passwords, that would be really nice.

The cryptographic primitives we need for this look something like:

  • A dirt-cheap asymmetric (verifier cannot forge signatures) message authentication code.
  • A mechanism for mutual agreement to session keys for the above MAC.
  • A reasonably efficient zero-knowledge proof of identity which can be bootstrapped from existing credentials (e.g. username+password pairs).
  • A way to bind one party’s contribution to the session keys to other credentials, such as the TLS shared secret, long-term client identity, and server certificate.

And here are some preliminary notes on how the protocol might work:

  • New HTTP query and response headers, sent only over TLS, declare client and server willingness to participate in the new scheme, and carry the first steps of the session key agreement protocol.
  • More new HTTP query and response headers sign each query and response once keys are negotiated.
  • The server always binds its half of the key agreement to its TLS identity (possibly via some intermediate key).
  • Upon explicit login action, the session key is renegotiated with the client identity tied in as well, and the server is provided with a zero-knowledge proof of the client’s long-term identity. This probably works via some combination of HTTP headers and new HTML form elements (<input type="password" method="zkp"> perhaps?)
  • Login provides the client with a ticket which can be used for an extended period as backup for new session key negotiations (thus providing a mechanism for automatic login for new sessions). The ticket must be useless without actual knowledge of the client’s long-term identity. The server-side state associated with this ticket must not be confidential (i.e. learning it is useless to an attacker) and ideally should be no more than a list of serial numbers for currently-valid tickets for that user.
  • Logout destroys the ticket by removing its serial number from the list.
  • If the client side of the zero-knowledge proof can be carried out in JavaScript as a fallback, the server need not store passwords at all, only ZKP verifier information; in that circumstance it would issue bearer session cookies instead of a ticket + renegotiated sesson authentication keys. (This is strictly an improvement over the status quo, so the usual objections to crypto in JS do not apply.) Servers that want to maintain compatibility with old clients that don’t support JavaScript can go on storing hashed passwords server-side.

I know all of this is possible except maybe the dirt-cheap asymmetric MAC, but I don’t know what cryptographers would pick for the primitives. I’m also not sure what to do to make it interoperable with OpenID etc.

Crashing should be fixed now

This site should no longer be causing certain versions of Firefox (particularly on Mac) to crash. If it still crashes for you, please flush your browser cache and try again. If it still crashes, please let me know about it.

As an unfortunate side effect of the changes required, there is no longer an owl at the bottom of each page. I’d appreciate advice on how to put it back. The trouble is persuading it to be at the bottom of the rightmost sidebar, but only if there is enough space below the actual content—formerly this was dealt with by replicating the background color on the <body> into the content elements for the sidebar, but now it’s all background images and there are visible seams if I do it that way. Note that body::after is already in use for something else, html::after can’t AFAIK be given the desired horizontal alignment, and (again AFAIK) media queries cannot measure the height of the page, only the window; so that excludes any number of more obvious techniques.

(If you mention Flexbox I will make the sad face at you.)