Bootstrapping trust in compilers

The other week, an acquaintance of mine was kvetching on Twitter about how the Rust compiler is written in Rust, and so to get started with the language you have to download a binary, and there’s no way to validate it—you could use the binary plus the matching compiler source to recreate the binary, but that doesn’t prove anything, and also if the compiler were really out to get you, you would be screwed the moment you ran the binary.

This is not a new problem, nor is it a Rust-specific problem. I recall having essentially the same issue back in 2000, give or take, with GNAT, the Ada front-end for GCC. It is written in Ada, and (at the time, anyway) not just any Ada compiler would do, you had to have a roughly contemporaneous version of … GNAT. It was especially infuriating compared to the rest of GCC, which (again, at the time) bent over backward to be buildable with any C compiler you could get your hands on, even a traditional one that didn’t support all of the 1989 language standard. But even that is problematic for someone who would rather not trust any machine code they didn’t verify themselves.

One way around the headache is diverse recompilation, in which you compile the same compiler with two different compilers, then recompile it with itself-as-produced-by-each, and compare the results. But this requires you to have two different compilers in the first place. As of this writing there is only one Rust compiler. There aren’t that many complete implementations of C++ out there, either, and you need one of those to build LLVM (which Rust depends on). I think you could devise a compiler virus that could propagate itself via both LLVM and GCC, for instance.

What’s needed, I think, is an independent root of correctness. A software environment built from scratch to be verifiable, maybe even provably correct, and geared specifically to host independent implementations of compilers for popular languages. They need not be terribly good at optimizing, because the only thing you’d ever use them for is to be one side of a diversely-recompiled bootstrap sequence. It has to be a complete and isolated environment, though, because it wouldn’t be impossible to propagate a compiler virus through the operating system kernel, which can see every block of I/O, after all.

And it seems to me that this environment naturally divides into four pieces. First, a tiny virtual machine. I’m thinking a FORTH interpreter, which is small enough that one programmer can code it by hand in assembly language, and having done that, another programmer can audit it by hand. You need multiple implementations of this, so you can check them against each other to guard against malicious lower layers—it could run on the bare metal, maybe, but the bare metal has an awful lot of clever embedded in it these days. But hopefully this is the only thing you need to implement more than once.

Second, you use the FORTH interpreter as the substratum for a more powerful language. If there’s a language in which each program is its own proof of correctness, that would be the obvious choice, but my mental allergy to arrow languages has put me off following that branch of PL research. Lisp is generally a good language to write compilers in, so a small dialect of that would be another obvious choice. (Maybe leave out the call/cc.)

Third, you write compilers in the more powerful language, with both the FORTH interpreter and more conventional execution environments as code-generation targets. These compilers can then be used to compile other stuff to run in the environment, and conversely, you can build arbitrary code within the environment and export it to your more conventional OS.

The fourth and final piece is a way of getting data in and out of the environment. I imagine it as strictly batch-oriented, not interactive at all, simply because that cuts out a huge chunk of complexity from the FORTH interpreter; similarly it does not have any business talking to the network, nor having any notion of time, maybe not even concurrency—most compile jobs are embarrassingly parallel, but again, huge chunk of complexity. What feels not-crazy to me is some sort of trivial file system: ar archive level of trivial, all files write-once, imposed on a linear array of disk blocks.

It is probably also necessary to reinvent Make, or at least some sort of batch job control language.

Operating system selection for $PROJECT, mid-2015

Presented without context, for amusement purposes only, a page from my notes:

FreeBSD NetBSD Linux
Per-process default route Poorly documented,
possibly incomplete
Probably not Poorly documented,
buggy
Can compile PhantomJS Probably Probably Yes
Jails Yes No Not really
Xen paravirtual guest Incomplete Yes Yes
System call tracing truss None? strace
pipe2 Yes Yes Yes
closefrom Yes Yes No
sysctl Yes Yes No
getauxval No No Yes
signalfd No No Yes
execvpe No Yes Yes
Reference documentation Acceptable (YMMV1) Acceptable (YMMV) Major gaps
Tutorial documentation Terrible Terrible Terrible
Package management Broken as designed Broken as designed Good
System maintenance automation I can’t find any I can’t find any Acceptable
QA reputation Excellent Good Good
Security reputation Good Good Debatable
Development community Unknown to me Unknown to me Full of assholes

1 It makes sense to me, but then, I taught myself Unix system programming and administration by reading the SunOS 4 manpages.

Google Voice Search and the Appearance of Trustworthiness

Last week there were several bug reports [1] [2] [3] about how Chrome (the web browser), even in its fully-open-source Chromium incarnation, downloads a closed-source, binary extension from Google’s servers and installs it, without telling you it has done this, and moreover this extension appears to listen to your computer’s microphone all the time, again without telling you about it. This got picked up by the trade press [4] [5] [6] and we rapidly had a full-on Internet panic going.

If you dig into the bug reports and/or the open source part of the code involved, which I have done, it turns out that what Chrome is doing is not nearly as bad as it looks. It does download a closed-source binary extension from Google, install it, and hide it from you in the list of installed extensions (technically there are two hidden extensions involved, only one of which is closed-source, but that’s only a detail of how it’s all put together). However, it does not activate this extension unless you turn on the voice search checkbox in the settings panel, and this checkbox has always (as far as I can tell) been off by default. The extension is labeled, accurately, as having the ability to listen to your computer’s microphone all the time, but of course it does not get to do this until it is activated.

As best anyone can tell without access to the source, what the closed-source extension actually does when it’s activated is monitor your microphone for the code phrase OK Google. When it detects this phrase it transmits the next few words spoken to Google’s servers, which convert it to text and conduct a search for the phrase. This is exactly how one would expect a voice search feature to behave. In particular, a voice-activated feature intrinsically has to listen to sound all the time, otherwise how could it know that you have spoken the magic words? And it makes sense to do the magic word detection with code running on the local computer, strictly as a matter of efficiency. There is even a non-bogus business reason why the detector is closed source; speech recognition is still in the land where tiny improvements lead to measurable competitive advantage.

So: this feature is not actually a massive privacy violation. However, Google could and should have put more care into making this not appear to be a massive privacy violation. They wouldn’t have had mud thrown at them by the trade press about it, and the general public wouldn’t have had to worry about it. Everyone wins. I will now dissect exactly what was done wrong and how it could have been done better.

It was a diagnostic report, intended for use by developers of the feature, that gave people the impression the extension was listening to the microphone all the time. Below is a screen shot of this diagnostic report (click for full width). You can see it on your own copy of Chrome by typing chrome://voicesearch into the URL bar; details will probably differ a little (especially if you’re not using a Mac).

Screen shot of Google Voice Search diagnostic report, taken on Chrome 43 running on MacOS X. The most important lines of text are 'Microphone: Yes', 'Audio Capture Allowed: Yes', 'Hotword Search Enabled: No', and 'Extension State: ENABLED.
Screen shot of Google Voice Search diagnostic report, taken on Chrome 43 running on MacOS X.

Google’s first mistake was not having anyone check this over for what it sounds like it means to someone who isn’t familiar with the code. It is very well known that when faced with a display like this, people who aren’t familiar with the code will pick out whatever bits they think they understand and ignore everything else, even if that means they completely misunderstand it. [7] In this case, people see Microphone: Yes and Audio Capture Allowed: Yes and maybe also Extension State: ENABLED and assume that this means the extension is actively listening right now. (What the developers know it means is this computer has a microphone, the extension could listen to it if it had been activated, and it’s connected itself to the checkbox in the preferences so it can be activated. And it’s hard for them to realize that anyone could think it would mean something else.)

They didn’t have anyone check it because they thought, well, who’s going to look at this who isn’t a developer? Thing is, it only takes one person to look at it, decide it looks hinky, mention it online, and now you have a media circus on your hands. Obscurity is no excuse for not doing a UX review.

Now, mistake number two becomes evident when you consider what this screen ought to say in order not to scare people who haven’t turned the feature on (and maybe this is the first they’ve heard of it even): something like

Voice Search is inactive.

(A couple of sentences about what Voice Search is and why you might want it.) To activate Voice Search, go to the preferences screen and check the box.

It would also be okay to have a duplicate checkbox right there on this screen, and to have all the same debugging information show up after you check the box. But wait—how do developers diagnose problems with downloading the extension, which happens before the box has been checked? And that’s mistake number two. The extension should not be downloaded until the box is checked. I am not aware of any technical reason why that couldn’t have been the way it worked in the first place, and it would go a long way to reassure people that this closed-source extension can’t listen to them unless they want it to. Note that even if the extension were open source it might still be a live question whether it does anything hinky. There’s an excellent chance that it’s a generic machine recognition algorithm that’s been trained to detect OK Google, which training appears in the code as a big lump of meaningless numbers—and there’s no way to know whether those numbers train it to detect anything besides OK Google. Maybe if you start talking about bombs the computer just quietly starts recording…

Mistake number three, finally, is something they got half-right. This is not a core browser feature. Indeed, it’s hard for me to imagine any situation where I would want this feature on a desktop computer. Hands-free operation of a mobile device, sure, but if my hands are already on a keyboard, that’s faster and less bothersome for other people in the room. So, Google implemented this frill as a browser extension—but then they didn’t expose that in the user interface. It should be an extension, and it should be visible as such. Then it needn’t take up space in the core preferences screen, even. If people want it they can get it from the Chrome extension repository like any other extension. And that would give Google valuable data on how many people actually use this feature and whether it’s worth continuing to develop.

Announcing readings.owlfolio.org

I’d like to announce my new project, readings.owlfolio.org, where I will be reading and reviewing papers from the academic literature mostly (but not exclusively) about information security. I made a false start at this near the end of 2013 (it is the same site that’s been linked under readings in the top bar since then) but now I have a posting queue and a rhythm going. Expect three to five reviews a week. It’s not going to be syndicated to Planet Mozilla, but I may mention it here when I post something I think is of particular interest to that audience.

Longtime readers of this blog will notice that it has been redesigned and matches readings. That process is not 100% complete, but it’s close enough that I feel comfortable inviting people to kick the tires. Feedback is welcome, particularly regarding readability and organization; but unfortunately you’re going to have to email it to me, because the new CMS has no comment system. (The old comments have been preserved.) I’d also welcome recommendations of comment systems which are self-hosted, open-source, database-free, and don’t involve me manually copying comments out of my email. There will probably be a technical postmortem on the new CMS eventually.

(I know about the pages that are still using the old style sheet.)

Programming languages and transportation

Last week there was a gag listicle making the rounds entitled If programming languages were vehicles and I lol’d along with the rest of them, but then I kept thinking about it and a bunch of the entries just seemed to miss the mark. And I’m in a silly sort of mood, so here is MY version: slightly different language selection, just as opinionated, 100% more Truth™.

Continued…

The literary merit of right-wing SF

The results are in for the 2014 Hugo Awards. I’m pleased with the results in the fiction categories—a little sad that The Waiting Stars didn’t win its category, but it is the sort of thing that would not be to everyone’s taste.

Now that it’s all over, people are chewing over the politics of this year’s shortlist, particularly the infamous sad puppy slate, over on John Scalzi’s blog, and this was going to be a comment there, but I don’t seem to be able to post comments there, so y’all get the expanded version here instead. I’m responding particularly to this sentiment, which I believe accurately characterizes the motivation behind Larry Correia’s original posting of his slate, and the motivations of those who might have voted for it:

I too am someone who likes, and dislikes, works from both groups of authors. However, only one group ever gets awards. The issue is not that you cannot like both groups, but that good works from the PC crowd get rewarded and while those from authors that have been labeled unacceptable are shunned, and that this happens so regularly, and with such predictability that it is obviously not just quality being rewarded.

BrowncoatJeff

I cannot speak to the track record, not having followed genre awards closely in the past. But as to this year’s Hugo shortlist, it is my considered opinion that all the works I voted below No Award (except The Wheel of Time, whose position on my ballot expresses an objection to the eligibility rules) suffer from concrete, objective flaws on the level of basic storytelling craft, severe enough that they did not deserve a nomination. This happens to include Correia’s own novels, and all the other works of fiction from his slate that made the shortlist. Below the fold, I shall elaborate.

Continued…

I Will File Bugs For You

This post prompted by Aaron Klotz’s Diffusion of Responsibility and Sumana Harihareswara’s Inessential Weirdnesses in Open Source.

One of the most common ways to start interacting with a free software project, as opposed to just using the software produced by that project, is when you trip over a bug or a missing feature and now you need to go tell the developers about it. Unfortunately, that process is often incredibly off-putting. If there’s a bug tracking system, it is probably optimized for people who spend all day every day working with it, and may appear to demand all kinds of information you have no idea how to supply. If there isn’t, you’re probably looking at signing up for some sort of mailing list (mailing list! how retro!) Either way, it may not be easy to find, and there’s a nonzero chance that some neckbeard with a bad attitude is going to yell at you. It shouldn’t be so, but it is.

So, I make this offer to you, the general public, as I have been doing for close friends for many years: if you don’t want to deal with that shit, I will file bugs for you. I’ve been on the Internet since not quite the elder days, and I’ve been hacking free software almost as long; I know how to find these people and I know how to talk to them. We’ll have a conversation and we’ll figure out exactly what’s wrong and then I’ll take it from there. I’m best at compilers and Web browsers, but I’ll give anything a shot.

THE FINE PRINT: If you want to take me up on this, please do so only via email; my address is on the Contact page. Please allow up to one week for an initial response, as this service is provided in my copious free time.

Offer valid only for free software (also known as open source) (as opposed to software that you are not allowed to modify or redistribute, e.g. Microsoft Word). Offer also only valid for problems which I can personally reproduce; it’s not going to go well for anyone involved if I have to play telephone with you and the developers. Offer specifically not valid for operating system kernels or device drivers of any kind, both because those people are even less pleasant to work with than the usual run of neckbeards, and because that class of bugs tends to be hardware-dependent and therefore difficult for me to personally reproduce on account of I don’t have the exact same computer as you.

The management cannot guarantee this service will cause bugs to actually get fixed in any kind of timely fashion, or, in fact, ever.

2014 Hugo Awards ballot

I’m not attending the Worldcon, but I most certainly am voting the Hugos this year, and moreover I am publishing my ballot with one-paragraph reviews of everything I voted on. If you care about this sort of thing you probably already know why. If you don’t, the short version is: Some of the works nominated this year allegedly only made the shortlist because of bloc voting by Larry Correia’s fans, he having published a slate of recommendations.

There’s nothing intrinsically wrong with publishing a slate of recommendations—don’t we all tell our friends to read the stuff we love? In this case, though, the slate came with a bunch of political bloviation attached, and one of the recommended works was written by Vox Day, who is such a horrible person that even your common or garden variety Internet asshole backs slowly away from him, but nonetheless he has a posse of devoted fanboys and sock puppets. A frank exchange of views ensued; be glad you missed it, and I hope the reviews are useful to you anyway. If you want more detail, Far Beyond Reality has a link roundup.

I value characterization, sociological plausibility, and big ideas, in that order. I often appreciate ambitious and/or experimental stylistic choices. I don’t mind an absence of plot or conflict; if everyone involved is having a good time exploring the vast enigmatic construction, nothing bad happens, and it’s all about the mystery, that’s just fine by me. However, if I find that I don’t care what happens to these people, no amount of plot or concept will compensate. In the context of the Hugos, I am also giving a lot of weight to novelty. There is a lot of stuff in this year’s ballot that has been done already, and the prior art was much better. For similar reasons, volume N of a series has to be really good to make up for being volume N.

Continued…

PETS rump session talk

I spoke briefly at PETS 2014 about which websites are censored in which countries, and what we can learn just from the lists.

another small dispatch from the coalface

For all countries for which Herdict contains enough reports to be credible (concretely, such that the error bars below cover less than 10% of the range), the estimated probability that a webpage will be inaccessible. Vertically sorted by the left edge of the error bar. Further right is worse. I suspect major systemic errors in this data set, but it’s the only data set in town.