As ever, some things do not fit neatly in one bin or another.

Redesigning Income Tax

Here is an opinionated proposal, having no chance whatsoever of adoption, for how taxes ought to be levied on income. This post was originally scheduled for Income Tax Day here in the good old U.S.A., but I was having trouble with the algebra and then I was on vacation. So instead you get it for Canadian tax day. Everything is calculated in US dollars and compared to existing US systems, but I don’t see any great difficulty translating it to other countries.

Premises for this redesign:

  • The existing tax code is way too damn complicated.
  • It is the rules that need to be simplified, not the mathematics. In particular, the marginal tax rate does not need to be a step function just to simplify the arithmetic involved. You look that part up in a table anyway.
  • All individuals’ take-home pay should increase monotonically with their gross income, regardless of any other factors.
  • High levels of income inequality constitute a negative externality; income tax should therefore be Pigovian.
  • Tax rates should not vary based on any categorization of income (e.g. interest and capital gains should be taxed at the same rate as wages). This principle by itself removes a great deal of the complexity, and a great deal of the perverse incentives in the current system as well.

Read on for concrete details of the proposal.


Caffeinated owls

Semi-anthropomorphic sketches of six owls, each with a different
        facial expression and labeled with the name of a different
        coffee-related beverage: decaf (asleep), half-caf (awake, but
        not happy about it), regular (a little more awake and still
        not happy about it), Irish coffee (cheerfully buzzed),
        espresso (unable to blink), double espresso (oh dear, it's
        gone all the way to knurd).

Art by Dave Mottram. Found on G+.

A Contest

In honor of the Feast of All Fools, and because if anyone has noticed it, they haven’t told me, I hereby announce that there is a joke in the references of my most recently published paper. Whoever first correctly identifies it will win the right to suggest a joke to be added to my next paper, which is currently in preparation. Post your guesses in the comments; so as not to spoil it for anyone, comments will not be visible until after the contest ends.

One guess per person. Must provide a working email address (or I won’t be able to contact you if you win). Do not suggest a joke now; the winner will be notified of the topic of the upcoming paper, so they can think of something appropriate. Management reserves the right to reject joke suggestions, in which case the next person in line will get a crack at it.

What Is Wrong With You Monkeys?!

Attention conservation notice: Angry rant about sexism and sexism-motivated abuse in the computer industry.

I was going to write a crunchy, cerebral, if perhaps controversial, post today about how I don’t think Bitcoin is going to change the world, but then I got up and read my usual newsfeeds and discovered that, once again, the Internet’s collection of gibbering follow monkeys have decided to hurl abuse up to and including death threats at someone. Someone whom, I am not surprised to find, is female and not white. So now you don’t get crunchy, or cerebral. You get an angry rant, because I have had enough of this shit.


Ideas that don’t make money

The sad Internet news of this week is that the multiplayer online game/community Glitch will have to shut down next month. The announcement makes it sound like mostly a financial problem (not enough revenue to keep going), with a side order of getting caught between technology curves. They built the desktop client on Flash, which is on its way out now, but the technologies that will replace it are not completely ready yet; meanwhile, Flash is mostly not available at all on mobile devices but they didn’t have the engineering manpower to build a whole new client for each such platform.

This is a personal disappointment for me, since I liked the game, but it’s also not the first time I’ve seen an Internet community built around a compelling idea fall apart because the money wasn’t there. Something very similar happened to Metaplace and Faunasphere. It’s not just games; the WELL, paragon of elder days, had to be bought out by its users, and this was only possible because it goes back to elder days and has users who are very, very rich. TV Tropes, timesink par extraordinare and valuable resource for high school English students, is ad-supported so it keeps getting jerked around by Google.

You get the idea: the ecology around the Web is only capable of supporting ideas that bring in the money. It doesn’t really matter how good the idea is on its own terms, or how desirable it is to its audience if that audience isn’t big enough to provide enough money. Kickstarter and the like help with that last bit, but they don’t work for things that need lots of money or a continuous stream of money. Glitch staff quoted a figure of six million U.S. dollars a year to keep the game running, which is comparatively small for a business—thirty-ish people at $100,000/yr, plus however much the servers and the connectivity cost, plus overhead. But one million dollars is extraordinary for a Kickstarter project.

The requirement for a continuous stream of money to keep the servers running also hurts things on the Net that were successful but are now declining. I can still play Super Mario World any time I want; even after the original hardware stops working altogether, there will be emulators. But I can’t go back to Star Wars Galaxies, and I’m not sure if I should believe the website that’s telling me I can still play Uru Live. Again this isn’t just about games; we all remember what happened to Geocities.

Free software helps, but not enough, because it’s not enough to be in possession of all the code and data that you need for a client-server MMO. Some specific person or group has to actually run the server, and now we’re back to that continuous stream of money requirement—most of which will be going to people, not to computrons or tubes. You might not need developers, but you definitely need sysadmins. I was a sysadmin in college, for a tiny little computer lab that almost never had crises at four in the morning, and it was still a shitload of work. For an MMO you also need in-game and out-of-game moderators, which is even more difficult and thankless a gig than sysadminning, and while people do sometimes volunteer to do it for free, often those are exactly the people who should not be doing that job (yeah, I’m looking at you, Reddit).

Is there a solution? I don’t have one. I think it’s more a problem of capitalism than a problem of software architecture.

git backend, hg cli

LWN has an article with a nice chunky comment thread talking about the history of DVCSes and how git has basically taken over the category. Mozilla, of course, still mostly uses Mercurial, but there’s a lot of people who prefer git now, and there are bridges and stuff.

I have a weird perspective on all of this. I hacked on Monotone back in the day, so I have the basic DVCS concept cold, and Mercurial is only a little different; it never surprises me. Git, however… I read the documentation, and I think I understand what’s going on, and then I do something that according to (my understanding of) the documentation should do what I want, and instead it mangles my local repo and I get to spend an hour or two repairing it. Or, in one memorable case, it mangled the remote, shared repo—thankfully that was easily fixed once I figured out what it had done, but I still don’t know why it did that instead of what I expected it to. (A matter of which branch’s HEAD pointer got updated with the result of a merge.) I’ve been actively hacking on projects whose primary VCS is Git for over a year now and this consistently happens to me about once every 20 to 40 hours of coding time.

So I don’t trust Git and I don’t like using it. I do, however, appreciate its speed, which as far as I can tell is down to back-end stuff—storage format, network protocol, and so on. So here’s what I want: I want someone to write an exact clone of the Mercurial CLI that uses git’s back end. I have no time, but I would totally contribute money to the development of this. It has to be an exact clone in terms of command line behavior, though. If that means throwing away front-end features of Git, I am 100% fine with that. I would happily lose the index/working copy distinction, for instance. I could also live with losing support for arbitrary Mercurial extensions; I would miss MQ in principle but I suspect there’s an alternate development model for Mozilla that doesn’t need it. Everyone else seems to manage.

Anyone else interested in something like that?

unearthed arcana (music division)

Some time ago—I don’t remember how long precisely—I started working on a mixtape. I got as far as writing down a bunch of songs in categories, and then I lost interest, and the list has been cluttering up my desk ever since. The category tags no longer make a great deal of sense and I’m not even sure who sings some of these songs anymore, but if I put it into the computer then I can get rid of the paper cluttering up my desk, and maybe the magic of the internets will do something with it.


Classical Mechanics Interlude: Acceleration to stop in a constant distance

Over on twitter, @MegaManSE asked

does anyone know the equation to find the acceleration to stop a moving object in a constant distance given some random starting velocity?

I didn’t, at the time, know … but I do know how to work it out from first principles, and it makes a decent little classical mechanics exercise, and also an excuse to figure out how to get MathJax hooked up on this blog, which might be useful in the future. So here’s how it’s done.

The first step in solving one of these problems is to rewrite the question as formally as possible:

At time t=0t=0 an object is at position x=0x=0 and moving with velocity ν=v\nu=v. Find the constant acceleration aa such that at some future time t=Tt=T, when the object is at position x=dx=d, its velocity will be zero.

Now how do we do that? It’s time for just a little bit of integral calculus. Velocity is the rate at which a moving object’s position changes, as a function of time, and acceleration is the rate at which a moving object’s velocity changes, also as a function of time. The calculus was invented to answer the question, if I know what one of these is, what are the other two? It has a somewhat-deserved reputation for being confusing, but mostly that’s because it’s hard to explain how you come up with its rules. If you know the rules, they’re pretty easy to apply. The acceleration in this problem is constant, aa, and we know at time 00 the velocity is vv and the position is 00. Therefore, the velocity at time tt is

ν(t)=v+0tadt=v+at\nu(t) = v + \int_0^t a\; \text{d}t = v + at

and the position is

x(t)=0+0tv+atdt=0+vt+at22x(t) = 0 + \int_0^t v + at\; \text{d}t = 0 + vt + \frac{at^2}{2}

These are both functions of time, but we want to solve for acceleration as a function of distance and starting velocity. But that’s just a matter of algebra. We want ν(T)=0\nu(T) = 0, so we plug that into the first of these equations and solve for TT:

0=v+aTT=va0 = v + aT \quad\rightarrow\quad T = \frac{-v}{a}

And we want x(T)=dx(T) = d, so we plug both that and the formula for TT into the second equation:

d=vva+a2(va)2d = v\frac{-v}{a} + \frac{a}{2}\left(\frac{-v}{a}\right)^2

Now all we have to do is solve for aa:

d=v2a+v22ad = \frac{-v^2}{a} + \frac{v^2}{2a}

d=2v2+v22ad = \frac{-2v^2 + v^2}{2a}

2ad=v22ad = -v^2

a=v22da = \frac{-v^2}{2d}

Wait, the acceleration comes out to be negative?! Yes. That’s how you know the object is slowing down rather than speeding up. (If the object weren’t moving in a straight line, its position, velocity, and acceleration would all have to be treated as 2- or 3-dimensional vectors, but the calculations would wind up being very nearly the same, only with more boldface. Also, if the velocity were negative, it would mean the object was moving backward. This is, in fact, the difference between velocity and speed: speed is the magnitude of velocity, without the direction, so it can never be negative.)

Dead uncle Allotheria

Summaries of the rest of CCS’10 are still coming eventually, but what I want to talk about today is the Difference Engine No. 2 which I have now seen demonstrated twice: Mozilla had its annual meeting at the Computer History Museum a couple years back, and on Saturday last I was there again for SRI’s holiday dinner.

As you know, (Bob,) the Difference Engine was originally designed in the 1820s by Charles Babbage; a tabletop-sized demonstration version was built, but the full-size version was abandoned in 1833 after Babbage fell out with the mechanist he had hired. Babbage went on to design the Analytical Engine, which if completed would have been a fully operational stored-program computer, and later to redesign the Difference Engine with only eight thousand (instead of 25,000) parts. Having blown £17,500 of Parliament’s money (more than a million pounds at current rates) on the first failed project, he was not able to secure funding for either of these, and they remained drawings only. The London Science Museum built two copies of Difference Engine No. 2—that’s the 8000-part version—in the 1990s, with period-appropriate materials and manufacturing tolerances, to prove that it could have been done; one of these is on display at the Computer History Museum.

The Difference Engine is not a computer in the modern sense, or even a desk calculator. It does one thing only: it computes tables of values of mathematical functions. You approximate the function you want as a polynomial of up to 7th degree (you probably need several polynomials, for different ranges of the argument) and you can then have the Engine calculate the value of the polynomial for many different inputs, using finite differences. It stamps the numbers into a bed of plaster of Paris, which can then be used to cast a letterpress plate and run off many copies of a book. There were already such books, but they were computed by hand, and notorious for their errors. The Difference Engine’s failure meant that they kept on being computed by hand; however, Georg and Edvard Scheutz managed to build a miniature version of the Engine in 1857 and use it to print a table of logarithms; also, the first practical mechanical desk calculator went into production in 1851, and was no doubt employed in making tables more accurate.

Nowadays, of course, we could program an electronic computer to typeset such a book, but why would we bother, when the very same computer can just spit out exactly the values we’re interested in, of any function we want? In other words, the spiritual descendants of the Analytical Engine have not only obsoleted the Difference Engine—which I’m sure Babbage himself would have expected—but have made the job it was made to do a thing of the past. This hardly ever happens, and nobody (except possibly Vannevar Bush or Murray Leinster) saw it coming—witness the Heinlein juvenile of forgotten title in which the spaceship had an electronic computer to navigate it, but to make it work, the human navigation team had to type in numbers from a book of tables.

At the demonstration, several people in the audience did not seem to believe, even after it was repeated several times, that no Difference Engines were ever built in Victorian times. This one is a replica? they kept asking. Nope. Production model, serial number two, of the only production run there ever was. It does seem odd to me that, before now, it was so fervently believed that Babbage’s engines could not have been built with Victorian manufacturing technology, when they manifestly did build similar devices, such as the Scheutz brothers’ difference engine and various desk calculators.

The other telling question was on the order of if you set the Engine up wrong, what happens? Being so special purpose, of course it can’t have a bug of the sort we are all used to. Its gears can get out of alignment, but it’s got a mechanism to detect that and jam rather than produce an incorrect sum. (I don’t know how the operator would recover from such a jam, however.) And no matter how you set it up, it will compute some polynomial; but that might not be the polynomial you wanted. Babbage himself was asked this question, in a less-coherent form (look for On two occasions).

I’ll leave you all with a vision of what might have been.

The Twit Cleaner

(notes on behavioral categorization of Twitter accounts)

I don’t follow a lot of people on Twitter, but I still sometimes have trouble deciding whether the accounts I’m following are worth it. Folks with much longer follow lists presumably have even harder going.

Enter The Twit Cleaner, a (sadly, as of late 2013, defunct) service that scans your follow list and automatically categorizes the behavior of everyone on it. They have some straightforward heuristics for deciding whether someone is worth following, mostly documented in their FAQ:

Q. How are the (potential) bad guys broken down?

A. The possible categories are:
Dodgy - spam phrases, @ spamming, duplicate links etc
Absent - No updates in a month, or fewer than 10 tweets.
Repetitive - High numbers of duplicate tweets or links
Flooding - So high volume you can’t see anyone else
Non-Responsive - No interaction & those that follow back < 10%
Little New Content - Retweeting lots or just posting quotes

This is generally a good scheme, but its focus on conversational use of Twitter means that it misidentifies a few types of legitimate account as unsavory. I think a few special case categories would go a long way to making the service’s advice more useful.

Announcement channels

These are the Twitter equivalent of a news ticker—they broadcast announcements related to something, but they don’t converse with people (as a general rule). The Cleaner dings them as dodgy behavior: tweeting the same links all the time and/or not interactional: hardly follow anyone. Examples include @NBCOlympics, @CDCemergency, @asym, @Astro_Soichi, and (ironically) @TwitCleaner itself (the problem here appears to be public @somebody, your report is ready at directed tweets when direct messages fail).

These can probably be machine-identified as extreme outliers in follower-to-followed ratio. @asym and @Astro_Soichi don’t follow anyone; @NBCOlympics and @CDCemergency follow less than 0.1% of their follower numbers. @TwitCleaner likes to follow users of the service, though; maybe they should just whitelist themselves? Also, if Twitter-verified users are not already whitelisted (I wasn’t able to tell from my own report), perhaps they should be.


Lurkers are the opposite of announcement channels: they just read Twitter, they never post anything. Lurking is a time-honored tradition on the Internet and people shouldn’t be penalized for it. I have several lurkers on my follow list just on the off chance that they might start posting in the future.

Accounts that have never posted at all should be distinguished from accounts that post rarely. (The latter are often spammers. Lately Twitter itself has gotten a lot better about finding and banning spammers, but they still turn up now and then.)

Fictional character accounts

There are any number of fictional characters who regularly use Twitter—that is, their authors write and post tweets under their names, usually to provide a bonus story line, or to implement the fourth wall mail slot. Examples include @Othar of Girl Genius and the entire cast (caution: mildly NSFW; @pintsize0101 consistently links to egregiously NSFW images of the where’s my brain bleach variety) of Questionable Content. Fictional characters may absent themselves for long periods because the bonus story line is on hold (Othar recently didn’t post anything for four months but is now back) and might not follow anyone but other characters from the same fictional world (the QC cast does this); both things get them unfairly dinged by the Cleaner.

It probably isn’t possible to identify fictional accounts in a mechanical way. However, you could pick out cliques in the follow graph, sets of accounts that are followed by many but that follow no one but each other, as deserving human attention. If Twitter implemented some sort of account-labeling scheme that would let the people behind the curtain mark accounts as fictional characters, that would be awesome.