Monday, February 27, 2012

The Ongoing Saga of Making Ubuntu 11.10 Usable

Alright, I installed Ubuntu 11.10 (aka "lost all common sense" edition) on a new computer, so I have the memory of trying to make it bearable still in my head. The absolute necessary first thing to do is:

And set your login session to "Gnome fallback". This gives you a passable facsimile of Gnome 2 (aka "worked and was not useless" edition). I am a little... offended is to strong, but miffed that "fallback" means "what to use if you want to get work done and are not, in fact, a hipster," but whatever.

You have to Alt+right click (M-<mouse-3> for the enlightened Emacs users) in order to right click on the panels now. Now I know what you're thinking, "what did they set to right click which necessitated bumping right click to Alt+right click?" The answer is "nothing". That's right, if you right click on the panel in Gnome 3, nothing happens. Which naturally might lead you to wonder "so why take something which worked and was simple and make it more complicated while substituting nothing in return?" Because "liquid intuitive natural human smooth desktop metaphor", that's why. Probably something to do with netbook/tablet/$CURRENT_FAD.

Anywhoo, the next step is installing Compiz, which gives you more (read: any) customization and, most importantly, wobbly windows. It used to be a simple click in System → Preferences → Appearance → Effects and away you wobbly went! Now, you have to dig into compiz-config-settings-manager a bit. First, install compiz and the settings manager:

I won't go into all the options here, but if you open ccsm (short for "compiz config settings manager") and look around, you can fix the lack of wobbly windows as well as focus-follows-mouse. I really can't use a computer without those two.

To get rid of the global menu bar (in what drug-fueled rage did this seem like a good idea), just run:

Who doesn't love the idea of scroll bars which try their darnedest to evade your mouse pointer? I know that it becomes even more of a huge pain metaphorical adventure with focus-follows-mouse. If you move the mouse off of the scroll... thing, it changes your window focus. Oh, and did I mention that "off of" can mean "above or below, if the scroll thing thinks you shouldn't have moved your pointer that direction?" Well, it does. If you anger the scroll-thing gods, they will become cross and punish you for your wickedness by making the scroll thing forever evade your grasp. It's kind of like Sisyphus, but you are forever chasing a scroll thing you can never quite catch.

Anyway, to make it die the horrible flaming death it so rightly deserves, run the following (you may need to log out and back in for it to take effect):

Now, to make the window decorators okay again, like they were when everything was fine, run this (you'll probably have to log out and back in again):

And no, I couldn't find an easier way to do this. The advantage the default decorators have over the crappy (aka new) ones is that you can actually tell the difference between the active window and background ones. In both "Ambiance" and "Radiance" the (only) theme options, the difference between the current and background windows is that the window title — not the window bar, but just the title text — goes from gray to grayish. Thanks Ubuntu.

This also has the advantage of restoring the window borders, so that if, for example, you are trying to resize a window, you don't have to hit the one-pixel-wide target area in order to resize. Apparently, according to the geniuses in charge of designing the new interface, being able to resize windows easily is not NEW INTERFACE METAPHOR-y enough.

At this point, you have restored your computer to a reasonably-functional state. You may be wondering, as I did, why the developers of a distribution would take something which was just fine, did what you needed, and was instantly familiar to everyone who has used a computer before, and replace it with the user-interface version of Interior Semiotics. I have two hypotheses:

  1. A couple guys glanced at an iPad, saw that it couldn't do as many useful things as a real computer, and thought "hey, we could make our OS not be as useful as a real computer too!" So they set out to build the worst piece of useless filth they could imagine. One thing that sprung to mind immediately was
    I know! Let's make it so that to launch any application, the user has to move the mouse over to the left of the screen, but not the top-left, that's for the close-maximize-minimize buttons.
    But that was thinking too small. It took another act of auto-lobotomy brilliance to take it to the next level.
    Oh, how about, when the user has opened an application, any indication of other applications on the system or how to launch them disappears! Then, in order to launch other applications, the user needs to go back over to the "Dash" — which is totally not a stupid name, concept, or idea — and try to select something!
    The designers were proud of their progress, but frustrated that novice users might still figure out how to launch applications not in the small list of things included by default on the DASH HOME DOCK METAPHOR PANEL. Then came the final piece of brilliance:
    Aha! How about instead of having an "Applications" menu which contains a list of applications organized by their function — Graphics, Internet, Sound & Video, etc. — we only have four applications, and if the user wants to launch something else, they can start typing its name! No one will ever be able to get work done once we include that! We'll finally be competitive with tablets in the "least amount of useful functionality" race!
    Finally, they were satisfied. Nobody who had ever used a computer before would be able to figure out this turd, and the designers could live their life-long dream of turning something useful into a CES tech demo. Surely, the highest aspiration of any piece of software is to look kind of novel for a couple minutes when filmed over the shoulder of a PR representative at a loud and crowded tech convention. In other words, you want this:
    Yeah. That's what people want.
  2. Drugs. LOTS of drugs.

Now, armed with these tools, you can venture forth into the world and use your computer like a computer, rather than an imitation magazine. If you want a magazine you can read while you lean back on your postmodern sofa, go buy a copy of Clash magazine (it's pretty underground, you've probably never heard of it). If you want to lean back and watch TV, then watch TV. If you want to lean back and listen to music, get an iPod, or better yet, an Android phone with a vast multitude of listening options.

But if you have any interest in doing actual work, you get a computer and pray to Turing that the Metaphor Brigade hasn't left the coffee shop and ruined it yet. I suppose if your work entails filming a tech demo of you performing the three tasks you can do with a hypothetical tablet that nobody would pay (or even receive) money to acquire, then keeping up with the latest fads might be important. For those of us who want to open up the computer, start composing a document, and have characters appear on the screen as buttons on the keyboard are pressed, we have been immensely harmed now that "creative" people discovered they could ruin whole operating systems, not just individual websites.

The one kernel of solace is that, because there are users of these formerly-useful operating systems who posses the skills to fix these monstrosities, solutions will continue to exist. And hopefully, once the project leads sober up, all of the trash of the last year can safely be discarded.

Anything is possible, I suppose.

Wednesday, February 15, 2012

Set ordering in Python: why I didn't sleep last night

In yesterday's class of the Python-infused Linear Algebra course I'm taking, a fellow student noted the strange ordering of items in a set when printed. It seemed that the printed order of the set items always swapped the first two items, but left the rest unchanged. I don't know what sets he tried, but I was interested in the general behavior, and so began my quest through the Python source code...

BIG IMPORTANT WARNING!!!

These are some interesting properties of the order of iterations over Python's set objects. They are NOT to be relied upon for the correctness of your program! There are a bunch of hardcoded, opaque magic numbers which determine the order, so it could easily change across micro version, platforms, or compilations with different options. Treat sets as having undefined ordering!

All of this is looking at the tag "v3.1.1" in the cpython repository on the Python repositories site. "3.1.1" is the version of python3 on the computers here, so I figured that would be the one to check, though I assume the set code hasn't changed that much. The root of that version is here.

Sets are defined in the source files "Include/setobject.h" and "Objects/setobject.c" for the headers and implementation, respectively. The relevant part of sets is that their elements are stored in a hash table using open addressing for collision resolution. Python (version 3.1.1, anyway) uses the following algorithm for resolving collisions, in Python (with cheats):

entry is actually a pointer (the whole function is in C); I'm fudging the specifics here a lot. mask is table.array_size() - 1 where table.array_size() is the size of the underlying C array. The essence is the updating of index; it is intended to distribute keys all over the array, making collisions less likely. There are a bunch of trade-offs, and they are described in "dictnotes.txt" and "dictobject.c". One interesting fact is that basic integers hash to themselves (and are ANDed with mask for indexing into the array).

The use of open addressing means it can get all objects simply by walking the table. Entries in the set are defined as:

The set object itself comprises an array of setentries and some additional fields, like the number of in-use entries and the size of the array.

Calling list(o) converts a "sequence object" (there is an abstracted interface at the C level) by calling PySequence_List, declared in "Include/abstract.h" and implemented in "Objects/abstract.c". PySequence_List does some basic checks and calls _PyList_Extend, in "Objects/listobject.c", which creates a new list and calls listextend(self, iter(o)) for the real work. It then simply walks through the iterator, appending the items returned to itself, doubling its length if it runs out of space. So the ordering is determined by the iterator for sets.

The work is done in the setiter_iternext() function, which walks through the backing array until it finds a non-NULL entry and returns that. So the order of objects from the result of list(s) is determined by the order of elements in the table. That is determined by PySet_Add which calls set_add_key which uses PyObject_Hash to get the hash of the object to add.

PyObject_Hash looks into the object's vtable (Python doesn't call it that, it's stored in the object's ob_type field) to find its tp_hash function and calls that. tp_hash returns a long. Each object has its own hash function; this is what is used for a couple of common types:

  • double: uses the integer part of the float (with some extra stuff)
  • pointer: rotates the pointer right by 4 bits (since the lower 3-4 bits are likely to be zero)
  • int: the absolute value of the int modulo the platform's maximum long value. This actually can get interesting, since Python longs are arbitrary precision. The hash is computed through some additions, rotations, and fun stuff like that.
  • tuple: the hashes of its objects, XORed together and multiplied by a sequence of constants to mix up the value, and something having to do with primes.
  • string: XOR of bytes (not "characters"! Know your Unicode!).

So the order items from list(s) is "mostly" stable for sets containing the same objects. The exception is when objects with the same hash are inserted in different orders; because of open addressing, the order of insertion determines their place in the array. The following case should hold true:

So it is possible for sets with the same contents to have different orderings, but it requires a collision of hash(a) % mask. Remember, (for ints) hash(i) == i when i is not an arbitrary-precision numeric. You can create a collision easily by exploiting knowledge of the initial size of the internal array. Rather cleverly, the set object includes an immediate array of PySet_MINSIZE elements (defaulting to 8) within the structure itself. This means that if your set doesn't grow beyond 5 elements (the hard-coded load factor is 2/3), there is no extra memory allocated for the table. Given those two facts, all you need is (a, b) such that a & 7 == b & 7. How about (0, 8)?

You can then do:

I'm not sure why the order of the elements is reversed; this only happens for certain combinations:

I started investigating deeper into how Python fills its hash tables, but I decided to stop after a dozen hours, reading through Python's parser code, and trying to figure out pixel offsets to pass to PyCairo during visualization. My progress thus far is here. I'm trying to get a visualization of exactly how the entries are being added, but porting Brandon Rhodes' dictionary code to work for sets is non-trivial. Also, getting the visualization to display in a Gtk window rather than writing to a file took some fiddling around with PyGtk and PyCairo.

tl;dr: the order of objects in a set when converted to a list (such as during display on the repl) is dependent on the hash of the object and the collision resolution strategy. The pattern found by the student in which the first two elements of the set were reversed was a coincidence, as the indexing function attempts to distribute keys uniformly over the table.

I spent WAY too much time on this. Interesting, though.