Wednesday, September 12, 2007

Universal Serial Numbers

No, this is not a post about cracks or keygens for some software. It is an idea I have had kicking around in my head for a bit that needed to get out. The motivation for this idea is best captured in story form.

At the dawn of time, Man had a scant few possessions. The idea of assigning numbers to these possessions was absurd for a couple of reasons:

  1. He wouldn't have gotten very far past 1: Rock. 2: Wife. 3: Shank of meat.
  2. Numbers weren't invented yet.

As time went on, Man began to add things to his list of possessions: sticks, bigger stones, and eventually land! For most of human history, there were few enough things that you could just kind of remember all of them.

But what happens when you have thousands of items for sale in the same store? Should the clerk simply memorize the price of each and every different kind of spiced ham and make a tally of how many of each kind he has sold? If that's how things worked, the apocalypse would have occurred about 30 seconds after the opening of the first Wal-Mart.

What happened is that some guy, who I can only assume was named Joseph Bar, invented the Bar code to shift the tedium of matching an item to a product description to computers. Since then a number of different bar code schemes have evolved.

The invention of barcodes has also led to such things as barcodepedia, which lets you look up any barcode of a common format.

But what of other products? Network interface devices have MAC addresses, cars have VINs, people have passport numbers, books have ISBNs and so on. Wouldn't it be nice if all of these different identifiers could coexist peacefully in the same database?

That's what I am proposing. A unique identifier for each and every thing on planet Earth.

In the words of Keanu Reeves: "Whoa."

This idea would need a lot of fleshing out, and gets complicated quickly. Let's take the example of books for the sake of argument. Each individual book has a universal serial number. It does not need any more identifying information than that, no ISBN, no barcode, no Library of Congress catalog card number. If you had a graphical scheme for encoding it like a barcode, that would be the end of the story as far as what's printed on it.

Things get messy when you try to do something useful with this. So you have your lovely sha1 (why not?) string and you want to find some stuff out about it. Well, it would be nice to know what the product is. So if you have a copy of The Bourne Ultimatum, and all that's on the book itself is its unique hash code, how do you figure out that it is actually Robert Ludlum's smash hit? Well, let's invent a relationship between USNs (Universal Serial Numbers). Your individual copy could be a "member of the product line $FOO." So we add that to our database, publish a website with all of that and run off fat and happy.

But wait! We now know that the USN 173a18cf0b9835a0a0c67808ca20bb82a4c20dc7 is an instance of the book The Bourne Ultimatum and we could certainly store some info about the authors and publishing company. But we wanted everything. So the book might be "authored by $BAR," "published by company $BAZ," "edited by $FOB," and so on. Now we have tons of kinds of relationships which all have to be recorded.

What if we wanted to take it further? Say I have my copy of the aforementioned book but it gets lost or (gasp!) stolen? If I record somewhere that the book 173a18cf0b9835a0a0c67808ca20bb82a4c20dc7 belongs to me, I could help someone return it or prove that it was mine in the first place.

Here we hit the problem of privacy. Others may be different, but I certainly would not like everyone on the Internet to know about all of my possessions. One possible solution is to have the database work in a distributed manner, similar to DNS. That way if you looked up 173a18cf0b9835a0a0c67808ca20bb82a4c20dc7 in the global database, it would point you to my server, which would have its own rules for deciding how much to reveal. Because the keyspace is so large, there is a low probability of sniping keys and figuring out what someone has. There is far more to consider here, but again, this is a very rough draft.

So there it is. A rough idea which, given an enormous amount of work, could define and organize the relationships between all identifiable objects known to man.

Including his rock, his wife, and his shank of meat.

No comments:

Post a Comment