The Huz Experience

Interweb Stuff

S.T.A.L.K.E.R

Tuesday 8th August 2006

So yesterday there was a classic Internet furore as AOL publicly released a pile of search data from users of their search engine. Anonymised, mind you, but that’s not quite the right word. When I think of something being ‘anonymised’, I think of dodgy electronic voices. Interviewees in impenetrable silhouette. Details changed to protect the innocent. The whole shebang.

Adventure gamers on AOL? Bunch of freaks.What AOL mean here is that they didn’t attach the name and address of each user directly. Very good of them - but they still gave each user a unique ID number.

And tagged each of their search queries with a date and time.

What this means is that anyone can quite easily track the search history of user #3, and - if they happen to have insider access to any of the sites user #3 found through the AOL search, say a government site or one owned by any of the big media companies - crosscheck AOL’s data with their own web server logs to lift the veil of anonimity. Oh dear.

You don’t even have to be that big a site to score a victim. So congratulations, Case Study User X - you stumbled upon a site for which I have the logs.

Huz? Stalker more like. Thanks Ryan.

The guy - and, judging by his search history, I’m assuming it’s a bloke - is from Texas. He came to the site but didn’t stay. The bastard.

Probably because it doesn’t contain many pictures of actresses. He likes searching for info on actresses, you see. And paparazzi pictures of them. And sometimes, he can’t help but wonder where they live.

Nothing too bad there, I suppose. Harmless enough. At the beginning of his search oddysey with AOL, he was looking for ‘free puppies’. By the end, his search had moved on to the subject of puppy food. Is this a touching tale of dreams fulfilled?

Probably. Unfortunately, not many dreams are likely to be fulfilled by people having their search history posted for the world to see. My stalking victim was relatively innocuous, searching mainly on mundane topics that any one of thousands of people could have. The fact that he was looking for careers with a specific company, evidently owned a particular model of printer and had a poor credit rating - but wanted a loan - might help to pin him down more precisely, but not without a lot of educated guesswork.

It’s highly likely that for dozens, if not hundreds, of people represented in the 2.1GB of raw search data, the effects of someone who knew them trawling through the data would be much more damaging. Even a quick perusal of the first 65,000 records revealed some woman - evidently a woman - with an unhealthy interest in post-natal depression and ‘infanticide’, and a few less savoury examples.

The sad thing is that without the inclusion of unique user IDs - and the associated loss of privacy for those concerned - the data becomes much less interesting for research purposes. It simply becomes a collection of words, without context. As it stands, the AOL data is intruiging; it represents something akin to a stream of consciousness as ordinary people interact with the Internet, revealing more about themselves to the ether than they might reveal to their friends.

It’s a bit of a scary thought how much of ourselves is likely to exist in Google’s vaults, really.

Remember kids: Big Brother might not be watching you right now, but he’s probably saving everything you do until later. And if you’re not careful, he’ll release it all in an ill-advised and incredibly naive philanthropic gesture.


Leave a Reply

Hey there. The Huz Experience would be a right pain to administer without WordPress, and would be overrun with spam for questionable knob potions without Akismet. Thanks chaps!

Valid XHTML 1.0 Strict   Valid CSS!