Network Security Methods of Analysis Under Attack

Andrew "Weev" Auernheimer has been convicted of federal hacking charges (really cracking, but apparently prosecutors aren't smart enough to understand the difference). This guy has a checkered past, but as a software engineer, I have to agree with The Washington Post that the conviction sets a bad precedent.

Given my stance that functional (that is, motivational) right and wrong are relative to desires, you might not agree with my position on this matter. I can only try to convince you by appealing to what desires you already have.

My position is based on the ends of:

Maximal truth fostering
Maximal human goal actualization
Minimal need for the institution of a state (which ties into #1 and #2)

I don't believe in intrinsic value, and don't believe that truth (the concordance of beliefs to their reality) has intrinsic value. However, most human desires are ones which require or greatly benefit from true beliefs. I assert that most people have strong and plentiful reasons to seek the promotion of truth. While there are outliers such as lying to Nazis about hiding Jews in your attic or denying an inconvenient truth to pull through some difficulty, society provides more benefits to its members when that society fosters truth.

While it'd be ideal if no one would exploit security vulnerabilities, that is not likely to be the case so long as humans are wired for personal gain at the expense of the group. Every time people have thought they could get group selection working along human norms, they have been proven disastrously wrong. Humans aren't going to stop being self-interested unless they're rewired neurologically or they are denied opportunities by living in some kind of totalitarian panopticon.

Given that I don't expect humans to not be self-interested, I certainly don't trust humans with coercive rule over others - especially when that rule is sanctioned as an institution. That means that, for now, there will be some jerks among us. The cheapest way to deny them the opportunity which is consonant with minimal institutional coercive rule of humans over humans, is to make better locks. Unfortunately, engineers aren't fully informed of all possible exploits and some cut corners.

"Hackers" provide a valuable service in exposing weaknesses and disseminating that information to the public so that future weaknesses of the same type can be corrected or avoided. If a site has security holes, I do not wish to trust it with my information until those security holes are patched. Being informed of site weaknesses allows me to be a better rationalist.

That said, there is a generally-agreed-upon protocol for testing and exposing weaknesses. The pattern is:

1. Hypothesize exploit
Test
Reveal exploit to the company or individual and allow them time to properly fix
After the exploit is fixed, or after a set timeout, reveal to the public

The bolded part of #4 is critical because it discourages companies and individual from being lazy once they know of the problem. It provides a service to the people who have data stored with those companies and it lets other companies know that they might be called out. It's arguable that such an ultimatum (blackmail?) benefits society.

However, it appears that Mr. Auernheimer went straight to Gawker with the vulnerability. Even if he didn't give Gawker the specifics or if Gawker didn't publish the specifics, giving out the knowledge that a weakness exists with a certain site could drive malicious crackers to focus efforts on that site. This is where I completely disagree with the actions Mr. Auernheimer took.

That doesn't mean I'm siding with the prosecution. As for the government's arguments, I flat out disagree with some while others are less clear cut.

Impersonation: The User Agent String

As the article points out, if pretending you're using a different device is a crime, then anyone who has ever used the internet is guilty, even if unknowingly; almost all user agent strings contain the word Mozilla, and IE, KHTML (Chrome and Safari) and Opera are not derived from the Mozilla codebase.

People do this subtle kind of lying all the time. A New Clarksville pastor pretended to be homeless to see if his congregation was taking his words to heart. A friend of mine used to work at The Olive Garden and all the women used fake names on their nametags (probably to avoid getting phone number requests from males). Hell, actors pretend they're different people! I put those in the same category as pretending to be using a different browser - it's a matter of expectations a reasonable person would have.

Web servers and their software designers have no "reasonable" expectation that a user string is accurate. I say that because people are clearly spoofing user agent string - a lot. Therefore you are foolish if you don't operate within that conception of reality. You're not using reason - you're being unreasonable. If you are hinging security on user agent strings, then your software is designed poorly and should be redesigned if your goal is security.

Before IE rightfully lost its dominance to other superior browsers, there were still some sites which claimed to require IE before allowing you to access certain features even though the other browsers supported those features. This wasn't to protect information, but was to prevent bad experiences for the users of browsers the developers didn't bother developing and testing for. Multiple times in my past, I've told my browser to pretend it was IE just so I could use a site. When people stopped relying so much on the user string, they improved programming standards and started using feature detection. The result was that sites became usable on a much wider range of browsers, and browser competition benefited the public.

Scraping in General, URL-Walking in Specific

Here's a simple and low-cost idea, don't give out information if you don't want others to have it. Scraping collects data which is projected through various HTML templates back towards something resembling its source form. I completely disagree with a right to control THE SHAPE of information (even if I agreed with a right to control the disemination of information one let out of their control).

Guessable URLs are a weakness which customers who value privacy should demand are repaired. When someone finds such an exploit, it's the company and their lazy software engineers who should rightly be blamed, not the messenger. I've had to change my debit card twice, once due to a Hannaford card processing exploit and one due to a TJX card processing exploit. In both these cases, yes, the cracker was an asshole who (most likely) lied to obtain information. But I'm not going to say it was MY information, and don't get me started on how open authorization for credit cards is a horrible idea rivaled only by social security numbers. I was annoyed at Hannaford and TJX for allowing my card information to be "stolen" though I'm not sure how reasonable that emotion is (for instance, it may have been extremely costly to have protected against whatever specific exploit was used). While that doesn't mean I don't have reasons to bring punishment against the cracker, I also have reasons to bring blame and punishment against the negligent.

When I was young and dumb and in college, I work-studied at The UNH Alumni Center. For some stupid reason, the admissions office wouldn't give us electronic records and a long-running task when there was nothing to do was to key in a few pages of the huge stack of paper records they gave us into some application specific to The Alumni Center. When it came my turn to do some work on it, I invoked the F(this) function because manually retyping information is for morons, not to mention it's error prone.

I simply went home and found that the admissions office DID have a site where the electronic records could be obtained (not password protected) and that they (stupidly) used incrementing numeric ids for their records. It was then a simple matter to walk the URLs until there were no more and effectively download ALL UNH admissions information. Since there was a 100% overlap with what needed to be keyed in, I didn't need further filtering.

To avoid corrupting the system, I wrote my "data inputter" to have a dry-run mode and logging, and tested it before it went live. It took 3 hours to run, and I found one error. I tested it again, 3 hours later, it worked. I set it to live and ran it again. Each time I was running it, I went off to do other things. The last time, I went to dinner at Applebee's.

The next day, I charged The Alumni Center for 2 days of work which I guess is kind of stealing, but I still don't feel that badly about it. I told them they didn't need to key in anything anymore because the scraper and inputer had done it.

Scraping is useful.

If scraping is illegal, then Google needs to shut down because it spiders the web. Granted, you can set a robots.txt file to tell spiders what you expect them to do on your site. There's nothing that FORCES a spider to listen to it and some might not either for malicious reasons or because they were lazily designed. A reasonable and pro-liberty developer knows this and doesn't rely on the law to fix problems which are more easily fixed in code.

Before people coded to actual web standards, Google caused a bunch of sites to lose information by walking people's sites. Since many sites had data-modifying endpoints as GET methods (rather than POST/PUT/DELETE), they could be spidered and they ended up being hit deleting the data (here's one example of that).

URL-walking is akin to asking someone a bunch of similar questions. They can choose to ignore you, but after a while it could be considered harassment. If they answer you when they didn't want to, that's their own dumb fault. However, there is a small amount of resources which are wasted in even listening for questions. This is why DDOSing (human-equivalent: harassment) are legitimate crimes in my opinion. However, I demand a reasonable threshold.

Lying is Bad, But Do You Really Lie to Computers?

I'm not a fan of lying. I assert that it violates the NAP (which is very problematic anyways). People have many and strong reasons to discourage lying, one of which is it weakens the beneficial tool of society and indirectly undermines the ability to actualize goals. Even in the case with the Nazis, if that society had valued truth more strongly then the Nazis wouldn't have even gotten to power or wouldn't have been able to demonize Jews as they did.

Lying via computer is a fuzzier concept. There are things which seem like lying or breaking and entering which may merely be being falsely invited and entering.

Property and Rights in a Computer-to-Computer World

My friend told me a story about someone who was going to a party, but got the house wrong. They knocked on the door and the person told them to come in and when they did, the inviter was all "who the hell are you?" The person entering got in legal trouble for that. If that story is true, I consider it an injustice. Maybe the person should have been more careful with getting the address right (though it appears that SWAT teams aren't held to the same standard for no-knock raids), but maybe the person who said "come in" should have been more careful. Or maybe it was an honest mistake that happens so rarely that it's simply not worth addressing - sometimes the best justice comes from doing nothing.

Open wifi hotspots fall into this category. The owner is hanging a "come on in" sign out and other people's computers accept that invitation without THEIR owners' knowledge or circumstantial consent. If that's a crime, then I give up on this legal system.

Scraping being illegal ends Google and pretty much any derivative work OF ANYTHING even if not resold. User agents being illegal ends dressing as a rich person when you're actually poor (yeah, let's bring back Roman sumptuary laws! - though I guess that falls into a similar category of "impersonating a police officer").

URL probing is more problematic, but I still blame poorly-designed software on the server to allow constant probing. Predictable URLs exacerbate the problem and can easily be eliminated in most cases by using GUIDs as your identitifer of choice.

Password-guessing seems much more fraudulent to me because you're lying to the server about who you ARE rather than just asking for a certain set of data (as in the case of URL probing). I think a strong case can be made for that remaining illegal, though people should stop using weak-ass passwords. Especially when there are ways to increase entropy without sacrificing convenience - here and here.

Speaking of "Correct Horse Battery Staple," things like Bitcoin pose problems for the concept of property - at least my conception of how to define property and what role it should serve. Online Bitcoin tool brainwallet.org defaults to an address computed from the phrase "correct horse battery staple." Some people have sent coins to that address which others who are watching that address have snatched up. This led to people crying "theft."

However, the way Bitcoin works is that people don't "own" addresses, they solve puzzles which let them set the conditions of a puzzle that the next person must solve to claim those coins. From one point of view, no one owns bitcoins, they just possess private knowledge of the solutions to given puzzles. Anyone solving a puzzle can claim the coins per the rules of the system... and using the system implies consent in the way it works. Also, that's what you get when you try to eliminate trust from interpersonal relationships, you have to rely on rules enforced by something other than humans and non-humans probably aren't going to discriminate in the same way humans would. In other words, the Bitcoin network's allowing a claim to coins is proof that the claim is valid - theft is literally impossible with Bitcoin because theft isn't a concept computers can understand and protect against. Anything else requires trust or must refer to something which happens outside the Bitcoin network (for instance, gaining access to a puzzle solution fraudulently rather than just guessing).

This isn't to say that people can have strong and plentiful reasons for action to oppose actions taken by "theives" and aren't free to define theft however they like. But it does make them look like assholes to others if they don't appear consistent, and perceived hypocricy is a bad way to coalition-build.

All desires and norms have a cost to make real in the world. I firmly believe that property should be a tool of man, not his master. If property is costly for others to respect, then it'll stop being respected. Given that rights don't functionally exist outside of respect, only affordable rights will end up functionally existing. Information is hard to control and access to information creates a bunch of thorny problems which are hard to justify the same way as other access.

A strong case can be made that IP rights barely exist anymore except in to a legal system which is struggling to adjust to the new reality of people's attitudes. Getting people to stop downloading things is very expensive and works at cross purposes to a lot of other things people desire such as not having the government control and monitor the internet. It's human nature to share information. Making purchasing information easier goes a long way. I no longer "pirate" music because buying it is easy and a fair value exchange. But I can't get Game of Thrones from HBO without buying an HBO cable subscription. Guess what? The Pirate Bay for them, now they get no money and I don't feel bad in the least.

But I digress... I'm getting too caught up on the IP here.

Conclusion

People see what this guy did as wrong and they want to punish him for that. That's fine. That's really all morality is underneath despite all the rhetoric. Far be it from me to worship "consistency Jesus." If people want to say on Monday that enslaving Africans is wrong and on Tuesday say it's right, that's their perogative. However, such inconsistencies won't get you a large coalition and, without people to give others reasons to respect your prescriptions and proscriptions, they are unable to do it in the same way that my car is unable to drive me home when it's out of gasoline.

There are important second- and third-order effects from establishing case law regarding common network security analysis methods from this trial. Intent may matter (it indicates if someone has "bad" desires), but attacking the methods of analysis throws out what I consider the good with the bad.