Confused why you're here? My name used to be Ben Balbo. I'm now Ben Dechrau (/bɛn dex-raɪ/).

Archive for the 'Security' Category

Will Cloud Computing Violate your Privacy and Security?

According to yesterday morning’s ABC Radio National show, cloud computing will pose a danger to your on-line privacy and security with people able to read your email, see what web sites you’ve visited and reconcile your on-line activities, banking details and buying habits. We’re also going to hear a lot about cloud computing in the coming months because Google have just released their latest product, Chrome.

That’s what I understood from the show. I’m not entirely sure how Chrome fits in to the equation, but I’ll get to that later.

So apparently cloud computing is a system that allows applications to run “in the cloud”* where all data is accessible by Google. The presenters did single out Google but added that other cloud computing providers could also access any data in their part of the cloud.

Experts were also quoted as being concerned about the security of the data in cloud computing environments as, not only does the user need to trust the application developer and maintainer, but any other third party that the application hosting is reliant upon. Currently people only need to worry about the software producers as all data is stored on your local computer.

I think there’s a massive amount of confusion here, or perhaps I’m the one that’s confused.

Let’s examine my view of what cloud computing is: computing power that resides “in the cloud” and isn’t dependent on one piece of hardware. I’ll flesh that out a little.

Sample network diagram* Just a comment of “in the cloud” – in network diagrams “clouds are used to represent networks external to the one pictured for the purposes of depicting connections between internal and external devices, without indicating the specifics of the outside network” [wikipedia]. Generally this refers to the Internet.

In the beginning there were servers. Real, physical boxes that ran an operating system. They would be web servers, database servers, email servers, and so on. Some servers would provide more than one function, offering web, database and email hosting, for example. People had the choice between having their own dedicated (physical) server or hosting in a shared environment where multiple clients’ web sites were hosted on one physical box. The latter option was much cheaper but also provided less flexibility in terms of server configuration for the end client.

Then there were virtual private servers. Imagine a physical server that contains multiple virtual servers. Each virtual server has its own operating system, its own disk space and can run its own programs. This provided the functionality of a dedicated server at a fraction of the cost.

Now imagine having a virtual private server but you don’t know where it is. You don’t have a concept of it residing on a physical server – it’s simply out there “in the cloud” somewhere.

That is, in my view, cloud computing. Removing the “isn’t dependent on one piece of hardware” part of my definition would make any server fit the description of cloud computing.

So why are all these people concerned about cloud computing being such a threat to privacy? Cloud computing will allow web-based applications to scale more readily to demand, so perhaps more web-based applications will be hosted in a cloud computing environment. Perhaps it’s also because Google’s online applications (Docs, Calendar, Reader, etc.) are perceived to run in a cloud computing environment and that Google are the custodians of your data. Together with their Adsense technology, it’s assumed that Google know everything about you.

The dangers are, of course, already there. I use Google calendar for all my appointments, so they know whom I know, where I’ve met them and when all my friends’ birthdays are. My news reader of choice is Google Reader. I use Twitter to share my current actions, feeling, learnings, rants. Technorati and Google Blogs index my blog. I used to use Saasu for all my business accounting and billing. Running these applications in a cloud computing environment is not going to make these data any more reconcilable than they already are.

One example given of the privacy concerns was that people will now be able to read your email and see which web sites you’ve visited. Well, I can (but don’t) read all my clients’ emails – they’re stored on my server. My ISP can see every web page I’ve requested (and most of the time its contents) and probably passes that information to Hitwise. Google Analytics knows a fair amount of where I’ve been and what I like.

Caveat lector: I have not managed to determine what Google’s policies are on data stored on Google’s App Engine. If you know, please add a comment to this post.

In my view this is all hype about nothing. We’re no less secure than we were before. The goal posts have not moved, we’ve just been given a different playing field in which to kick our balls around.

And as for Google Chrome being part of this whole cloud computing thing, it’s a browser! It’s as much part of cloud computing as Firefox, Opera and Internet Explorer are. Sure, it runs Javascript faster, is apparently less likely to crash completely and might be a superior browser when using online applications. It’s also been said that Chrome could be the Google Operating system that was being talked about many moons ago. Chrome is the operating system that provides access to the applications that reside in the cloud. But it’s still just a browser.

Given my near-paranoid tendencies when it comes to security and privacy, should I be worried?

Never trust your users!

Time and again I see people do stupid stuff on the web. I’m talking about the developers. There’s this big fat rule in the world of web development: never trust your users to do the right thing.

This could mean asking the user if they’re sure that the want to delete an appointment from their calendar, checking a provided email address is valid or prompting them to save changes before moving to another page.

These examples are quite trivial though – hopefully nothing super bad will happen if these checks aren’t performed. When it comes to money, however, you really want to make sure you’re double checking everything.

I had a rather interesting encounter with a stupid system that processes tons of financial transactions every day (I assume). It’s an online payment system for a number of Australian services: you can pay your car registration fees, building permit fees, council rates and parking infringement fines, to name a few.

Here’s the first screen:

And here’s the payment confirmation page:

It seems that changing the contents of the price field in the first page alters the final payment amount!

Why the developers thought this was a good idea is beyond me. When dealing with money, or any information for that matter, you should always check the values match what is expected. In this situation, I expected one of two results:

  1. The payment page recalculated the payment amount and charged that amount, rather than the amount sent from the browser, or
  2. The payment page tells the user that the payment amount does not match the bill amount and prompts the user to start the payment process again.

Update: ZenPsycho just suggested the system might intentionally allow users to pay more or less than the required amount. This is a valid point, and perhaps some of the billing system’s clients might like to offer this. I forgot to mention though that the form element for the amount included “readonly” and “disabled” attributes, so if the client chooses not to allow the user to change the values, the system really should enforce the payment amount. At the very least it should warn me that I’m about to pay less than the current amount and ask me to confirm.

How strong is your password?

The Password Strength Checker uses a number of metrics to determine how strong a given password is, including the number of characters in total, uppercase and lower case letters, numbers and symbols. It also deducts points in the event you have numbers only, repeated characters, consecutive same-case letters, sequential letters or numbers.

Having played around with it a bit, it’s great for telling if a given password is strong, but don’t worry too much if it tells you its weak.

Take, for example, the password Ad%U,1q3b. This string was chosen because it causes the report to give exceptional ratings for all positively scoring criteria and a pass for all deductions, resulting in a password of “Very Strong” complexity with a 100% score.

Now take the password Ad%U,1q3bbbb. It receives a “Very Weak” complexity with a 0% score.

I’m not a statistician, but I’m pretty sure the longer password has a lower probability of being found. Am I wrong? That said, it’s still a great tool, and perhaps I need to upgrade my rudimentary in-line password strength checker!

Powerful Cross Site Scripting Scanning Tool

scanEEWeb developers today are increasingly aware of the number of ways that attackers can abuse their site. Not only do we have to worry about someone stealing data directly through our site or from our database, cross site scripting (XSS) attacks provide a mechanism for someone to run arbitrary code on another web site.

During his OSDC 2007 keynote, Rasmus Lerdorf mentioned the scanmus, a cross site scripting scanning tool he’d written. It looks at a page’s source code and identifies potential entry points. In the case where it finds a form, it will submit data in a way to detect a number of XSS vulnerabilities, and report those to the user. Unfortunately, while he plans to make this available to the community, this won’t happen just yet.

Ben Cornwell and I got to chatting during the break and when I suggested we write our own, he didn’t hesitate. I don’t think he quite realised at the time that there wouldn’t be any PHP work involved though.

You see, there’s this tradition at conferences (at least the ones that I’ve attended), that when a discussion or talk at the conference gives you an idea for a product, script or technology, you start on it right away and present it at a lightening talk during the same conference. So we couldn’t just have some lame PHP script parse the resultant HTML and spew it to the browser. That would be too easy. That would be just what they’d be expecting us to do! And you know you can’t take over the world by being predictable.

So we wrote it in HTML and JavaScript. Even the logo! It’s one HTML file.

Now this will work perfectly if the HTML script is placed in the document root of the site you want to test. If you want to test remote web sites though, as we did during the lightening talk, you’ll have an issue with cross domain xmlhttprequests. So for the demo we had a simple proxy helper that would load the remote site. The JavaScript class could then load the remote site’s contents through a local call.

So without further ado, you might all be wondering where you can download this awesome tool. Well, it’s still extremely pre-alpha. It itself has XSS vulnerabilities! It needs to be worked on. But you can still grab the HTML and PHP files if you like.

I’ve already had a fair amount of interest from people who want to help, so if you’d like commit privileges, please let me know. You can check out the trunk in the meanwhile.

Challenge/Response Email Verification

Challenge/Response email verification (CREV) is a mechanism for reducing the amount of spam you get. It works like this:

  • Alice sends Bob an email,
  • Bob uses a CREV system, and this is the first email he’s received from Alice,
  • The CREV system holds the email and sends an email to Alice asking her to reply or follow a link to verify she is a real person before the email can be released and sent to Bob,
  • Alice replies or follows the link,
  • The CREV system adds Alice to a white-list (so she won’t be asked to verify herself again) and releases the email for delivery to Bob,
  • Bob receives the email.

Compare this with normal email systems:

  • Alice sends Bob an email,
  • Bob receives the email.

Looking at it this way, you might think “Cool! So much less spam! And Alice only has one extra step to allow her email to get trough”.

Well, consider my way:

  • Alice sends Bob an email,
  • Bob’s mail server noticed this is the first email he’s received from Alice,
  • Bob’s mail server tells Alice’s mail server it’s currently busy, and could the email be sent again in 5 minutes (this is referred to as grey-listing),
  • Alice’s mail server holds onto the email and resends 5 minutes later,
  • Bob’s mail server accepts her email on the second attempt – any subsequent emails from Alice to Bob will be immediately accepted in future,
  • Bob receives the email.

This method doesn’t require any extra work on Alice’s behalf, and when implemented in conjunction with other anti-spam mechanisms (such as checking sending mail servers against black lists, which I didn’t include in my flow because this can also be used with CREV systems) cuts down spam enormously. For example, I got 1 spam email yesterday.

You might argue that CREV systems would cut that down to zero spam, but this is not the case. CREV will only allow emails to a user from a given email address. If you receive a spam that appears to come from a white-listed address, it will still get through. This is more likely than you might expect, as many spam and virus-laden emails are sent through spyware applications that email users in the infected person’s address book, which means they come from someone you know. Neither grey-listing nor CREV systems will stop this type of spam.

So what’s wrong with CREV systems? In my opinion, it’s a poor implementation due to the challenge/response requirement that the sender must take action to ensure the email gets through. Imagine a scenario: you’re at the airport, your flight is about to leave. You have to email a document to a client that you haven’t emailed before, and they require it by close of business that day. You hit send, you shut down your laptop and board for a 16 hours flight, only to get to the end and find the challenge response email. Your email has not been delivered, and the client will not get the document they required until the next business day.

With my implementation, you hit send, you shut down your laptop and board the plane. Your email reaches their server, and they pretend to be busy. Your email is resent automatically by your mail server 5 minutes later. The client gets the document 5 minutes after you sent it. All’s well.

Update

I thought I should explain more about why grey-listing works. In the example above, Bob’s mail server correctly retries to send the email to Alice after the 5 minute period. If Sam the spammer sends Bob (or Alice) an email, his mail server will likely ignore the request to resend in 5 minutes. All Sam wants to do is pump out as many emails as possible before his mail server is black-listed. As the email is never resent, it never gets delivered.

And purely for interests sake, here are some other checks my mail servers perform after the grey-listing process before allowing email through:

  • Check the remote mail server communicates using the correct protocol,
  • Check the remote mail server is not black-listed,
  • Check the email address of the sender is valid,
  • This checks that the sender’s mail server will accept email to this address, not just that the address is correctly formed
  • Check the sender’s computer or gateway is not black-listed
  • Check email isn’t identified as spam using a bayesian spam filter

If these checks pass, the email gets delivered.