The problem as I see it.....
Where to start? Let me start by telling you that most of the books you read are wrong. The code samples you copy of the internet to do a specific task are wrong (the wrong way to handle a GET request), the function you copied from that work colleague who in turn copied from a forum is wrong (the wrong way to handle redirects). Start to question everything. Maybe this post is wrong this is the kind of mindset you require in order to protect your sites from XSS. You as a developer need to start thinking more about your code. If a article you are reading contains
stuff like echo $_GET or Response.Write without filtering then it’s time to close that article.
Are frameworks the answer? I think in my honest opinion no. Yes a framework might prevent XSS in the short term but in the long term the framework code will be proven to contain mistakes as it evolves and thus when it is exploited it will be more severe than if you wrote the code yourself. Why more severe? A framework hole can be easily automated since many sites share the same codebase, if you wrote your own filtering code than an attacker would be able to exploit the individual site but find it hard to automate a range of sites using different filtering methods.
This is one of the main reasons the internet works today, not because everything is secure just because everything is different.
One of the arguments I hear is that a developer can’t be trusted to create a perfect filtering system for a site and using a framework ensures the developer follows best guidelines. I disagree, developers are intelligent they write code and understand code, if you can build a system you can protect it because you’re in the best position to.
How to handle input
When you handle user input just think to yourself “a number is a vector”, imagine a site that renders a image server side and allows you to choose the width and height of the graphic, if you don’t think a number is a vector then you might not put any restrictions on the width and height of the generated graphic but what happens when an attacker requests a 100000×100000 graphic? If you’re code doesn’t handle the maximum
and minimum inputs then an attacker can DOS your server with multiple requests. The lesson is not to be lazy about each input you handle, you need to make sure each value is validated correctly.
The process should be as follows.
1. Validate type – Ensure the value your are getting is what you were expecting.
2. Whitelist – Remove any characters that should not be in the value by providing the only characters that should.
3. Validate Length – Always validate the length of the input even when the value isn’t being placed in the database. The less that an attacker has to work with the better.
4. Restrict – Refine what’s allowed within the range of characters you allow. For example is the minimum value 5?
5. Escape – Depending on context (where your variable is on the page) escape correctly.
You can make things easier for yourself by placing these methods into a function or a class but don’t overcomplicate keep each method as simple as possible and be very careful and descriptive with your function names to avoid confusion.
HTML context
Lets look at an example of the method above with a code sample in PHP.
<?php
$x = (string) $_GET[‘x’]; //ensure we get a string not array
$x = preg_replace(“/[^\w]/”,””, $x); //remove any characters that are not a-z, A-Z,
0-9 or _
$x = substr($x, 0, 10);//restrict to a maximum of 10 characters
if(!preg_match(“/^a/i”, $x)) {//this value must only begin with a or A
$x = ‘’;
}
echo ‘<b>’ . htmlentities($x, ENT_QUOTES) . ‘</b>’; //escape everything according to context of $x
?>
You might be wondering why I used (string) in the code above. Lets try it without it.
Using the following: test.php?x[]=123
Results in: “Warning: substr() expects parameter 1 to be string, array given”
Because of the PHP feature which allows you to pass arrays over a GET request you can create a warning in PHP over unexpected type when trying to whitelist the value. Using type hinting ensures you get the expected type.
Great so we now understand how to restrict and escape a value. Lets look at another context.
How to handle input
When you handle user input just think to yourself “a number is a vector”, imagine a site that renders a image server side and allows you to choose the width and height of the graphic, if you don’t think a number is a vector then you might not put any restrictions on the width and height of the generated graphic but what happens when an attacker requests a 100000×100000 graphic? If you’re code doesn’t handle the maximum
and minimum inputs then an attacker can DOS your server with multiple requests. The lesson is not to be lazy about each input you handle, you need to make sure each value is validated correctly.
The process should be as follows.
1. Validate type – Ensure the value your are getting is what you were expecting.
2. Whitelist – Remove any characters that should not be in the value by providing the only characters that should.
3. Validate Length – Always validate the length of the input even when the value isn’t being placed in the database. The less that an attacker has to work with the better.
4. Restrict – Refine what’s allowed within the range of characters you allow. For example is the minimum value 5?
5. Escape – Depending on context (where your variable is on the page) escape correctly.
You can make things easier for yourself by placing these methods into a function or a class but don’t overcomplicate keep each method as simple as possible and be very careful and descriptive with your function names to avoid confusion.
HTML context
Lets look at an example of the method above with a code sample in PHP.
<?php
$x = (string) $_GET[‘x’]; //ensure we get a string not array
$x = preg_replace(“/[^\w]/”,””, $x); //remove any characters that are not a-z, A-Z,
0-9 or _
$x = substr($x, 0, 10);//restrict to a maximum of 10 characters
if(!preg_match(“/^a/i”, $x)) {//this value must only begin with a or A
$x = ‘’;
}
echo ‘<b>’ . htmlentities($x, ENT_QUOTES) . ‘</b>’; //escape everything according to context of $x
?>
You might be wondering why I used (string) in the code above. Lets try it without it.
Using the following: test.php?x[]=123
Results in: “Warning: substr() expects parameter 1 to be string, array given”
Because of the PHP feature which allows you to pass arrays over a GET request you can create a warning in PHP over unexpected type when trying to whitelist the value. Using type hinting ensures you get the expected type.
Great so we now understand how to restrict and escape a value. Lets look at another context.
Script context
When not in XHTML/XML mode a script tag does not decode HTML entities. If you have a value within a variable inside a script tag, question is what do you escape?
example:
<script>x=’value here’;</script>
Inside a JavaScript variable like this you have to watch out for the following ‘
and </script> using these vectors it’s possible to XSS the value.
The two examples are listed below.
Vector 1: ‘,alert(1),//
Vector 2: </script><img src=1 onerror=alert(1)>
The second example requires no quotes and a lot of developers assume it won’t be executed because it’s still inside a JavaScript variable, this is clearly wrong as it executes because the browser doesn’t know where the script begins and ends correctly.
OWASP Newsletter [ November 2011 ] 5
To escape a value inside a script context you should JavaScript escape the value.
The best way of doing this is using unicode escapes, a unicode escape in JavaScript looks like the following:
<script> alert(‘\u0061’);//”a” in a unicode escape </script>
You can experiment with unicode escapes using my Hackvertor tool. Please understand how they work as they will be very important to you when understanding how to protect many contexts.
<?php
function jsEscape($input) {
if(strlen($input) == 0) {
return ‘’;
}
$output = ‘’;
$input = preg_replace(“/[^\\x01-\\x7F]/”, “”, $input);
$chars = str_split($input);
for($i=0;$i<count($chars);$i++) {
$char = $chars[$i];
if(preg_match(“/^\t$/”, $char)) {
$output .= ‘\\t’;//don’t unicode escape but using a shorter \t instead. Double escape remember!
continue;//skip a line and move on the the next char
}
$output .= sprintf(“\\u%04x”, ord($char));
}
return $output;
}
?>
It’s very important you follow the same procedure as before (Validate type, Whitelist, Validate Length, Restrict, Escape) for the specific variable you’re working on but this time we will convert our value into unicode escapes. A simple function to do that is as follows:
I’ve purposely designed this function with a few little optimisations missing, for example instead of using unicode you could use hex escapes since we restrict the range of allowed characters, alphanumeric characters are even converted when they could be replaced by their literal characters and new lines/tabs are encoded too when you could use the shorter equivalent. Lets add a line to use a literal tab character instead of \u0009. Why would you want to do this? To reduce the characters sent down the wire.
<?php
if(preg_match(“/^\t$/”, $char)) {
$output .= ‘\\t’;
continue;
}
?>
by : Gareth Heyes
Comments
Post a Comment