4 03 2013
Double encoding: URI and HTML encoding
URL’s have specific characters that are special, like
& that if you need to use as part of your GET URI then you need to encode them. For example:
http://localhost?key=this & that&key2=value2
It’s obvious that this URL is invalid,
this & that has both spaces and a special character
&. In fact, you may have even noticed it’s invalid in your browser:
To get around this you can URI encode your URL’s which will convert special characters to ASCII:
But there is also HTML encoding, which means escaping special HTML characters when you are putting in text into html. For example:
<p>6 > 7</p>
Doesn’t work, since
> is a special character. To get around this, you need to HTML encode your text. HTML encoding uses special characters to indicate escaping of text. HTML encoding the above example would make your text
<p>6 > 7</p>
But sometimes you want to put in some HTML dynamically into a page that also contains a URL. Here you need to double encode (encode the URI and the HTML). The ordering here matters. Let’s take this HTML text as the source:
<a href="me.aspx?Filename=Anton's Document" />
And lets encode it first with HTML then with URI:
html encoded: <a href="me.aspx?Filename=Anton's+Document" /> uri encoded: <a href="me.aspx?Filename=Anton%26apos;s+Document" />
But what if we do it the other way around?
uri encoded: <a href="me.aspx?Filename=Anton's+Document" /> html encoded: <a href="me.aspx?Filename=Anton's+Document" />
See the difference? Look at where the apostrophe would be
invalid: %26apos; valid: '
The first example will give you an invalid URL but the second example is the URL you want. URL and HTML encodings aren’t interchangable, they are used for specific scenarios and sometimes need to be used together (in the right order).