Wednesday, December 02, 2015

[Browsers] Inconsistent HTML Entities

I fancy that this is old hat to most of you, but I had thought HTML entities cut and dried: An ampersand, a word and a semicolon (reminds me a little of dBase III, actually.) It seems, however, that certain browsers are still unsure as to what constitutes an HTML entity, and where it should end.

One of the languages I use for server-side management uses the ampersand + word + semicolon format to mark where variables may be interpolated into the output stream. So it was with some surprise when it was pointed out to me by a colleague that &currentValue; was being rendered as ¤tValue;.

Being the inquisitive sort, I went to
dev.w3.org and downloaded all the HTML entity names. After a bit of fiddling about in jQuery, I had the list. After a bit more fiddling, this time in JScript, I ended up with an HTML file showing the entity name, the entity as rendered, and then what happens when you leave off the trailing semicolon and append some other text (in this case 'es'). This is where things get ... well ... unusual. Some HTML entites follow the standard. Others like ¤ and £ don't. You enter &poundage, expecting the browser to give &poundage only to have it give you £age. Weirdness abounds. Take · which actually renders as ¢erdotes!

So as to give you an idea of how it looks in your browser, I've put the file up in a
jsfiddle . Let me know how you get on and whether your browser is as compliant is it makes out.

Enjoy!


© Copyright Bruce M. Axtens, 2015