tomasdev

web development handcrafted

JS and PHP Unicode characters map

17 May, 2011. Written by Tom Roggero

From wikipedia:

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems

So I was wondering how Spanish characters (such as á, é, í, ó, ú, ñ) could be set as value of an input through Javascript (I guess you can use this for other elements such as below). Having the next HTML, I've tried (using jQuery):

 
<input type="text" id="name" />
<p id="paragraph">Some random text that will be replaced
 
 
// document ready
$(function(){
    // set input value with special characters
    $("#name").val("Tomás");
    // set paragraph "Hello, how are you?" in Spanish
    $("#paragraph").val("Hola, cómo estás?");
});
 

Which obviously, didn't work. It could had been something related to encoding of the file (such as ANSI versus UTF-8 problems)... Anyway I found a global nice solution: use unicode!

 
// document ready
$(function(){
    // set input value with special characters
    $("#name").val("Tom\u00E1s");
    // set paragraph "Hello, how are you?" in Spanish
    $("#paragraph").val("Hola, c\u00F3mo est\u00E1s?");
});
 

It is important to mention that \u escaping will not work with single quotes strings. You can use it with alert() or whatever you need on JS with non-English characters (Latin languages shares some of them, like French, Portuguese and Spanish does).

If that is not enough for you, character assassin, I will give you more. After a while, I was just checking if there is something similar to be used in PHP. Why? What was my problem? Sending special chars in email subject or email name. For the email body you can perfectly set content type as HTML, and then use htmlentities() for the encoding... That magic function also works with subject and headers! If you're still having trouble, try to use the third parameter as it might fix your issue:

 
$subject = "Hola España";
$subject = htmlentities($subject, ENT_QUOTES, "UTF-8");
 

You can check the list of some of the Unicode Characters value at wikipedia or see a full, awesome, complete map of all Unicode Characters between 000000 and 10FFFF (remember, they're hexadecimal).

2 Comments /

Mathias Bynens
30/11/2011

It is important to mention that \u escaping will not work with single quote strings.

Huh?

Leave a Reply