While working on VCFPorter (a project that will be out soon), I’ve run into the problem of trying to parse (and fully understand) how javascript handles UTF8. While I thought I had a pretty good grasp on things, having to actually translate UTF-8 Quoted Printable strings (part of the vcard spec) sent me scurrying to Google to look up how exactly to encode/decode UTF-8 strings in javascript. That’s always a good indicator that I don’t know enough about something.
While feeling a little frustrated that I didn’t understand javascript as well as I should have (and binary encodings), I ran into the following great resources (some for the second/third time) that explain encoding, and more specifically UTF8:
http://www.joelonsoftware.com/articles/Unicode.html (Joel makes fantastic articles, this is just one of many)
https://en.wikipedia.org/wiki/UTF-8 (detailed UTF-8 byte-by-byte description)
http://ecmanaut.blogspot.co.uk/2006/07/encoding-decoding-utf8-in-javascript.html (small code snippet with how to translate to UTF8 strings)
http://www.convertstring.com/EncodeDecode/QuotedPrintableDecode (for testing to find out what your string should look like)
Primarily the last one exposed a really awesome trick (API designers hate him!) to decoding a UTF-8 encoded binary string:
`While working on VCFPorter (a project that will be out soon), I’ve run into the problem of trying to parse (and fully understand) how javascript handles UTF8. While I thought I had a pretty good grasp on things, having to actually translate UTF-8 Quoted Printable strings (part of the vcard spec) sent me scurrying to Google to look up how exactly to encode/decode UTF-8 strings in javascript. That’s always a good indicator that I don’t know enough about something.
While feeling a little frustrated that I didn’t understand javascript as well as I should have (and binary encodings), I ran into the following great resources (some for the second/third time) that explain encoding, and more specifically UTF8:
http://www.joelonsoftware.com/articles/Unicode.html (Joel makes fantastic articles, this is just one of many)
https://en.wikipedia.org/wiki/UTF-8 (detailed UTF-8 byte-by-byte description)
http://ecmanaut.blogspot.co.uk/2006/07/encoding-decoding-utf8-in-javascript.html (small code snippet with how to translate to UTF8 strings)
http://www.convertstring.com/EncodeDecode/QuotedPrintableDecode (for testing to find out what your string should look like)
Primarily the last one exposed a really awesome trick (API designers hate him!) to decoding a UTF-8 encoded binary string:
`
``While working on VCFPorter (a project that will be out soon), I’ve run into the problem of trying to parse (and fully understand) how javascript handles UTF8. While I thought I had a pretty good grasp on things, having to actually translate UTF-8 Quoted Printable strings (part of the vcard spec) sent me scurrying to Google to look up how exactly to encode/decode UTF-8 strings in javascript. That’s always a good indicator that I don’t know enough about something.
While feeling a little frustrated that I didn’t understand javascript as well as I should have (and binary encodings), I ran into the following great resources (some for the second/third time) that explain encoding, and more specifically UTF8:
http://www.joelonsoftware.com/articles/Unicode.html (Joel makes fantastic articles, this is just one of many)
https://en.wikipedia.org/wiki/UTF-8 (detailed UTF-8 byte-by-byte description)
http://ecmanaut.blogspot.co.uk/2006/07/encoding-decoding-utf8-in-javascript.html (small code snippet with how to translate to UTF8 strings)
http://www.convertstring.com/EncodeDecode/QuotedPrintableDecode (for testing to find out what your string should look like)
Primarily the last one exposed a really awesome trick (API designers hate him!) to decoding a UTF-8 encoded binary string:
`While working on VCFPorter (a project that will be out soon), I’ve run into the problem of trying to parse (and fully understand) how javascript handles UTF8. While I thought I had a pretty good grasp on things, having to actually translate UTF-8 Quoted Printable strings (part of the vcard spec) sent me scurrying to Google to look up how exactly to encode/decode UTF-8 strings in javascript. That’s always a good indicator that I don’t know enough about something.
While feeling a little frustrated that I didn’t understand javascript as well as I should have (and binary encodings), I ran into the following great resources (some for the second/third time) that explain encoding, and more specifically UTF8:
http://www.joelonsoftware.com/articles/Unicode.html (Joel makes fantastic articles, this is just one of many)
https://en.wikipedia.org/wiki/UTF-8 (detailed UTF-8 byte-by-byte description)
http://ecmanaut.blogspot.co.uk/2006/07/encoding-decoding-utf8-in-javascript.html (small code snippet with how to translate to UTF8 strings)
http://www.convertstring.com/EncodeDecode/QuotedPrintableDecode (for testing to find out what your string should look like)
Primarily the last one exposed a really awesome trick (API designers hate him!) to decoding a UTF-8 encoded binary string:
`
``
These two functions are what did the trick for me (primarily the decoding), and I was happy to find the answer, but think it took too long. Also, there’s quite a difference going from:
=E8=97=A4=E6=A3=AE
to
藤森
Without furthere ado, here’s the code I had to write to accomplish this: