What's the significance of UTF-8 charset in HTML?

I don’t understand it yet and would appreciate if someone explained it to me. I see it on editors, but don’t really understand why.

UTF-8 allows for a much broader set of possible characters than ASCII which was the defacto character set for the web in the early days of the 90’s and 00’s.

It let’s international characters be recognized and displayed properly, by having up to a four byte information space for each character. This is sometimes called the “Unicode” character set.

It allows for things like the following which cannot be represented in ASCII.
Ա Բ Գ Դ Ե ԶՀ Ձ Ղ Ճ Մ Յ Ն Շ
Ё Ђ Ѓ Є Ѕ І Ї Ј Љ Њ Ћ Ќ Ў Џ А Б В Г
ਂ ਅ ਆ ਇ ਈ ਉ ਊ ਏ ਐ ਓ ਔ ਕ ਖ ਗ ਘ ਙ ਚ ਛ ਜ ਝ
ༀ ༁ ༂ ༃ ༄ ༅ ༆ ༇ ༈ ༉ ༊ ་ ༌ ། ༎ ༏ ༐ ༑
豈 更 車 賈 滑 串 句 龜 龜 契 金

You can read more about the specifics at these resources.

http://www.unicode.org/faq/utf_bom.html#UTF8

2 Likes

Even if you don’t need to display foreign languages on your site, one practical application is it allows the display of the “curly” single and double quotation marks.





long dashes like —

as opposed to using the plain

"
and hyphen -

If you have a publication website (newspaper, magazine, blog), you’d want the utf-8 encoding.