If you are confused about what Unicode is, what Unicode fonts are, and where do single-byte and double-byte fonts fit in, then here’s a simple explanation about Unicode and Unicode encoded fonts. About single-byte and double-byte fonts, we will refer to them in this post, and also link to another post.
What is Unicode?
First, let us explore what Unicode is? Unicode is a standard that provides a unique number for every character, known as the Unicode character code.
For example, look at this Symbol dialog box in Microsoft PowerPoint. You can see that the character associated with small-case ‘a’ is represented by Unicode character code 0061, as shown highlighted in red in Figure 1, below.
Figure 1: Symbol dialog box in PowerPoint
Before the Unicode standard, any font foundry could use their own proprietary standards, and even the same foundry could use more than one standard. Predictably, this caused so much confusion.
How do I type in Unicode text?
Now, you can insert any character, even something that you don’t see on your keyboard if you know its Unicode character code. For example, if you know that the Unicode character code for the dollar sign is 0024, you can just type in 0024 in the Character Code box shown in the Symbol dialog box (see Figure 1), and click the Insert button. The dollar sign is inserted next to your active insertion point in PowerPoint.
But you may also hear about another standard called ASCII. Do note that the ASCII standard is a subset of the Unicode standard. The ASCII standard allows fewer characters than Unicode, so Unicode just absorbs all ASCII characters and builds up much further.
Which is a Unicode Font?
Although you may imagine that any font that uses Unicode character codes would be a Unicode font, that’s not exactly true. That’s because Unicode standards were intended to cover every character in every known language script. That’s a very ambitious aim, and not one font possesses all the characters prescribed in the Unicode standard. Did you know that the Unicode 8 standard contains 120,737 characters?
But most fonts do not need so many characters. They work happily within the constraints of the ASCII standard. Depending on whether they use single-byte encoding or double-byte encoding, they are just known as single-byte and double-byte fonts.
How do I use a Unicode font in Word or PowerPoint?
You use Unicode fonts in the same way as you use any other font. You may want to download and install a specific Unicode font, but it’s possible that some Unicode fonts may already be installed on your system. So are any Unicode encoded fonts installed as part of Microsoft Windows? Yes indeed, Arial, Times New Roman, and some other fonts are Unicode encoded fonts because they surpass the limit of ASCII encoded fonts.
To paraphrase, Unicode fonts are named after the Unicode standard that stipulates a Unicode character code for each character.
Unicode’s latest standards attempt to encompass every known character or glyph in every language and provide each of them with a unique Unicode character code.
Characters are sometimes called glyphs, but there is a small difference between these terms. The important part is that you cannot expect each glyph to be represented in a font that follows Unicode standards. The plain Arial font that ships with Windows contains 3,988 glyphs, Arial Unicode MS has a larger number of 50,377 glyphs. Google’s Noto on the other hand possesses 65,535 glyphs. All three of them are Unicode fonts, but from the number of glyphs that each contains, you can understand that even all Unicode fonts are not created equal.
Are Glyphs and Characters Different?
For all practical purposes, a glyph and a character could be the same, but can’t you already hear purists denouncing this simple explanation? So, let’s get into some detail.
Have you seen fonts that have ligatures that encompass two characters such as “fi” or “fl”, as shown in Figure 2, below? In such cases, “fi” is a glyph composed of two characters, “f” and “i”. That’s the reason, a Unicode font typically has more glyphs than characters.
Figure 2: With and without ligatures
Yes, this is not a complete explanation of differences between glyphs and characters, but it does help you understand that they are not the same.
So do all fonts use Unicode character codes? Not at all–in fact, very few fonts do so. All remaining fonts can be further divided into single-byte and double-byte fonts. To repeat once again, the character codes they use are essentially ASCII, but since Unicode built itself over the ASCII standard, you may imagine that these fonts use Unicode character codes. But that’s not the real story!
How do I download a Unicode font?
Thanks to Steve Rindsberg, who helped me make this post simpler and logical.