What’s a Unicode Font?

Created: Wednesday, November 14, 2018, posted by Geetesh Bajaj at 9:30 am

Updated: at



1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)
Loading...

If you are confused about what Unicode is, what Unicode fonts are, and where do single-byte and double-byte fonts fit in, then here’s a simple explanation about Unicode and Unicode encoded fonts. About single-byte and double-byte fonts, we will refer to them in this post, and also link to another post.

What is Unicode?

First, let us explore what Unicode is? Unicode is a standard that provides a unique number for every character, known as the Unicode character code. For example, look at this Symbol dialog box in Microsoft PowerPoint. You can see that the character associated with small-case ‘a’ is represented by Unicode character code 0061, as shown highlighted in red in Figure 1, below.

Symbol dialog box in PowerPoint
Figure 1: Symbol dialog box in PowerPoint

Before the Unicode standard, any font foundry could use their own proprietary standards, and even the same foundry could use more than one standard. Predictably, this caused so much confusion.

Now, you can insert any character, even something that you don’t see in your keyboard if you know its Unicode character code. For example, if you know that the Unicode character code for the dollar sign is 0024, you can just type in 0024 in the Character Code box shown in the Symbol dialog box (see Figure 1), and click the Insert button. The dollar sign is inserted next to your active insertion point in PowerPoint.

But you may also hear about another standard called ASCII. Do note that the ASCII standard is a subset of the Unicode standard. The ASCII standard allows fewer characters than Unicode, so Unicode just absorbs all ASCII characters and builds up much further.

What is a Unicode Font?

Although you may imagine that any font that uses Unicode character codes would be a Unicode font, that’s not exactly true. That’s because Unicode standards were intended to cover every character in every known language script. That’s a very ambitious aim, and not one font possesses all the characters prescribed in the Unicode standard. Did you know that the Unicode 8 standard contains 120,737 characters?

But most fonts do not need so many characters. They work happily within the constraints of the ASCII standard. Depending on whether they use single-byte encoding or double-byte encoding, they are just known as single-byte and double-byte fonts.

So are any Unicode encoded fonts installed as part of Microsoft Windows? Yes indeed, Arial, Times New Roman, and some other fonts are Unicode encoded fonts because they surpass the limit of ASCII encoded fonts.

To paraphrase, Unicode fonts are named after the Unicode standard that stipulates a Unicode character code for each character.

Unicode’s latest standards attempt to encompass every known character or glyph in every language and provide each of them with a unique Unicode character code.

Characters are sometimes called glyphs, but there is a small difference between these terms. The important part is that you cannot expect each glyph to be represented in a font that follows Unicode standards. The plain Arial font that ships with Windows contains 3,988 glyphs, Arial Unicode MS has a larger number of 50,377 glyphs. Google’s Noto on the other hand possesses 65,535 glyphs. All three of them are Unicode fonts, but from the number of glyphs that each contains, you can understand that even all Unicode fonts are not created equal.

Are Glyphs and Characters Different?

For all practical purposes, a glyph and a character could be the same, but can’t you already hear purists denouncing this simple explanation? So, let’s get into some detail.

Have you seen fonts that have ligatures that encompass two characters such as “fi” or “fl”, as shown in Figure 2, below? In such cases, “fi” is a glyph composed of two characters, “f” and “i”. That’s the reason, a Unicode font typically has more glyphs than characters.

With and without ligatures
Figure 2: With and without ligatures

Yes, this is not a complete explanation of differences between glyphs and characters, but it does help you understand that they are not the same.

So do all fonts use Unicode character codes? Not at all–in fact, very few fonts do so. All remaining fonts can be further divided into single-byte and double-byte fonts. To repeat once again, the character codes they use are essentially ASCII, but since Unicode built itself over the ASCII standard, you may imagine that these fonts use Unicode character codes. But that’s not the real story!


Thanks to Steve Rindsberg, who helped me make this post simpler and logical.

What’s a Unicode Font?


Related Posts

List of Double-byte Fonts We do have some extensive content on what Unicode encoded fonts are, and how the remaining ASCII encoded fonts are of two types: single-byte and doubl...
Single and Double-Byte Fonts in PowerPoint An Indezine reader, whom I met in person described the PowerPoint double-byte font scare as a poisonous king cobra snake! Although this sounds like an...
Convert Text Boxes to Placeholders in PowerPoint Is it possible to change a text box into a Title in PowerPoint? I've got tons of slides that use text boxes instead of Title and Content placeholders,...
An Update on “Safe Fonts” for PowerPoint  by Julie Terberg There is a lot of confusing, incomplete, and often misleading information out there about choosing fonts in PowerPoint. Spec...

Microsoft and the Office logo are trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries.

Plagiarism will be detected by Copyscape

© 2000-2018, Geetesh Bajaj - All rights reserved.

since November 02, 2000