Home --> Software Design Issues --> Unicode --> opentype
Search  
 
Open type Fonts: A discussion
The conceptual basis for the Open type font

  Fonts are used when we display text in a computer application. The glyphs in a font  correspond to the shapes of the letters and special symbols used in the writing system for a language. We generally associate a character code with a glyph so that a text string specified by a series of character codes is displayed simply by horizontally placing the associated glyphs one after the other.


  The text string should contain the necessary linguistic content for it to be processed consistent with the requirements of the application. The glyph string on the other hand, has meaning only for the display and the linguistic content is expected to be inferred by the person viewing the display.

  In most western writing systems, a letter of the alphabet is individually mapped to a shape and so a one to one mapping exists between the characters in the text string and the glyphs in the display. Hence given a text string, the glyph string is obtained by a simple table lookup where the table is kept as part of the font. Each character in the text string is identified with a name and the table merely maps a name to a glyph value. We have seen elsewhere that this is the principle of font encoding. Thus a True type font or a bit-mapped font will have a table inside, mapping the character names to an integer (usually an eight bit value). Displaying text involves the process of graphically positioning the glyphs one after another in a horizontal sequence.

  The situation is different with syllabic writing systems where the displayed text is actually built up by applying the pattern for simple syllables but may associate special shapes with specific syllables. The text to be displayed could indeed be specified in terms of the consonants and vowel in a syllable which are the basic linguistic units in the language. But the desired shape for the syllable cannot be effected by simply placing the shapes for the consonants and the vowel in sequence.

  A font for the language/script which follows a syllabic writing system will have shapes to build up the required display for any syllable but the number of shapes (Glyphs) will be much more than the set of vowels and consonants since the writing system has additional shapes for vowels which occur as part of syllables in the middle of a word. Also, unique shapes for certain syllables will be required. Seen below are some typical syllables and the shapes used to build them.
  The illustration above clearly shows that the one code - one glyph relationship does not hold when the character codes differ from the glyph indices. In fact, many applications supporting the display of Indian language text merely used the glyph codes as the representation (i.e., use the glyph codes themselves as character codes) since they could use conventional font rendering methods to generate the display. The complexity of the application increases considerably when multiple codes have to be mapped into multiple glyphs.

  We note that linguistic processing is not impossible when font glyphs are used for representing the text but the processing would be dependent on the font. In the past there has been no attempt at standardizing a font for any Indian script.

  The usefulness of restricting the input string to contain only the codes for the consonants and vowels is seen when one thinks of linguistic processing. This is in fact the basis on which ISCII and Unicode work. However, with this simple assignment, the onus is on the application to render the text using any appropriate method. Typically a font may be used or one could convert the text into a TeX document and get an output which is typographically superior, or just use an XY plotter to draw curves and thus generate the shapes.

  If a font is used, the application is expected to to have specific knowledge of the glyphs in the font so that appropriate glyphs could be selected to form the display. The application is expected to know the rules of the writing system so as to arrive at the choice of the right glyphs for each syllable. This is a difficult task since it is not easy for an application to actually know what glyphs a given font offers. Even assuming that this can be done, the application would be tied to the availability of the specific font.

  On the other hand, if we pay attention to the rules of the writing system alone but have the provision to find out if a font has the support in terms of specific glyphs, perhaps some degree of standardization is possible. A conventional font cannot offer such a facility since only rendering information is stored inside the font file. Hence the concept of a new font format which can tell us what sort of glyphs it provides, in the context of a given writing system using a specific script.

  Unicode support for applications was conceived with the possibility of providing standardized support for rendering a string of consonants and a vowel. In other words, the issue under consideration is " whether one could incorporate the rules of the writing system into a program" and provide an interface to the application to invoke a specific rule to render a given syllable. It is well known that though the rules are well known, there is enough complexity in the process since alternate representations are permitted in practice. Yet, in principle it is possible to think of a model for generating the shape for the syllable somewhat along the lines indicated below.

A pure vowel or a consonant with an implied "ah" in it would be rendered in its basic form.

A consonant vowel combination would be rendered by adding a matra (ligature) except for cases where unique shapes are specified. (Tamil and Malayalam have special shapes for "uh" and "ouh"). A list of exceptions could be maintained.

A syllable with two or more consonants will be rendered typically using half forms in most Devanagari based scripts and one below the other in the Southern scripts. Special forms would apply for all cases where the shape of the syllable is well known. This set is typically of the order of a hundred and fifty. A full list of specific cases will however be maintained.

Special ligatures for "ra" ( and "ya" in some of the Southern scripts) would be used depending on the position occupied by the consonant in the syllable, i.e., whether it occurs in the beginning, middle or the end of the string of consonants.

By more or less listing all the rules observed in standard practice, one could conceivably code them into a module. Such a module would nevertheless be required to work with a font which has all the necessary glyphs. Also the process of identifying the glyph indices will involve an exhaustive application of each writing rule to a syllable to see if glyphs conforming to the requirements can be chosen for the display. When alternate forms of display are permitted, this becomes an essential requirement. Also, it helps the application default to a very simple form for display, if complex ligatures are not present. The writing system always allows any syllable to be represented using the generic shape for all but the last consonant in the syllable.

The Open type font is a concept where it would be possible to find out whether a glyph satisfying a requirement in a syllable is indeed available in the font. As opposed to a conventional font which only stores rendering information for each glyph, the open type font can also provide information which relates a group of glyphs to a specific requirement. The multiple code to multiple glyphs mapping is essentially what is being attempted with this new font format.

In every rendered syllable, there is some feature in the display that either identifies the presence of a specific consonant or a unique form for the syllable. This "some feature" may be ascertained with some effort.


If each of the glyphs in the font is related to one or more codes, then in principle one could incorporate a table into the font which table specifies the codes to glyphs mappings. Unlike the earlier fonts (True type or bit mapped) where a glyph is related to only one character, this new font called the Open Type font will incorporate features where a glyph would be specified in terms of other glyphs through a process of substitution or relative positioning. For instance


By providing a library of services to an application where the services support querying for specific features incorporated into the font by way of

Alternate forms or representations for a character:
Substitutions for a given glyph string
Positioning information for specific ligatures

  one could in principle implement the rules of the writing system. The application will obtain the required glyphs from an Open type font which supports the required features in an exhaustive manner for all practically encountered syllables.

  It must be emphasized here that an open type font is not required at all for rendering text where a character maps into exactly one glyph. For writing systems which render syllables, the shaping engine which implements the rules could certainly benefit from the availability of an Open type font since it can select appropriate glyphs by querying the services provided by OTLS (Open type library services). With conventional fonts, this querying is ruled out.

  The specifications for an Open type font are quite complex since several tables have to be incorporated into the font file. These tables invariably reflect the idiosyncrasies of the writing system. The documentation provided by Microsoft and Adobe should be adequate for a designer to develop an Open type font. Yet, this is a complex process for most of the scripts merely on account of the large number of glyph substitutions and glyph positioning entries required in practice. In the Mangal Open type font, the one below the other form is pretty much absent for many important syllables. So designing an Open type font is not easy, unlike a regular True type font which may have more or less the same base glyphs. There are tools provided by the special interest groups (VOLT) which give some hints on converting existing True type fonts to Open type.

  A client application supporting Unicode for Indian languages will typically use the Open type Library services provided by Microsoft. This is not without its accompanying complexity though it appears that there is greater flexibility in rendering text since alternate forms could be used. The application must necessarily code into itself the rules of the writing system and use the OTLS to select glyphs matching the requirements.

  Much of this complexity can be reduced by introducing a shaping engine which does the job of implementing the rules and thus isolate the application from the actual rendering. This approach permits a degree of standardization in rendering text but the shaping engine's default behaviour may not offer the required flexibility which conventional practice demands. Microsoft has also provided this shaping engine. It is known as Uniscribe.

The real problem of dealing with Open type fonts.

  When fonts are designed, the basic requirement will be to incorporate enough glyphs to cover all the shapes for the different syllables. One cannot think of a very large number of glyphs since the font will become unwieldy. Moreover mapping of the codes to the glyphs (substitutions) will require very large tables. On the other hand, a smaller number of glyphs might not adequately display all the syllables as per convention. Glyph design is hence a compromise between what would be a minimal set of shapes considered adequate and a set of shapes that will meet all the basic rules of the writing system. Thus the font designer is expected to know how all the required  syllable should be rendered given the constraint on the number of glyphs. In an Open type font it is not merely enough to provide the required glyphs but more importantly identify how the composite glyphs are formed (how a set of glyphs map into another).


To summarize

The shaping engine incorporates the rules of the  writing system for each script and helps select appropriate glyphs from the Open type font. Experience tells us that the rules of the  writing system are not rigid and conventions can vary. If the application must cater to different conventions, a default behaviour may not be appropriate and a parameter based selection of the display shape will be required. This parameter may have to be specified in the context of the syllable under consideration. This is what really complicates the design of the application.

While a desired shape for a syllable may be easily forced by using zero width modifier, the complexity of linguistic processing automatically increases.

The Open type font may not be the right way to go if applications are required to effect efficient text processing and also support an interactive user interface. It is conceivable that the tables we mentioned earlier, which are included in the Open Type font, may actually be brought out of the Open type font and given a standard representation. This way, the shaping engine can work with additional flexibility of dynamically choosing the glyphs from different fonts and thus meeting different requirements. This idea is certainly implementable since table look up is a fairly simple process.

Multilingual Computing- A view from SDL

Introduction
Viewpoint
Writing systems
Linguistic requirements
Dealing with Text
Computing requirements (for India)


Unicode for Indian Languages

The conceptual basis for Unicode

Unicode for Indian scripts
Data entry
Issues in rendering Unicode
Using a shaping engine
Discussion on sorting
Open type fonts


Unicode support in Microsoft applications

Uniscribe
Limitations of Uniscribe

A review of some MS applications supporting Unicode



Recommendations for Developers of Indian language Applications

Using True type fonts to render Unicode Text

Can we simplify handling Unicode text?

Guidelines for development under Linux


Summary of SDL's observations

Acharya Logo
Distant views of the Himalayan Peaks are unforgettable and awe inspiring!

Today is Aug. 16, 2018
Local Time: 15 58 52

| Home | Design issues | Online Resources | Learn Sanskrit | Writing Systems | Fonts |
| Downloads | Unicode, ISCII | SW for the Disabled | Linguistics | Contact us |
Last updated on     Best viewed at 800x600 or better