of the writing systems followed in India will be of help to those interested
in Indian languages. Also, such understanding will be of considerable help
in designing fonts for Indian scripts as well as develop applications which
discuss the writing systems.
Basically, the languages
of India employ a syllabic writing method where a unique shape is used
for each syllable. Such a shape is generally related to the basic shapes
of the consonants in the syllable but over the centuries, conventions have
come to be followed. These conventions differ across the scripts. It is
the opinion of scholars that all the scripts in use have been derived from
Brahmi, the script used during Emperor Ashoka's time (300 BCE). However,
there are sufficient reasons to study the scripts individually since the
variations will help us understand the problems involved in using computers
to deal with the scripts.
The basic quantum for a syllable
is a combination of one or more consonants and a vowel. A vowel or consonant
by itself is also treated as a syllable. A syllable may be logically viewed
Here V represents a vowel
and C a consonant. When a consonant by itself constitutes a syllable, it
is assumed that it has 'a' as the vowel as part of the syllable. This representation
corresponds to the phonetic description of the syllable in terms of the
more basic sounds of the language.
When text is written, the
phonetic representation above does not hold but a syllable is written following
a set of rules specific to the writing system employed. These rules specify
how a displayed syllable should be composed from more basic shapes. The
basic shapes mentioned here are not just the shapes for the vowels and
consonants as one might think but include many individual shapes that relate
to combinations of sounds corresponding to the vowels and consonants.
A basic principle followed
in a syllabic writing system is that one employs 'medial vowel shapes'
for vowels which end the syllable. A medial vowel shape can never be shown
by itself but must accompany the shape for a consonant or a sequence of
consonants. However, the 'medial vowel' representation rule may itself
be violated and special shapes used to represent specific combinations
of a consonant and a vowel.
To understand the intricacies
of the different writing systems, we provide a graphical illustration of
the manner in which a syllable may be composed. Since there are many different
ways of composing the shapes across the different scripts, it will be helpful
to formulate representations for the elements that go into forming the
shape for the syllable. In the illustration below, the elements in squares
refer to shapes that are considered basic shapes in the writing system.
These basic shapes constitute syllables themselves but the set will not
be adequate to specify the full complement of syllables used in the language.
It must be remembered that the writing systems do provide rules for writing
arbitrary syllables but such rules will help mostly for syllables which
are not native to the language. The elements in circles refer to
shapes that get added to the more basic shapes (squares). The description
provided against each shape (basic or added) will provide a clue to the
rules in writing syllables.
with two consonants
The illustration below shows
the different formats for a syllable involving two consonants and a vowel.
Ligature based forms are more frequent for this case though the basic rule
of half forms or one below the other forms apply in general. The medial
vowel form is usually attached to the basic consonant or in some cases,
a consonant-consonant ligature. Medial vowels added to the left get added
before the half form shapes. In Telugu and Kannada, ligatures are used
for some consonants which appear below. In general, one may not be
able to discern the ordering of the sounds associated with the syllable
unless one knows the conventions. In most scripts. CC ligatures are quite
common and in some cases, the ligatures may give no clues to the consonants
they represent. It will therefore be necessary to memorize the ligatures
typically for about 15-30 cases. It also happens that the shapes for the
ligatures differ to such an extent as to require additional medial vowel
shapes for proper alignment.
with three consonants
Syllables with three
consonants (CCC form). Three consonants or more in a syllable is not unusual
though they may account for only about 10% of the syllables in the language.
Half forms are used to advantage in Devanagari. The once below the other
form gets extended to three consonants in the case of Grantha. When three
consonants are involved, ligatures are common when the last consonant is
an 'r' or 'y'. In Devanagari, syllables with 'da' often have ligature forms.
In the Grantha, three consonants written one below the other are not uncommon.
When rendered this way, the spacing between lines will increase and so
fewer lines get accommodated in a page.
of Syllables from different scripts
are examples of syllables from different scripts which conform to the rules
indicated above. It must be noted that these are written according the
standard conventions followed. Telugu, Kannada and Grantha require a number
of ligatured consonants together with properly positioned medial vowels
and this generally makes it difficult to create 8 bit fonts for these scripts.
The specific syllables used in the illustrations are not identified here.
The viewer may refer to the corresponding pages describing each script
Consonants to write syllables
Shape generation for
a syllable is well defined and standardized only for simple syllables made
up of CV combinations. When new syllables have to be handled (typically
from foreign words), one may face specific difficulties in writing the
same, especially if the syllable has a number of consonants in it. In Indian
languages which are based on the principle of root words, one normally
does not see long syllables except in rare instances. The writing systems
generally allow any syllable to be written in decomposed form by just concatenating
the generic consonants and ending the syllable with a CV combination.
The flexibility (or freedom) to show syllables in alternate forms can lead to headaches when string processing is attempted. When alternate forms for a syllable are permitted, the internal representation should reflect the specific desired form (as is the case with Unicode). Algorithms to identify syllables get complicated on account of this.