Aspect of Indian Languages
of India have a common phonetic base. One does not use the term "alphabet"
to refer to the set of letters with which the script is written. Instead,
the set is called "Aksharas". Very Simply, an akshara refers to a sound.
Sounds heard in spoken words are built up from the basic set of sounds
represented by the vowels and consonants of the language.
In all Indian languages,
an akshara is pronounced the same way regardless of its position within
a word, unlike in English where the pronunciation varies widely, depending
not only on the word but also on the location of the letter within the
Also, in Indian
languages, the vowels number between thirteen and eighteen while the consonants
vary from eighteen in Tamil to as many as thirty eight in Telugu and Malayalam.
All the aksharas are therefore built from about fifty basic letters.
indeed possible to use just the vowels and the consonants for writing any
of the languages. This is probably how children are taught a script to
begin with. In practice however, the scripts abound in what are called
"Samyuktakshars", which are the equivalent of syllables and represent sounds
built up from combinations of consonants and a vowel. The writing system
for a language often permits more than one representation (shape) for the
samyuktakshar. Samyuktakshars are often referred to as conjunct characters.
Clearly, when one sees an akshara in print, its sound is fixed. However,
there may be more than one representation for a given conjunct and this
depends on the writing practices followed in a region.
All the ideas
expressed here may well be grasped by studying the Devanagari script in
which Sanskrit and a couple of other Indian Languages are written.
We have an extensive discussion on this in our section on Learning
Sanskrit . The reader is encouraged to look at the material presented
in that section.
It turns out
that when dealing with Indian languages on a computer, one needs a representation
for the aksharas in general and not merely the vowels and consonants. The
akshara is the basic unit or quantum from a linguistic point of view and
computer programs processing text in Indian languages should be able to
efficiently deal with this quantum, built up from two or more basic sounds.
This poses a real challenge as there are more than 13000 individual aksharas
that have to be reckoned and many more which might come into use, if the
common phonetic base across all the Indian Languages is helpful in situations
where language independent information such as statistical data, addresses,
schedules of meetings etc., have to be disseminated in different languages
simultaneously. People who can speak a language but do not know its
script may still be able to read information in that language by
merely reading it in a script familiar to them.
books written in English which deal with text in Indian languages (such
as commentaries on ancient scriptures) used Roman transliteration to help
read the text. In many instances, diacritical marks were added to the Roman
letters to establish a closeness to the aksharas of the language, which
would be difficult to achieve with just the twenty six letters.
between Indian languages is very desirable to help people learn one language
through another. The common phonetic base makes this easy.
Yet, transliteration between the languages will have to be handled with
care, for there are quite a few aksharas which are specific to some languages
but not seen or used in others. For instance, Tamil does not have
the aspirated consonants of Telugu or Sanskrit and reading Sanskrit through
Tamil which is very desirable, is often rendered difficult. Situations
such as these are usually handled by introducing new symbols in the script
of a language to represent via transliteration, characters found in other
here, stresses the need to establish a single coding scheme to cover all
the different aksharas across all the languages of India in order to allow
correct transliteration. In this connection, the use of Roman letters with
diacritic marks does result in a script useful for reading text prepared
in any of the Indian languages. The National Library at Calcutta
has recommended a nice scheme
for Roman transliteration.
In a phonetic script,
written shapes directly correspond to syllables and hence represent sounds.
This is also the
concept behind the akshara. The akshara represents a vowel , consonant
or a conjunct consonant. The written shape of a syllable generally conforms
to the rules followed in the writing system. The writing system rules vary
according to the script but basically the idea is to use 'medial vowel
forms' in writing syllables.
The writing system
rules do permit multiple written shapes for a given syllable. This is required
in practice when the Type used in printing text does not cater to all the
ligatures observed in writing.
The basic set of aksharas
is more or less common to all the Indian languages. It is true that there
are slight variations in the actual set of vowels and consonants for the
different languages. The Southern languages include a short form for the
vowels "e" and 'o" whichis generally not seen in the Northern languages.
Differences with Tamil
Tamil uses a minimal
set of vowels and consonants. The Tamil script has traditionally not distinguished
the middle consonants in each 'varga" and thus the soft as well as the
aspirated forms of the five consonants "ka ca, ta, tha, pa" are written
with out specific distinction. This can confuse the first time reader.
In this respect Tamil cannot be viewed as a strict phonetic script.