Some distinguishing features included in the Editor

1. Very flexible means of the data entry

The local language files are stored in a language independent format. This allows uniform data entry for all the languages. The Editor allows three different Data Entry Methods.

  • A universal phonetically mapped keyboard for uniform data entry across all the Indian Languages. This mapping covers a superset of aksharas from all the Indian languages.
  • A transliteration based data entry where data is entered in English conforming to a transliteration scheme. Though many schemes are in use today, the popular ITRANS scheme is supported by the Editor.
  • The Editor also incorporates a special Transliteration Scheme, which uses only the lower case Roman letters for data entry. This version will be of use to disabled persons who may not be able to use both their hands during data entry.
  • An input method supporting data entry identical with the manual local language typewriter. This is supported for Hindi and Tamil.

  • (The version for Linux (Nov.2002) supports only the first of the above schemes)

    2. Automatic transliteration across all the languages

    The Document prepared in one language can be automatically viewed in any other language on account of the uniform representation of the text in all the languages. Thus if one enters the text of the `Bhagavad Gita` in Sanskrit, one would be able to view the same in all the languages without error. Aksharas which are present in one language and not in the other will be correctly transliterated using equivalent aksharas in the latter. The Current version of the Editor supports on- screen transliteration allowing users to enter text in one script, make a copy of it on screen using the copy feature and immediately transliterate the copy into any other script.

    3. Truly Multilingual Capabilities
    The Editor will permit the language in which data entry is to be done, to be selected dynamically during data entry. Thus it would be very easy for one to start the data entry in say Tamil, and after entering some text, switch to Gurmukhi and type in an explanation in Punjabi. The software supports more than eleven scripts from which a choice can be made. Also, the Roman script used for English, is also directly supported by the Editor. Unlike applications under Microsoft Windows which require a switch to a different keyboard layout, the IITM Editor allows switching the scripts using pull down menu.

    4. Support for a very comprehensive character set

    The Editor permits data entry and display of complex Conjunct characters (familiarly known as the Samyuktakshars) in an effortless manner and permits Matras (Vowel Extensions) and special symbols to be typed in without any difficulty. These Matras and special symbols will be very useful for a teacher preparing lessons on the writing methods for different scripts. The system supports data entry for sixteen vowels, forty six consonants (being a superset of consonants from different languages) and more than eight hundred conjunct characters which can be formed from the basic set of consonants. Thus the system caters to more than twelve thousand individually recognizable aksharas.

    Besides these, standard Punctuation marks and numerals are supported. These can be entered in the local language input mode directly.

    In this respect, the IITM Editor scores over other approaches based on Unicode or ISCII where the data entry proceeds based on a set of aksharas that in general, do not provide many required symbols which are part of the writing systems in vogue. If required, text in Unicode or ISCII may be converted to the syllable based representation used by the software and thus permit fairly comprehensive text processing.

    5. Support for Vedic symbols and Music notation
    The set of characters supported by the Editor also includes special symbols used in Vedic texts and classical (Carnatic) music. This feature is again very useful for academicians and teachers who would produce printed text illustrating different aspects of Vedic and Music rendering. The Editor will permit Vedic Notation conforming to Yajur Vedic texts as well as Sama Vedic texts.

    6. Support for other World Languages

    One may use Editor to prepare documents in other languages, which are based on the Phonetic writing system. Sinhalese, Bali, Hebrew, Greek and Japanese Hiragana are supported along with a few other scripts. One may also add a new language to the Editor, if the language support files and the font files for the language are included.

    Top



    Languages/Scripts supported by the Editor

    The Multilingual Editor is based on the concept of "syllable based coding" or simply, a representation that corresponds to the sounds of the Aksharas of Indian languages. This way, any language whose writing system follows a phonetic approach may be accommodated into the set of languages supported by the Editor. It is this representation which has helped us develop an enhanced version of the Editor which can speak the text during data entry and thus help visually handicapped persons use a computer in their own mother tongue. The syllable based representation is very useful for transliteration across the languages which are based on the same set of sounds (phonemes).

    The Editor supports text preparation in all the official languages of India. All the official languages are covered in about ten Scripts, even though more than one script may be used for a language.

    The scripts are, Devanagari (for Hindi and Marathi besides Sanskrit), Gujarati, Gurmukhi, Bengali (and Assamese), Oriya, Telugu, Tamil, Kannada and Malayalam. Urdu is also a national language though its writing system is quite different. A special version of the Editor caters to Urdu.

    The latest version of the Editor includes support for preparing text in Bharati Braille including Nemeth codes so as to permit direct printing of Braille Documents on standard Braille Embossers.

    The following are some languages having writing systems similar to those of the Indian languages, where the letter or character shape seen in text directly corresponds to a sound. These languages are also supported in the Editor. They are,

    Sinhalese, Bali, Oromo (Ethiopia), Hebrew, Greek, Japanese Hiragana, Avesta and Arabic.

    It is possible to accommodate almost all the South East Asian languages which include Thai, Burmese, Malay, Tibetan and others.

    One might view the Editor as a data entry method for typing in composite letters i.e., characters which are built up from multiple shapes. A syllable is precisely that. Text involving accent marks may also be viewed as consisting of such characters. Hence the Editor may also be used for data entry of such characters. It will be of help in preparing text in the International Phonetic Alphabet.

    Top



    Application Areas for IITM Software

    In the earlier section it was mentioned that the IITM Editor not only serves as a simple and useful word processor but also produces a file which may be processed using the text processing functions provided by the local language library. The local language library forms the application programmer's interface for handling text strings in Indian scripts.

    Given below are the typical application areas where the text prepared by the Editor would find immediate use.

    1. Text processing for linguistic analysis

    There are countless possibilities for linguistic analysis with respect to Indian languages. The text prepared by the Editor may be analyzed for

    The Grammatical Structure of sentences.

    Analysis of the Frequencies of words and aksharas in ancient texts and manuscripts.

    Analysis of the metres used in poetry and automatic identification of the "Chandas"

    Concordance generation

    General linguistic analysis e.g., Morphological studies, Parsing etc..

    2. Typesetting Documents
    The documents prepared by the Editor may be immediately typeset using TeX to produce high quality multilingual printouts. The utilities for typesetting are also provided with IITM Software Package. The Rich Text Format files generated by the editor may also be used in Microsoft Windows based Typesetting Applications.


    3. Generating multilingual Web Pages

    The Editor can be used to prepare web pages supporting multilingual text in all the different Indian languages. One can prepare the required HTML documents by directly typing it in with the Editor. The llf2html.exe utility supplied with the Editor may be used to produce the final HTML file. Under Linux a similar utility is available (llf2html). Another useful utility is the converter to generate files in the PDF format, suitable for preparing e-texts.


    4. Importing Multilingual text into other applications.

    The Editor will permit compatible Windows applications to import multilingual text with great ease. The cut/copy and Paste features of the Editor will be of great help in preparing data for applications such as Email, DTP (using Microsoft Word or similar application). The Editor may be used to generate the .rtf format file which the compatible applications may freely import. You can cut and paste into a Microsoft Instant Messenger window and send a message effortlessly.

    Scholars will find this feature very useful for preparing articles in English (using Word) which incorporate quotations or other references to Multilingual text.

    Under Windows, the editor will also generate a .rtf file which may be viewed using Word pad or Word and the specially designed rtf viewer for Linux. The .rtf format is fairly standard and it will thus be possible to exchange documents between Windows and Linux in this format. Of course the .llf file itself could be exchanged since a version of the editor runs under Linux as well.

    The IITM Package also provides utilities for converting the local language documents to Unicode text as well as PDF. Since these two are generally accepted as universal formats, one gets the advantage of using the editor for preparing text for several other applications which are based on these formats. Specifically, the PDF format can be used to prepare e-books in Indian languages.


    5. Preparation of Statistical Tables and Schedules which can be seen and understood in all the languages of India.

    The ability of the IITM software to display a given text in all the different languages makes it ideal for use in the preparations of tables, maps and other announcements involving Names of Places, persons etc. The automatic transliteration capability allows for transparent handling of the information across all the languages.
    6. Direct feature to handle Roman Text.
    English Text (Roman) may be typed into the document very easily without having to change any software setting. A hot key is provided for this (details follow). The "tconvert" (as well as "tview") utility may be used to convert Roman Transliterated text into Indian scripts. The utility allows four different transliteration schemes. This way, many manuscripts which have been prepared in the past using transliteration, may be converted directly into the required Indian scripts.  Transliteration is covered in a subsequent section.

    Top



     

    Contents

    Some distinguishing features supported by the Editor

    Languages/Scripts supported by the Editor

    Areas of application for the Editor