image
image
image
image
image
image
image
 
Home --> Online Resources --> Multiingual Editor 
Search  
 
The Multilingual Editor
Introduction
  The purpose of the Multilingual editor is to allow easy preparation of text in all the Indian languages so that many different applications can utilize the text. An important aspect of the text prepared using the editor is the representation of the text in a form suited for easy and effective linguistic processing. The Editor supports a uniform user interface across all the languages/scripts and allows a number of flexible data entry schemes. 

  The Editor package also includes utilities to convert the representation into formats compatible with other applications. Text prepared using the editor could be taken to Word (or other similar applications) and very high quality printed documents could be obtained.  The main idea behind the design of the Editor is the concept of "One program for all of India". The program has achieved this distinction by supporting Urdu as well, which is included in the list of national languages.

  The version of the Editor described here is meant for use on Microsoft Windows based systems. The version for Linux includes the same features and is discussed in a separate page.

Basic Features of the Editor

 1. Flexible data entry

  Text preparation using the recommended data entry methods may be mastered in just a few hours. Four different data entry schemes are available for all the scripts except as indicated.

  In addition to the above, a data entry scheme recommended and standardized for Tamil (during the Tamilnet99 conference), is also supported.

   seen below is the phonetic mapping scheme standardized at IIT Madras. The script used in the illustration is Devanagari. The mapping accommodates about 58 basic vowels and consonants across eleven languages. The mapping shown covers aksharas from all the languages.

2. Edit large files.

   Text files of large sizes can be handled by the Editor, typically upto 20,000 lines or more in any of the scripts.

3. Dynamic selection of the script

   The Editor is truly multilingual and allows free mixing of all the scripts even on a single line. English letters (i.e., text in English) can always be typed in along with Indian scripts. See the illustration at the beginning where the selection of languages is shown.

Back to contents

4. On-screen transliteration

   Text entered in one script may be immediately converted to another dynamically. In the screen shot shown below, the first line entered in Devanagari has been duplicated using the copy feature and each line dynamically changed to a script of choice. Transliteration is based on the phonetic nature of the languages of India and the Editor permits correct transliteration of Aksharas across all the scripts, using phonetically equivalent aksharas. Thus aksharas not present in a language may also be shown using phonetic equivalents for them. In the screen image below, see how Devanagari is transliterated into  Gurmukhi and Malayalam. It is quite possible that modern Gurmukhi may not show the conjunct in the form shown. The fourth line is in Sinhalese and the same has been transliterated into Devanagari in the fifth line. 

 
Back to contents

5. Cut/copy Paste into other applications.

   The text prepared using the Editor may be pasted into applications such as Microsoft Word, Wordpad, Instant Messenger, Outlook Express and many others.  In essence the IITM Editor allows many Windows applications to be enabled with all Indian languages. One need not therefore, look for Word in Indian Languages with its limited  features in handling the Indian scripts. Seen below are examples of cut and paste. In one case, the text from the editor is copied into Word, where it can be formatted further.  A more interesting application is seen where the text from the editor is copied on to the composer window of Outlook express. Email in Indian languages is just a clicl away from the Editor.

The Editor supports Find/Replace strings in local languages also as the screen image given below illustrates. The keystrokes are echoed in Roman and the text string itself is displayed in a separate window. The language selection is to allow strings to be entered in specific languages.
Back to contents


6. Support for more than 10000 aksharas across all the Indian scripts.

 The Editor allows data entry correctly for many many conjuncts (Samyuktaksharas) across the different languages. Approximately 800 conjuncts are recognized by the editor and each one of these may combine with one of upto 16 vowels to yield the above number. The data entry scheme also permits new conjuncts to be typed in consistent with the rules for the writing system for the scripts.

    In each script, upto 13 punctuation marks and 10 numerals (in their respective scripts) are supported. Traditionally Indian scripts have used few punctuation marks, if any. However current requirements for publishing text in Indian languages presuppose the availability of most of the Roman punctuation symbols.

   Data entry allows for typing in Vedic accent marks in Devanagari and the Grantha scripts. Samavedic accent marks are also supported for Grantha, the script used in South India for writing Sanskrit.

7. Support for Urdu.

   Urdu, Arabic, Hebrew and other Semitic languages/scripts which are written from right to left, are supported in the right to left version of the Multilingual Editor.  There are two versions for the right to left Editor as well. In the first, text generated conforms to the correct sorting order (alphabetical order) for the native Semitic  script. In the second, the text conforms to the sorting order of Indian aksharas. Thus the first version renders linguistic processing in Urdu, Arabic etc., very easy.

Details of the Arabic, Urdu editor are presented in a separate page.

 
Back to contents

8. New scripts.

   The design of the Editor allows new scripts to be introduced without difficulty. The basic principle of the design rests on the concept of the Akshara and the internal representation is the equivalent of the akshara (i.e., a sound). Hence the display of the akshara can be effected in any script through look up tables.  The Multilingual Editor will also accommodate new fonts for any script. the tools for introducing new fonts are included in the IITM package. However, the wide variations seen in the fonts designed for Indian scripts makes it virtually impossible to guarantee that all the aksharas will be properly rendered. The set of fonts recommended by IIT Madras fulfill the requirements for correct rendering of all the aksharas in all the languages. The Editor package includes these fonts. For an interesting discussion on the vagaries of fonts for Indian scripts  please visit the corresponding page.
 

9. Special Versions for Tamil

 There are two specially designed versions of the Editor which conform to the data entry standards recommended during the Tamilnet99 conference held at Chennai. The first conforms to the phonetic standard and the second allows data entry based on a standard manual Tamil typewriter Keyboard. Details are available.
 

Back to contents

Contents

Introduction

Flexible data entry

Edit large files

Dynamic selection of scripts

On-screen transliteration

Cut and Paste

Large set of Aksharas

Support for Urdu

Adding new scripts

Special versions of the Editor for Tamil


Download page



Download and installation instructions are given in the readme file linked below.

Readme File
(To be read first)



Editor Help

HTML document describing the use of the Editor. Also deals with aspects of Indian languages and scripts.



Data Entry

Information on data entry schemes in different versions of the Editor


System Requirements

IBM PC compatibles running Microsoft Windows (98/Me/2000/XP) or Linux

16MB Main memory. 

About 5 MB of hard disk space
(for good performance 32MB of main memory will help)

An SVGA graphics card with a resolution of 800x600 or better.



 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 

Acharya Logo
The morning sun as seen rising behind the holy Ganga.

Today is Aug. 18, 2017
Local Time: 22 06 36


| Home | Design issues | Online Resources | Learn Sanskrit | Writing Systems | Fonts |
| Downloads | Unicode, ISCII | SW for the Disabled | Linguistics | Contact us |
Last updated on 11/08/12     Best viewed at 800x600 or better