Using the Multilingual Editor

  The multilingual editor is used for preparing text documents in different Indian languages. The Editor is an application written for use under Linux as well as Microsoft Windows (98/me/2000/XP). It supports a graphical user interface as seen from the screen shot.

   As in normal applications supporting drop down menus, the editor features menus for file operations, editing and selecting the script. The currently selected script and the mode of input are displayed at the bottom of the window along with the line number where the cursor is located. It must be kept in mind that the application is a text editor and therefore word processing features are almost completely absent. Yet, the text prepared by the editor can be taken to a word processor (e.g., Abiword or Microsoft Word) and formatted as desired.

  Features of the editor which are useful for multilingual text preparation are discussed below along with details of use. 

Opening Files

  A new file can be opened by clicking 'New' in the 'File' option in the main menu . The name for the file will be required only when you save the file.

  An already prepared file can be opened by first selecting the 'Open' in the 'File' option of the main menu and then selecting the file by clicking on the filename when a new window displaying the available .llf files comes up (or typing the file name in the text area beside 'File Name' and clicking the 'Open' button). This is the standard feature seen in many Microsoft Windows applications.

   The File Names of some recently opened files can be seen on the menu displayed when the 'File' option is clicked and they can be opened just by clicking on the file name. This is a standard feature seen in many Windows applications.

  In the present Editor, the file names will have to be specified in English. In future versions, it is envisaged that the user will be able to name the files in local languages.

Selecting a Language
  While opening an already existing file, the file is checked for the information about the language in which it was prepared. (Each file features a header that contains self-identifying information about the file.) In case the header is absent, the file is opened in the Default language (which has been set to SANSKRIT). It is possible that a text file conforming to the coding scheme of IIT Madras is prepared by another text processing application. In such a case, the pure text may not have a header. 

  The current version of the Editor will give an error message about the missing header but will allow you to open it if you say yes. It is possible to fool the Editor by taking an arbitrary file and naming it with a .llf extension. The Editor will open the file but you will end up editing a local language text string that may have no meaning for you, just as what you may find in editing a binary file with a text Editor.

  During the editing process, a new language may be selected by clicking on the 'Language' option on the main menu and then clicking on the required language. In case you have added a new language, then choose the 'Other Languages' option in the menu and then choose the language corresponding to which you have added your language in the IITMfcEd.ini file. The screen shot below shows this.

Back to top
On Screen Transliteration
  It is possible to change a portion of text seen in one script into another dynamically. Just select the text using the mouse and open the language menu. Choose the language and the selected text will immediately change to the newly specified script. Please note that this powerful on screen transliteration facility will work properly even for a single akshara. However, it may be difficult to identify exactly where an akshara begins and where it ends if it includes a few zero width glyphs (as in the case of a Matra). So we recommend that you transliterate whole words and do a bit of editing if necessary. Transliteration will be possible only if the selected text is in one single script. If English letters are included in the selected text, transliteration will be effected on that text as well, leading to strange results!

Back to top

Fonts used by the Editor
  The current version of the Editor has basic support for all the Indian Languages/Scripts. The practical use of the Editor is however restricted only to those languages for which the Truetype fonts and the associated .tab files are available. As of May 2001, all the languages/scripts of India including Urdu are supported. In the list below, the names in the second column refer to the names of the True type fonts required for displaying the script corresponding to the indicated language. Where other fonts may also be used, appropriate mention is made in the third column.
Sanskrit 1.2, Sanskrit98, Xdvng
Adhawin, Iweb-kambar, tamnet99_fonts
ltml_manoj, kerala
Basic Editing Operations
  The text can be edited by moving the cursor using the arrow keys and the Page up and Page down keys, and keying in the desired text. While editing, the cursor moves one akshara at a time. That is, once you have typed a full character, you cannot delete only a part of that character. This is because, the character is obtained by combining different key-strokes and all of them are assembled into a single character. This gives you a facility to add vowels to consonants to form a full character even in the middle of the text. This feature is consistent with the observation that the internal representation of text is in syllable form.

  Whenever you move on to a different line, the cursor may move to a place not directly above or below the old position and in case there is no text on that line, then the language is set to the default language. In case there is some text on that line, then the language is set to the language of that text. A general recommendation is to type in no more than fifty characters per line though many more can be typed in. The language can be set by selecting the required language from the Languages option in the main menu. Other editing options provided are, Cut and Paste, Search and Replace and inserting text from other files.

Mixing many Languages
  A line of text in the Editor can have many languages, which can be selected from the 'Language' option in the main menu. A general recommendation is to use around two or three languages per file. The languages can be selected in any order. The file having multiple languages is saved in a manner where the languages are preserved when it is reopened in the Editor or browser. You may save a file in a different language but the change will apply only to the default language used in the entered text. Please see the section on "saving as".
Cut and Paste Options
  This option for cutting and pasting of text is useful for generating text that has many repeated parts. The desired text to be replicated can be selected by dragging the mouse on it (keeping the mouse button down as the mouse is moved). The Select All option in the Edit menu of main menu can be used to select the whole file. By selecting the Copy option in the Edit menu the selected text is copied into a buffer in the memory. After placing the cursor at the desired target location the Paste option in the Edit can be used to replicate the text at the desired location (any number of times). 

  The only limitation in this method is that whole lines of text must be cut and pasted at a time. Another file can be pasted using the insert file option in Edit menu. The cut/copy and paste operation can also be used to take text into other Windows applications such as Word, Excel, Outlook Express (email composer window) or even the Microsoft instant messenger. This is a very useful feature of the Editor. The copy paste operation combined with the on-screen transliteration feature can save hours of work in preparing multilingual documents where the same text is seen in different scripts. Please note that the cut and paste operation works with full lines only. This is currently a design limitation.

Back to top

Search and Replace Options
  When this option is selected, a new window appears on the screen with three fields. The first is for the language specification and the second is for the input string. The third field shows up the string in local language as letters are typed into the second field.

  The string to be searched should be typed in the text area where the keystrokes are echoed in Roman. However this string is dynamically transformed into the chosen script , and is displayed in the text area below the former. Then 'Find' button should be clicked to locate the text. 'Reset' will clear the search string. The search can be repeated by pressing F3 key or by clicking on the 'Find Next' option in the Edit menu in main menu for searching the text in the entire file.

  It must be borne in mind that this somewhat different approach to inputting the search string is caused by some restrictions imposed by the design of the Editor, which relies on Microsoft foundation classes. It turns out that there is no easy way to accept a string in local language for the find option. Hence the input string is really presented as an ASCII string, which however is processed to display the equivalent local language characters. Since the input has to conform to ASCII conventions, use of the Ctrl key while forming conjuncts will give some problems, as the control key will be interpreted differently. To avoid this, the user should use the "^" key to indicate combinations while entering strings with conjuncts. Please note that the use of the carat key will cause some problems for Vedic accents where the carat key has been assigned a specific function!

  Text can be searched and replaced using the Replace option in Edit option of the main menu. Select a language for the text to be searched and replaced. Type the text to be searched in the text area in English, which is dynamically transformed into the language selected in the test area below. Then type in the 'Replace with' text area the transliterated text in English. Click on the Replace button to replace the first occurrence of the text or on the "Replace All" button to replace the text in the entire file. This option is useful for replacing wrongly keyed in text by the correct one at a later point of time.
The search and replace operation is not yet supported in the Linux version (Jan. 2003).

Back to top

Inserting Text

  Text can be inserted at a point in a file by the Cut and Paste option or by simply moving the cursor to that point and keying in the text. This option also allows the user to add text from another file. This can be a very useful option in association with the Saving and Saving as option. A file can be typed in a language and can be saved as a file in another language using the Save As option. These two files can be concatenated into one file by using the Insert file option in the Edit menu. Using this option an important file can be converted into any of the languages provided, and concatenated into one file having many languages in a matter of few minutes! The sample Vande.llf file was prepared this way. The same file could also have been prepared using the on-screen transliteration feature. In the current version of the Editor, new data from another file may be inserted at any selected line and not merely at the end.

  Please note that cut/paste and insert file operations work with whole lines only. This is a limitation in the current implementation of the editor.

Saving and "Saving As" Options
  An interesting feature of the IITM Editor in that the data input during the editing process is retained in the memory of the computer in two different formats simultaneously. Hence the file may be saved in either of the formats. The first of these is the .llf format in which the characters of different Indian languages are stored as 16 bit codes. The .llf format is universal in the sense that it is a language independent representation of the text, which allows automatic transliteration across different languages. The .llf format is also recognized by IITM software running on other computer systems such as UNIX machines, DOS machines and the Macintosh.

  The .llf format is a compact format where each character, be it a vowel, consonant, conjunct or combination, occupies two bytes. This format is a BINARY format and when the Editor saves the text in this format, it produces a binary file consisting of 16 bit codes. The Editor attaches a header to the file when it is saved. This header consists of specific Multilingual information relating to the contents of the file.

  The name of the file may be specified by typing it in, when the window for specifying the file name appears. By default, the Editor will save a file as "untitiled.llf" if no file name is specified.

  The .llf format is very useful for saving the text if further processing of the entered text is required e.g., indexing the words, generate concordances etc., or other linguistic processing. The IITM Local Language library may be used to write applications, which work with 16 bit codes.

  The second format is known as the Rich Text Format (rtf), a standard used by Microsoft to produce documents which may be easily imported into other application software (such as Word Perfect, Microsoft Word or Wordpad). The rich text format consists of purely ASCII text incorporating mechanisms to denote formatting information.

  Text saved in Rich Text Format may be easily imported into applications such as Wordpad, Microsoft Word etc., thus permitting global formatting of the entered text using the features of these applications. Rich Text format can also be easily translated into the HTML format useful for generating web documents. To view a .rtf file generated by the IITM Editor on other systems, the corresponding fonts must be available in the second machine.

  The text saved in the Rich Text format may be directly printed from any application that can handle the format. This is also the preferred way of inserting text in Indian languages into other documents say, prepared using Word or Word perfect.

  It must be remembered that the file saved by the Editor in .rtf format cannot be opened again by the Editor. The Editor will open files saved in the .llf format only. The Rich Text format embeds the information relating to the specific fonts, which must be used in viewing the text. Thus while the .rtf format is truly portable, one would also require to install the corresponding font(s) in their system.

  When you select the "Save as" option the choice of a language in which the file is to be saved as well as the choice of a format are available for the user. Note that a new file has to be first saved using the 'Save' option, before it can be saved using the 'Save as' option.

Back to top

Saving the file in a different language/script

  The Editor works with a universal representation for the characters in all the Indian Languages. Thus when data is entered using one script say Malayalam, the text can be saved so as to identify it as a document in, say Bengali. This means that when you open the saved file it would come up with Bengali as the script.

  What would be the use for such a feature?

  Often when preparing multilingual documents, the same Text (typically a couplet from of Gita or a poem of Tagore) may have to be reproduced in different scripts for people to read them in their own mother tongue. In such situations, one need not enter the same text again and again in different languages. (Remember when we say text in Indian languages we mean text that is phonetically presented). Once the text is prepared in the base language, a copy of it can be opened by the Editor and saved in another language. This new file, if appended to the first one, will produce a new document with the same text in two different languages. The insert text option of the Editor will be useful for appending files or inserting test from one file in the middle of the document being edited.

  Note: The assumption that there is a common phonetic base across all the Indian languages is the basis for this feature. While it may be thought that this should permit any text to be entered in any language/script, one must keep in mind that during data entry, the input is limited to the characters of the specific language chosen. Characters found in other languages but not the one in use in the Editor, cannot be input. Thus the universal representation is useful for uniform display of the phonetic information. Characters specific to one language can be input only in that language.

  Vande.llf, included with the Editor package, is an example of a file prepared by the Editor, which displays the same text in different languages. 

Back to top
Output formats generated by the Editor
  The keyed in local language text will generate output in .llf (Local Language Format) file as well as in .rtf (Rich Text Format) . The details of these formats are described in the "Saving and Saving as" topic. The text displayed on screen is in a format compatible with clipboard based cut/copy and paste applications. This is in the .rtf format.


  The text being edited may be printed using the print option in the File menu. The print preview may be selected to get an idea of the appearance of the printed page. In practice, it would be easier to copy the text into a word processor and print the same after formatting the text to suit one's requirements.

  If multiple printers are installed in the system, the appropriate printer may be selected. Some flexibility is available in orienting the page(s) to be printed.

  The Editor does not permit formatting of the text as in some word processors. However, if the text was saved in the rich text format, a program such as Wordpad or AbiWord may be used to effect the formatting prior to the printing. The application accepting the .rtf file may be used to change the appearance of the page by selecting the fonts, sizes, left or right alignment etc. 


Opening Files

Selecting a Language

Basic Editing Operations

Mixing many Languages

Cut and Paste Options

On-screen transliteration

Search and Replace Options

Saving and "Saving as" Options

Inserting Text

Support Files

Outputs generated by the 

Printing Option