acharya logo
image
image
image
image
image
image
 
Home --> IITM Software 
Search  
 
About IITM Software

  The set of computer programs developed at the Systems Development Laboratory, Department of Computer Science and Engineering, IIT Madras, India, go by the name IITM Software. The software permits the development of applications supporting multilingual user interfaces in all the Indian languages. With some applications, users will also be able to interact with a computer in their own mother tongue. The IITM software is useful for teaching people about computers and also get trained in their use, directly in their own mother tongue. 

    The distinguishing aspect of the IIT Madras software is that the Multilingual issues are handled at the level of the application itself and so support within the Operating  System is not called for. By taking this approach, the IIT Madras team has been able to provide a useful set of applications for data entry, printing and processing of text on a variety of computer systems. Data prepared on one computer may be moved to other systems without the need for special conversion utilities. 

    Almost all the multilingual developments in the world today, seem to go in the direction of providing support within the Operating System to handle the specific languages of the world.  The IIT Madras team has observed that while such an approach may indeed work, getting a common system to accommodate all the Indian languages is not going to be easy. Providing user interaction on a system by writing applications which depend both on the platform as well as the language is going to result in too many variations across platforms.  Accomplishing the same at the level of an application not only eliminates system dependent programming but will actually provide a consistent and uniform user interface across all systems. 


  The software developed at the Institute may be classified as under. 

  • Multilingual text and document preparation packages under different operating systems .
  • Conversion utilities to handle text files prepared using other applications such as ISCII or Unicode based word processors, transliterated text etc.. 
  • A set of applications to prepare documents for display on the world wide web. These include utilities which can generate graphical images, PDF files, PostScript files etc., from text files prepared using the Multilingual editor. 
  • A set of applications consistent with linguistic processing of text in Indian languages. These include akshara and word frequency count programs, sorting, indexing and concordance generation utilities and search engines for use with the web. 
  • Indian language based command processors which are similar to a DOS or Unix shell.  Users not knowing English may  use these programs and learn to use  computers. 
  • A set of tutorials to help understand the IITM software. 
  • Applications to help the Visually handicapped learn to use computers in their own mother tongue. These applications are multilingual in nature and have text to speech features built into them to provide synthesized speech output in an interactive environment. Utilities for getting Bharati Braille output are also included along with a speech enhanced web browser based on Lynx.
  • A set of web based applications supporting on-line Data Bases (mysql). These applications will be useful for setting up web sites disseminating information in indian languages organized as data bases. 
  • A software development pack for developing multilingual applications using C, C++, Java, Visual C++  etc.,
  • PERL modules to work with Indian scripts so that a variety of text processing applications suited to our requirements could be written.



The IITM software project was begun in 1991 and initially concentrated on developing a uniform internal representation for the aksharas of Indian languages. This approach is consistent with the writing systems which are syllabic in nature, as is the case with all Indian scripts. This led to a system of sixteen bit codes for the aksharas of the languages and the scheme supported more than 12000 different aksharas. Subsequently, a C library, supporting functions similar to the curses library of Unix, was developed for different platforms and a simple editor, viewer and a postscript printing utility were completed. This library included an effective character rendering utility to display the aksharas of our languages using primitives made up of curves.  This approach allowed text to be rendered uniformly on all systems. No fonts were used by the library as there were (and still are) no standards for Indian language fonts on any of the platforms. Besides the Indian languages, the system was also able to deal with Greek, Hebrew and Japanese Hiragana. 

   The first three applications developed using the library were 1) a multilingual viewer for viewing text, 2) a screen editor capable of handling all Indian scripts and 3) a printing utility to generate a Postscript file from the text prepared using the editor so that hard copy output may be obtained.  The multilingual viewer  could be invoked as an external application to view Indian language text from web browsers and email applications that supported MIME attachments. It was a simple but very effective approach to displaying Indian languages on web browsers and it did not require fonts of any sort to be installed at the browser end.  All the three applications were developed for use with many different computers including DOS, Win-3.x, Win95/NT, Unix systems including Linux, Sun workstations, HP systems, IBM RS6000 machines, Silicon Graphics systems and finally the Macintosh. 

   The character rendering program was also able to generate a .gif file of the text to be displayed. This way it was also possible to serve Indian language documents which may be seen on virtually any graphics based browser by generating the .gif file on the fly. A search engine was also developed by adapting an existing Indexing program to work with 16 bit characters and this with the .gif file generation, allowed web based search applications to work with Indian language text.  Samskritapriyah, a  volunteer group in Madras used the software to put up on-line lessons for learning Sanskrit  using this approach and this was received very well.

   The second phase of the development of the IITM software concentrated on enhancing the system to support fonts. This is one of the most complex problems since there is no standardization whatsoever in respect of Indian language fonts.  Font designers had used arbitrary encodings and arbitrary choices for the glyphs themselves in generating the fonts.  As a consequence, Indian language applications were  necessarily font specific and most certainly language specific. 

   The IIT Madras system handled this effectively by developing a layer between the character codes and the font rendering program and this layer used a table to derive the glyphs for any akshara. This way, multilingual text could be displayed merely by switching tables.  The separation between internal representation and display rendering is truly incorporated in this approach which makes linguistic processing very effective. Also any font that has the necessary glyphs to render all the required aksharas could be used for the display.

   By this time it was clear that the clue to developing multilingual interfaces was a language and font independent internal representation.  Having already implemented this as part of the initial design, application development became easier. The font based output also satisfied the requirements of users who were contemplating getting quality printed outputs for publications. 

   By June 1998, the development team took a decision to restrict the development of applications to two platforms only, Linux and Microsoft Windows, since the process of development for a variety of systems was getting to be unwieldy. The MFC based Windows editor is a particularly useful piece of software since it provides Word compatible documents in the .rtf format.  The multilingual editor, together with the a word processor or DTP application can produce truly high quality printed documents. 

Top of Page


   The project taken up at the Systems Development Laboratory represents a unique experiment in system design by continuing the development over a period of more than fourteen years resulting in a set of useful applications for use in the country. During the second half of 1998, the development team worked at enhancing the system to provide support for disabled persons. We have been able to synthesize speech in indian languages from the text representation supported by the software. It has also been possible for us to produce Braille output in indian languages  consistent with the Bharati Braille recommendations. After preparing the multilingual text, the Braille output may be obtained  using any standard Braille embosser connected to the system. These two applications have added strength to the IIT Madras software as they can provide visually handicapped persons a means to accessing information much more meaningfully.

  A number of programs listed above were essentially developed during the period 1998-2005. The lab had set up a web site ( the one you are currently viewing) to present the IITM software to the users and had also included some useful on-line demos with Java applets and search engines. Application development for the visually handicapped continues to receive priority as also text and linguistic processing applications. The PERL modules for dealing with Indian scripts (16 bit codes) should allow easy development of many linguistic applications. A page describing the different applications is also included at the site.

A note about the .llf files

  The syllable level coding which forms the internal representation of text in Indian languages is special to the IITM Software. Each akshara, be it a pure vowel, consonant or conjunct, has a unique sixteen bit representation which is also uniform across all the languages. This representation includes numerals and special symbols used in the writing systems of India.

  All applications developed using the IITM software library use the sixteen bit representation for string processing and the .llf format is just a series of 16 bit codes, similar to pure text in ASCII where each letter goes with a seven bit code within a byte. All programs built around the software, use  the .llf format. The .llf format is a binary format and is not amenable to editing with any other software. The IITM software does include utilities to convert the representation to ISCII, Unicode  and Roman transliterated text and hence one should be able to handle files prepared using other software by converting them to the .llf format.

Top of Page



The IITM Software relates to computer applications which support user interfaces in the different languages of India.

History of the IITM software project






The applications work transparently across all the scripts in use in India. Support for scripts written right to left is also provided.

The Software offers tools for students and professionals to develop their own multilingual applications.

Technical aspects of the IITM software
(The contents of these pages present an overview of the technical issued involved in local language computing)




The approach to handling syllables at the level of an applications as opposed to taking support form the Operating System simplifies multilingual application development. Also, text to speech is easily accomplished, allowing the applications to be used by the visually handicapped.

The IIT Madras syllable level coding scheme is described in a separate page.




As on March 14, 2004 the IITM project has opened up the development of applications to all interested persons. A new project called IMLI (IIT Madras Language Interface) has been started at Sourceforge. Please visit http://sourceforge.net/projects/imli/ for further information



 
Acharya logo
A musical composition written on Palm Leaf. The manuscript dates back to 1790. A full view of the page in the manuscript (in the Grantha Script) is presented in the page discussing the preservation of information in Palm Leaf manuscripts.

Today is Sep. 24, 2017
Local Time: 18 02 35


| Home | Design issues | Online Resources | Learn Sanskrit | Writing Systems | Fonts |
| Downloads | Unicode, ISCII | SW for the Disabled | Linguistics | Contact us |
Last updated on 10/31/12     Best viewed at 800x600 or better