![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Home --> Multilingual Applications |
Application areas for the IITM Software. The primary aim of the project taken up at the Systems Development Laboratory has been to develop a system of computer programs which permit a uniform and language independent approach to designing computer applications supporting user interfaces in all the Indian languages. |
Generating
indexes and
Software for the Visually Handicapped Email and Internet applications |
Multilingual Document preparation The IIT Madras editor program may be readily used to prepare text in all the Indian scripts. Text prepared using the editor may be imported into other applications such as word processors (e.g., Microsoft Word). Also the text may be quickly converted to the html (as well as PostScript, PDF ) format for display using standard web browsers. Seen below are screen shots of the early version of the multilingual editor which uses curves to draw the characters on the screen and the recent version of the same which uses fonts. ![]() The editor supports a very rich set of aksharas including many not covered by standard coding schemes such as Unicode or ISCII. A point to keep in mind is that almost any desired representation for an akshara can be provided dynamically through externally specified parameters. Printouts of high quality may be produced via postscript or through the word processors into which the text is imported. Data entry is natural and uniform across all Indian languages. |
The Multilingual Editor from IIT Madras is a versatile application that supports a uniform user interface for data entry in all the Indian scripts. Scripts from South Asian Regions are also accommodated. As of July 2002, a special version of the editor handles Urdu, Arabic and other scripts written right to left. The multilingual editor can be downloaded free of cost from this site. |
Automatic transliteration across all Indian languages Applications which require the same text to be displayed in two or more languages simultaneously may be easily handled with the editor. Data entry need be effected for just one script and the text may be reproduced in other scripts automatically. In the example below the first couplet was entered in Devanagari using the multilingual editor. All the others were automatically transliterated and added. ![]() Incidentally, this a couplet in Arthashastra that specifies the physical proportions which must be adhered to in erecting a pillar. Some food for thought for Civil Engineers! |
Transliteration with the IITM software results in the most appropriate representation in the script for sounds that form the basis of the set of Aksharas supported in the software. Sounds present in one language but not in the other are represented in the second case through shapes that are accepted as close equivalents. |
Generating indexes and concordances for words The IIT Madras software includes programs for indexing texts so that concordances may be generated for the words in the text. The index generated may also be sorted to yield meaningful word indexes for further study of the manuscripts/text. Seen below is a portion of the concordances generated for words in Tirukkural which consists of 1330 couplets. Each couplet consists of seven words and altogether there are about 6500 words in the text of Kural which are distinct. Shown are the words in sorted order with the number of the "Adhikaram" and the number of the couplet itself. ![]() It might appear that the sorting order is not maintained strictly in respect of the word seen on the last line. This is not a bug in the sorting algorithm. The space character in the last line has a code value higher than any Tamil akshara. Hence it falls after the other three words though one would expect it to be placed before the three. When text has to include letters which are not really part of the aksharas, it becomes necessary to treat them differently. In practice, it would be easy enough to handle the space before sorting the string. Likewise, the IITM coding scheme sorts the true consonant (i.e., a pure consonant without a vowel) by placing it after the vowel combinations. This is matter of choice. The algorithm could equally place it at the beginning. Visitors who have learnt Tamil will appreciate this aspect of ordering the aksharas. |
(Proper sorting order for Indian scripts is maintained in the utilities) |
Displaying Roman (ASCII) transliterated text For many years Roman transliteration had been used to represent text in Indian languages. The conversion utilities in the IIT Madras software may be effectively used in converting these texts into the form suitable for display in different scripts. For instance the line shown below may be converted to give the display in Devanagari which follows the line. This is very useful for scholars preparing manuscripts in Indian languages where they may not be very conversant with the scripts. They can type in the text in English and get it converted automatically. The IIT Madras software includes a utility known as "tconvert" to view ascii text prepared according to some transliteration scheme, in the desired Indian script. Several transliteration schemes are handled by the utility. ![]() Incidentally, the IITM Software will also permit text in English to be transliterated into Indian scripts. The words in the given text are converted into appropriate phonemes and displayed in Indian scripts. Who will benefit from this? Apparently some of our politicians! ![]() Back to contents |
a utility to convert Roman
transliterated text into local scripts
Online transliteration Service (Use this service to get your own copy of text in different Scripts) |
Linguistic Applications The string processing library may be effectively used to perform sophisticated string processing in Indian Language texts. Calculations involving word frequencies, number of occurrences of specific characters, conjuncts etc. may be done very easily. Lexical analysis, parsing of sentences may also be performed with substantial ease. |
The page relating to Linguistics and Computation
has details on the utilities provided by IIM for linguistic processing
of text prepared using the Multilingual Editor.
Of special interest to linguistic experts will be the frequency analysis utilities for tabulating the frequency of occurrences of aksharas in a corpus specific to a language. Results of frequency analysis of text from Bhagavadgita, Kural and Tevaram are presented as examples. |
Educational aids in Teaching languages and Science The multilingual capabilities of the system may be effectively used in teaching one language through another. Added to this, the ability to setup web pages makes the system specially attractive to designing computer based training material for use in schools and educational institutions. The link at the right takes you to a sample lesson on Trogonometry (Pythagoras Theorem) in Hindi . The on-line lessons made available at this site for learning Sanskrit stand as excellent examples of educational material prepared using the IITM multilingual software. |
|
Development of Multilingual client applications Large scale resource sharing across computer systems has been rendered easy on account of the concept of client server applications. The fundamental principle behind a client server application is that the user interface to application is separated from the actual processing of the information. The IIT Madras software, with its library of string processing functions is well suited for developing applications which make use of Indian language user interfaces. Such applications find use with databases, searching though archives of information, on line references etc.. |
On-line reference with script
display in Different languages. The codes are maintained in a mysql data
base and can be queried to yield post office, district and state
Sanskrit Dictionary Use this free online reference to the Monier Williams dictionary. This presentation is an excellent example of a search application in Indian languages. |
Applications in the study and preservation of old manuscripts One of the important applications of the IIT Madras software relates to manuscript preservation. Rare palm-leaf manuscripts which are preserved (and should be preserved) are currently being transcribed manually. Many of these manuscripts were written in scripts which are no longer in use e.g., the scripts in copper plates belonging to the early Chola period in South India .
Back to contents |
An exposition of the methods which may be used to display old manuscripts and provide search capabilities to a client application to locate specific manuscripts. |
Preparation of Manuscripts and documents containing text with Vedic symbols or special notation The multilingual editor is well suited for preparing and displaying Devanagari text containing vedic notation. Four Vedic symbols are provided - Anudatam, Swaritam, Dheerga Swaritam and Kampitam (Kampa). These marks occur above , below or both above and below an akshara. These symbols are usually found in printed texts of the Rig Veda and Yajur Veda. ![]() Shown below are the beginning lines of the portion of SamaVeda referred to as "Ouhagana". The text is shown in Grantha using fonts developed at IIT Madras. ![]() Back to contents |
(A brief introduction) |
Preparation of text with music notation for South Indian classical music The rich character set supported by the IIT Madras software permits music notation to be handled, in respect of South Indian (Carnatic music). The symbols included here correspond to the twelve different notes or swaras in five different octaves. Also supported are symbols to specify Kampitam, Jaru and the duration of the note from the eighth of the interval to one fourth and half of the interval. Symbols live Tala marks are also supported. The music notation is based on the recommendation made by Dr. Sambamurthy in his book series on South Indian music. Currently, music notation is supported only for Tamil and the "iitmtam" truetype font. |
The development team has not pursued this application seriously, for musicologists and musicians seem to have widely differing views on the subject. |
Software for the visually handicapped During the year 1998, the IIT Madras software has been enhanced to include support for use by the visually handicapped. A special version of the Multilingual editor editor has been developed which features text to speech output and appropriate audible responses for almost all the selections in the menu items. Characters are spoken as they are typed in and so are words and whole lines besides the full text of the document itself. A visually handicapped person will be able to use this editor meaningfully for quick and effective data entry both in the vernacular and in English. |
Multilingual data preparation application for use by the visually handicapped.
Free screen reader application
adapted to work with Windows and Linux
Text based web browser enhanced to support screen reading features |
Braille output in Indian languages Another useful application of the IITM software is in generating Braille output in Bharati Braille, the adaptation of the six dot system for Indian languages. Multilingual text prepared by the IITM editor may be instantly converted into Bharati Braille and embossed on a standard Braille embosser either on-line or off-line. Hence lessons in Braille may be prepared and distributed in the form ready for embossing. |
Useful introduction to the Braille standard in India Online services for the Visually Handicapped Use these services to gain access to school and college text books prescribed for different classes and educational programs in major universities. Specific educational institutions may also use these services to get documents printed in Braille. |
Setting up Web pages (and web sites) catering to Indian Languages One of the most important applications for the IITM software relates to working with Indian scripts on the world wide web. The ability of the software to support data entry in a uniform manner across all the languages allows quick and effective means for setting up multilingual web pages. The host of utilities available with the software permit interactive web pages to be setup as well, through java applets. Utilities to present Indian language text in the form of Images or PDF files have far reaching consequences in respect of making Indian script viewable on the web without need for special software, fonts or viewers. |
|
Email and Internet applications in Indian Languages Email is one stable application which has retained its simplicity and elegance, even as other internet applications have become sophisticated and complex. The development team responsible for the IITM software realized the importance of this application early enough to provide support for it. There are two ways of looking at email in Indian languages. |
Email in Indian languages The approach to handling email in Indian languages is based on the use of appropriate fonts and rendering text in html format. Such text is easily displayed if the specified font is available in the system. Most email services in India which allow email in local scripts use this method. While this is useful, the user has to interact with the system in English only since the basic email application continues to be an English based one. |
Perl Modules to work with Indian language text |
Acharya Logo |
Local Time: 15 52 58 |