image
image
image
image
image
image
image
 
Home --> Software Design Issues --> Computing with Indian Languages
Search  
 
Computing with Indian languages
    This is a concept we would like to introduce in the context of multilingual systems.  Computing is a general term which refers to Information Processing, where information is associated with some data, typically a text string, a number, an image and so on.  One writes  computer programs to manipulate data. Computing in Indian languages may broadly relate to computer programs which process Indian Language text strings which may be input  through a keyboard and displayed on a conventional screen display. 

    A primitive approach to computing in Indian languages may be through computer  programs which do string processing on texts of Indian languages much the same way  it is done for the Roman (ASCII) strings.  This way, many existing applications may be adapted to work with text strings in Indian languages. It turns out that this is not a simple task on account of the large set of aksharas that have to be handled. While the older  techniques seem to be well suited for displaying the Indian Scripts, data entry becomes  formidable. Hence, new approaches to handling Indian language text are required. 

   Often people ask questions such as "why not have an Operating System run in Tamil or  Bengali?", just as Arabic or Japanese Windows.  There is no satisfactory answer to this  question. In the first place, building support for Indian languages within Operating systems is not an easy job, especially when one looks for uniformity in use across all the Indian  languages. 

   There is a general feeling among the professionals that it is best to deal with Multilingual  information at the level of the user application.  That is to say, keep the language aspects outside the Operating System.  This way, the Operating Systems are rid of the problem of having to deal with varied character sets across many different languages. While some  persons  point to the success of Unicode implementation in Microsoft Windows, it must be clearly understood that the system kept away from Indian languages for an important  reason. Unicode is just not right for linguistic processing with Indian Languages and is too complex to handle even for one Indian script, much less for all the scripts in a uniform way.  For all practical purposes  Unicode has retained only the eight bit coding structure (actually only 128 codes) for all our scripts which is really the bottleneck in handling the aksharas. For efficient string processing, it is necessary to work with the basic linguistic quantum of our languages which happens to be not a letter of the alphabet but an Akshara, which is actually a syllable. 

    We have a short discussion on the problems arising out of the current implementation  of Unicode for Indian languages, or for that matter the ISCII code itself, the basic standard  that led to the Unicode representation for our languages. 

   Any meaningful approach to computing in Indian languages must provide for unique  identification of the full set of aksharas, through fixed length codes and also provide a  uniform approach to data entry of the aksharas across all the languages. Any other approach will suffer from incompatibilities between the user interfaces across different  Indian languages. While solutions specific to a language may indeed be feasible, one is looking for solutions which can be used all over the country.

    For the present, and at least for some years to come, it is best to handle the problem by  writing applications which handle Multilingual information directly, that is to say that the Operating System's support should not be sought as there is no consensus among the professionals on what this support should be.  The total arbitrariness with which data entry in Indian languages has been handled, not to speak of the lack of uniform representation across the languages, makes it necessary for us to take a serious look at  standardization. The problem is further compounded by the individual approaches taken by vendors who seem to think that the concept of the language kit with an Operating system  will solve the problem. 

   The phonetic base and the concept of the Akshara is unique to the languages of India.  It is best to deal with the problem of computing with Indian languages by first understanding  how aksharas have been used in our ancient as well as modern texts. This will give us an insight into the linguistic base of our languages, allowing us to come up with a universal  approach to dealing with all the languages of the country.



A sample of applications in
Indian languages supported by the IITM Software is shown below. The links will take you to pages describing the applications. A summary of applications is also available
 
 

Text Preparation

Linguistic Processing

Web Page creation

Electronic Mail

E-books

Data base applications

Search Engines

Text to Speech applications

Bharati Braille



 
Acharya Logo
A beautiful view of the hillside in the morning mist. The scene is in the Himalayas.

Today is Nov. 20, 2017
Local Time: 19 39 37

| Home | Design issues | Online Resources | Learn Sanskrit | Writing Systems | Fonts |
| Downloads | Unicode, ISCII | SW for the Disabled | Linguistics | Contact us |
Last updated on 10/26/12     Best viewed at 800x600 or better