You are here:Home Development of Technology and Tools
Development of Technology and Tools
Completed Projects :
Standardization of Bangla Fonts and Code
In order to make computer applications, such as Document Processing, E-mailing, Calculations, possible in Bangla, in a way that is interoperable across different computing platforms, it was necessary to establish a standard for encoding the data. SNLTR has established that Unicode 5.0 and above as the standard that can be adopted for different e-governance applications and is also in parity with the international practice and standard. This is in direct contrast with the plethora of non-standard Bangla software proliferating the market that hinders interoperability across different computers. Based on these standardizations some font encoding converters have been developed that are discussed in the next section.
Conversion of Legacy Documents
There is a huge repository of Bangla digital documents that were not prepared in accordance with the Unicode 5.0 standard. These documents have been written in non-standardized fonts, or written in other conventions that are not based on Unicode compliant systems. Hence, any system that is not compatible with any of these fonts or encoding standards could not display Bangla texts properly. To have the benefit of this large corpus of information and data for further use, SNLTR has developed a number of code-conversion software that can convert the electronic version of legacy data to the Unicode 5.0 format.
Based on the standards defined by SNLTR, a Bilingual version of the Linux Operating System has been developed and it has been named as বৈশাখী লিনাক্স (Baishakhi Linux). This operating system is open source and is based on Ubuntu flavor of Linux. Any desktop or laptop can have Baishakhi Linux as the operating system. This operating system will allow any user to be able to do all possible operation and computations that are supported in other Linux based systems. The Baishakhi Linux operating system supports all the available features of office works, such as Document Preparation, Presentation Preparation, Spreadsheet Computation, E-mail facilities, Web browsing etc. Further, these operations can be carried out in a Bi-lingual mode that is, both in Bangla as well as in English. Baishakhi Linux is distributed free.
In order to facilitate data and document entry in Bangla, a number of keyboard layouts have been designed and implemented to suit the needs of different people habituated with different typing practices. The suite of such keyboards is being increasingly upgraded. The salient feature of all these keyboards is that they are all Unicode 5.0 compatible and hence any document entered through them is acceptable across any standard platform and can be visualized through any Unicode compliant Bangla fonts. SNLTR has designed a new Bangla keyboard named “Baishakhi Keyboard”, which has a 3-layer keyboard structure, with Normal, Shift and Right Alt modes, designed to accommodate all the Bangla alphabets and signs. ”. The keyboard layout is mostly phonetic in nature.
SNLTR has also customized two other popular keyboard layouts- Inscript and Gitanjali to make them fully UNICODE 5.0 compatible and changed the name as ‘Baishakhi Inscript’ and ‘Uni- Gitanjali’.
In order to conform with the existing practice of the government employees, another keyboard layout, named ‘Webel” has also been designed, which retains the existing layout as used by the employees with the minor modifications as required for making it Unicode compliant.
Baishakhi for Windows
Although the penetration of the Windows operating systems is more than the penetration of Linux systems, there are not many popular tools (like, keyboard layouts, keyboard layout Viewer, Fonts etc.) for typing Bangla that follows Unicode Standard in Windows operating system. Thus, SNLTR has developed a number of tools, available in a CD, which can be installed in any PC with Windows XP or Vista, to enable users carry out Office applications, such as Document preparation, Powerpoint presentation preparation, Excel Spreadsheet computations, E-mailing, Web browsing, in Bangla. The CD contains the following Tools:
After installing this tool the system will be ready for typing Bangla with default Bangla Keyboard Layout as 'Baishakhi'. Another two keyboard layouts 'Baishakhi Inscript' and 'Uni Gitanjali’ will also be installed and available for typing Bangla.
This Tool is very useful for the user who doesn’t like to type Bangla using the normal keyboard. Installing this Tool a Soft Keyboard GUI will be available, where user can type Bangla into a specific text area by pressing the key of a virtual keyboard with mouse. The typed text can be cut and pasted to desired application.
To view the Bangla properly Unicode fonts should be available in the system. Installing this tool two Unicode Bangla Font 'Vidya' and 'Bangla Akademi' will be installed in the system.
Bangla firefox 3.0
Installing this tool Firefox-3.0 web browser will be available in Bangla.
Bangla OpenOffice 2.4(Windows)
The open Office application is the alternative of the Microsoft Office application, which is available freely. Like MS office all Document preparation, PowerPoint presentation preparation and Excel Spreadsheet computations can be done using this application.
This is a unique application which is very useful for Bangla. This Tool has the following features:
Normalization of Bangla Documents.
Word and Akshar Count: The ‘Word and Akshar Count’ feature is used to count the number of words and Akshars (considering yuktakshar as a single character) within a given text document. The count operation is done after normalizing the file. The feature is applicable for both txt file and Microsoft word file.
Sorting of a Bangla Document File: The ‘Sort’ operation is also done after normalizing the document file. This feature is also applicable for txt as well as Microsoft word file.
Bangla Word Browser: The ‘Bangla Word Browser’ is used to browse Bangla words. The feature can further be used to check spelling of a Bangla word. The digitized form ‘Bangla Academy Banan Abhidhan’ which has been also digitized by SNLTR has been used as the datum of the word database in this application. The figure below is an illustration of working of Bangla Sahayika as a Word Browser.
On-going projects :
Development of Bangla Optical Character Recognizer (OCR)
Development of Bangla Spell-Checker
This project aims at developing a robust spell checker for Bengali that will be useful for different Bengali application the technology developed should be suitable for use with editors and office applications. Beta version available on-line at http://banglabanan.baishakhi.org/index.htm.
Web-browser for the Blind
Project sponsored by Ministry of Information Technology, Govt. of India, jointly with IIT Kharagpur. The objective of this project is to build a light-weight web browser with special features that can be easily accessible by the visually impaired people. The World Wide Web is a vast resource of information; it is an excellent medium of communication and has brought about many advantages in our lives.
While editors for writing musical notations are available for western scores, none such exists for the Indian Music System. The present work is developing an editing system that will enable the users write the Indian musical notations in a computer. The project further aims at providing facility to play the Swaralipi to facilitate music composition by the composers.
This is a fundamental tool required for development of technologies, such as language translators. While existence of such facilities is commonplace in western languages, none exist for Bangla, for that matter for most of the Indian languages.
Multidimensional Corpora of Bengali Speech
There is a huge repository of Bangla digital documents, which were prepared in non-standard fonts. To have the benefit of this large corpus of information and data for further use, SNLTR has developed a number of code-conversion software that can convert the electronic version of legacy data to the Unicode 5.0 format. Maintaining Unicode standard, this project aims to archive electronic language resources developed in Bangla as well as in English along with a bilingual search engine.