(91) 44 – 24501851
(91) 44 – 24500831

Pongal-2000 Project and Tamil Digital Corpus
Indo-German Project to computerise Tamil Literature

From The Indian Express

(Chennai, March 27, 1998) The Institute of Indology and Tamil Studies at Cologne in Germany has undertaken a project named 'Pongal 2000' to digitise and computerise Tamil literature on a fairly large-scale.

This is being done with a view to construct a Tamil national corpus to encompass all major Tamil text categories, classical works starting from the Sangam period as well as prose selected from the earliest Portuguese Tamil prints to the latest contemporary works of prose.

Lecturer at the Cologne University, Germany, and Institute of Asian Studies, Chemmanchery, Chennai, Thomas Malten said the project being carried out in collaboration with the Institute at Chennai and the Tamil Department at the South Asia Language Centre at Berkeley, USA, has received initial funding from the University of Cologne and has so far completed conversion of approximately 10 million words from printed text to machine readable format.

Besides, a selection of one lakh Tamil palm leaf manuscript known to be scattered in the libraries of India and Europe will be digitised with UNESCO funding at the Institute of Asian Studies.

Already, four research scholars have taken up training in methods of digital editing, indexing and parsing of Tamil texts at the Institute since last year. One of the first steps of the project, where digital Tamil texts are being used as lexical source material, will be gathering and bringing together the contents of all major dictionaries which have been written during the last 300 years.

Pointing out there are at least 1,000 Tamil dictionaries which have been produced in printed or manuscript form containing changes of meaning of Tamil words, these dictionaries form the base of a new lexicographic effort in Tamil aimed at tracing the development of the language in the manner and scope of the Oxford English Dictionary.

Soon, dictionaries like the Malabar-English Dictionary of Fabricius or the three volumes of the Madurai Sangam Dictionary published early this century and many other rare books will be available for lexicography. The institute is collaborating with Prof Rajamanickam of the De Nobili Research Institute at Loyola College, Chennai.

Available on Internet
Sangam and post-Sangam literature, Silappadikaram, Periyapuranam, Thiruvachakam and Kamparamayanam have been made available in transliterated form on the Internet. This is available with an online Tamil-English dictionary based on the Madras Tamil lexicon consisting of nearly 130,000 entries.

"It is the first large online depository for any Indian language and further enhancements are being made under the Pongal 2000 project", says Thomas Malten adding several scholars are working full-time on the project.

The Institute also plans to bring out a computerised catalogue of Tamil printed books intended to incorporate bibliographical information from other sources in order to glean a much clearer perspective of the history of published Tamil literature of the 19th and 20th centuries.

The first number of a German language on-line journal named 'Kolam: A Mirror of Tamil culture', has been launched recently, comprising translations from Tamil into German. documentation of popular Tamil culture including festivals and cinema.

  Copyright © 2014 Institute of Asian Studies. All Rights Reserved.