linguistics

Cognate Identification to improve Phylogenetic trees for Indian Languages

Cognates are present in multiple variants of the same text across different languages. Computational Phylogenetics uses algorithms and techniques to analyze these variants and infer phylogenetic trees for a hypothesized accurate representation based …

Some Strategies to Capture Karaka-Yogyata with Special Reference to apadana

In today’s digital world language technology has gained importance. Several software, have been developed and are available in the field of computational linguistics. Such tools play a crucial role in making classical language texts easily …

Hindi Wordnet for Language Teaching: Experiences and Lessons Learnt

This paper reports the work related to making Hindi Wordnet1 available as a digital resource for language learning and teaching, and the experiences and lessons that were learnt during the process. The language data of the Hindi Wordnet has been …

New Vistas to study Bhartṛhari: Cognitive NLP

A sentence is an important notion in the Indian grammatical tradition. The collection of the definitions of a sentence can be found in the text ‘Vākyapadīya’ written by Bhartṛhari in fifth century C.E. The grammarian-philosopher Bhartṛhari and his …

Semi-automatic WordNet Linking using Word Embeddings

Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, …

Synthesizing Audio for Hindi Wordnet

In this paper, we describe our work on the creation of a voice model using a speech synthesis system for the Hindi Language. We use pre-existing 'voices', use publicly available speech corpora to create a 'voice' using the Festival Speech Synthesis …

Sarcasm Suite: A browser-based engine for sarcasm detection and generation

Sarcasm Suite is a browser-based engine that deploys five of our past papers in sarcasm detection and generation. The sarcasm detection modules use four kinds of incongruity: sentiment incongruity, semantic incongruity, historical context incongruity …

Mapping it differently: A solution to the linking challenges

This paper reports the work of creating bilingual mappings in English for certain synsets of Hindi wordnet, the need for doing this, the methods adopted and the tools created for the task. Hindi wordnet, which forms the foundation for other Indian …

Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets

India is a country with 22 officially recognized languages and 17 of these have WordNets, a crucial resource. Web browser based interfaces are available for these WordNets, but are not suited for mobile devices which deters people from effectively …

PanchBhoota: Hierarchical phrase based machine translation systems for five Indian languages

We present our work on developing fifteen Hierarchical Phrase Based Statistical Machine Translation (HPBSMT) systems for five Indian language pairs namely Bengali-Hindi, English-Hindi, Marathi-Hindi, Tamil-Hindi, and Telugu-Hindi, in three domains …