Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks - Computational Intelligence in Data Science
Conference Papers Year : 2020

Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks

Abstract

Detecting paraphrases in Indian languages require critical analysis on the lexical, syntactic and semantic features. Since the structure of Indian languages differ from the other languages like English, the usage of lexico-syntactic features vary between the Indian languages and plays a critical role in determining the performance of the system. Instead of using various lexico-syntactic similarity features, we aim to apply a complete end-to-end system using deep learning networks with no lexico-syntactic features. In this paper we exploited the encoder-decoder model of deep neural network to analyze the paraphrase sentences in Tamil language and to classify. In this encoder-decoder model, LSTM, GRU units and gNMT are used as layers along with attention mechanism. Using this end-to-end model, there is an increase in f1-measure by 0.5% for the subtask-1 when compared to the state-of-the-art systems. The system was trained and evaluated on DPIL@FIRE2016 Shared Task dataset. To our knowledge, ours is the first deep learning model which validates the training instances of both the subtask-1 and subtask-2 dataset of DPIL shared task.
Fichier principal
Vignette du fichier
507484_1_En_3_Chapter.pdf (358.08 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-03434784 , version 1 (18-11-2021)

Licence

Identifiers

Cite

B. Senthil Kumar, D. Thenmozhi, S. Kayalvizhi. Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks. 3rd International Conference on Computational Intelligence in Data Science (ICCIDS), Feb 2020, Chennai, India. pp.30-42, ⟨10.1007/978-3-030-63467-4_3⟩. ⟨hal-03434784⟩
122 View
116 Download

Altmetric

Share

More