Influence of compression distance measures on Authorship Attribution

P.Vijayapal Reddy

Abstract


Authorship attribution (AA) can be defined as the task of inferring characteristics of a document’s author from the textual characteristics of the document itself. In this paper, it is evaluated the compression model for AA on Telugu text. It considered LZW compression model with three different compression distance measures such as Normalized Compressor Distance (NCD), Compression Dissimilarity Measure (CDM) and Conditional Complexity of Compression (CCC) . The results shows  that the compression models are good alternatives for Authorship attribution. The model is evaluated using micro-average F1, macro-average F1 and accuracy measures.


Full Text:

PDF




Copyright (c) 2019 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

 

All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 


Paper submission: ijr@pen2print.org