Privacy Policy Annotation for Semi-automated Analysis: A Cost-Effective Approach

Dhiren A. Audich; Rozita Dara; Blair Nonnecke

doi:10.1007/978-3-319-95276-5_3

Conference Papers Year : 2018

Privacy Policy Annotation for Semi-automated Analysis: A Cost-Effective Approach

(1) , (1) , (1)

Dhiren A. Audich

Function : Author
PersonId : 1035463

University of Guelph [Guelf, Ontario, Canada]

Rozita Dara

Function : Author
PersonId : 1035464

University of Guelph [Guelf, Ontario, Canada]

Blair Nonnecke

Function : Author
PersonId : 1035465

University of Guelph [Guelf, Ontario, Canada]

Abstract

Privacy policies go largely unread as they are not standardized, often written in jargon, and frequently long. Several attempts have been made to simplify and improve readability with varying degrees of success. This paper looks at keyword extraction, comparing human extraction to natural language algorithms as a first step in building a taxonomy for creating an ontology (a key tool in improving access and usability of privacy policies).In this paper, we present two alternatives to using costly domain experts are used to perform keyword extraction: trained participants (non-domain experts) read and extracted keywords from online privacy policies; and second, supervised and unsupervised learning algorithms extracted keywords. Results show that supervised learning algorithm outperform unsupervised learning algorithms over a large corpus of 631 policies, and that trained participants outperform the algorithms, but at a much higher cost.

Domains

Computer Science [cs]

Fichier principal

470710_1_En_3_Chapter.pdf (294.74 Ko)

Origin	Files produced by the author(s)
licence	CC BY 4.0 - Attribution

Connect in order to contact the contributor

https://inria.hal.science/hal-01855985

Submitted on : Thursday, August 9, 2018-10:41:36 AM

Last modification on : Thursday, June 12, 2025-12:22:02 PM

Long-term archiving on : Saturday, November 10, 2018-12:57:08 PM

Dates and versions

hal-01855985 , version 1 (09-08-2018)

Licence

CC BY 4.0 - Attribution

Identifiers

HAL Id : hal-01855985 , version 1
DOI : 10.1007/978-3-319-95276-5_3

Cite

Dhiren A. Audich, Rozita Dara, Blair Nonnecke. Privacy Policy Annotation for Semi-automated Analysis: A Cost-Effective Approach. 12th IFIP International Conference on Trust Management (TM), Jul 2018, Toronto, ON, Canada. pp.29-44, ⟨10.1007/978-3-319-95276-5_3⟩. ⟨hal-01855985⟩

Privacy Policy Annotation for Semi-automated Analysis: A Cost-Effective Approach

Abstract

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share