%0 Conference Proceedings %T Identification of Access Control Policy Sentences from Natural Language Policy Documents %+ University of North Texas (UNT) %A Narouei, Masoud %A Khanpour, Hamed %A Takabi, Hassan %Z Part 1: Access Control %< avec comité de lecture %( Lecture Notes in Computer Science %B 31th IFIP Annual Conference on Data and Applications Security and Privacy (DBSEC) %C Philadelphia, PA, United States %Y Giovanni Livraga %Y Sencun Zhu %I Springer International Publishing %3 Data and Applications Security and Privacy XXXI %V LNCS-10359 %P 82-100 %8 2017-07-19 %D 2017 %R 10.1007/978-3-319-61176-1_5 %K Access control policy %K Attribute-based access control %K Policy engineering %K Natural language processing %Z Computer Science [cs]Conference papers %X Access control mechanisms are a necessary and crucial design element to any application’s security. There are a plethora of accepted access control models in the information security realm. However, attribute-based access control (ABAC) has been proposed as a general model that could overcome the limitations of the dominant access control models (i.e., role-based access control) while unifying their advantages. One issue with migrating to an ABAC model is the information that needs to be encoded in the model is typically buried within existing natural language artifacts, hence difficult to interpret. This requires processing natural language documents and extracting policies from those documents. Software requirements and policy documents are the main sources of declaring organizational policies, but they are often huge and consist of a lot of general descriptive sentences that lack any access control content. Manually processing these documents to extract policies and then using them to build a model is a laborious and expensive process. This paper is the first step towards a new policy engineering approach for ABAC by processing policy documents and identifying access control contents. We take advantage of multiple natural language processing techniques including pointwise mutual information to identify access control policy sentences within natural language documents. We evaluate our approach on documents from different domains including conference management, education, and healthcare. Our methodology effectively identifies policy sentences with an average recall and precision of 90% on all datasets, which bested the state-of-the-art by 5%. %G English %Z TC 11 %Z WG 11.3 %2 https://inria.hal.science/hal-01684368/document %2 https://inria.hal.science/hal-01684368/file/453481_1_En_5_Chapter.pdf %L hal-01684368 %U https://inria.hal.science/hal-01684368 %~ IFIP-LNCS %~ IFIP %~ IFIP-TC %~ IFIP-WG %~ IFIP-TC11 %~ IFIP-WG11-3 %~ IFIP-DBSEC %~ IFIP-LNCS-10359