A Novel Rhetorical Structure Approach for Classifying Arabic Security Documents

Document Type : Original Article

Author

Department of Computer Science, King Saud University, Riyadh 11653, Saudi Arabia

Abstract

Security Documents classification is aimed at securing documents from being illegally disclosed. Classifying a portion of a document as a 'secret' depends on the type of effect its disclosure will have in an organization. In this respect, Information is classified according to their critical semantic (i.e. its context or value and intended uses or audience at particular time or situation). Understanding the semantic of a document is not an easy task. The rhetorical structure theory (RST) is one of the leading theories that have been applied successfully in text processing and understanding. In this paper, we will describe a novel approach to automatically classify Arabic Security documents using RST.

Keywords