Determining semantic similarity between documents is crucial to many tasks such as plagiarism detection, automatic technical survey and semantic search. In this paper, we have mainly focused on detecting the semantic similarity between documents in large documents collection and queries based on an Arabic search engine, we investigated MapReduce as a specific framework for managing distributed processing in dataset pattern and semantic similarity measures of documents. Then we study the state of the art of different approaches for computing the similarity of documents. We propose an approach based on parallel algorithm of semantic similarity measures using MapReduce and WordNet after translation phase to detect the relevant documents in the face of the Arabic query. The numerical results obtained and presented showed the efficiency and the performance of the technique adopted. [ABSTRACT FROM AUTHOR]
Copyright of Procedia Computer Science is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)