Ibrahim Bounhas, Yahya Slimani



This paper discusses the social constraints that should be taken into account in document analysis. In fact, a document is viewed and analysed as the object of a transaction between realizers and beneficiaries. Then, the first condition for the success of the document reading process is insuring confidence between the two parties. Thus document analysis should help in authority evaluation which is based on identifying the names and the roles of the realizers and also studying their behaviours. Our proposal approach is social-based. Besides authority study, it is based on identifying document usage types. A usage type defines how a community of users can access to a document. Designing a document parser requires to solve complex problems. We argue that authority evaluation requirements and document usage types are important parameters that can reduce this complexity. Thus, the first step of our approach consists in identifying social parameters. Then documents are fragmented regarding their logical structure. We distinguish two levels: the macro-logical level and the micro-logical level. These two levels allow us to design a generic approach that could be applied to documents having different styles and different types.


  1. Zacklad, M., 2007. Processus de documentarisation dans les Documents pour l'Action (DopA). Babel - edit -, Le numérique: impact sur le cycle de vie du document, ENSSIB.
  2. Naumann, F. & Rolker, C., 2000. Assessment Methods for Information Quality Criteria, In. International Conference on Information Quality (IQ), Cambridge, MA..
  3. Knight, S. & Burn, J., 2005. Developing a Framework for Assessing Information Quality on the World Wide Web, Informing Science Journal, vol. 8, pp. 59-73.
  4. Rieh, S. Y., 2002. Judgment of Information Quality and Cognitive Authority in the Web, Journal of the American Society for Information Science and Technology, vol. 53, no. 2, pp. 145-161.
  5. Shaalan, K. & Raza, H., 2007. Person Name Entity Recognition for Arabic. In ACL'07. Workshop on Computational Approaches to Semitic Languages, Prague, Czech Republic, pp.17-24.
  6. Viola, P. & Narasimhand, M., 2005. Learning to Extract Information from Semi-structured Text using a Discriminative Context Free Grammar. In 28th annual international ACM SIGIR conference on Research and development in information retrieval, Salvador, Bahia, Brazil, pp. 330-337.
  7. Rangoni, Y. & Belaïd, A., 2006. Document Logical Structure Analysis Based on Perceptive Cycles. 7th IAPR Workshop on Document Analysis Systems - DAS 2006, Springer Verlag (Ed.), pp. 117-128.
  8. Dou, H. & Hassanaly, P., Quoniam, L.; La Tela A., 1990. Technological watch and information: on bibliometric analysis in information services, Documentaliste, vol. 27, no. 3, pp. 132-141
  9. Connan, J. & Omlin., C. W., 2000. .Bibliography Extraction with Hidden Markov Models, technical report US-CS-TR-00-6, computer science department, University of Stellenbosch.
  10. Aussenac-Gilles, N. & Condamines, A., 2004. Documents électroniques et constitution de ressources terminologiques ou ontologiques, InformationInteraction-Intelligence, vol. 4, no. 1 pp. 75-94,
  11. Wenger, E., 1998. Communities of Practice: Learning, Meaning and Identity, Cambridge University Press.
  12. Zitouni, I., Sorensen, J., Luo, X. & Florian R., 2005. The Impact of Morphological Stemming on Arabic Mention Detection and Coreference Resolution, In ACL'05, workshop on Computational Approaches to Semitic Languages, 43rd Annual Meeting of the Association of Computational Linguistics. Ann Arbor, Michigan, USA, pp. 63-70.
  13. Abuleil, S., 2004. Extracting Names from Arabic Text for Question-Answering Systems, In RIAO'2004, Coupling approaches, coupling media and coupling languages for information retrieval, Avignon, France. pp. 638- 647.

Paper Citation

in Harvard Style

Bounhas I. and Slimani Y. (2009). A SOCIAL APPROACH FOR DOCUMENT ANALYSIS . In Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2009) ISBN 978-989-674-013-9, pages 95-102. DOI: 10.5220/0002274300950102

in Bibtex Style

author={Ibrahim Bounhas and Yahya Slimani},
booktitle={Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2009)},

in EndNote Style

JO - Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2009)
SN - 978-989-674-013-9
AU - Bounhas I.
AU - Slimani Y.
PY - 2009
SP - 95
EP - 102
DO - 10.5220/0002274300950102