
6 CONCLUDING REMARKS 
Search engines provide support for automatic 
information retrieval which helps in finding data 
sources. However, the tasks of extracting the 
relevant information remain for the user. Thus, there 
are some bottlenecks that must be passed, such as 
(Fensel, 2001): lack of a means for representation 
and translation and lack of a means for content 
descriptions.  
Considering P2P systems, there is an extra issue: 
to increase the result quality while optimizing the 
search space. In this scenario, two problems must be 
addressed: how to find relevant files for the user 
query and how to increase the semantics in the 
information resources. To overcome these issues, we 
proposed an ontology-based approach that can be 
used for improving searching techniques. With this 
proposal, we have reduced the extra traffic produced 
by traditional flooding techniques when the optimal 
results are required, and increased the semantics 
regarding the information storage and searching. The 
search space optimization is achieved by clustering 
files into super peers, based on file similarity. The 
increasing of the semantics is done by adopting 
ontologies, making explicit the information content 
in a manner independent of the underlying structures 
used to store the information.  
We have presented the ontology manager, by 
defining and implementing a tool for matching 
ontologies to XML documents. By matching the 
ontology to a XML file, the system can connect the 
peer to a proper super peer that is described by a 
specific ontology. The matching phase basically 
considers the concept name, the structure similarity 
and stemmer algorithms. The ontologies are 
generated from an integration process among the 
conceptual schemas that describe the XML files.   
We implemented a tool named The Matcher that 
identifies the similarity between XML files and 
OWL ontologies. To evaluate the results, we have 
performed a set of experimental tests, which clearly 
demonstrated the accurate results. As future work, 
we are going to incorporate this tool into DetVX, a 
framework for detecting, managing and querying 
XML replicas and versions in P2P scenarios. We are 
currently developing a graphic tool for peer 
management based on JXTA platform (Gong, 2001). 
The system will allow managing the super peers, 
peers and corresponding files, as well to assess the 
performance when using the presented approach.   
ACKNOWLEDGEMENTS 
This work has been partially supported by CNPq 
under grant No. 142396/2004-4 for Deise de Brum 
Saccol; Pronex Project – FAPERGS under grant No. 
0408933 and CNPq Universal under grant No. 
481055/2007-0 for Renata de Matos Galante. 
REFERENCES 
Baeza-Yates, R.A. and Ribeiro-Neto, B.A., 1999. Modern 
Information Retrieval. ACM Press / Addison-Wesley. 
Bertino, E.; Guerrini, G. and Mesiti, M., 2004. A 
Matching Algorithm for Measuring the Structural 
Similarity between an XML Document and a DTD and 
its Applications.  Information Systems,  Elsevier 
Science Ltd., 29, 23-46.  
Dalamagas, T.; Cheng, T.; Winkel, K.J. and Sellis, 
T.,2004.  Clustering XML Documents using Structural 
Summaries. In: EDBT Work. on Clustering 
Information over the Web, Greece. 
Fensel, D., 2001. Ontologies: A Silver Bullet for 
Knowledge Management and Electronic Commerce. 
Springer. 
Francesca, F.D.; Gordano, G.; Ortale, R. and Tagarelli, 
A.., 2003. Distance-based Clustering of XML 
Documents. In: Work. on Mining Graphs, Trees and 
Sequences,  Croatia.  
Gong, L., 2001. JXTA: A Network Programming 
Environment. IEEE    Internet Computing, 5(3):88–95, 
May/June. 
 Kantrowitz, M., Mohit, B. and Mittal, V., 2000. 
Stemming and its effects on TFIDF ranking. In: SIGIR 
Conf. on Research and Development in Information 
Retrieval. Athens. 
Levenshtein, V., 1966. Binary Codes capable of correcting 
deletions, insertions, and reversals. Cybernetics and 
Control Theory, 10(8):707–710. 
Lian, W.; Cheung, D.; Mamoulis, N. and Yiu, S., 2004. 
An Efficient and Scalable Algorithm for Clustering 
XML Documents by Structure. IEEE Trans. on 
Knowledge and Data Engineering , 16, 82-96. 
 Madhavan,  J.,  Bernstein,  P. A. and Rahm, E., 2001. 
“Generic schema matching using Cupid”. In: 
VLDB’01, Rome, Italy. 
Maedche, A.; Staab, S., 2002. “Measuring similarity 
between ontologies”. In: EKAW. 
 Manning, C. D. and Schütze, H., 1999. Foundations of 
Statistical Natural Language Processing. 1
st
 ed. 
Cambridge, MA: MIT Press. 
Mena, E., and Illarramendi, A., 2001. Ontology-based 
query processing for global information systems. 
Springer. 
Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, and et. al., 
2002. EDUTELLA: A P2P Networking Infrastructure 
Based on RDF. In: WWW’02, Hawaii, EUA. 
AN ONTOLOGY-BASED APPROACH FOR SEMANTIC INTEROPERABILITY IN P2P SYSTEMS
315