CALCULATING SEMANTIC SIMILARITY BETWEEN

COMPUTER-UNDERSTANDABLE DESCRIPTORS OF

SCIENTIFIC RESEARCH

Steven B. Kraines and Weisen Guo

University of Tokyo, 5-1-5 Kashiwa-No-Ha, Kashiwa-Shi, Chiba Prefecture, Japan

Keywords: Knowledge Representation, Semantic Matching, Semantic Similarity, Logic Inference.

Abstract: If researchers created computer-understandable descriptors as part of the process of authoring journal

articles and other expert knowledge resources, intelligent computer-aided matching and searching

applications that are critical for addressing complex and large-scale problems in society could be realized.

The EKOSS system enables knowledge experts to create computer-understandable descriptors of their

knowledge resources using description logics ontologies as formal knowledge representation languages. The

descriptors, called semantic statements, are authored as description logic ABoxes in reference to a shared

domain ontology in the form of a TBox. Reasoners using logic-based inference can then measure the

semantic similarity between semantic statements, which can be applied in knowledge searching, mining and

integration applications. A method for semantic matching that uses logic inference based on a DL ontology

TBox to increase both the precision and recall of matching descriptors created as ABoxes is described, and

the accuracy of the method compared to matching without logic inference is analyzed between a set of 15

semantic statements created using EKOSS to describe research articles related to sustainability science.

1 INTRODUCTION

Integration of knowledge from a wide range of

academic and non-academic domains is needed to

address the complex problems of today’s society, e.g.

related to achievement of sustainable societies

(Takeuchi and Komiyama, 2006). However, just

coming to grips with the different terminologies

used in different disciplines, e.g. resolving different

usages of the same term, is difficult (Allenby, 2006).

If the accumulating knowledge resources remain

disconnected, then soon it will be impossible to

uncover the potential interrelationships and structure

of all of that knowledge, which is necessary in order

to solve the problems that contemporary science

attempts to address (Lane and Bertuzzi, 2011).

Information technologies can be used to generate

networks of expert knowledge related to specific

areas such as global sustainability and bioscience

(Neumann and Prusak, 2007; Cahlik, 2000). Some

of these studies include semantic relationships by

using automated natural language processing (NLP)

techniques, such as keyword extraction (Kajikawa et

al., 2007). However, even the most advanced NLP

techniques today, such as relationship extraction,

cannot determine meaningful relationships between

keywords with high accuracy (Erhardt et al., 2006).

Technologies that enable creators of knowledge

resources to provide computer interpretable

descriptors of their resources themselves, rather than

relying on third-party annotators or automated

computer “bots”, could make computer-aided

knowledge sharing more effective (Gerstein et al.,

2007; Uren et al., 2006; Power, 2009). In particular,

technologies emerging in the context of the

Semantic Web, such as ontologies, could be utilized

to create an interactive knowledge sharing platform

that would act as a forum for exchanging and

integrating different forms of scientific knowledge

related to a wide range of social issues (Allenby,

2006; Berners-Lee and Hendler, 2001; for examples

see Kraines et al., 2005; Davis et al., 2009;

Kumazawa et al., 2009).

If expert scientific knowledge resources, such as

research articles or research project reports, were

accompanied with highly accurate and semantically

rich descriptors that can be interpreted by a

computer, then inference and reasoning technologies

146

B. Kraines S. and Guo W..

CALCULATING SEMANTIC SIMILARITY BETWEEN COMPUTER-UNDERSTANDABLE DESCRIPTORS OF SCIENTIFIC RESEARCH.

DOI: 10.5220/0003637001460151

In Proceedings of the International Conference on Knowledge Management and Information Sharing (KMIS-2011), pages 146-151

ISBN: 978-989-8425-81-2

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

could be used to provide a wide range of knowledge

processing services (Power, 2009; Alani et al., 2005;

Hess and Schliedera, 2006). For example, by

mapping the concepts contained in ontologies that

have been constructed as knowledge models for

different domains of knowledge, it should be

possible to translate the concepts and relationships

expressed in the descriptors between different

domains of knowledge, e.g. between chemical

engineering and macroeconomics.

Any third party effort, e.g. by a group of

professional curators, to create such descriptors for

all published research articles could not keep up

with the rate of scientific publication (Attwood et al.,

2009). However, knowledge processing based on

computer-understandable descriptors authored by

humans could be made sustainable by providing

incentives for knowledge experts such as researchers

and policy makers to author the descriptors for their

own knowledge resources, perhaps as a part of the

process of submitting research articles or project

reports. Once the larger community is engaged in

creating such descriptors of their expertise, then the

knowledge processing based on those descriptors

could be scaled up to the size of that community

(DeRose et al., 2007; Ceol et al., 2008).

EKOSS is a web-based platform that supports

computer-mediated sharing and integration of expert

knowledge resources based on computer-

understandable descriptors that are authored by

human knowledge creators (Kraines et al., 2006).

The TBox of a description logics ontology provides

a simplified, unambiguous language for describing

expert scientific knowledge with semantics that can

be interpreted accurately by a computer reasoning

engine. Expert knowledge is described in the form of

ABoxes that instantiate the TBox. Those ABoxes,

called “semantic statements” in EKOSS, are

“computer-understandable” in that a computer can

use logical inference and background knowledge

encoded in the ontology to derive new

“understanding” from a semantic statement. Each

semantic statement is made of one or more triples

consisting of a two ontology class instances and a

typed, directed property between them.

Here, we describe work to develop and test a

method for computing the semantic similarities

between a set of research articles based on semantic

statements that have been created for those articles.

By reasoning about semantic statements with logical

inference, we can identify similarities between

research articles that “tell similar stories” but do not

have any bibliographic evidence for similarity, such

as co-citation. Thus, in comparison to conventional

social networks that describe “who knows who”, we

aim to discover knowledge networks of “who should

know who” because their work is similar in a

meaningful way (Neumann and Prusak, 2007).

In Section 2, we describe the method for

calculating the semantic similarity between two

statements, and the process we used to create the

gold standard for evaluating the calculated semantic

similarities. In section 3, we present the results of

the analysis of semantic similarity calculations using

several different levels of inference. We discuss the

results and conclude the paper in sections 4 and 5.

2 METHODS

Our hypothesis is that given a set of semantic

statements of knowledge resources authored by the

human creators of those resources, we can find more

accurate and more “interesting” matches between

knowledge resources than by using conventional

matching techniques. To test this hypothesis, we

examine the effectiveness of using logical inference

to find which pairs of semantic statements, each of

which describes the research presented in a single

research article, have the highest semantic similarity.

In the following sections, we describe the method

for calculating the semantic similarity between a set

of semantic statements, and the gold standard we

have created to evaluate the matching results.

2.1 Semantic Statement Matching

In order to study the different kinds of semantic

matching techniques described in the previous

section, we have developed a semantic matching

tool that supports three of the basic types of

matching described by Guo and Kraines (2008):

matching of classes only, matching of triples without

logical inference, and matching using DL inference.

The process flow for the matching tool is illustrated

in figure 1. All matching tasks take a set of semantic

statements together with a complete set of matching

options as inputs, and they output a list of matching

results giving the calculated matching score between

each pair of semantic statements together with the

specific bindings between triples or instances,

depending on whether or not triples are used in the

semantic matching. In addition to the matching type,

the matching options include class and property

generalization rules and inclusion of property

inverses and symmetry for triple-based matching.

The semantic matching tool generates a search

query set from each semantic statement as follows.

CALCULATING SEMANTIC SIMILARITY BETWEEN COMPUTER-UNDERSTANDABLE DESCRIPTORS OF

SCIENTIFIC RESEARCH

147

First, the semantic statement is decomposed into a

set of atomic search queries, which are all of the

classes of the instances in the statement in the case

of “class-based” matching and all of the semantic

triples in the statement otherwise. Because queries

are matched using class and property taxonomies,

queries containing general classes and properties

will match with many semantic statements, but

queries with specific classes and properties will

often not find any matches. In order to compensate

for this difference, we provide users with the option

to specify class and property generalization rules.

These are essentially lists of classes (properties) that

are to be substituted for any subclasses

(subproperties) that occur in the atomic search

queries. For example, the analysis described here

uses property generalization rules that include the

property “has participant”. Therefore, all of the

properties in all search queries that are subproperties

of the property “has participant”, such as “produces”,

“consumes” and “has actor”, are replaced with “has

participant”. The result of this process is a set of

atomic search queries that represent the different

semantic assertions contained in the original

semantic statement at a particular level of semantic

specificity that is specified by the user.

Figure 1: Flow diagram of the semantic matching

algorithm.

After the search query sets are generated, each

atomic search query in each search query set is

matched with all of the original semantic statements,

which we call “target statements”. In the case of

“class-based” matching, each atomic search query is

just a single (possibly generalized) class; property

information is ignored. Consequently, a match is

recorded between the atomic search query and each

instance in the target statement whose class is the

same as or a subclass of the class given by the query.

For “triple-based” matching, each atomic search

query is a semantic triple comprised of a directed

property from a domain class to a range class. Each

atomic search query is matched with each semantic

triple in each target statement. A match occurs if the

class of the domain instance, the class of the range

instance and the property of the triple from the target

statement are all equal to or subsumed by the

respective classes and properties in the atomic

search query. If the user has included the matching

option “use property inverses” and the property in

the atomic search query has an inverse, then the

inverse property is substituted into the search query

and the classes of the domain and range are reversed.

The modified atomic search query is then matched

once again with all of the triples in the target

statement. Similarly, if the user has included “use

property symmetry” and the property is declared in

the ontology to be symmetric, then the classes of the

domain and range are reversed and the modified

atomic search query is matched again with all of the

triples in the target statement.

In “DL-based” matching, once again each atomic

search query is a semantic triple. First, all instances

and properties of one target statement are loaded

into the knowledge base of the DL reasoner (we use

RacerPro here) as the ABox, together with the

ontology which comprises the TBox. Then each

atomic search query is evaluated by the reasoner

against the knowledge base. If the reasoner finds an

answer set to the query, then all pairs of instances in

the ABox that can be bound to the class variables in

the search query are recorded. The ABox of the

knowledge base is then cleared, and the next target

statement is loaded into the knowledge base.

In both “triple-based” and “DL-based” semantic

matching, multiple pairs of instances in the target

statement may match with a particular atomic search

query. Also, it is possible that more than one search

query may match with a particular pair of instances.

To obtain the score of a match between a search

query set and a target statement, we calculate

weights for each atomic search query using inverse

document frequency (IDF) (Spark Jones, 1972):

weight of atomic query =

ln[(total # of statements)/

(# of statements having at least 1

match with the atomic query)]

(1)

Begin

End

Receive matching parameters:

- matching type, generalization rules, etc.

Generate search query sets from each semantic

statement

- Decompose statement into triples or instances

- Replace properties and classes according to the

generalization rules

Calculate matching score from IDFs of each atomic query

Match each atomic query in each search query set

with each target statement

- Class-based: match atomic queries to instances

- Triple-based: match atomic queries to triples

- DL-based: query DL reasoner for bindings to atomic

query

KMIS 2011 - International Conference on Knowledge Management and Information Sharing

148

The matching score is then just the normalized

sum of the weights of the matching atomic queries:

score

(

s, t

)

∑

(weight o

matched atomic quer



)





∑

(weight o

atomic quer



)





(2)

Where n is the total number of atomic queries in

the search query set s, and m is the number of

atomic queries that have at least one match in target

statement t.

2.2 Creating the Gold Standard

In order to evaluate the precision and recall of the

matching results using each of the semantic

matching techniques, we have created a gold

standard that gives the “correct” matches between a

set of 15 semantic statements that were created using

the EKOSS system to describe research articles on

topics related to sustainability science. The semantic

statements were authored using the SCINTENG

ontology implemented in OWL-DL, which makes

extensive use of the logical constructs provided in

the DL framework such as domain and range

restrictions on properties, logical characteristics of

properties such as being transitive or functional, and

universal, existential, and cardinal restrictions on

classes (Kraines and Guo, 2011). All of these

constructs can be used for semantic inference.

We have chosen to focus on a small set of the

semantic statements in order to be able to thoroughly

investigate all of the matching results. The 15

semantic statements were selected for research

articles that are recent, published in internationally

recognized journals, and are representative of the

coverage of the SCINTENG ontology: five articles

focus on experimental studies in material science,

four articles focus on modeling studies of energy

devices, three articles focus on studies of natural or

agrarian ecosystems, and three articles focus on

analyses of economic systems. Each semantic

statement contains on average about 40 semantic

triples, so there are over 500 triples, not including

the triples in the TBox and the triples obtained

through logic inference.

We created the gold standard by examining the

actual semantic similarity between pairs of semantic

statements that had non-zero scores when DL

inference was used for semantic matching. This set

of matches necessarily includes all of the matches

generated using triples, both with and without

property inverses and symmetry. However, many of

the matches generated using class matching will not

be in this set, and these will all be treated as negative

matches. We discuss this issue in section 4.

Figure 2 shows the results of matching the 15

research articles using DL inference. Each of the

articles appears on both axes, so the diagonal

consists of 100% matches of the semantic statement

for a research article with itself. The matrix is not

diagonal, however, because the semantic matching

techniques are based on logical inference matches

from the search query sets (on the X axis) to the

target statements (on the Y axis).

Figure 2: Results of matching semantic statements using

DL inference. Labels on the X axis are short versions of

the article titles. Target statements are on the Y-axis and

search query sets for each of the semantic statements are

on the X-axis. Shades of gray show match scores: a white

cell is a zero match and a black cell is a match with a score

of 20% or more. The gold standard scores are shown as

circled numbers from 0 to 9 overlaying the shaded cells.

Black circles are used on cells where the actual matching

score using DL inference was less than 8%, and white

circles are used where the match score was 8% or more.

We examined each of the 81 non-zero matches

off the diagonal to evaluate the actual semantic

similarity. There are 210 pairs of different semantic

statements, so this gives a coverage of 39%. We

manually assigned a semantic similarity score to

each of the matches based on our knowledge of the

CALCULATING SEMANTIC SIMILARITY BETWEEN COMPUTER-UNDERSTANDABLE DESCRIPTORS OF

SCIENTIFIC RESEARCH

149

matching domain, as

shown in figure 2. The

occasional disagreements between the match

scores and the gold standard scores, e.g. the black

cell in the first column that is labelled with a “1”,

are indication that even matching using DL

inference does not give perfectly accurate matches.

3 RESULTS

We ran the semantic matching process using class-

based matching (classes), triple-based matching

without inverses or symmetry (triples), triple-based

matching with both inverses or symmetry (triples+),

and DL inference (DL). All of the matching

techniques include inference over class and property

hierarchies defined in the ontology. We used a set of

property generalization rules but no class

generalization rules for all of the semantic matching.

To calculate precision and recall, we treat all of

the matches in the gold standard having scores

greater than or equal to 5 as true positives. We

calculated PR curves by adjusting the cut off for the

semantic matching results. Only 9 matches were

greater than 15% in the DL semantic matching case,

as shown in figure 2, so we have evaluated the

number of true and false positives and negatives for

cut off values of 1%, 2%, 5%, 8%, and 10%.

The resulting PR curves are shown in figure 3.

Except for the last point for the 10% cutoff,

matching with DL inference outperformed all other

matching techniques. While class matching tended

to have high recall, the large number of matches that

did not fulfil our criteria for the gold standard (at

least one matching triple) meant very low values for

precision. Even at the 10% cut off, the precision for

the class matching case was just 60%.

The PR value for the DL inference case at the

10% cut off is clearly a low performance result,

having both lower precision and lower recall than

the previous cutoff at 8%. The reason for the

simultaneous decrease in precision and recall is as

follows. The number of true positives decreased

from 18 to 13 when the cutoff was raised from 8% to

10%. However, the cut off did not actually result in

the removal of any false positives because already at

8% cutoff there were only 5 false positives. A

decrease in true positives with no change in false

positives resulted in a decrease in both precision and

recall. In comparison, the triples+ case at 8% cutoff

had one less false positive than the DL inference

case at 10% cutoff with the same number of true

positives, so although the recall is the same, the

precision is slightly better for the triples + case.

The simultaneous decrease in precision and

recall for the triples case from 5% cutoff to 10%

cutoff occur for reasons similar to the DL inference

case from 8% cutoff to 10% cutoff.

Figure 3: PR curves for the four semantic matching runs

described in the text.

4 DISCUSSION

The approach we have used for creating the gold

standard favors the DL matching results. Therefore,

the comparison of PR curves in the previous section

should be viewed as a measure of the degree to

which simpler forms of semantic matching can

reproduce the results given by the DL inference

based semantic matching. However, we believe that

this is a useful evaluation because it is reasonable to

think that the manual (rather than computer-

extracted) addition of rich semantic information

should result in more semantically accurate matches.

Ontologies formalized in logic have well known

limitations in regard to expression of fuzzy concepts

and uncertainty. Furthermore, the logical formalisms

provided by OWL-DL do not allow us to express the

propagation of a relationship from one property to

another (Horrocks et al., 2006). For example, we

might want to infer that if a particular instance of

activity, such as “driving”, is identified to be located

in a particular city “Tokyo” and have a particular

participant “diesel truck”, then we know that the

“diesel truck” has location “Tokyo” (at least for the

duration of the activity). This function is supported

in the new OWL2 protocol (Grau et al., 2008). The

propagation function could also be implemented by

defining Horn clause rules that create new properties

KMIS 2011 - International Conference on Knowledge Management and Information Sharing

150

every time a propagating combination is detected.

For the example given above, the Horn clause would

have the head “A has location B and A has

participant C” and the implication “C has location

B”. We plan to examine the effect of adding rules

such as this on semantic matching in future work.

5 CONCLUSIONS

We have described a method and tool for matching

semantic statements that represent the expert

knowledge reported in research articles based on a

DL ontology. The precision and recall of matching

using DL inference versus matching triples or

classes directly without inference were measured

using a gold standard prepared specifically for this

study. The results indicate that the non-inference

matching techniques were significantly less accurate

than matching with DL inference.

ACKNOWLEDGEMENTS

This work has been supported by funding from the

Japan Office of the Alliance for Global

Sustainability and the Office of the President of the

University of Tokyo.

REFERENCES

Alani, H., Kalfoglou, Y., O'Hara, K., Shadbolt, N., 2005.

Towards a Killer App for the Semantic Web. ISWC

2005 LNCS, 3729, pp. 829-843.

Allenby, B., 2006. The ontologies of industrial ecology?

Progress in Industrial Ecology, 3(1): 28-40.

Attwood, T. K., Kell, D. B., McDermott, P., Marsh, J.,

Pettifer, S. R., Thorne, D., 2009. Calling International

Rescue: knowledge lost in literature and data

landslide! Biochemical Journal, 242: 317-333.

Berners-Lee, T., Hendler, J., 2001. Publishing on the

Semantic Web. Nature, 410: 1023—1024.

Cahlik, T., 2000. Comparison of the maps of science.

Scientometrics, 49: 373-387.

Ceol, A., Chatr-Aryamontri, A., Licata, L., Cesareni, G.,

2008. Linking Entries in Protein Interaction Database

to Structured Text: the FEBS Letters Experiment.

FEBS letters, 582(8), 1171-1177.

Davis, C., Nikolic, I., Dijkema, G. P. J., 2009. Integration

of Life Cycle Assessment Into Agent-Based Modeling.

J. Industrial Ecology, 13: 306-325.

DeRose, P., Shen, W., Chen, F., Doan, A., Ramakrishnan,

R., 2007. Building structured web community portals:

a top-down, compositional, and incremental approach.

In VLDB '07: Proc 33rd Intl Conf on very large data

bases, Vienna, Austria, pp. 399—410.

Erhardt, R. A-A., Schneider, R., Blaschke, C., 2006.

Status of text-mining techniques applied to biomedical

text. Drug Discovery Today, 11(7-8), 315-325.

Gerstein, M., Seringhaus, M., Fields, S., 2007. Structured

digital abstract makes text mining easy. Nature, 447:

142.

Grau, B. C., Horrocks, I., Motik, B., Parsia, B., Patel-

Schneider, P., and Sattler, U., 2008. OWL 2: The next

step for OWL. Web Semantics: Science, Services and

Agents on the World Wide Web 6(4): 309—322.

Guo, W., Kraines, S. B., 2008. Explicit scientific

knowledge comparison based on semantic description

matching. Annual meeting of the ASIST 2008,

Columbus, Ohio.

Hess, C., Schliedera, C., 2006. Ontology-based

verification of core model conformity in conceptual

modeling. Comp, Environ Urban Sys, 30(5):543-561.

Horrocks, I., Kutz, O., Sattler, U., 2006. The even more

irresistible SROIQ. In KR, AAAI Press, pp: 57-67.

Kajikawa, Y., Ohno, J., Takeda, Y., Matsushima, K.,

Komiyama, H., 2007. Creating an academic landscape

of sustainability science: an analysis of the citation

network. Sustainability Science, 2(2): 221—231.

Kraines, S. B., Guo, W., 2011. A system for ontology-

based sharing of expert knowledge in sustainability

science. Data Science Journal, 9: 107—123.

Kraines, S. B., Batres, R., Koyama, M., Wallace, D. R.,

Komiyama, H., 2005. Internet-based integrated

environmental assessment: using ontologies to share

computational models. J. Industrial Ecology

, 9: 31-50.

Kraines, S. B., Guo, W., Kemper, B., Nakamura, Y., 2006.

EKOSS: A knowledge-user centered approach to

knowledge sharing, discovery, and integration on the

Semantic Web. ISWC 2006 LNCS, 4273: 833—2091.

Kumazawa, T., Saito, O., Kozaki, K., Matsui, T.,

Mizoguchi, R., 2009. Toward knowledge structuring

of Sustainability Science based on ontology

engineering. Sustainability Science, 4(1):99—116.

Lane, J., Bertuzzi, S., 2011. Measuring the results of

science investments. Science, 331: 678—680.

Neumann, E., Prusak, L., 2007. Knowledge networks in

the age of the Semantic Web. Briefings in

Bioinfomatics, 8 (3):141-149.

Power, R., 2009. Towards a generation-based semantic

web authoring tool. In ENLG '09: Proc. 12th

European Workshop on Natural Language Generation,

Athens, Greece, pp. 9-15.

Sparck Jones, K., 1972. A statistical interpretation of term

specificity and its application in retrieval. Journal of

Documentation, 28(1): 11—21.

Takeuchi, K., Komiyama, H., 2006. Sustainability science:

building a new discipline. Sustainability Sci, 1(1): 1-6.

Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-

Vera, M., Motta, E., Ciravegna, F., 2006. Semantic

annotation for knowledge management: requirements

and a survey of the state of the art. Journal of Web

Semantics, 4 (1): 14—28.

CALCULATING SEMANTIC SIMILARITY BETWEEN COMPUTER-UNDERSTANDABLE DESCRIPTORS OF

SCIENTIFIC RESEARCH

151