Parsimonious Representation of Knowledge Uncertainty using Metadata

about Validity and Completeness

elia da Costa Pereira

1 a

, Didier Dubois

2 b

, Henri Prade

2 c

and Andrea G. B. Tettamanzi

3 d

Universit

e C

ote d’Azur, CNRS, I3S, Sophia Antipolis, France

IRIT – CNRS, 118, route de Narbonne, Toulouse, France

Universit

e C

ote d’Azur, Inria, CNRS, I3S, Sophia Antipolis, France

Keywords:

Knowledge Representation, Possibility Theory.

Abstract:

We investigate how metadata about the uncertainty of knowledge contained in a knowledge base can be ex-

pressed parsimoniously and used for reasoning. We propose an approach based on possibility theory, whereby

a classical knowledge base plus metadata about the degree of validity and completeness of some of its portions

are used to represent a possibilistic belief base. We show how reasoning on such belief base can be done using

a classical reasoner.

1 INTRODUCTION AND

RELATED WORK

In general, the process of getting a piece of informa-

tion from a Knowledge Base (KB) is driven by prac-

tical purposes—such a piece of information will be

used to justify certain decisions, for example. Its qual-

ity has therefore an important role to play in the suc-

cess of the decisions made. The quality of a piece of

information can be measured by considering different

dimensions. Most contributions in the literature con-

centrate exclusively on the amount of true (known)

facts in a KB for assessing its quality. (Wick et al.,

2013), for example, propose several algorithms for es-

timating a value of conﬁdence based on the probabil-

ity of a fact in a KB being true. In (Dong et al., 2014),

the authors studied the applicability of data fusion

techniques to solve the problem of knowledge base

feeding. The criterion they used to construct qual-

ity knowledge bases was to identify the true values

of data items among multiple observed values pro-

vided from different (and maybe unknown) sources

with different reliabilities. However, as it has been

pointed out for example by (Razniewski et al., 2016),

https://orcid.org/0000-0001-6278-7740

https://orcid.org/0000-0002-6505-2536

https://orcid.org/0000-0003-4586-8527

https://orcid.org/0000-0002-8877-4654

While quite some facts are known about the

world, little is known about how much is un-

known.

In other words, a knowledge base is in general incom-

plete. This obviously has an impact on the overall

quality of a KB—the more it is incomplete the lesser

is its quality and the more the pieces of information

extracted from it have to be considered (used) with

caution.

The problem of representing both validity and

completeness has begun to be dealt with for informa-

tion stored in databases many years ago before be-

ing addressed for knowledge bases (KBs). For exam-

ple, we can consider the model of database integrity

proposed by (Motro, 1989) and the work by (De-

molombe, 1996), who used modal logic for reason-

ing about validity and completeness of information

stored in relational databases as precursors of some

ideas that have later been adopted for KBs. How-

ever, the representation of the incompleteness in in-

formation stored in the databases has been inspired

by earlier works on the representation of incom-

pleteness in knowledge bases as the ones proposed

by (Levesque, 1980; Levesque, 1982) and the one

proposed by (Collins et al., 1975) for reasoning with

this kind of knowledge bases.

Recent work on annotating KBs with metadata

about their completeness has been done, in the con-

text of the semantic Web, by (Darari et al., 2013;

Razniewski et al., 2016), who studied the way in

Pereira, C., Dubois, D., Prade, H. and Tettamanzi, A.

Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness.

DOI: 10.5220/0010874000003116

In Proceedings of the 14th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2022) - Volume 2, pages 441-449

ISBN: 978-989-758-547-0; ISSN: 2184-433X

 2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reser ved

441

which statements about completeness can be used

when answering queries. According to their ap-

proach, it is then possible, given a statement about

a topic, to specify if information about it in the base is

complete or not. However, the gradual view of com-

pleteness in data sources has not been considered.

Solutions to construct a possibilistic belief base

from a crisp KB using topical validity and complete-

ness metadata, like (da Costa Pereira et al., 2017), suf-

fer from some limitations, mainly due to the fact that,

in order to guarantee consistency, they have to sacri-

ﬁce much of the expressive power of the knowledge

representation language. In particular, the “facts” that

can be recorded in the knowledge base are restricted

only to ground formulas without negation and dis-

junction. Indeed, negative information (i.e., facts that

do not hold) is critical for the correctness of queries

involving negation (Razniewski et al., 2016). Com-

pleteness and negative information are closely re-

lated: if we know that a portion of a KB is complete, it

is as if we knew an inﬁnity of negated facts (all those

relevant to that portion that are not in the KB).

Motivated by the above considerations, we want

to answer the following research question: is it possi-

ble that a classical knowledge base plus metadata in-

formation on the (gradual) validity and completeness

with respect to a few conﬁgurations (groups, portions,

subjects, topics of statements it contains), enables one

to represent a possibilistic belief base and perform

possibilistic inferences by using a classical reasoner?

This research question leads us to the following sub-

question: which should be a suitable deﬁnition for

such a conﬁguration which in turn will allow us to de-

ﬁne appropriate validity and completeness functions?

We propose a framework based on possibility the-

ory to represent and reason about gradual notions of

validity and completeness in KBs. Since it would be

impractical to associate values of possibility and ne-

cessity to each single assertion in a knowledge base

(KB), we show that, thanks to validity and complete-

ness metadata, it is possible to express degrees of pos-

sibility/necessity for formulas entailed by the KB in a

parsimonious way (i.e., without having to associate a

weight to each single formula) and to perform possi-

bilistic inferences on top of a classical KB, consider-

ing, in addition to possibilistic uncertainty, also nega-

tive information.

In particular, we present a way to represent va-

lidity and completeness information (with respect to

particular slices of information) in a knowledge base

in a way that permits said validity and completeness

information to simulate the knowledge base as a pos-

sibilistic knowledge base, where the possibilistic in-

formation is derived exclusively from the validity and

completeness. The advantage of this approach is that

the validity and completeness is only assessed at the

slice level, where in a possibilistic knowledge base,

the possibility distribution needs to be deﬁned for all

facts. Therefore, when the number of slices is much

smaller than the number of facts, the proposed repre-

sentation is much more parsimonious.

The paper is organized as follows: Section 2 gives

then some background about the formal tools we use.

We present our proposal in Section 3, which explains

how (gradual) validity and completeness are related

to the beliefs of an agent. Finally, Section 4 discusses

some possible applications of our proposal.

2 BACKGROUND

We ﬁrst provide a brief refresher on possibility the-

ory, before recalling the basics of possibilistic logic, a

logic where classical formulas are weighted in terms

of certainty.

2.1 Possibility Theory

Fuzzy sets (Zadeh, 1965) are sets whose elements

have degrees of membership in [0, 1]. Possibility the-

ory (Dubois and Prade, 1988) is a mathematical the-

ory of uncertainty that relies upon fuzzy set theory,

in that the (fuzzy) set of possible values for a vari-

able of interest is used to describe the uncertainty as

to its precise value. At the semantic level, the mem-

bership function of such set, π, is called a possibil-

ity distribution and its range is [0, 1]. A possibility

distribution can represent the available knowledge of

an agent. π(I ) represents the degree of compatibility

of the interpretation I with the available knowledge

about the real world if we are representing uncertain

pieces of knowledge. By convention, π(I ) = 1 means

that it is totally possible for I to be the real world,

1 > π(I ) > 0 means that I is only somehow possible,

while π(I ) = 0 means that I is certainly not the real

world.

A possibility distribution π is said to be normal-

ized if there exists at least one interpretation I

s.t.

π(I

) = 1, i.e., there exists at least one possible situa-

tion which is consistent with the available knowledge.

Deﬁnition 1. (Possibility and Necessity Measures) A

possibility distribution π induces a possibility mea-

sure and its dual necessity measure, denoted by Π and

N respectively. Both measures apply to a classical set

S ⊆ Ω and are deﬁned as follows:

Π(S) = max

I ∈S

π(I ); (1)

N(S) = 1 − Π(

S) = min

I ∈

{1 − π(I )}. (2)

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

442

In words, Π(S) expresses to what extent S is con-

sistent with the available knowledge. Conversely,

N(S) expresses to what extent S is entailed by the

available knowledge. It is equivalent to the impossi-

bility of its complement

S—the more

S is impossible,

the more S is certain. A few properties of Π and N

induced by a normalized possibility distribution on a

ﬁnite universe of discourse Ω are the following. For

all subsets A, B ⊆ Ω:

1. Π(A ∪ B) = max{Π(A), Π(B)};

2. Π(A ∩ B) ≤ min{Π(A), Π(B)};

3. Π(

0) = N(

0) = 0; Π(Ω) = N(Ω) = 1;

4. N(A ∩ B) = min{N(A), N(B)};

5. N(A ∪ B) ≥ max{N(A), N(B)};

6. Π(A) = 1 − N(

A) (duality);

7. N(A) > 0 ⇒ Π(A) = 1; Π(A) < 1 ⇒ N(A) = 0;

A consequence of these properties is that

max{Π(A), Π(

A)} = 1. In case of complete ig-

norance on A, Π(A) = Π(

A) = 1.

2.2 Possibilistic Logic

Before going into details about possibilistic logic, we

would like to put forward our motivation for such a

logic for handling uncertainty in this work. Informa-

tion is often pervaded with uncertainty, and it may

be convenient to associate pieces of information with

certainty levels. These certainty levels can often be

qualitatively assessed only using a ﬁnite completely

ordered scale ranging from “fully certain” to “not cer-

tain at all”, with intermediary levels such as “almost

certain”, or “somewhat certain”. Possibility theory

offers such a qualitative setting, when a ﬁnite sub-

set of [0, 1] including 0 and 1 is used and then only

the ordering of the degrees in [0, 1] is meaningful,

in agreement with the use of max and min opera-

tors. Moreover, the inverse mapping 1−(·) exchanges

the necessity scale with a possibility scale, such as

“fully possible”, “quite possible”, “somehow possi-

ble”, “not possible at all (= impossible)”. In the fol-

lowing, the pieces of information are associated with

certainty levels which are viewed as lower bounds of

necessity measures. Then, the min-decomposability

of necessity measures with respect to conjunction ac-

knowledges the fact that to be certain at least at some

level α that a conjunction of facts is true, we should

be certain at least at level α that the truth of each fact

is certain at least at level α.

Possibilistic logic (Dubois et al., 1994) has been

originally motivated by the need to manipulate syn-

tactic expressions of the form (φ, α) where φ is a clas-

sical logic formula, and α is a certainty level, with the

intended semantics that N(φ) ≥ α, where N is a neces-

sity measure. It is then possible to consider that all the

propositions of the considered language can be totally

ordered on a given scale. In our case, propositions are

formulas. Besides, in possibilistic logic, a level of in-

consistency can be associated with a knowledge base

as recalled now.

A possibilistic knowledge base B is a set of

possibilistic logic formulas {(φ

, α

) | i = 1, . . . , m}.

Clearly, B can be layered into a set of nested

classical bases B

= {φ

| (φ

, α

) ∈ B and α

≥ α}

such that B

⊆ B

if α ≥ β. Proving syntacti-

cally B ` (φ, α) amounts to proceeding by refu-

tation and proving B ∪ {(¬φ, 1)} ` (⊥, α) by re-

peated application of the resolution rule (¬φ ∨

ψ, α), (φ ∨ ν, β) ` (ψ ∨ ν, min(α, β)). Moreover, B `

(φ, α) if and only if B

` φ and α > inc(B), where

inc(B) is inconsistency level of B deﬁned as inc(B) =

max{α | B ` (⊥, α)}. It can be shown that inc(B) = 0

iff B

∗

is consistent, with B

∗

= {φ

| (φ

, α

) ∈ B}. Thus

reasoning from a possibilistic base just amounts to

reasoning classically with subparts of the base whose

formulas are strictly above the certainty level.

A possibilistic knowledge base B = {(φ

, α

) | i =

1, . . . , m} encodes the constraints N(φ

) ≥ α

. B is

thus semantically associated with a possibility distri-

bution (Dubois et al., 1994)

(I ) = min

i=1,...,m

max(φ

, 1 − α

where φ

= 1 if I is a model of φ

, and φ

= 0 oth-

erwise. As it can be seen, π

(I ) is all the larger as

the interpretation I makes false only formulas with

low certainty levels. π

is the largest possibility dis-

tribution, i.e., the least committed distribution assign-

ing the largest possibility levels in agreement with

the constraints N(φ

) ≥ α

for i = 1, . . . , m. The dis-

tribution π

rank-orders the interpretations I of the

language induced by the φ

’s according to their plau-

sibility on the basis of the strength of the pieces of

information in B. If the set of formulas B

∗

is con-

sistent then the distribution π

is normalized (i.e.,

∃I , π

(I ) = 1). The semantic entailment is deﬁned by

B |= (φ, α) iff ∀I , π

(I ) ≤ π

{(φ,α)}

(I ). Reasoning by

refutation in propositional possibilistic logic is sound

and complete, applying the syntactic resolution rule.

Namely, it can be shown that B |= (φ, α) iff B ` (φ, α)

and inc(B) = 1 − max

(I ).

Algorithms for reasoning in possibilistic logic and

an analysis of their complexity, which is similar to the

one of classical logic, multiplied by the logarithm of

the number of levels used in the necessity scale, can

be found in (Lang, 2001).

Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness

443

3 REPRESENTING AND

REASONING WITH VALIDITY

AND COMPLETENESS

As it was hypothesized by (Motro, 1989) for the case

of relational databases, here, to formalize the con-

cepts of validity and completeness in the the case of

knowledge bases, we shall assume the existence of

a hypothetical knowledge base that captures a desig-

nated environment of the real world perfectly. The

knowledge base K mentioned in the paper is then an

approximation of such hypothetical knowledge base.

When dealing with relational databases, only the

statements explicitly present in the database are con-

sidered as true (valid). The others are considered as

false (closed world assumption). When dealing with

sets of formulas, the true statements are those explic-

itly represented in the dataset, plus those which can

be inferred thanks to a reasoner. However, due to the

open world assumption, we cannot suppose that the

other statements are false—the truth status of some

statements may be unknown in case of incomplete

knowledge.

In this section, we recall the notions of validity

and completeness, ﬁrst introduced in (Demolombe,

1996), and made gradual in the setting of possibilis-

tic logic (Dubois and Prade, 1997) for dealing with

relational databases and adapt them to the more gen-

eral setting of knowledge bases, where (i) the open

world assumption holds, (ii) implicit knowledge can

be inferred by logical deduction, and (iii) negative in-

formation is also taken into account unlike what was

proposed in (da Costa Pereira et al., 2017).

It is often the case that the knowledge contained in

a knowledge base is not all certain to the same degree.

There will be statements whose truth is absolutely cer-

tain. This might be the case of ontological axioms

or integrity constraints. Other groups of statements,

obtained for example from the same source or cov-

ering the same subject, might have the same degree

of certainty, but statements from different portions of

the knowledge base might be believed with greater or

lower certainty.

Our working hypothesis is that, as suggested

by (da Costa Pereira et al., 2017), the degree of cer-

tainty of every piece of information depends on the

degree to which the knowledge base is valid and com-

plete with respect to all groups, portions, subjects,

topics (or whatever else we wish to call them) of state-

ments it contains. We think that an intuitive name

for this notion of a semantically determined homoge-

neous portion of a knowledge base may be a slice and

we will stick to this term from now on.

While it is true that the term slice might lead to

confusion with the same term as used in the hyper-

cube data model (Gray et al., 1997), as a matter of

fact, the suggestion that a slice may be a subset de-

ﬁned by ﬁxing one or more dimensions constitutes

a good and useful intuition. Indeed, if we interpret

slices as knowledge “topics” or “domains”, then this

is exactly what slices are, with the speciﬁcity that here

every “dimension” can be viewed as a binary truth

assignment to a formula, e.g., in a hypotetical knowl-

edge base about travel,“x is a ﬂight and x departs from

London”, thus giving the slice of the knowledge base

that provides information about ﬂights departing from

London.

3.1 Postulates

To be able to talk about the validity and complete-

ness of information stored in a knowledge base with

respect to a particular slice, we need a formal way of

deﬁning the latter. We begin with the most general

and neutral deﬁnition, whereby a slice T is just a set

of formulas. A more precise deﬁnition is deferred to

when we will have discussed the properties that a slice

must satisfy.

A few basic postulates for slices, based on com-

mon sense arguments, are the following.

P1. Slices are non-empty. We assume slices are de-

ﬁned by the designer or a user of a knowledge

base in order to state metadata about the validity

and completeness of portions of knowledge in the

base; deﬁning an empty slice would defeat its pur-

pose.

P2. Slices are all distinct. Deﬁning two equivalent

slices would be redundant and of no practical use;

therefore, we can safely bar this possibility.

P3. For every slice not contained in another slice (we

may call it a “top-level” slice), there exists a for-

mula entailed by the knowledge base that only be-

longs in that slice and in no other slice.

What justiﬁes stating Postulate P3 is that for every

portion of a knowledge base a knowledge engineer

might want to deﬁne in order to state metadata on it,

one would expect that either that portion is a proper

subset of another portion (i.e., a sort of sub-topic or

sub-domain), or, if it is not, then its very deﬁnition is

motivated by the existence of some facts that are not

covered by other slices. For instance, in a knowledge

base about travel, I might want to deﬁne a slice about

“airports” because there are assertions involving air-

ports, like “London Heathrow (LHR) has four opera-

tional terminals”, that do not deal with any other pos-

sible slices, like “ﬂights”, “airlines”, “aircraft”, and so

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

444

on. Or I might choose to deﬁne “aviation”, which in-

cludes all of them, including the assertion about LHR,

which is not covered by any other existing slice.

Let K be a set of formulas in a decidable logical

language L, for which there exists a reasoner capable

of performing inferences and deduce other formulas

which are not explicitly contained in K.

Under the closed-world hypothesis typical of

databases, which is the setting in which (Dubois and

Prade, 1997) was stated, it would be reasonable to

admit that what cannot be deduced from an agent’s

knowledge base corresponds to what the agent be-

lieves to be false.

However, in the case of a knowledge base, the

open-world assumption holds and the agent is capa-

ble of performing logical inferences (e.g., thanks to a

reasoner). Therefore, we must think in terms of logi-

cal entailment of formulas.

Without loss of generality, we will assume a Her-

brand semantics for L.

Deﬁnition 2. The Herbrand base of L is the set H

all ground atoms in L. An interpretation (or model) is

a function I : H

→ {0, 1}, which can also be viewed

as a subset of the Herbrand base, I ⊆ H

(the set of

all atoms φ such that φ

= 1). We denote Ω = 2

the

set of all interpretations.

We write K |= φ to denote the fact that formula

φ is a logical consequence of all the formulas in K.

Assuming the usual deﬁnition of satisfaction (given

an interpretation I ∈ Ω and a formula φ ∈ L, I |= φ

if and only if φ evaluates to true in I ), we deﬁne the

notion of entailment as follows: K |= φ if and only if,

for every interpretation I ∈ Ω, I |= K implies I |= φ.

Using a sound and complete reasoner, if K |= φ, φ

can also be deduced from K by the agent (which we

write K ` φ), whereas if K 6` φ (φ cannot be deduced

from K), this means that K 6|= φ (φ is not a logical

consequence of K). Finally, given a set S of formulas,

K |= S if and only if ∀φ ∈ S, K |= φ.

3.2 Graded Validity and Completeness

The purpose of a knowledge base is to store axioms

and assertions that summarize an agent’s knowledge

about the world or, at least, a limited portion of the

world, which is relevant to the problem the agent is in-

tended to deal with. We will take an objectivist stance

by assuming that there exists, among all the possible

interpretations on language L, one that reﬂects the ac-

tual state of affairs. Let us denote such interpretation

by I

∗

. Then we may say that the objective truth of

any formula φ ∈ L is given by φ

∗

. To be absolutely

clear about that, we are assuming that I

∗

is real and

an objective truth exists for every formula, indepen-

dently of a knowledge base K and of the agent using

it. As a matter of fact, the knowledge represented in

(a slice of) K might (and will, in general) not reﬂect

reality perfectly or accurately.

Given this premise, the notions of validity and

completeness of a knowledge base K with respect to a

slice may be deﬁned as follows:

• K is valid with respect to a slice iff, for every for-

mula φ in that slice, K |= φ implies that I

∗

|= φ,

i.e., φ is objectively true;

• K is complete with respect to a slice iff, for every

formula φ in that slice, K 6|= φ implies that I

∗

6|= φ,

i.e., φ is objectively false.

As pointed out in Section 1, we will use the term

beliefs to refer to (possibly partial, incomplete, or in-

valid) information held by an agent. An agent may

then believe something to different degrees. We sup-

pose that these degrees depend on both the degree of

completeness of the sets of statements and on the re-

liability or trustworthiness of the information source.

For example (da Costa Pereira et al., 2017), informa-

tion related to an Air France ﬂight should be complete

if the source is the Air France carrier itself. However,

the completeness could be lower if the source is a pri-

vate travel agency with a partial coverage about the

current ﬂights from the different companies including

those of Air France. Similarly, the degree of trust to

be associated with information fed by a clerk should

be lower than the one to be associated with informa-

tion fed by a supervisor. Still, we would like to stress

that the way in which such degrees are obtained is out

of the scope of this paper.

As pointed out in Section 2, possibility theory is

well suited to model degrees of certainty or, dually,

degrees of possibility. Besides, possibility theory, un-

like other theories of uncertainty like probability the-

ory, is well -suited to model total ignorance which is

necessary to represent situations in which we have,

for example, both K 6|= φ and K 6|= ¬φ. This is the rea-

son why, here, we adopt this theory to represent the

gradual property of both the reliability of an informa-

tion source as well as the completeness of information

regarding a particular slice.

We assume that K is a consistent, classical (as op-

posed to possibilistic) knowledge base, i.e., K con-

tains statements (such as axioms and assertions) ex-

pressed in one of the decidable logical languages usu-

ally employed to represent knowledge in practical ap-

plications (examples might be Datalog, description

logics, RDF + RDFS, or one of the proﬁles of OWL).

We assume that, in addition to K, metadata about

validity and completeness of information stored in K

are given in the form of two functions, Val and Comp,

Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness

445

which associate a degree of validity and complete-

ness, respectively, to a number of slices deﬁned on

K. Let S

⊂ 2

the set of such slices. In practice,

these two functions might be implemented by a look-

up table, listing their values for each deﬁned slice.

Deﬁnition 3. Let Val : S

→ [0, 1] be such that, for

each slice T ∈ S

, Val(T ) is the degree to which

K contains valid information about slice T , which

means, for all formulas φ such that K |= φ and φ ∈ T ,

N(φ) ≥ Val(T ).

Intuitively, if we can deduce φ from the knowledge

base K, and φ belongs in slice T , and we know that the

source who fed K is (somehow) reliable for domain of

T then the agent should believe φ at least as much as

the degre to which the source is reliable. If the source

is fully reliable then φ will be certain for the agent.

Deﬁnition 4. Let Comp : S

→ [0, 1] be such that, for

each slice T ∈ S

, Comp(T) is the degree to which K

contains complete information about slice T , which

means, for all formulas φ such that K 6|= φ and φ ∈ T ,

Π(φ) ≤ 1 − Comp(T ).

Intuitively, if we cannot deduce φ from the knowl-

edge base K, and φ belongs in slice T , and we know

that information in K about slice T is (somehow)

complete, then φ should be certainly false.

It is reasonable to assume that, given two slices T

and T

T ⊆ T

⇒ Val(T) ≥ Val(T

), (3)

T ⊆ T

⇒ Comp(T ) ≥ Comp(T

). (4)

Indeed, if we are told that K contains reliable

(resp. complete) information about a broader domain

to a given degree α, then K cannot be less reliable

(complete) about a narrower (i.e., more speciﬁc) do-

main T ; if anything, it might be more reliable (com-

plete) about T if a more reliable (complete) source is

available just for T .

The extent to which the agent believes φ depends

on (i) what is supposed to be known about φ — can

we deduce φ from K? —, and (ii) on the validity and

completeness of K with respect to the slices that con-

tain φ. That being the case, K, together with the meta-

data provided by Val and Comp, should allow us to

compute the degree of possibility and necessity for

any arbitrary formula φ, as follows:

Π(φ) =



1, if K |= φ,

min

T :φ∈T

{1 − Comp(T )}, otherwise;

(5)

N(φ) =



max

T :φ∈T

Val(T ), if K |= φ,

0, otherwise.

(6)

Let us call π the hypothetical possibility distribution

that induces the possibility and necessity measures

of Equations 5 and 6 and let B a hypothetical possi-

bilistic belief base corresponding to it. Furthermore,

among all possibility distributions compatible with Π

and N, we will select the one that makes the least

commitment, i.e., the maximal (most general) one.

3.3 Existence Conditions

We now derive a necessary condition for the existence

of such a possibility distribution π.

Let φ be such that K |= φ; if K is consistent,

K 6|= ¬φ. By the duality property of the possibility and

necessity measures, it must be Π(φ) = 1−N(¬φ); this

is satisﬁed by Equations 5 and 6, since Π(φ) = 1 and

N(¬φ) = 0.

Proposition 1. Let us assume that K is consistent and

that there exists a possibility distribution π that in-

duces the possibility and necessity measures of Equa-

tions 5 and 6. Then, for all φ such that K |= φ,

max

T :φ∈T

Val(T) = max

:¬φ∈T

Comp(T

). (7)

Proof. If measures N and Π are induced by the same

possibility distribution π, it must be N(φ) = 1 −

Π(¬φ); therefore, by Equations 5 and 6, we can write

max

T :φ∈T

Val(T) = N(φ) = 1 − Π(¬φ)

= 1 − min

:¬φ∈T

{1 − Comp(T

)}

= max

:¬φ∈T

Comp(T

which proves the thesis.

A formula φ such that K 6|= φ and K 6|= ¬φ poses

no problem, because Comp and Val do not interact:

Π(φ) = min

T :φ∈T

{1 − Comp(T )},

Π(¬φ) = min

:¬φ∈T

{1 − Comp(T

)},

N(φ) = N(¬φ) = 0.

We now prove that a formula and its negation can-

not belong in the same slice, unless the two functions

Val and Comp are identical.

Proposition 2. Either Val(T ) = Comp(T ) for all

slice T , or, for all formula φ and for all slice T ,

φ ∈ T ⇒ ¬φ 6∈ T .

Proof. By contradiction: we show that if φ ∈ T and

¬φ ∈ T , a contradiction can be derived. Let φ be a

formula belonging in just one slice T . If the slices

for which Val or Comp are deﬁned as all distinct,

there will always be at least one such formula. If

not, the equivalent slices can be merged together so

that all slices are distinct. By the assumption, ¬φ ∈ T

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

446

too. Now, we apply Equations 5 and 6 to compute the

possibility and necessity of both φ and ¬φ: without

loss of generality, let us assume K |= φ, and therefore,

K 6|= ¬φ; then

Π(φ) = 1, Π(¬φ) = 1 − Comp(T ),

N(φ) = Val(T ), N(¬φ) = 0.

By the duality property,

Val(T) = N(φ) = 1 − Π(¬φ) = Comp(T ).

We can observe that the case in which Val =

Comp defeats the very purpose of having the two

complementary notions of validity and completeness;

if that were the case, it would sufﬁce to call the func-

tion into which those two notions would confound

themselves “trust”, because it would reﬂect a gen-

eral notion of reliability of information about a given

slice. Therefore, since we are interested in investi-

gating the use of validity and completeness as two

distinct notions, in what follows, we will make the

assumption that, in general, Val 6= Comp. As a con-

sequence, we now know that any acceptable deﬁni-

tion of what a slice is will have to satisfy the postulate

that a formula and its negation cannot belong in the

same slice: formally, for every slice T and formula φ,

φ ∈ T ⇒ ¬φ 6∈ T . For instance, a slice might be de-

ﬁned as the set of (ground) formulas that are satisﬁed

by a formula with free variables (i.e., a query).

With this notion of slice (i.e., such that if a for-

mula belongs to a slice, then the negation of that for-

mula does not belong in the slice), we are able to

prove possibilistic generalizations of results obtained

by (Demolombe, 1999) in a KD doxastic logic.

First of all, the fact that a formula and its negation

cannot belong to the same slice motivates the deﬁni-

tion of the dual of a slice.

Deﬁnition 5. Let T be a slice. The dual of T , denoted

¬T , is the slice such that φ ∈ T iff ¬φ ∈ ¬T .

The dual of a slice is thus a sort of complement,

but not in the set-theoretic sense, because there may

exist a formula ψ such that ψ 6∈ T and ψ 6∈ ¬T ; there-

fore, in general, ¬T 6= T . A straightforward conse-

quence of Deﬁnition 5 is that ¬(¬T ) = T .

The intuition behind the next proposition is that if

the knowledge base is consistent and if information is

complete concerning both T and ¬T , then, for each

formula φ in T or in ¬T , either the agent believes φ or

it believes ¬φ.

Proposition 3. If K is consistent, for all slice T ,

min{Comp(T ), Comp(¬T )} ≤ min

φ∈T

max{N(φ), N(¬φ)}.

Proof. We prove this proposition by showing that, for

all formula φ ∈ T ,

min{Comp(T ), Comp(¬T )} ≤ max{N(φ), N(¬φ)},

from which the thesis follows. Given a formula φ ∈ T ,

there are just three mutually exclusive cases:

Case I. K |= φ and, therefore, by the consistency of

K, K 6|= ¬φ; we thus have, by Def. 4,

N(φ) = 1 − Π(¬φ) ≥ Comp(¬T );

hence,

max{N(φ), N(¬φ)} ≥

≥ N(φ) ≥ Comp(¬T )

≥ min{Comp(T ), Comp(¬T )}.

Case II. K 6|= φ and K 6|= ¬φ; we thus have, by Def. 4,

N(φ) = 1 − Π(¬φ) ≥ Comp(¬T );

N(¬φ) = 1 − Π(φ) ≥ Comp(T );

hence,

max{N(φ), N(¬φ)} ≥

≥ max{Comp(¬T ), Comp(T )}

≥ min{Comp(T ), Comp(¬T )}.

Case III. K |= ¬φ and, therefore, by the consistency

of K, K 6|= φ; we thus have, by Def. 4,

N(¬φ) = 1 − Π(φ) ≥ Comp(T );

hence,

max{N(φ), N(¬φ)} ≥

≥ N(¬φ) ≥ Comp(T )

≥ min{Comp(T ), Comp(¬T )}.

Since the three above cases are exhaustive for every

formula φ ∈ T , this concludes the proof.

Proposition 4. If π is the least commitment possibil-

ity distribution such that Equations 5 and 6 hold, for

all slice T ,

Val(T) = min

φ∈T :K|=φ

N(φ). (8)

Proof. By Def. 3, Val(T ) ≤ N(φ) for all φ ∈ T such

that K |= φ; therefore, we can write

Val(T) ≤ min

φ∈T :K|=φ

N(φ). (9)

On the other hand, by Equation 6, we can write

min

φ∈T :K|=φ

N(φ) = min

φ∈T :K|=φ

max

:φ∈T

Val(T

Now, we observe that the set of slices {T

: φ ∈ T

}

always includes T as one of its elements, because φ ∈

T ; therefore,

max

:φ∈T

Val(T

) ≥ Val(T);

furthermore,

Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness

447

• either T is a top-level slice and, by Postu-

late P3, there exists a formula φ

∗

∈ T that does

not belong to any other slice, in which case

max

:φ

∗

∈T

Val(T

) = max{Val(T )} = Val(T ),

whence we can conclude that

min

φ∈T :K|=φ

max

:φ∈T

Val(T

) = Val(T),

which is the thesis;

• or there exists another slice

T such that T ⊂

T , for

which, by Equation 3, Val(

T ) ≤ Val(T ), which

leads us to conclude that

min

φ∈T :K|=φ

max

:φ∈T

Val(T

) ≤ Val(T),

which, together with Equation 9, yields the thesis.

No other case being possible, this concludes the

proof.

It has been proven (Demolombe, 1999) that, for

a consistent base K, if K is complete about ¬T (the

complement of slice T), then K is valid about slice T ;

the following an extension of that result to the gradual

case.

Proposition 5. If K is consistent, for all slice T ,

Comp(¬T ) ≤ Val(T ).

Proof. By Def. 4, Comp(¬T) ≤ 1 − Π(¬ψ) = N(ψ),

for all ¬ψ ∈ ¬T (and, therefore, ψ ∈ T ) such that K 6|=

¬ψ; let

β = min

ψ∈T :K6|=¬ψ

N(ψ);

then we can write Comp(¬T ) ≤ β. Clearly, β ≤

min

φ∈T :K|=φ

N(φ), because {φ ∈ T : K |= φ} ⊆ {ψ ∈

T : K 6|= ¬ψ}, since K |= φ ⇒ K 6|= ¬φ. Therefore, by

Proposition 4, we have

Comp(¬T ) ≤ β ≤ min

φ∈T :K|=φ

N(φ) = Val(T ),

which proves the thesis.

3.4 Complexity of Reasoning

The results that have been derived above show that

one can “simulate”, as it were, a possibilistic belief

base by means of a crisp base K together with meta-

data about the validity and completeness of K with

respect to a number of “slices” (i.e., sets of formulas).

It is not important to know a least-commitment

possibility distribution that induces the possibility and

necessity measures of Equations 5 and 6 or to rep-

resent one of its corresponding possibilistic bases B

explicitly, since K, together with its associated meta-

data Val and Comp, is sufﬁcient to compute any pos-

sibilistic inference using any available classical rea-

soner, as demonstrated by the algorithm shown in Fig-

ure 1, adapted from (da Costa Pereira et al., 2017).

Require: K ⊂ L : a consistent KB; φ ∈ L: a formula;

Ensure: N(φ).

1: α ← 0

2: if K |= φ then

3: for all slice T ∈ S

4: if φ ∈ T and α < Val(T ) then

5: α ← Val(T)

6: end if

7: end for

8: else if K 6|= ¬φ then

9: for all slice T do

10: if ¬φ ∈ T and α < Comp(T ) then

11: α ← Comp(T )

12: end if

13: end for

14: end if

15: return α.

Figure 1: An algorithm that “simulates” a possibilistic in-

ference from B using K, Val, and Comp.

Property 1. Algorithm 1 is correct (i.e., it computes

N(φ)).

Proof. If K |= φ, Equation 6 is applied; other-

wise, Equation 5 together with duality: N(φ) = 1 −

Π(¬φ).

Property 2. The cost of Algorithm 1, is at most

k+2 classical inferences, where kS

k is the num-

ber of slices deﬁned for K.

Proof. Algorithm 1 needs ﬁrst of all to execute at

most two classical inferences: the one in Line 2 and,

in case K 6|= φ, the one in Line 8. Then, checking

whether a formula belongs in a topic costs at most

one classical inference and it has to be done for all

the slices deﬁned for K.

Notice that, according to this result, while the cost

of a possibilistic inference is higher then the cost of

a classical inference, it is so only by a factor which

depends on the number of slices deﬁned on the KB.

It is to be expected that this number will, in general,

be much smaller than the number of facts contained in

the KB. In other words, the overall complexity of pos-

sibilistic inference will be in the same class as classi-

cal inference.

4 DISCUSSION AND

CONCLUSION

We have shown that a classical knowledge base plus

metadata information on the (gradual) validity and

completeness of its “slices” enables one to represent

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

448

a possibilistic belief base and perform possibilistic in-

ferences by using a classical reasoner at a cost which,

albeit larger than the classical counterpart by a mul-

tiplicative factor proportional to the number of slices,

lies in the same complexity class.

All of our results are valid for the general case of

a decidable fragment of ﬁrst-order logic and thus they

can be readily transferred to state-of-the-art and popu-

lar knowledge representation languages, like Datalog

and RDF + OWL and their reasoners. This also means

that our suggestion to use gradual metadata about va-

lidity and completeness may be applied to represent-

ing and reasoning with possibilistic uncertainty on top

of the standard infrastructure of the semantic Web,

without requiring any ad hoc extension and at a rea-

sonable cost. In that setting, one way of implement-

ing the notion of a slice might be through RDF named

graphs.

Future work includes demonstrating how our pro-

posal can be deployed on the semantic Web in-

frastructure to represent and reason about uncertain

knowledge with a proof-of-concept implementation.

ACKNOWLEDGEMENTS

This work has been partially supported by the French

government, through the 3IA C

ote d’Azur “Invest-

ments in the Future” project managed by the National

Research Agency (ANR) with the reference number

ANR-19-P3IA-0002.

REFERENCES

Collins, A., Warnock, E. H., Aiello, N., and Miller, M. L.

(1975). Reasoning from incomplete knowledge. In

BOBROW, D. G. and COLLINS, A., editors, Repre-

sentation and Understanding, pages 383 – 415. Mor-

gan Kaufmann, San Diego.

da Costa Pereira, C., Dubois, D., Prade, H., and Tettamanzi,

A. G. B. (2017). Handling topical metadata regarding

the validity and completeness of multiple-source in-

formation: A possibilistic approach. In SUM, volume

10564 of Lecture Notes in Computer Science, pages

363–376, Berlin. Springer.

Darari, F., Nutt, W., Pirr

o, G., and Razniewski, S. (2013).

Completeness statements about RDF data sources

and their use for query answering. In The Seman-

tic Web - ISWC 2013 - 12th International Semantic

Web Conference, Sydney, NSW, Australia, October 21-

25, 2013, Proceedings, Part I, pages 66–83, Berlin.

Springer.

Demolombe, R. (1996). Answering queries about validity

and completeness of data: From modal logic to rela-

tional algebra. In FQAS, volume 62 of Datalogiske

Skrifter (Writings on Computer Science), pages 265–

276, Roskilde. Roskilde University.

Demolombe, R. (1999). Database validity and complete-

ness: Another approach and its formalisation in modal

logic. In KRDB, volume 21 of CEUR Workshop Pro-

ceedings, pages 11–13, Aachen. CEUR-WS.org.

Dong, X. L., Gabrilovich, E., Heitz, G., Horn, W., Murphy,

K., Sun, S., and Zhang, W. (2014). From data fusion

to knowledge fusion. Proc. VLDB Endow., 7(10):881–

892.

Dubois, D., Lang, J., and Prade, H. (1994). Possibilistic

logic. In Handbook of logic in artiﬁcial intelligence

and logic programming (vol. 3): nonmonotonic rea-

soning and uncertain reasoning, pages 439–513. Ox-

ford University Press, New York, NY, USA.

Dubois, D. and Prade, H. (1988). Possibility Theory—An

Approach to Computerized Processing of Uncertainty.

Plenum Press, New York.

Dubois, D. and Prade, H. (1997). Valid or complete infor-

mation in databases - A possibility theory-based anal-

ysis. In DEXA, volume 1308 of Lecture Notes in Com-

puter Science, pages 603–612, Berlin. Springer.

Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Re-

ichart, D., Venkatrao, M., Pellow, F., and Pirahesh, H.

(1997). Data cube: A relational aggregation operator

generalizing group-by, cross-tab, and sub totals. Data

Min. Knowl. Discov., 1(1):29–53.

Lang, J. (2001). Possibilistic logic: complexity and al-

gorithms. In Kohlas, J. and Moral, S., editors, Al-

gorithms for Uncertainty and Defeasible Reasoning,

Vol. 5 of Handbook of Defeasible Reasoning and

Uncertainty Management Systems (Gabbay, D. M.

and Smets, Ph., eds.), pages 179–220. Kluwer Acad.

Publ., Dordrecht.

Levesque, H. J. (1980). Incompleteness in knowledge

bases. SIGART Bull., 74:150–152.

Levesque, H. J. (1982). The logic of incomplete knowledge

bases. In On Conceptual Modelling (Intervale), pages

165–189, New York. Springer.

Motro, A. (1989). Integrity = validity + completeness. ACM

Trans. Database Syst., 14(4):480–502.

Razniewski, S., Suchanek, F. M., and Nutt, W. (2016). But

what do we actually know? In AKBC@NAACL-HLT,

pages 40–44, Stroudsburg, PA. The Association for

Computer Linguistics.

Wick, M. L., Singh, S., Kobren, A., and McCallum, A.

(2013). Assessing conﬁdence of knowledge base con-

tent with an experimental study in entity resolution.

In Proceedings of the 2013 workshop on Automated

knowledge base construction, AKBC@CIKM 13, San

Francisco, California, USA, October 27-28, 2013,

pages 13–18, New York. ACM.

Zadeh, L. A. (1965). Fuzzy sets. Information and Control,

8:338–353.

Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness

449