Parsimonious Representation of Knowledge Uncertainty using Metadata
about Validity and Completeness
C
´
elia da Costa Pereira
1 a
, Didier Dubois
2 b
, Henri Prade
2 c
and Andrea G. B. Tettamanzi
3 d
1
Universit
´
e C
ˆ
ote d’Azur, CNRS, I3S, Sophia Antipolis, France
2
IRIT – CNRS, 118, route de Narbonne, Toulouse, France
3
Universit
´
e C
ˆ
ote d’Azur, Inria, CNRS, I3S, Sophia Antipolis, France
Keywords:
Knowledge Representation, Possibility Theory.
Abstract:
We investigate how metadata about the uncertainty of knowledge contained in a knowledge base can be ex-
pressed parsimoniously and used for reasoning. We propose an approach based on possibility theory, whereby
a classical knowledge base plus metadata about the degree of validity and completeness of some of its portions
are used to represent a possibilistic belief base. We show how reasoning on such belief base can be done using
a classical reasoner.
1 INTRODUCTION AND
RELATED WORK
In general, the process of getting a piece of informa-
tion from a Knowledge Base (KB) is driven by prac-
tical purposes—such a piece of information will be
used to justify certain decisions, for example. Its qual-
ity has therefore an important role to play in the suc-
cess of the decisions made. The quality of a piece of
information can be measured by considering different
dimensions. Most contributions in the literature con-
centrate exclusively on the amount of true (known)
facts in a KB for assessing its quality. (Wick et al.,
2013), for example, propose several algorithms for es-
timating a value of confidence based on the probabil-
ity of a fact in a KB being true. In (Dong et al., 2014),
the authors studied the applicability of data fusion
techniques to solve the problem of knowledge base
feeding. The criterion they used to construct qual-
ity knowledge bases was to identify the true values
of data items among multiple observed values pro-
vided from different (and maybe unknown) sources
with different reliabilities. However, as it has been
pointed out for example by (Razniewski et al., 2016),
a
https://orcid.org/0000-0001-6278-7740
b
https://orcid.org/0000-0002-6505-2536
c
https://orcid.org/0000-0003-4586-8527
d
https://orcid.org/0000-0002-8877-4654
While quite some facts are known about the
world, little is known about how much is un-
known.
In other words, a knowledge base is in general incom-
plete. This obviously has an impact on the overall
quality of a KB—the more it is incomplete the lesser
is its quality and the more the pieces of information
extracted from it have to be considered (used) with
caution.
The problem of representing both validity and
completeness has begun to be dealt with for informa-
tion stored in databases many years ago before be-
ing addressed for knowledge bases (KBs). For exam-
ple, we can consider the model of database integrity
proposed by (Motro, 1989) and the work by (De-
molombe, 1996), who used modal logic for reason-
ing about validity and completeness of information
stored in relational databases as precursors of some
ideas that have later been adopted for KBs. How-
ever, the representation of the incompleteness in in-
formation stored in the databases has been inspired
by earlier works on the representation of incom-
pleteness in knowledge bases as the ones proposed
by (Levesque, 1980; Levesque, 1982) and the one
proposed by (Collins et al., 1975) for reasoning with
this kind of knowledge bases.
Recent work on annotating KBs with metadata
about their completeness has been done, in the con-
text of the semantic Web, by (Darari et al., 2013;
Razniewski et al., 2016), who studied the way in
Pereira, C., Dubois, D., Prade, H. and Tettamanzi, A.
Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness.
DOI: 10.5220/0010874000003116
In Proceedings of the 14th International Conference on Agents and Artificial Intelligence (ICAART 2022) - Volume 2, pages 441-449
ISBN: 978-989-758-547-0; ISSN: 2184-433X
Copyright
c
2022 by SCITEPRESS Science and Technology Publications, Lda. All rights reser ved
441
which statements about completeness can be used
when answering queries. According to their ap-
proach, it is then possible, given a statement about
a topic, to specify if information about it in the base is
complete or not. However, the gradual view of com-
pleteness in data sources has not been considered.
Solutions to construct a possibilistic belief base
from a crisp KB using topical validity and complete-
ness metadata, like (da Costa Pereira et al., 2017), suf-
fer from some limitations, mainly due to the fact that,
in order to guarantee consistency, they have to sacri-
fice much of the expressive power of the knowledge
representation language. In particular, the “facts” that
can be recorded in the knowledge base are restricted
only to ground formulas without negation and dis-
junction. Indeed, negative information (i.e., facts that
do not hold) is critical for the correctness of queries
involving negation (Razniewski et al., 2016). Com-
pleteness and negative information are closely re-
lated: if we know that a portion of a KB is complete, it
is as if we knew an infinity of negated facts (all those
relevant to that portion that are not in the KB).
Motivated by the above considerations, we want
to answer the following research question: is it possi-
ble that a classical knowledge base plus metadata in-
formation on the (gradual) validity and completeness
with respect to a few configurations (groups, portions,
subjects, topics of statements it contains), enables one
to represent a possibilistic belief base and perform
possibilistic inferences by using a classical reasoner?
This research question leads us to the following sub-
question: which should be a suitable definition for
such a configuration which in turn will allow us to de-
fine appropriate validity and completeness functions?
We propose a framework based on possibility the-
ory to represent and reason about gradual notions of
validity and completeness in KBs. Since it would be
impractical to associate values of possibility and ne-
cessity to each single assertion in a knowledge base
(KB), we show that, thanks to validity and complete-
ness metadata, it is possible to express degrees of pos-
sibility/necessity for formulas entailed by the KB in a
parsimonious way (i.e., without having to associate a
weight to each single formula) and to perform possi-
bilistic inferences on top of a classical KB, consider-
ing, in addition to possibilistic uncertainty, also nega-
tive information.
In particular, we present a way to represent va-
lidity and completeness information (with respect to
particular slices of information) in a knowledge base
in a way that permits said validity and completeness
information to simulate the knowledge base as a pos-
sibilistic knowledge base, where the possibilistic in-
formation is derived exclusively from the validity and
completeness. The advantage of this approach is that
the validity and completeness is only assessed at the
slice level, where in a possibilistic knowledge base,
the possibility distribution needs to be defined for all
facts. Therefore, when the number of slices is much
smaller than the number of facts, the proposed repre-
sentation is much more parsimonious.
The paper is organized as follows: Section 2 gives
then some background about the formal tools we use.
We present our proposal in Section 3, which explains
how (gradual) validity and completeness are related
to the beliefs of an agent. Finally, Section 4 discusses
some possible applications of our proposal.
2 BACKGROUND
We first provide a brief refresher on possibility the-
ory, before recalling the basics of possibilistic logic, a
logic where classical formulas are weighted in terms
of certainty.
2.1 Possibility Theory
Fuzzy sets (Zadeh, 1965) are sets whose elements
have degrees of membership in [0, 1]. Possibility the-
ory (Dubois and Prade, 1988) is a mathematical the-
ory of uncertainty that relies upon fuzzy set theory,
in that the (fuzzy) set of possible values for a vari-
able of interest is used to describe the uncertainty as
to its precise value. At the semantic level, the mem-
bership function of such set, π, is called a possibil-
ity distribution and its range is [0, 1]. A possibility
distribution can represent the available knowledge of
an agent. π(I ) represents the degree of compatibility
of the interpretation I with the available knowledge
about the real world if we are representing uncertain
pieces of knowledge. By convention, π(I ) = 1 means
that it is totally possible for I to be the real world,
1 > π(I ) > 0 means that I is only somehow possible,
while π(I ) = 0 means that I is certainly not the real
world.
A possibility distribution π is said to be normal-
ized if there exists at least one interpretation I
0
s.t.
π(I
0
) = 1, i.e., there exists at least one possible situa-
tion which is consistent with the available knowledge.
Definition 1. (Possibility and Necessity Measures) A
possibility distribution π induces a possibility mea-
sure and its dual necessity measure, denoted by Π and
N respectively. Both measures apply to a classical set
S and are defined as follows:
Π(S) = max
I S
π(I ); (1)
N(S) = 1 Π(
¯
S) = min
I
¯
S
{1 π(I )}. (2)
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
442
In words, Π(S) expresses to what extent S is con-
sistent with the available knowledge. Conversely,
N(S) expresses to what extent S is entailed by the
available knowledge. It is equivalent to the impossi-
bility of its complement
¯
S—the more
¯
S is impossible,
the more S is certain. A few properties of Π and N
induced by a normalized possibility distribution on a
finite universe of discourse are the following. For
all subsets A, B :
1. Π(A B) = max{Π(A), Π(B)};
2. Π(A B) min{Π(A), Π(B)};
3. Π(
/
0) = N(
/
0) = 0; Π() = N() = 1;
4. N(A B) = min{N(A), N(B)};
5. N(A B) max{N(A), N(B)};
6. Π(A) = 1 N(
¯
A) (duality);
7. N(A) > 0 Π(A) = 1; Π(A) < 1 N(A) = 0;
A consequence of these properties is that
max{Π(A), Π(
¯
A)} = 1. In case of complete ig-
norance on A, Π(A) = Π(
¯
A) = 1.
2.2 Possibilistic Logic
Before going into details about possibilistic logic, we
would like to put forward our motivation for such a
logic for handling uncertainty in this work. Informa-
tion is often pervaded with uncertainty, and it may
be convenient to associate pieces of information with
certainty levels. These certainty levels can often be
qualitatively assessed only using a finite completely
ordered scale ranging from “fully certain” to “not cer-
tain at all”, with intermediary levels such as “almost
certain”, or “somewhat certain”. Possibility theory
offers such a qualitative setting, when a finite sub-
set of [0, 1] including 0 and 1 is used and then only
the ordering of the degrees in [0, 1] is meaningful,
in agreement with the use of max and min opera-
tors. Moreover, the inverse mapping 1(·) exchanges
the necessity scale with a possibility scale, such as
“fully possible”, “quite possible”, “somehow possi-
ble”, “not possible at all (= impossible)”. In the fol-
lowing, the pieces of information are associated with
certainty levels which are viewed as lower bounds of
necessity measures. Then, the min-decomposability
of necessity measures with respect to conjunction ac-
knowledges the fact that to be certain at least at some
level α that a conjunction of facts is true, we should
be certain at least at level α that the truth of each fact
is certain at least at level α.
Possibilistic logic (Dubois et al., 1994) has been
originally motivated by the need to manipulate syn-
tactic expressions of the form (φ, α) where φ is a clas-
sical logic formula, and α is a certainty level, with the
intended semantics that N(φ) α, where N is a neces-
sity measure. It is then possible to consider that all the
propositions of the considered language can be totally
ordered on a given scale. In our case, propositions are
formulas. Besides, in possibilistic logic, a level of in-
consistency can be associated with a knowledge base
as recalled now.
A possibilistic knowledge base B is a set of
possibilistic logic formulas {(φ
i
, α
i
) | i = 1, . . . , m}.
Clearly, B can be layered into a set of nested
classical bases B
α
= {φ
i
| (φ
i
, α
i
) B and α
i
α}
such that B
α
B
β
if α β. Proving syntacti-
cally B ` (φ, α) amounts to proceeding by refu-
tation and proving B {(¬φ, 1)} ` (, α) by re-
peated application of the resolution rule (¬φ
ψ, α), (φ ν, β) ` (ψ ν, min(α, β)). Moreover, B `
(φ, α) if and only if B
α
` φ and α > inc(B), where
inc(B) is inconsistency level of B defined as inc(B) =
max{α | B ` (, α)}. It can be shown that inc(B) = 0
iff B
is consistent, with B
= {φ
i
| (φ
i
, α
i
) B}. Thus
reasoning from a possibilistic base just amounts to
reasoning classically with subparts of the base whose
formulas are strictly above the certainty level.
A possibilistic knowledge base B = {(φ
i
, α
i
) | i =
1, . . . , m} encodes the constraints N(φ
i
) α
i
. B is
thus semantically associated with a possibility distri-
bution (Dubois et al., 1994)
π
B
(I ) = min
i=1,...,m
max(φ
I
i
, 1 α
i
),
where φ
I
i
= 1 if I is a model of φ
i
, and φ
I
i
= 0 oth-
erwise. As it can be seen, π
B
(I ) is all the larger as
the interpretation I makes false only formulas with
low certainty levels. π
B
is the largest possibility dis-
tribution, i.e., the least committed distribution assign-
ing the largest possibility levels in agreement with
the constraints N(φ
i
) α
i
for i = 1, . . . , m. The dis-
tribution π
B
rank-orders the interpretations I of the
language induced by the φ
i
s according to their plau-
sibility on the basis of the strength of the pieces of
information in B. If the set of formulas B
is con-
sistent then the distribution π
B
is normalized (i.e.,
I , π
B
(I ) = 1). The semantic entailment is defined by
B |= (φ, α) iff I , π
B
(I ) π
{(φ,α)}
(I ). Reasoning by
refutation in propositional possibilistic logic is sound
and complete, applying the syntactic resolution rule.
Namely, it can be shown that B |= (φ, α) iff B ` (φ, α)
and inc(B) = 1 max
I
π
B
(I ).
Algorithms for reasoning in possibilistic logic and
an analysis of their complexity, which is similar to the
one of classical logic, multiplied by the logarithm of
the number of levels used in the necessity scale, can
be found in (Lang, 2001).
Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness
443
3 REPRESENTING AND
REASONING WITH VALIDITY
AND COMPLETENESS
As it was hypothesized by (Motro, 1989) for the case
of relational databases, here, to formalize the con-
cepts of validity and completeness in the the case of
knowledge bases, we shall assume the existence of
a hypothetical knowledge base that captures a desig-
nated environment of the real world perfectly. The
knowledge base K mentioned in the paper is then an
approximation of such hypothetical knowledge base.
When dealing with relational databases, only the
statements explicitly present in the database are con-
sidered as true (valid). The others are considered as
false (closed world assumption). When dealing with
sets of formulas, the true statements are those explic-
itly represented in the dataset, plus those which can
be inferred thanks to a reasoner. However, due to the
open world assumption, we cannot suppose that the
other statements are false—the truth status of some
statements may be unknown in case of incomplete
knowledge.
In this section, we recall the notions of validity
and completeness, first introduced in (Demolombe,
1996), and made gradual in the setting of possibilis-
tic logic (Dubois and Prade, 1997) for dealing with
relational databases and adapt them to the more gen-
eral setting of knowledge bases, where (i) the open
world assumption holds, (ii) implicit knowledge can
be inferred by logical deduction, and (iii) negative in-
formation is also taken into account unlike what was
proposed in (da Costa Pereira et al., 2017).
It is often the case that the knowledge contained in
a knowledge base is not all certain to the same degree.
There will be statements whose truth is absolutely cer-
tain. This might be the case of ontological axioms
or integrity constraints. Other groups of statements,
obtained for example from the same source or cov-
ering the same subject, might have the same degree
of certainty, but statements from different portions of
the knowledge base might be believed with greater or
lower certainty.
Our working hypothesis is that, as suggested
by (da Costa Pereira et al., 2017), the degree of cer-
tainty of every piece of information depends on the
degree to which the knowledge base is valid and com-
plete with respect to all groups, portions, subjects,
topics (or whatever else we wish to call them) of state-
ments it contains. We think that an intuitive name
for this notion of a semantically determined homoge-
neous portion of a knowledge base may be a slice and
we will stick to this term from now on.
While it is true that the term slice might lead to
confusion with the same term as used in the hyper-
cube data model (Gray et al., 1997), as a matter of
fact, the suggestion that a slice may be a subset de-
fined by fixing one or more dimensions constitutes
a good and useful intuition. Indeed, if we interpret
slices as knowledge “topics” or “domains”, then this
is exactly what slices are, with the specificity that here
every “dimension” can be viewed as a binary truth
assignment to a formula, e.g., in a hypotetical knowl-
edge base about travel,“x is a flight and x departs from
London”, thus giving the slice of the knowledge base
that provides information about flights departing from
London.
3.1 Postulates
To be able to talk about the validity and complete-
ness of information stored in a knowledge base with
respect to a particular slice, we need a formal way of
defining the latter. We begin with the most general
and neutral definition, whereby a slice T is just a set
of formulas. A more precise definition is deferred to
when we will have discussed the properties that a slice
must satisfy.
A few basic postulates for slices, based on com-
mon sense arguments, are the following.
P1. Slices are non-empty. We assume slices are de-
fined by the designer or a user of a knowledge
base in order to state metadata about the validity
and completeness of portions of knowledge in the
base; defining an empty slice would defeat its pur-
pose.
P2. Slices are all distinct. Defining two equivalent
slices would be redundant and of no practical use;
therefore, we can safely bar this possibility.
P3. For every slice not contained in another slice (we
may call it a “top-level” slice), there exists a for-
mula entailed by the knowledge base that only be-
longs in that slice and in no other slice.
What justifies stating Postulate P3 is that for every
portion of a knowledge base a knowledge engineer
might want to define in order to state metadata on it,
one would expect that either that portion is a proper
subset of another portion (i.e., a sort of sub-topic or
sub-domain), or, if it is not, then its very definition is
motivated by the existence of some facts that are not
covered by other slices. For instance, in a knowledge
base about travel, I might want to define a slice about
“airports” because there are assertions involving air-
ports, like “London Heathrow (LHR) has four opera-
tional terminals”, that do not deal with any other pos-
sible slices, like “flights”, “airlines”, “aircraft”, and so
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
444
on. Or I might choose to define “aviation”, which in-
cludes all of them, including the assertion about LHR,
which is not covered by any other existing slice.
Let K be a set of formulas in a decidable logical
language L, for which there exists a reasoner capable
of performing inferences and deduce other formulas
which are not explicitly contained in K.
Under the closed-world hypothesis typical of
databases, which is the setting in which (Dubois and
Prade, 1997) was stated, it would be reasonable to
admit that what cannot be deduced from an agent’s
knowledge base corresponds to what the agent be-
lieves to be false.
However, in the case of a knowledge base, the
open-world assumption holds and the agent is capa-
ble of performing logical inferences (e.g., thanks to a
reasoner). Therefore, we must think in terms of logi-
cal entailment of formulas.
Without loss of generality, we will assume a Her-
brand semantics for L.
Definition 2. The Herbrand base of L is the set H
L
of
all ground atoms in L. An interpretation (or model) is
a function I : H
L
{0, 1}, which can also be viewed
as a subset of the Herbrand base, I H
L
(the set of
all atoms φ such that φ
I
= 1). We denote = 2
H
L
the
set of all interpretations.
We write K |= φ to denote the fact that formula
φ is a logical consequence of all the formulas in K.
Assuming the usual definition of satisfaction (given
an interpretation I and a formula φ L, I |= φ
if and only if φ evaluates to true in I ), we define the
notion of entailment as follows: K |= φ if and only if,
for every interpretation I , I |= K implies I |= φ.
Using a sound and complete reasoner, if K |= φ, φ
can also be deduced from K by the agent (which we
write K ` φ), whereas if K 6` φ (φ cannot be deduced
from K), this means that K 6|= φ (φ is not a logical
consequence of K). Finally, given a set S of formulas,
K |= S if and only if φ S, K |= φ.
3.2 Graded Validity and Completeness
The purpose of a knowledge base is to store axioms
and assertions that summarize an agent’s knowledge
about the world or, at least, a limited portion of the
world, which is relevant to the problem the agent is in-
tended to deal with. We will take an objectivist stance
by assuming that there exists, among all the possible
interpretations on language L, one that reflects the ac-
tual state of affairs. Let us denote such interpretation
by I
. Then we may say that the objective truth of
any formula φ L is given by φ
I
. To be absolutely
clear about that, we are assuming that I
is real and
an objective truth exists for every formula, indepen-
dently of a knowledge base K and of the agent using
it. As a matter of fact, the knowledge represented in
(a slice of) K might (and will, in general) not reflect
reality perfectly or accurately.
Given this premise, the notions of validity and
completeness of a knowledge base K with respect to a
slice may be defined as follows:
K is valid with respect to a slice iff, for every for-
mula φ in that slice, K |= φ implies that I
|= φ,
i.e., φ is objectively true;
K is complete with respect to a slice iff, for every
formula φ in that slice, K 6|= φ implies that I
6|= φ,
i.e., φ is objectively false.
As pointed out in Section 1, we will use the term
beliefs to refer to (possibly partial, incomplete, or in-
valid) information held by an agent. An agent may
then believe something to different degrees. We sup-
pose that these degrees depend on both the degree of
completeness of the sets of statements and on the re-
liability or trustworthiness of the information source.
For example (da Costa Pereira et al., 2017), informa-
tion related to an Air France flight should be complete
if the source is the Air France carrier itself. However,
the completeness could be lower if the source is a pri-
vate travel agency with a partial coverage about the
current flights from the different companies including
those of Air France. Similarly, the degree of trust to
be associated with information fed by a clerk should
be lower than the one to be associated with informa-
tion fed by a supervisor. Still, we would like to stress
that the way in which such degrees are obtained is out
of the scope of this paper.
As pointed out in Section 2, possibility theory is
well suited to model degrees of certainty or, dually,
degrees of possibility. Besides, possibility theory, un-
like other theories of uncertainty like probability the-
ory, is well -suited to model total ignorance which is
necessary to represent situations in which we have,
for example, both K 6|= φ and K 6|= ¬φ. This is the rea-
son why, here, we adopt this theory to represent the
gradual property of both the reliability of an informa-
tion source as well as the completeness of information
regarding a particular slice.
We assume that K is a consistent, classical (as op-
posed to possibilistic) knowledge base, i.e., K con-
tains statements (such as axioms and assertions) ex-
pressed in one of the decidable logical languages usu-
ally employed to represent knowledge in practical ap-
plications (examples might be Datalog, description
logics, RDF + RDFS, or one of the profiles of OWL).
We assume that, in addition to K, metadata about
validity and completeness of information stored in K
are given in the form of two functions, Val and Comp,
Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness
445
which associate a degree of validity and complete-
ness, respectively, to a number of slices defined on
K. Let S
K
2
L
the set of such slices. In practice,
these two functions might be implemented by a look-
up table, listing their values for each defined slice.
Definition 3. Let Val : S
K
[0, 1] be such that, for
each slice T S
K
, Val(T ) is the degree to which
K contains valid information about slice T , which
means, for all formulas φ such that K |= φ and φ T ,
N(φ) Val(T ).
Intuitively, if we can deduce φ from the knowledge
base K, and φ belongs in slice T , and we know that the
source who fed K is (somehow) reliable for domain of
T then the agent should believe φ at least as much as
the degre to which the source is reliable. If the source
is fully reliable then φ will be certain for the agent.
Definition 4. Let Comp : S
K
[0, 1] be such that, for
each slice T S
K
, Comp(T) is the degree to which K
contains complete information about slice T , which
means, for all formulas φ such that K 6|= φ and φ T ,
Π(φ) 1 Comp(T ).
Intuitively, if we cannot deduce φ from the knowl-
edge base K, and φ belongs in slice T , and we know
that information in K about slice T is (somehow)
complete, then φ should be certainly false.
It is reasonable to assume that, given two slices T
and T
0
,
T T
0
Val(T) Val(T
0
), (3)
T T
0
Comp(T ) Comp(T
0
). (4)
Indeed, if we are told that K contains reliable
(resp. complete) information about a broader domain
T
0
to a given degree α, then K cannot be less reliable
(complete) about a narrower (i.e., more specific) do-
main T ; if anything, it might be more reliable (com-
plete) about T if a more reliable (complete) source is
available just for T .
The extent to which the agent believes φ depends
on (i) what is supposed to be known about φ can
we deduce φ from K? —, and (ii) on the validity and
completeness of K with respect to the slices that con-
tain φ. That being the case, K, together with the meta-
data provided by Val and Comp, should allow us to
compute the degree of possibility and necessity for
any arbitrary formula φ, as follows:
Π(φ) =
1, if K |= φ,
min
T :φT
{1 Comp(T )}, otherwise;
(5)
N(φ) =
max
T :φT
Val(T ), if K |= φ,
0, otherwise.
(6)
Let us call π the hypothetical possibility distribution
that induces the possibility and necessity measures
of Equations 5 and 6 and let B a hypothetical possi-
bilistic belief base corresponding to it. Furthermore,
among all possibility distributions compatible with Π
and N, we will select the one that makes the least
commitment, i.e., the maximal (most general) one.
3.3 Existence Conditions
We now derive a necessary condition for the existence
of such a possibility distribution π.
Let φ be such that K |= φ; if K is consistent,
K 6|= ¬φ. By the duality property of the possibility and
necessity measures, it must be Π(φ) = 1N(¬φ); this
is satisfied by Equations 5 and 6, since Π(φ) = 1 and
N(¬φ) = 0.
Proposition 1. Let us assume that K is consistent and
that there exists a possibility distribution π that in-
duces the possibility and necessity measures of Equa-
tions 5 and 6. Then, for all φ such that K |= φ,
max
T :φT
Val(T) = max
T
0
:¬φT
0
Comp(T
0
). (7)
Proof. If measures N and Π are induced by the same
possibility distribution π, it must be N(φ) = 1
Π(¬φ); therefore, by Equations 5 and 6, we can write
max
T :φT
Val(T) = N(φ) = 1 Π(¬φ)
= 1 min
T
0
:¬φT
0
{1 Comp(T
0
)}
= max
T
0
:¬φT
0
Comp(T
0
),
which proves the thesis.
A formula φ such that K 6|= φ and K 6|= ¬φ poses
no problem, because Comp and Val do not interact:
Π(φ) = min
T :φT
{1 Comp(T )},
Π(¬φ) = min
T
0
:¬φT
0
{1 Comp(T
0
)},
N(φ) = N(¬φ) = 0.
We now prove that a formula and its negation can-
not belong in the same slice, unless the two functions
Val and Comp are identical.
Proposition 2. Either Val(T ) = Comp(T ) for all
slice T , or, for all formula φ and for all slice T ,
φ T ¬φ 6∈ T .
Proof. By contradiction: we show that if φ T and
¬φ T , a contradiction can be derived. Let φ be a
formula belonging in just one slice T . If the slices
for which Val or Comp are defined as all distinct,
there will always be at least one such formula. If
not, the equivalent slices can be merged together so
that all slices are distinct. By the assumption, ¬φ T
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
446
too. Now, we apply Equations 5 and 6 to compute the
possibility and necessity of both φ and ¬φ: without
loss of generality, let us assume K |= φ, and therefore,
K 6|= ¬φ; then
Π(φ) = 1, Π(¬φ) = 1 Comp(T ),
N(φ) = Val(T ), N(¬φ) = 0.
By the duality property,
Val(T) = N(φ) = 1 Π(¬φ) = Comp(T ).
We can observe that the case in which Val =
Comp defeats the very purpose of having the two
complementary notions of validity and completeness;
if that were the case, it would suffice to call the func-
tion into which those two notions would confound
themselves “trust”, because it would reflect a gen-
eral notion of reliability of information about a given
slice. Therefore, since we are interested in investi-
gating the use of validity and completeness as two
distinct notions, in what follows, we will make the
assumption that, in general, Val 6= Comp. As a con-
sequence, we now know that any acceptable defini-
tion of what a slice is will have to satisfy the postulate
that a formula and its negation cannot belong in the
same slice: formally, for every slice T and formula φ,
φ T ¬φ 6∈ T . For instance, a slice might be de-
fined as the set of (ground) formulas that are satisfied
by a formula with free variables (i.e., a query).
With this notion of slice (i.e., such that if a for-
mula belongs to a slice, then the negation of that for-
mula does not belong in the slice), we are able to
prove possibilistic generalizations of results obtained
by (Demolombe, 1999) in a KD doxastic logic.
First of all, the fact that a formula and its negation
cannot belong to the same slice motivates the defini-
tion of the dual of a slice.
Definition 5. Let T be a slice. The dual of T , denoted
¬T , is the slice such that φ T iff ¬φ ¬T .
The dual of a slice is thus a sort of complement,
but not in the set-theoretic sense, because there may
exist a formula ψ such that ψ 6∈ T and ψ 6∈ ¬T ; there-
fore, in general, ¬T 6= T . A straightforward conse-
quence of Definition 5 is that ¬(¬T ) = T .
The intuition behind the next proposition is that if
the knowledge base is consistent and if information is
complete concerning both T and ¬T , then, for each
formula φ in T or in ¬T , either the agent believes φ or
it believes ¬φ.
Proposition 3. If K is consistent, for all slice T ,
min{Comp(T ), Comp(¬T )} min
φT
max{N(φ), N(¬φ)}.
Proof. We prove this proposition by showing that, for
all formula φ T ,
min{Comp(T ), Comp(¬T )} max{N(φ), N(¬φ)},
from which the thesis follows. Given a formula φ T ,
there are just three mutually exclusive cases:
Case I. K |= φ and, therefore, by the consistency of
K, K 6|= ¬φ; we thus have, by Def. 4,
N(φ) = 1 Π(¬φ) Comp(¬T );
hence,
max{N(φ), N(¬φ)}
N(φ) Comp(¬T )
min{Comp(T ), Comp(¬T )}.
Case II. K 6|= φ and K 6|= ¬φ; we thus have, by Def. 4,
N(φ) = 1 Π(¬φ) Comp(¬T );
N(¬φ) = 1 Π(φ) Comp(T );
hence,
max{N(φ), N(¬φ)}
max{Comp(¬T ), Comp(T )}
min{Comp(T ), Comp(¬T )}.
Case III. K |= ¬φ and, therefore, by the consistency
of K, K 6|= φ; we thus have, by Def. 4,
N(¬φ) = 1 Π(φ) Comp(T );
hence,
max{N(φ), N(¬φ)}
N(¬φ) Comp(T )
min{Comp(T ), Comp(¬T )}.
Since the three above cases are exhaustive for every
formula φ T , this concludes the proof.
Proposition 4. If π is the least commitment possibil-
ity distribution such that Equations 5 and 6 hold, for
all slice T ,
Val(T) = min
φT :K|=φ
N(φ). (8)
Proof. By Def. 3, Val(T ) N(φ) for all φ T such
that K |= φ; therefore, we can write
Val(T) min
φT :K|=φ
N(φ). (9)
On the other hand, by Equation 6, we can write
min
φT :K|=φ
N(φ) = min
φT :K|=φ
max
T
0
:φT
0
Val(T
0
).
Now, we observe that the set of slices {T
0
: φ T
0
}
always includes T as one of its elements, because φ
T ; therefore,
max
T
0
:φT
0
Val(T
0
) Val(T);
furthermore,
Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness
447
either T is a top-level slice and, by Postu-
late P3, there exists a formula φ
T that does
not belong to any other slice, in which case
max
T
0
:φ
T
0
Val(T
0
) = max{Val(T )} = Val(T ),
whence we can conclude that
min
φT :K|=φ
max
T
0
:φT
0
Val(T
0
) = Val(T),
which is the thesis;
or there exists another slice
ˆ
T such that T
ˆ
T , for
which, by Equation 3, Val(
ˆ
T ) Val(T ), which
leads us to conclude that
min
φT :K|=φ
max
T
0
:φT
0
Val(T
0
) Val(T),
which, together with Equation 9, yields the thesis.
No other case being possible, this concludes the
proof.
It has been proven (Demolombe, 1999) that, for
a consistent base K, if K is complete about ¬T (the
complement of slice T), then K is valid about slice T ;
the following an extension of that result to the gradual
case.
Proposition 5. If K is consistent, for all slice T ,
Comp(¬T ) Val(T ).
Proof. By Def. 4, Comp(¬T) 1 Π(¬ψ) = N(ψ),
for all ¬ψ ¬T (and, therefore, ψ T ) such that K 6|=
¬ψ; let
β = min
ψT :K6|=¬ψ
N(ψ);
then we can write Comp(¬T ) β. Clearly, β
min
φT :K|=φ
N(φ), because {φ T : K |= φ} {ψ
T : K 6|= ¬ψ}, since K |= φ K 6|= ¬φ. Therefore, by
Proposition 4, we have
Comp(¬T ) β min
φT :K|=φ
N(φ) = Val(T ),
which proves the thesis.
3.4 Complexity of Reasoning
The results that have been derived above show that
one can “simulate”, as it were, a possibilistic belief
base by means of a crisp base K together with meta-
data about the validity and completeness of K with
respect to a number of “slices” (i.e., sets of formulas).
It is not important to know a least-commitment
possibility distribution that induces the possibility and
necessity measures of Equations 5 and 6 or to rep-
resent one of its corresponding possibilistic bases B
explicitly, since K, together with its associated meta-
data Val and Comp, is sufficient to compute any pos-
sibilistic inference using any available classical rea-
soner, as demonstrated by the algorithm shown in Fig-
ure 1, adapted from (da Costa Pereira et al., 2017).
Require: K L : a consistent KB; φ L: a formula;
Ensure: N(φ).
1: α 0
2: if K |= φ then
3: for all slice T S
K
do
4: if φ T and α < Val(T ) then
5: α Val(T)
6: end if
7: end for
8: else if K 6|= ¬φ then
9: for all slice T do
10: if ¬φ T and α < Comp(T ) then
11: α Comp(T )
12: end if
13: end for
14: end if
15: return α.
Figure 1: An algorithm that “simulates” a possibilistic in-
ference from B using K, Val, and Comp.
Property 1. Algorithm 1 is correct (i.e., it computes
N(φ)).
Proof. If K |= φ, Equation 6 is applied; other-
wise, Equation 5 together with duality: N(φ) = 1
Π(¬φ).
Property 2. The cost of Algorithm 1, is at most
kS
K
k+2 classical inferences, where kS
K
k is the num-
ber of slices defined for K.
Proof. Algorithm 1 needs first of all to execute at
most two classical inferences: the one in Line 2 and,
in case K 6|= φ, the one in Line 8. Then, checking
whether a formula belongs in a topic costs at most
one classical inference and it has to be done for all
the slices defined for K.
Notice that, according to this result, while the cost
of a possibilistic inference is higher then the cost of
a classical inference, it is so only by a factor which
depends on the number of slices defined on the KB.
It is to be expected that this number will, in general,
be much smaller than the number of facts contained in
the KB. In other words, the overall complexity of pos-
sibilistic inference will be in the same class as classi-
cal inference.
4 DISCUSSION AND
CONCLUSION
We have shown that a classical knowledge base plus
metadata information on the (gradual) validity and
completeness of its “slices” enables one to represent
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
448
a possibilistic belief base and perform possibilistic in-
ferences by using a classical reasoner at a cost which,
albeit larger than the classical counterpart by a mul-
tiplicative factor proportional to the number of slices,
lies in the same complexity class.
All of our results are valid for the general case of
a decidable fragment of first-order logic and thus they
can be readily transferred to state-of-the-art and popu-
lar knowledge representation languages, like Datalog
and RDF + OWL and their reasoners. This also means
that our suggestion to use gradual metadata about va-
lidity and completeness may be applied to represent-
ing and reasoning with possibilistic uncertainty on top
of the standard infrastructure of the semantic Web,
without requiring any ad hoc extension and at a rea-
sonable cost. In that setting, one way of implement-
ing the notion of a slice might be through RDF named
graphs.
Future work includes demonstrating how our pro-
posal can be deployed on the semantic Web in-
frastructure to represent and reason about uncertain
knowledge with a proof-of-concept implementation.
ACKNOWLEDGEMENTS
This work has been partially supported by the French
government, through the 3IA C
ˆ
ote d’Azur “Invest-
ments in the Future” project managed by the National
Research Agency (ANR) with the reference number
ANR-19-P3IA-0002.
REFERENCES
Collins, A., Warnock, E. H., Aiello, N., and Miller, M. L.
(1975). Reasoning from incomplete knowledge. In
BOBROW, D. G. and COLLINS, A., editors, Repre-
sentation and Understanding, pages 383 415. Mor-
gan Kaufmann, San Diego.
da Costa Pereira, C., Dubois, D., Prade, H., and Tettamanzi,
A. G. B. (2017). Handling topical metadata regarding
the validity and completeness of multiple-source in-
formation: A possibilistic approach. In SUM, volume
10564 of Lecture Notes in Computer Science, pages
363–376, Berlin. Springer.
Darari, F., Nutt, W., Pirr
`
o, G., and Razniewski, S. (2013).
Completeness statements about RDF data sources
and their use for query answering. In The Seman-
tic Web - ISWC 2013 - 12th International Semantic
Web Conference, Sydney, NSW, Australia, October 21-
25, 2013, Proceedings, Part I, pages 66–83, Berlin.
Springer.
Demolombe, R. (1996). Answering queries about validity
and completeness of data: From modal logic to rela-
tional algebra. In FQAS, volume 62 of Datalogiske
Skrifter (Writings on Computer Science), pages 265–
276, Roskilde. Roskilde University.
Demolombe, R. (1999). Database validity and complete-
ness: Another approach and its formalisation in modal
logic. In KRDB, volume 21 of CEUR Workshop Pro-
ceedings, pages 11–13, Aachen. CEUR-WS.org.
Dong, X. L., Gabrilovich, E., Heitz, G., Horn, W., Murphy,
K., Sun, S., and Zhang, W. (2014). From data fusion
to knowledge fusion. Proc. VLDB Endow., 7(10):881–
892.
Dubois, D., Lang, J., and Prade, H. (1994). Possibilistic
logic. In Handbook of logic in artificial intelligence
and logic programming (vol. 3): nonmonotonic rea-
soning and uncertain reasoning, pages 439–513. Ox-
ford University Press, New York, NY, USA.
Dubois, D. and Prade, H. (1988). Possibility Theory—An
Approach to Computerized Processing of Uncertainty.
Plenum Press, New York.
Dubois, D. and Prade, H. (1997). Valid or complete infor-
mation in databases - A possibility theory-based anal-
ysis. In DEXA, volume 1308 of Lecture Notes in Com-
puter Science, pages 603–612, Berlin. Springer.
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Re-
ichart, D., Venkatrao, M., Pellow, F., and Pirahesh, H.
(1997). Data cube: A relational aggregation operator
generalizing group-by, cross-tab, and sub totals. Data
Min. Knowl. Discov., 1(1):29–53.
Lang, J. (2001). Possibilistic logic: complexity and al-
gorithms. In Kohlas, J. and Moral, S., editors, Al-
gorithms for Uncertainty and Defeasible Reasoning,
Vol. 5 of Handbook of Defeasible Reasoning and
Uncertainty Management Systems (Gabbay, D. M.
and Smets, Ph., eds.), pages 179–220. Kluwer Acad.
Publ., Dordrecht.
Levesque, H. J. (1980). Incompleteness in knowledge
bases. SIGART Bull., 74:150–152.
Levesque, H. J. (1982). The logic of incomplete knowledge
bases. In On Conceptual Modelling (Intervale), pages
165–189, New York. Springer.
Motro, A. (1989). Integrity = validity + completeness. ACM
Trans. Database Syst., 14(4):480–502.
Razniewski, S., Suchanek, F. M., and Nutt, W. (2016). But
what do we actually know? In AKBC@NAACL-HLT,
pages 40–44, Stroudsburg, PA. The Association for
Computer Linguistics.
Wick, M. L., Singh, S., Kobren, A., and McCallum, A.
(2013). Assessing confidence of knowledge base con-
tent with an experimental study in entity resolution.
In Proceedings of the 2013 workshop on Automated
knowledge base construction, AKBC@CIKM 13, San
Francisco, California, USA, October 27-28, 2013,
pages 13–18, New York. ACM.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control,
8:338–353.
Parsimonious Representation of Knowledge Uncertainty using Metadata about Validity and Completeness
449