Linguistic Alerts in Information Filtering Systems

Towards Technical Implementations of Cognitive Semantics

Radoslaw P. Katarzyniak

, Wojciech A. Lorkiewicz

and Ondrej Krejcar

Faculty of Computer Science and Management, Wroclaw University of Technology,

Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland

Center for Basic and Applied Research, Faculty of Informatics and Management, University of Hradec Kralove,

Rokitanskeho 62, 500 03 Hradec Kralove, Czech Republic

Keywords:

Information Filtering, Linguistic Alerts, Computational Semiotics, Epistemic Modalities, Cognitive Seman-

tics.

Abstract:

An original model of natural language alerts production is proposed. The alerts are produced by information

ﬁltering system and stated in a quasi-natural language, both potentially written and vocalized. The alerts are

chosen with respect to a certain collection of uncertain decision rules, thus they inherit various levels of epis-

temic uncertainty. The quasi-natural language statements include linguistic operators of epistemic modality, as

their necessary parts. The proposed model implements in a technical context an adequate cognitive semantics

captured by an original theory of epistemic modality grounding deﬁned elsewhere.

1 INTRODUCTION

Users’ selective dissemination of information and re-

lated information ﬁltering (IF for short) are important

challenges for modern information systems (Hanani

et al., 2001). They seem particularly crucial for man-

agement executives, interested in and strongly depen-

dent on up-to date information related to their every-

day business activities (Xu et al., 2011). The way

how the selected (ﬁltered) information is presented

needs to be designed with substantial inﬂuence of real

environments in which the executives work, includ-

ing these days frequent mobility of their daily work.

In such circumstances all easily comprehensible pre-

sentation modes, for instance applications of quasi-

natural, written and sometimes even vocalized lan-

guages, have become a very important theoretical and

practical problem for computer science community.

Unfortunately in actual settings, it is often the case

that on-line indexing of documents, incoming to exec-

utives’ knowledge repositories, is practically impossi-

ble due to their inherent characteristics. For instance,

a typical document can consists of expanded multi-

media elements and therefore require advanced and

time-consuming processing to elaborate semantic de-

scription of their content. Fortunately, at least in some

practical contexts, an approximate (yet still effective)

solution is to settle executive-oriented ﬁltering solely

on attributes of incoming documents for which values

can be easily determined. Such attributes may include

origin, author(s), afﬁliating institutions, attached gen-

eral keywords, etc. However, a rather obvious incon-

venience of the approximate solution is that ﬁltering

decisions may be uncertain to some extent. In par-

ticular, due to underlying soft classiﬁcation rules in

which preconditions are deﬁned by means of easy-

to-determine attributes, and post-conditions are built

from subjects (topics) the executives are interested in.

Another inconvenience might be that such IF systems

need to be based on processes of classiﬁcation rules

management (namely, their effective extraction, stor-

age, retrieval and update).

In this paper we provide a theoretical background

for solving the highlighted problem for a particular

class of IF systems. Namely, a theoretical founda-

tion for production of incoming documents’ alerts,

founded on uncertain classiﬁcation rules, is discussed.

An important functional assumption is that alerts are

to be stated in quasi-natural languages with linguis-

tic markers for communicating levels of epistemic

uncertainty. In perhaps all languages such linguistic

markers exist, usually in a form of well-known and

widely used basic modal operators of knowledge (I

know that ...), believe (I believe that ...) and possi-

bility (I ﬁnd it possible/it is possible that ...), as well

as their possible extensions e.g. I strongly believe

512

Katarzyniak, R., Lorkiewicz, W. and Krejcar, O.

Linguistic Alerts in Information Filtering Systems - Towards Technical Implementations of Cognitive Semantics.

In Proceedings of the 18th International Conference on Enterprise Information Systems (ICEIS 2016) - Volume 1, pages 512-519

ISBN: 978-989-758-187-8

that ... (Nuyts, 2001). The way, in which the natu-

ral language operators of epistemic modality should

be chosen and used as components of information ﬁl-

tering alerts, is an original contribution of this work,

comparing to other works, usually dealing with other

classes of language vagueness e.g. (Herrera-Viedma

et al., 2004).

The overall organization of the presentation is as

follows. In section 2, our original model of knowl-

edge base and basic knowledge management pro-

cesses, underlying the extraction and application of

classiﬁcation rules to alerts’ generation, is presented.

In this section a concept of mental language holons

is introduced as a key knowledge structure participat-

ing to adequate alert’s choice and extraction. Accord-

ing to their deﬁnition, the holons cover complemen-

tary language oriented experience summarizations. In

section 3, the so-called theory of modality grounding,

originally proposed elsewhere (Katarzyniak, 2005),

is applied to deﬁne strong and logically consistent

support for choosing adequate epistemic modality of

alerts. The theory is based on a technical model of so-

called cognitive semantics for natural language state-

ments, limited to some scope of quasi-natural lan-

guage statements, modal in the epistemic sense. The

section consists a brief note on the novelty of our ap-

proach, considered in respect to technical application

of cognitive linguistics. In section 4 a computational

example is presented. Finally, section 5 summarizes

presented results and points out possible future exten-

sions.

2 MODEL DEFINITION

2.1 Proﬁle of User Information Needs

and Filtering Task

The system’s user

is focused on a given aspect (im-

portant to the user) of information stored within the

system. This relevant to the user context strictly de-

pends on current user’s preferences and is inherently

individual. This user’s focus represents a set of topics

of interest (document’s subjects or themes) that are of

special importance or signiﬁcance to the user.

The user’s information needs are represented

by a set of subjects (also called themes) S =

{sub

, sub

, ..., sub

}, being of potential interest to

him or her. Moreover, we further assume that infor-

Due to strict editorial limitations, we focus solely on a

single user case. However, in a more practical realisation

the proposed approach can be easily extended to a case of

multiple users.

mation needs represent the sole information that the

system captures about the user. Consequently, user’s

information needs represent the user’s proﬁle stored

within the system.

We further assume that all of the incoming docu-

ments are processed (ﬁltered and indexed) in order to

determine whether they cover any of the highlighted

topics of interest (user’s information needs). The sole

purpose of the ﬁltering stage is to identify documents

that are signiﬁcant to the end user. In particular, the

goal is to identify documents d that having a complete

knowledge about them (for instance through thorough

manual examination by an expert) would lead to for-

mulation of basic statements – ”d is about sub

” or

”d is about sub

and sub

” (where j is different from

k). Recognizing such documents should further lead

to generation of adequate alerts by the system, to in-

form the end user about the appearance of important

documents.

Seemingly the introduced form of user proﬁle is

extremely trivial, as compared to apparently more

complex models of user proﬁles studied and applied

in the ﬁeld of IF systems e.g. (Brown and Jones,

2001; Shapira et al., 1999; Xu et al., 2011). How-

ever, as it turns out in the context of presented ap-

proach to generation of linguistic alerts even in such

an oversimpliﬁed proﬁle posses signiﬁcant problems.

In particular, linguistic approach requires application

of unconventional linguistic semantical models that

represent solid and adequate theoretical background

for technical implementations of cognitive semantics

for such linguistic alerts.

Importantly, we should highlight that the afore-

mentioned task of document ﬁltering is highly com-

plex. In automatic approaches it requires a com-

plex and computationally exhaustive procedure that

is able to analyse the content of a document and de-

termine it’s relevance against the set of identiﬁed sub-

jects. Moreover, in some realities a fully automatic

approach might not even be available, as such a set

of semi-automatic or even manual methods must be

utilised. Furthermore, in systems with strict process-

ing time restrictions the ﬁltering process posses sig-

niﬁcant technical problems, both methodologically

and computationally.

2.2 The Repository Databases

The repository consists of two classes of documents:

already stored documents D = {d

, d

, ..., d

} with

complete descriptions (including a description of their

semantic content) stored in a regular database of the

repository, and new documents (new arrivals) D

new

, d

new

, ..., d

new

} with incomplete descriptions of

Linguistic Alerts in Information Filtering Systems - Towards Technical Implementations of Cognitive Semantics

513

their thematic content, thus awaiting off-line semantic

analysis.

Formally the repository sub-databases can be

described as an information system by Pawlak

(Pawlak and Skowron, 2007), tailored to our prac-

tical context. Let Rep = (D ∪ D

new

, A,V, ρ) be fur-

ther considered, where D and D

new

are sets of

stored documents and new arrivals, respectively, A =

, w

, ..., w

, sub

, ..., sub

} is a set of at-

tributes, V =

a∈A

, where V

= W

and V

sub

{ε, 0, 1}, is a set of attributes’ values, and ρ : D × A →

V is a partial information function.

The partiality of function ρ reﬂects the extent to

which documents are described, regarding their the-

matic content (their semantics). Namely, it is as-

sumed that W = {w

, w

, ..., w

} consists of mul-

tivalued attributes, called conditional ones. Val-

ues of conditional attributes are usually delivered

at document’s arrival, as the attributes represent a

set of easily computed parameters/characteristics of

the document (computed on-line). Contrary, S =

{sub

, sub

, ..., sub

} consists of attributes, called

thematic attributes, representing the content of doc-

uments (in respect to a given proﬁle of information

needs). Determining the value of thematic attributes

requires intensive (both methodological and compu-

tational) off-line semantic analysis of the document.

For the sake of clarity and ease of presentation

some additional symbols are further introduced.

Namely, for each document d ∈ D ∪ D

new

, ρ

d|W

W →

i=1

) is a conditional-part information

function related to document d, such that for each at-

tribute x ∈ W , ρ

d|W

(x) ∈ W

holds, provided that W

consists of all possible values of x.

Similarly, for each document d ∈ D ∪ D

new

, ρ

d|S

S → {ε, 0, 1} is a thematic-part information function

related to document d. However, in this case rules

for assigning attribute values differ for d ∈ D and

d ∈ D

new

. Namely for each attribute x ∈ S and each

document d ∈ D, ρ

d|S

(x) = 1 if and only if docu-

ment d is indexed as being about subject x. Otherwise

d|S

(x) = 0. At the same time, for each attribute x ∈ S

and d ∈ D

new

, the value of x is treated as unknown,

what is formally represented by ρ

d|S

(x) = ε.

2.3 Mental Language Holons as

Representation of Subject

Distribution

As aforementioned, the introduced IF system is dedi-

cated to analyse incoming documents, regarding in-

dividual subjects sub ∈ S or/and their conjunctions

sub

∧ sub

, where sub

∈ S, sub

6= sub

Results from this analysis may be uncertain predic-

tions, communicated by the means of natural lan-

guage operators of epistemic modality.

Below an adequate model of database meta-

descriptions used in the ﬁltering process is proposed.

It’s purpose is to enable effective and semantically

valid realization of the assumed functional IF sys-

tem’s goal. The model will be fully compatible with

an original theory of epistemic modality grounding,

partially presented in (Katarzyniak, 2005; Katarzy-

niak, 2006b; Katarzyniak, 2006a). The main assump-

tion of the theory is that linguistic alerts are insepa-

rably connected to (in a sense grounded in) so-called

mental language holons . Language holons represent

embedded summarization of empirical episodic ex-

periences, i.e., experiences strictly related to partic-

ular subjects or their binary conjunctions. In many

ways language holons are similar to mental models,

known from the cognitive linguistics and psychology

(Johnson-Laird, 1985). For the sake of complete-

ness it is worth mentioning that, at the technical level,

mental language holons can be treated as complexes

of complementary classiﬁcation rules.

In order to formally capture the latter, the follow-

ing three retrieval languages are introduced:

K S = {sub

, sub

, ..., sub

K B = {sub

∧ sub

| sub

, sub

∈ S ∧ x < y},

K L = {

i=1

= x

) | w

∈ W, x

∈ W

, i = 1..L}.

(1)

The semantics of retrieval languages is given by

following functions:

|K S

: K S → 2

|K B

: K B → 2

|K L

: K L → 2

D∪D

new

(2)

where:

|K S

(sub) = {d ∈ D | ρ

d|S

(sub) = 1},

|K B

(sub

∧ sub

) = δ

|K S

(sub

) ∩ δ

|K S

(sub

|K L

(

i=1

= x

)) = {d ∈ D |

i=1

(ρ

d|W

) = x

)}

(3)

Mental language holons are deﬁned for simple

subjects in K S and conjunctive subjects in K B, in

respect to particular conditions from retrieval lan-

guage K L

⊆ K L, where the subset (of non-empty

conditions) K L

is deﬁned as: K L

= {k ∈ K L |

|K L

(k) ∩ D 6=

0}.

Having deﬁned K L

, we can introduce two aux-

iliary symbols class K

and class extension EXT (K

In particular, a class K

deﬁnes a set of indistinguish-

able (conditional attribute-wise κ

) already process

ICEIS 2016 - 18th International Conference on Enterprise Information Systems

514

documents, whereas class extension EXT (K

) deﬁnes

a set of indistinguishable (conditional attribute-wise

) all documents. Namely, if (and only if) |K L

| =

Q ≥ 1 and K L

= {κ

, κ

, ..., κ

}, then for i = 1..Q,

= δ

|K L

(κ

) ∩ D,

EXT(K

) = δ

|K L

(κ

) ∩ D

new

(4)

For each sub ∈ S and κ

∈ K L

, the (simple

subject) mental language holon is given as a vector

simholon:

simholon[κ

, sub, λ

(sub), λ

−

(sub)], (5)

where

(sub) =

|δ

|K S

(sub) ∩ K

−

(sub) =

|(D \ δ

|K S

(sub)) ∩ K

(6)

For each conjunctive subject (sub

∧ sub

) =

sub

∈ K B and κ

∈ K L

, the (conjunctive subject)

mental language holon is given as a vector conholon:

conholon[κ

, sub

, λ

(sub

), λ

+−

(sub

−+

(sub

), λ

−−

(sub

)],

(7)

where

(sub

) =

|δ

(sub

) ∩ δ

(sub

) ∩ K

+−

(sub

) =

|δ

(sub

) ∩ (D \ δ

(sub

)) ∩ K

−+

(sub

) =

|(D \ δ

(sub

)) ∩ δ

(sub

) ∩ K

−−

(sub

) =

|(D \ δ

(sub

)) ∩ (D \ δ

(sub

)) ∩ K

(8)

From the pragmatic point of view, mental lan-

guage holons are higher level summarizations (se-

mantic generalizations) of relative share of comple-

mentary bodies of experiences, related to particular

subjects (or their conjunctions). The whole repository

of language holons, available to IF system’s processes

and, in particularly to alerts production procedures, is

given as follows:

HOLONS =SIMHOLONS ∪CONHOLONS,

SIMHOLONS ={simholon[κ, x, λ

(x), λ

−

(x)]

| κ ∈ K L

, x ∈ K S },

CONHOLONS ={conholon[κ, x, λ

(x), λ

+−

(x), λ

−+

(x),

−−

(x)] | κ ∈ K L

, x ∈ K B}.

(9)

3 ALERTS PRODUCTION

3.1 Alerts and their Semantic

Proto-forms

Examples of possible structure and content of alerts,

considered in our research, are given as follows:

IF SYSTEM ALERT: There is a new [document: x].

I believe it is about [subject: sub]. You may be inter-

ested in reading it!

IF SYSTEM ALERT: Documents [documents:

, ..., x

] are new. It is possible that they are about

[subjects: sub

and sub

]. Should I put them on your

pending list?

IF SYSTEM ALERT: There is a new [document: x]

worth of being looked at. I believe it is about [subject:

sub

], but not about [subject: sub

]. According to

what I know about your interests, the ﬁrst issue may

be of interest to you. Should I put the document to

your working box? Please, answer [YES/NO]!

IF SYSTEM ALERT: Among others, the follow-

ing documents: x

, ..., x

have been received from

[source: source], too. I believe they are not about

[subject:sub] which you pointed at as your main is-

sue. Whether, despite this shall I put them on your

pending list? Please, answer [YES/NO]!

IF SYSTEM ALERT: It is possible that the following

[incomings: x

, ..., x

] deal with [subject:sub], which

is on your list of interests. Are you interested in read-

ing them before turning them to our central document

base? Please, answer [YES/NO]!

The structure of alerts fully depends on designer’s

choice and, obviously, it should reﬂect favoured

modes and preferences of particular user (users’

group) interactions. In our case the alerts are repre-

sented (communicated) in a natural language, which

is a partially controlled version of actual language. In

advanced multimedia systems the alerts can be vocal-

ized, too.

The common feature of the above examples is

their underlying sense. Namely, regardless of their

form (individual document vs. group of documents,

simple subject vs. conjunctive subject), they all are

founded on the same propositional aspect: being

about or not being about a particular simple subject

(or conjunction of simple subjects). Moreover, For

x ∈ D ∪ D

new

and sub ∈ K S ∪ K B, each example is

Linguistic Alerts in Information Filtering Systems - Towards Technical Implementations of Cognitive Semantics

515

originally created as instantiation (concretization) of

one of the following basic linguistic proto-forms:

knowing([document(s):x] is about [subject(s):sub])

believing([document(s):x] is about [subject(s):sub])

possible([document(s):x] is about [subject(s):sub])

or another proto-form, complementary to the above

enumerated ones.

It is worth of mentioning that for a ﬁxed document

x and a ﬁxed subject sub (a simple subject or a bi-

nary conjunction of simple subjects) one and only one

proto-form should be instantiated as proper represen-

tation of epistemic state. Namely, such constraint fol-

lows from common sense, natural language pragmat-

ics rule, saying that knowing, believing and ﬁnding

something only as possible (in the epistemic sense)

are mutually exclusive, different states of the same

mental epistemic attitude. Thus, in our research an

adequate extraction of natural language alerts from

IF system’s knowledge base (or more strictly: proper

and adequate choice and further instantiation of proto-

form) becomes a fundamental issue to be elaborated,

on both technical and theoretical levels.

In conclusion, similarly to other natural language

statements, three aspects of alerts need to be taken

into account: propositional element, modality, and

temporal frame. As it has just been mentioned above,

the propositional element is given by predication,

which on written (or vocalized) level is referred to

by elements of sets K S and K B. The alerts’ tem-

poral dimension is quite apparent. Namely, they are

stated in the present grammatical time. A more prob-

lematic issue is the alerts’ modality choice, which in

our case should reﬂect a kind of epistemic uncertainty

of IF system, itself. An important question, of both

theoretical and technical nature, is how to properly

choose adequate modality markers, in order to ex-

tend written (or vocalized) representation of predica-

tion (applied to incoming documents). This question

is strongly supported by an original theory of ground-

ing of modal epistemic statements, brieﬂy presented

below.

3.2 Applying the Theory of Epistemic

Modality Grounding to Alerts’

Production

The decision rules for proper choice of an adequate

modal proto-form, its instantiation (and further pre-

sentation to an end user in a written and/or vocalized

form) follow from an original theory of grounding,

presented elsewhere. Namely, for the case of sim-

ple subject-based predication the introductory theo-

retical results can be found in (Katarzyniak, 2005),

for binary conjunctive subject-based predication in

(Katarzyniak, 2006b; Katarzyniak, 2006a).

It is assumed in the theory (following multiple

models of language production (Evans and Green,

2006; Stachowiak, 2013; Wlodarczyk, 2013)) that

particular epistemic operators of modality are related

to summarized empirical experience, supporting re-

lated language proto-forms. However, these proto-

forms are never stored and processed as separate en-

tities, for they are conceptually (mentally) related to

their complementary counterparts. In particular, such

complexes of complementary proto-forms constitute

linguistic holons, which in our technical approach are

strongly related to the concept of mental language

holons, deﬁned in the previous sections. In conse-

quence, to each linguistic proto-form, always related

to one and only one part of a relevant mental language

holon, certain intensity of summarized (embodied)

experience of a subject (or binary conjunctive subject)

is assigned. In the theory of grounding this intensity

is numerically represented by the relative grounding

strength.

According to the theory of simple modali-

ties grounding, the proper choice of adequate

linguistic proto-form is possible if and only

if a proper system of the so-called modality

thresholds is applied (and technically realized

in a system). In our case the system needs to

consist of two interrelated sub-systems of thresh-

olds {λ

Know

, λ

maxBel

, λ

minBel

, λ

maxPos

, λ

minPos

} and

{λ

∧

Know

, λ

∧

maxBel

, λ

∧

minBel

, λ

∧

maxPos

, λ

∧

minPos

}, for effec-

tive control of simple-subject predication instantia-

tion and conjunctive subject predication instantiation,

respectively.

An interesting result from the theory of ground-

ing, for the practice perhaps the most important one,

is that the system of modality thresholds cannot be

freely chosen. Namely, in order to guarantee common

sense consistency of (written and verbal) language be-

haviour the system of modality thresholds has to ful-

ﬁl some predeﬁned set of requirements, accepted in

the theory of grounding, as a reﬂection of common

sense pragmatics applied in actual contexts to natural

language operators of knowledge, belief, and possi-

bility. The fact that written and/or verbal behaviour,

produced by a technical system based on the theory

of grounding, is actually consistent, from the semi-

otic and pragmatic point of view, can be analytically

proved and veriﬁed

Moreover, within the numerical scope which is

permissible according to the theory of grounding, val-

ues for thresholds can be chosen in an arbitrary man-

ner (Katarzyniak, 2005). However, for the case of

Some of the results can be found in (Katarzyniak, 2005;

Katarzyniak, 2006b; Katarzyniak, 2006a).

ICEIS 2016 - 18th International Conference on Enterprise Information Systems

516

populations of artiﬁcial agents it is be possible to ob-

tain them from computationally realized processes of

artiﬁcial language semiosis (Lorkiewicz et al., 2011).

In order to omit deeper discussion of the theory

of grounding (outside of the scope of this work) we

further present an original application of the theory

to basic rules deﬁnition for modal alerts’ acceptabil-

ity and adequacy. The fundamental assumption is

that a given modal alert can be produced (by IF sys-

tem) if and only if its underlying linguistic proto-form

is well-grounded in IF system’s knowledge base. It

means, too, that in this practical context, for a certain

alert being well grounded is equivalent to adequately

describing a related IF system’s state of knowledge

about possibility of a certain document d ∈ D ∪ D

new

to deal with a certain subject sub ∈ K S ∪ K B. In

particular, for any document d ∈ D

∗

, d ∈ EXT (K

and sub ∈ K S , the following set of so-called ground-

ing relations constitute the theoretical foundation of

IF alerting processes:

simholon[κ

, sub, λ

(sub), λ

−

(sub)]

possible([d] is about [sub])

holds if and only if λ

minPos

≤ λ

(sub) < λ

maxPos

simholon[κ

, sub, λ

(sub), λ

−

(sub)]

believing([d] is about [sub])

holds if and only if λ

minBel

≤ λ

(sub) < λ

maxBel

simholon[κ

, sub, λ

(sub), λ

−

(sub)]

knowing([d] is about [sub])

holds if and only if λ

(sub) = λ

Know

= 1.

Rather obviously, complementary alerts on doc-

ument d ∈ D

∗

not being about a particular subject

sub ∈ K S , are produced with respect to the next three

deﬁnitions:

simholon[κ

, sub, λ

(sub), λ

−

(sub)]

possible([d] is not about [sub])

holds if and only if λ

minPos

≤ λ

−

(sub) < λ

maxPos

simholon[κ

, sub, λ

(sub), λ

−

(sub)]

believing([d] is not about [sub])

holds if and only if λ

minBel

≤ λ

−

(sub) < λ

maxBel

simholon[κ

, sub, λ

(sub), λ

−

(sub)]

knowing([d] is not about [sub])

holds if and only if λ

−

(sub) = λ

Know

= 1.

Obviously, similar set of deﬁnitions, for d ∈ D

∗

d ∈ EXT (K

), and (sub

∧ sub

) = sub

∈ K B, can

also be formulated and used, if needed. However, in

such case another mental language holons must be re-

ferred to:

conholon[κ

, sub

, λ

(sub

), λ

+−

(sub

−+

(sub

), λ

−−

(sub

)]

possible([d] is about [sub

] and [sub

])

holds if and only if λ

∧

minPos

≤ λ

(sub) < λ

∧

maxPos

conholon[κ

, sub

, λ

(sub

), λ

+−

(sub

−+

(sub

), λ

−−

(sub

)]

believing([d] is about [sub

] and [sub

])

holds if and only if λ

∧

minBel

≤ λ

(sub) < λ

∧

maxBel

conholon[κ

, sub

, λ

(sub

), λ

+−

(sub

−+

(sub

), λ

−−

(sub

)]

knowing([d] is about [sub

] and [sub

])

holds if and only if λ

(sub

) = λ

∧

Know

= 1.

For purely editorial reasons, we do not deal with

the complementary conjunctive alerts, i.e., alerts on

new documents being about [sub

and not sub

], [not

sub

and sub

], [not sub

and not sub

]. It is quite ob-

vious that they have to be veriﬁed in a similar way,

but against values of λ

+−

(sub

), λ

−+

(sub

), and

−−

(sub

), respectively.

3.3 A Brief Note on Cognitive Semantics

The novelty of our approach to the generation of

quasi-natural language alerts falls outside of previ-

ous linguistic models. Namely, it is an original pro-

posal consistent with cognitive linguistics (Evans and

Green, 2006) and interactive linguistics (Wlodarczyk,

2013) paradigms. Both of them refer our work to the

concept of cognitive semantics (Talmy, 2000), which

describes the way a particular natural language sen-

tence embraces the pre-linguistic knowledge corpora

accessible to minds of a communicative agent. Ob-

viously, in our R&D context the communicating sub-

jects are IF systems.

Cognitive semantics is always characterised by

high speciﬁcity, because in each case it reﬂects prag-

matics and meaning of a very narrow class of linguis-

tic phenomena. In our model this speciﬁcity is appar-

ently visible in internally related and complex struc-

ture of mental language holons. A proposal of how

to realize the cognitive semantics of alerts in our IF

system should be treated as the most original contri-

bution of the model.

4 COMPUTATIONAL EXAMPLE

In this section we introduce a basic example that illus-

trates the entire process of generating linguistic alerts

in IF systems. For the sake of simplicity let us as-

sume an elementary information systems comprised

Linguistic Alerts in Information Filtering Systems - Towards Technical Implementations of Cognitive Semantics

517

of a document repository consisted of 10 processed

documents D = {d

, d

, ..., d

} and 3 new documents

new

= {d

, d

} that are evaluated based on a

set of 4 conditional attributes W = {w

, w

Further, let us assume that user’s information needs

are limited to two subjects S = {sub

, sub

}. Conse-

quently, the set of all attributes available in the system

is deﬁned as A = {w

, w

, sub

}. Fur-

thermore, let the domains of the introduced attributes

be given as follows, W

= W

= {A, B,C}.

Documents stored in the document repository are

processed. In particular, each document is analysed

by a set of indexing mechanisms (or other process-

ing mechanisms) that are able, based on the document

content and structure, to assign values for each of

the conditional attributes. Further, information about

each document’s subject is determined and stored. As

such the information function of the repository is de-

termined, i.e., attribute–value mapping, as given in

Table.1.

Focusing on three simple classes κ

, κ

, and

, given as κ

= {(w

, B), (w

, A), (w

, A)},

= {(w

,C), (w

, A), (w

, B)}, and κ

{(w

, B), (w

,C), (w

, A)}, we can deter-

mine three non-empty clusters of documents K

, d

}, K

= {d

, d

}, K

= {d

, d

}

and their extensions EXT (K

) = {d

}, EXT (K

) =

}, EXT (K

) =

0. It must to be mentioned that

one of the newly received documents, namely d

does not belong to any of these sets. This fact will

be commented in the ﬁnal remarks section.

Resulting summarization of data is represented

by the following set of holons HOLONS =

SIMHOLONS ∪ CONHOLONS:

SIMHOLONS = {

simholon[κ

, sub

, 0.25, 0.75],

simholon[κ

, sub

, 1.00, 0.00],

simholon[κ

, sub

, 1.00, 0.00],

simholon[κ

, sub

, 0.25, 0.75],

simholon[κ

, sub

, 0.50, 0.50],

simholon[κ

, sub

, 0.00, 1.00]}.

(10)

CONHOLONS = {

conholon[κ

, sub

∧ sub

, 0.25, 0.50,0.25, 0.00],

conholon[κ

, sub

∧ sub

, 0.25, 0.75,0.00, 0.00],

conholon[κ

, sub

∧ sub

, 0.00, 0.50,0.00, 0.50]}.

(11)

Having the relative grounding strength computed

and stored in each holon, we can now determine all

proto-forms, for the new arrivals from non-empty ex-

tensions EXT (K

) and EXT (K

To give an example, simple subjects will be con-

sidered. Let modality thresholds be set up to follow-

ing values λ

Know

= λ

maxBel

= 1, λ

minBel

= λ

maxPos

Table 1: Processed repository of documents.

B A A A 1 1

B A A A 0 1

C C A B 1 0

B A A A 0 1

C C A B 1 0

B B C A 0 0

B B C A 1 0

C C A B 1 1

B A A A ε ε

C C A B ε ε

B B C B ε ε

0.60, and λ

minPos

= 0.20. These values are not acci-

dental. Namely, they have been chosen taking into

account theorems from the theory of grounding sim-

ple modalities (Katarzyniak, 2005). It follows that

the threshold values should preserve consistency of

sets of grounded proto-forms with common sense in-

terpretation. Below we provide examples of well-

grounded grounded proto-forms:

• possible([d

] is about [sub

]) AND possible([d

] is

not about [sub

])

• believing([d

] is about [sub

]), BUT STILL

possible([d

] is not about [sub

])

• knowing([d

] is about [sub

])

It is worth of mentioning that these proto-forms

are logically consistent, which is ensured by the

proper choice of modality thresholds. A possible nat-

ural language alert founded on the established proto-

forms is:

IF SYSTEM ALERT: There is a new [document:

doc

] available. I believe it is about [subject: sub