Context-aware Retrieval and Classification: Design and Benefits

Kurt Englmeier

Abstract

Context encompasses the classification of a certain environment by its key attributes. It is an abstract representation of a certain data environment. In texts, the context classifies and represents a piece of text in a generalized form. Context can be a recursive construct when summarizing text on a more coarse-grained level. Context-aware information retrieval and classification has many aspects. This paper presents identification and standardization of context on different levels of granularity that supports faster and more precise location of relevant text sections. The prototypical system presented here applies supervised learning for a semiautomatic approach to extract, distil, and standardize data from text. The approach is based on named-entity recognition and simple ontologies for identification and disambiguation of context. Even though the prototype shown here still represents work in progress and demonstrates its potential of information retrieval on different levels of context granularity. The paper presents the application of the prototype in the realm of economic information and hate speech detection.

Download


Paper Citation