Towards a Knowledge Graph-specific Definition of Digital
Transformation: An Account Networking View for Auditing
Florina Livia Covaci
1 a
, Robert Andrei Buchmann
1 b
and Radu Dragos
Business Information Systems Department, ”Babes
-Bolyai” University, Cluj-Napoca, Romania
Computer Science Department, ”Babes
-Bolyai” University, Cluj-Napoca, Romania
Knowledge Graph, Accounting Auditing, Digital Transformation, RDF.
This paper reports on an experimental Digital Transformation project where RDF graphs are adopted in an
organization’s accounting and document management system as a novel approach to accounting digitization,
going beyond traditional ERP systems to enable account-centric network analysis and more insightful master
data management - having accounting contextualized in a relationship-rich Knowledge Graph that captures
some of the tacit knowledge that accountants and auditors apply during their common tasks. Legacy ERP
systems that are based on relational databases face challenges when aggregating information regarding the
transactions that an account was involved in, which sometimes involve multihop JOINs, links to contextual
documents that may reside elsewhere, or rules that mimic (at least partially) an auditor’s reasoning. The paper
reports on a knowledge capture effort for mapping accounting information into an RDF graph in order to over-
come limitations of legacy systems with auditing support, currently implemented in a feasibility demonstrator
of low Technological Readiness Level. As theoretical implications, we also derive from this experience a
novel, specialized definition of Digital Transformation.
Every organization maintains systematic records of
financial transactions, typically employing an ERP
system to support accounting activities. Records of
financial transactions are carried out based on ac-
counting principles which primarily treat data as ta-
ble structures and reports, although implied semantics
are always present in the form of an accountant’s tacit
knowledge about how accounts participate together
in double-entry records in various situations and how
those entries are co-dependent based on either practi-
cal patterns or accounting-specific regulations. More-
over, occurrence of errors in financial statements is
a problem to be dealt with on a monthly basis and
legacy accounting systems offer limited capabilities
sometimes requiring SQL skills or the support of an
IT department to look for non-compliant patterns in
multi/self-JOIN chains. The objective of this paper
is to report on a Design Science project that adopted
RDF graphs (Bizer C., 2009) as a treatment to a mas-
ter data management problem in a legacy accounting
system (and not only, but the current paper’s scope
is limited to the work of the accounting digitization
An accounting information system should hold
not only financial data records, it should also accu-
mulate and manipulate knowledge (i.e. semantic links
and domain-specific rules), and then use that to mimic
the reasoning and data navigation patterns of an ac-
counting professional. Knowledge representation has
gained importance in recent years because of the rise
of semantic technology advertised as ”Knowledge
Graphs” (KG) which are indicated in recent Gartner
reports as being both an Artificial Intelligence hype
ingredient (Gartner, 2021a) and a data analytics trend
(Gartner, 2021b), with possible ramifications in vari-
ous Knowledge Management aspects.
The work at hand is an effort to translate some of
those qualities to accounting data analysis that is rel-
evant for catching accounting errors or investigating
accounting patterns that an auditor would also look
for, typically by visual scrutiny. Therefore, the prob-
lem statement is hereby formulated in terms of the
Design Science problem template (Wieringa, 2014):
Improve auditing capabilities with an existing
accounting information system (problem context)
Covaci, F., Buchmann, R. and Dragos, R.
Towards a Knowledge Graph-specific Definition of Digital Transformation: An Account Networking View for Auditing.
DOI: 10.5220/0010875000003116
In Proceedings of the 14th International Conference on Agents and Artificial Intelligence (ICAART 2022) - Volume 3, pages 637-644
ISBN: 978-989-758-547-0; ISSN: 2184-433X
2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
637 treating it with a Knowledge Graph layer
over existing ledger data (artifact) enable quasi-social views focusing on finan-
cial records connectedness (requirement) order to enable auditing reasoning pat-
terns and contextualized accounting data management
Design science is often used to build new sys-
tems in order to evaluate whether their prescriptions
are feasible and useful, and to gain deeper insights
into the problem being investigated. The field of
accounting information systems can benefit greatly
from this methodology as domain-specific informa-
tion, tacit knowledge and implicit reasoning patterns
can be captured in innovative design decisions.
The larger context for the problem tackled here
is an experimental Digital Transformation project
towards a data-centric IT architecture, from an
application-centric mindset that has lead over time to
financial data silos requiring scheduled synchroniza-
tion, as well as time consuming manual consolidation
and verification. To obtain a contextualized account
network, a GraphDB instance (Ontotext, b) is popu-
lated through its OntoRefine plug-in for lifting legacy
tabular data to RDF graphs (Ontotext, c). Seman-
tic queries and SPARQL-based reasoning patterns are
then employed in an account analytics workbench to
build a network of how accounts interact with each
other depending on their co-occurrences in a double-
entry ledger system, and to navigate those relation-
ships according to some patterns of domain-specific
The remainder of the paper is structured as fol-
lows: Section 2 summarizes the Accounting Cycle,
afterwards Section 3 describes the knowledge capture
process and derived semantic patterns. Section 4 dis-
cusses evaluation challenges and Section 5 presents a
SWOT analysis.
In a business, every transaction affects at least two ac-
counts - as a debit (increase in assets) and as a credit
(increase in liability, equity, income). The debit and
credit entries must always be equal. After the trans-
actions are recorded in the general journal the finan-
cial statements are prepared starting with a trial bal-
ance sheet. A balance sheet is a financial snapshot of
a company’s financial position at a specific point in
time. It includes a list of assets, liabilities, and the
difference between the two, known as net worth. The
balance sheet is built on the accounting equation (as-
sets = liabilities + owner’s equity).
The accounting cycle is a series of steps that trans-
form a company’s basic financial data into financial
statements. The accounting cycle ensures that the
company’s financial statements are consistent, accu-
rate, and in compliance with official accounting stan-
The accounting cycle consists of the following six
Step 1: Gather and analyze documentation for the
current accounting period, such as receipts, invoices,
and bank statements.
Step 2: The ledger is made up of journal entries,
which are a chronological list of all of a company’s
transactions, written down according to double-entry
accounting procedures. To ensure that the company’s
bookkeeping is always up to date, journal entries are
recorded to the ledger on a continual basis, as soon as
business transactions occur.
Step 3: Prepare a trial balance: this is analyzed
mainly for the accounts that involve expenses and in-
come accounts in correlation with the result of assets
and liabilities accounts. Based on this correlation pos-
sible errors in the recorded transactions are investi-
gated by analyzing the account statement.
The error detecting process requires a significant
amount of time because it implies the human anal-
ysis of multiple inter-related account statements that
reflect an economic event. The following types of er-
rors may occurs in the process of recording of trans-
actions:(Renu G., 2013)
A. Clerical Errors. Clerical errors are errors in
recording, posting, totalling, and balancing. Clerical
errors are further separated into two types: (I) omis-
sion errors and (ii) commission errors (e.g. posting
in wrong account, error in totaling and balancing; er-
rors in carry forward totals to trial balance and so on).
Clerical errors may or may not have an impact on trial
B. Errors of Principle. When commonly accepted
accounting principles are not followed when record-
ing the transactions in the books of account, it is
called a principle error. For example, selecting the in-
correct account head or declaring capital expenditure
as revenue. A trial balance or routine inspection will
not reveal such an inaccuracy. It can only be discov-
ered through searching or independent verification.
C. Compensating Errors. Compensating errors are
ones that occur as a result of previous faults being
compensated for. This is difficult to detect because
the net effect is zero. The totals and posts can all be
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
checked for these inaccuracies, which have no effect
on the trial balance.
D. Errors of Duplication. Duplication errors arise
when the identical transaction is entered twice in the
books of original entry and hence twice in the ledger
accounts. These have no impact on trial balance.
Step 4: Adjust the identified errors: based on
the identified errors in the previous step the recorded
transactions are adjusted in order to correctly reflect
the economic event.
Step 5: Prepare an adjusted trial balance, with the
adjusted transactions and amounts. This balance sheet
will be the starting point for preparing the company
financial statements.
Step 6: Prepare financial statements that will offer
a view about the performance of the company.
Knowledge on accounting patterns can be a ba-
sis for designing accounting intelligence systems and
this knowledge often relates to aggregating informa-
tion across multi-hop relationships that exist over the
network of interacting accounts. Knowledge Graphs
are a natural machine-readable information structure
that can capture this network.
The data-centric mindset (DCManifesto, 2021) is in
line with an auditor’s default mindset, one which rea-
sons in accounting principles while navigating a data
context. The navigable ”data context” is formed of all
the direct and indirect relationships to which a data
point participates, limited by a relevance threshold -
e.g., the number of ”hops” in indirect relationships
after which the semantic connectedness becomes too
weak to be of interest for the operational purpose).
This highly synthetic notion of ”context” was recently
formulated in (Cagle, 2021) and it is the asset primar-
ily harnessed during KG-based Digital Transforma-
In the host organization, such context navigation
is not typically supported by the existing ERP’s rigid
tabular views, leaving it to a human auditor to man-
ually browse printed account cards (as suggested in
Figure 1) in a manner that resembles Knowledge
Graph navigation but is performed on paper, a strenu-
ous exploration - at best assisted by search-and-filter
features on tabular views in the ERP system.
The legacy accounting system has limited capabil-
ities (mostly basic form validation) for providing au-
diting support in identifying errors. The identification
of errors is usually made based on the account state-
ment - a periodic summary of that account’s activity.
The individual verification is a time consuming pro-
cess since the number of accounting transactions can
be in a range of 20000-25000/month, but it is funda-
mentally a navigation of the graph of account associ-
ations, not unlike the KG browsing that open knowl-
edge repositories like DBPedia provide in a browser
(DBpedia, 2021).
In order to build the Knowledge Graph for our De-
sign Science project we followed a knowledge capture
approach that involved three sources of knowledge:
Database Engineer staff that contributed to the
development and implementation of the existing
ERP software, thus having a good understanding
the boundaries between legacy data silos and the
improvised conventions for synchronizing them;
Domain-specific Operators - staff with experience
in financial-accounting field, the actual users of
the legacy systems able to demonstrate how work
is performed;
National legislation - certain accounting rules are
prescribed as double-entry associations in natural
language form - available to domain-specific op-
erators as professional knowledge but not embed-
ded in ERP systems functionality. Although some
changes in legislation are occasionally observed,
they are rather rare and should be easily edited as
association rules that an information system can
automatically confront with ledger records.
The knowledge capture process comprised the fol-
lowing steps:
1. Identification of the main entities (in ER sense)
and corresponding tables of the existing ERP sys-
tem. The structure of the tables together with a
data samples were isolated in spreadsheets and
subjected to discussions to clarify the (sometimes
cryptic) meaning of data fields;
2. Person-to-person interactions with think aloud
operations recorded to capture the data brows-
ing experience and reasoning assumptions of the
domain-specific operators, isolating the hints they
look for during their monthly verifications;
3. Discussions between the Knowledge Engineers
and the Database Engineers to identify gaps in the
accounting cycle that are not explicitly captured
in current data silos schemas;
4. Retrieval of accounting rules from the national
legislation, those that can take the form of double-
entry association rules - i.e. which accounts are
allowed to credit a given account;
5. Several iterations of this process were necessary
to eliminate understanding gaps between three
Towards a Knowledge Graph-specific Definition of Digital Transformation: An Account Networking View for Auditing
Figure 1: Manual Account Statement Analysis Process.
categories of stakeholders (legacy database engi-
neer, operators and knowledge engineer).
Graph building was partly manual (the language
of the national legislation does not have reasonable
natural language support and creating such support is
out of the project’s scope) and partly based on the On-
toRefine ETL-like approach provided by GraphDB,
capable of semantic lifting and reconciling tabular
data sources. In the following we will reveal some
of the semantic patterns resulting from this process.
In a traditional ERP each transaction is recorded
in a double-entry ledger based on the raw information
that characterizes it, with attributes such as Debit Ac-
count, Credit Account, Value of transaction, Date of
transaction. Depending on the complexity of the or-
ganization, the recording of the transactions may re-
quire additional information like funding sources or
detailed subtypes of expense/income.
To illustrate the design decisions underlying the
Knowledge Graph approach, in the following we
showcase the required accounting transactions related
to the economic event of acquisition of inventory
items. In the debit of account 303 01 00 “Inventory
items in the warehouse” a record registered the value
at registration price of the inventory objects purchased
from third parties based on the invoice issued. The
credit account will be 401 01 00 ”Suppliers”(1). At
the same time in the debit of the account 442 60 00
”Deductible value added tax” and in the credit of ac-
count 401 01 00 there is need to be recorded the payed
added value tax for the purchased inventory items (2).
When the inventory item is put into use in the debit of
account 303 02 00 ”Inventory items in use” and in the
credit of account 303 01 00 is registered the price of
the inventory items (3). The payment of the invoice
is registered in the debit of the account 401 00 00 and
the credit of account 770 00 00 ”Available funds” (4).
When the inventory item is out of use the value at reg-
istration price of the inventory objects is registered in
the debit of account 603 00 00 ”Expenditure on in-
ventory items” and the credit of account 302 00 00
Table 1 provides a summary of the transactions
described above in a similar manner that they are
recorded in a table in a relational database.
Such tables were converted into a network of ac-
counts whose pairing can be weighted and enriched
by a number of aggregate properties. The work does
not currently employ a full fledged ontology (like e.g.
FIBO) as it focuses on SPARQL-based RDF-star rea-
soning that can enrich the graph or aggregate informa-
tion over relationship-rich data. Figure 2 shows some
of these records in a linked data form revealing chains
of how accounts influence each other by participating
in the same ledger double-entry.
Regulations on how accounts can be associated
are available in the accounting law and belong to the
profession’s domain knowledge - they form a depen-
dency graph that can support certain levels of error
checking. Figure 3 shows a graph fragment with ac-
counting rules related to the recorded transactions in
Figure 2, also suggesting on the left side the national
law (in textual form) from which these rules have
been derived.
The knowledge graph can be exploited for iden-
tifying errors using SPARQL-based reasoning rules
which are deterministic. For compliance with the na-
tional law, the graph fragment with accounting rules
in Figure 3 becomes a reasoning premise to detect
principle accounting errors:
{?transaction a :principleError}
?transaction :debitAccount ?cd;
:creditAccount ?cc.
{GRAPH :AccountingRules {?cd
:mayHaveCreditEntry ?cc}}
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
Table 1: Accounting transactions related to a purchase of inventory items.
Transaction ID Debit Account Credit Account Transaction Transaction Other relevant
Amount Date attrs (e.g. doc links)
1 303 01 00 401 01 00 750 03-10-2021 . . .
2 442 60 00 401 01 00 142.5 03-10-2021 . . .
3 303 02 00 303 01 00 750 03-17-2021 . . .
4 401 00 00 770 00 00 892.5 04-10-2021 . . .
5 603 00 00 302 00 00 750 11-10-2021 . . .
Figure 2: Knowledge Graph fragment lifted from legacy transaction data.
We go beyond error-checking to employ the RDF
approach for richer, relationship-focused analysis. A
network of accounts is generated by the following
rule based on two accounts participating in the same
double-entry transaction:
INSERT { << ?x :hasCreditEntry ?y >>
:totalMonetaryAmount ?z }
SELECT ?x ?y (SUM(DISTINCT ?amount)
AS ?z)
WHERE {?transaction :debitAccount ?x;
:creditAccount ?y; :amount ?amount}
GROUP BY ?x ?y
In addition each co-participation is marked using
the RDF-star extension that allows RDF statements
to have their own properties - i.e., properties of prop-
erty instances aimed to close the functional gap be-
tween RDF graphs and labelled property graphs (Har-
tig, 2021). The rule example above computes the to-
tal amount involved by all interactions between the
connected accounts. Similarly other valuable aggre-
gations may be obtained as suggested in Figure 4 (e.g.
lists of partner organizations, lists of links to primary
In order to query the graph about all the accounts
that have influenced directly or indirectly on account
401 01 00 we can use the following SPARQL query:
{:401_01_00 (ˆ:hasCreditEntry|
:hasCreditEntry)+ ?x}
In contrast to a relational database, a query about
all the accounts that 401 00 00 was involved with
in the same operation, needs to perform n inner join
operations on the accounting transactions table. The
challenge of performing such a query has to do among
others with the indefinite value of n (depending on
the account that we query on), which makes it an
ideal case for the navigation of the accounts’ network.
Since the network is not separated, but kept together
with the legacy system from which it was derived,
navigation through the network may also collect rel-
evant data attached to every operational pair of ac-
counts - amounts, document links etc. Detection of
Towards a Knowledge Graph-specific Definition of Digital Transformation: An Account Networking View for Auditing
Figure 3: Transformation of legal accounting norms to Knowledge Graph.
Figure 4: Networked View on the Account Interactions.
arbitrary length paths has been recently made avail-
able as a plug-in service on GraphDB (Ontotext, a) -
due to weak default support for path detection in the
SPARQL standard, forcing our early stage attempts
to involve an overhead of network analysis libraries.
Multihop directed chains of relationships can now be
detected and highlighted - e.g. the shortest path con-
necting two recorded transactions through a chain of
SELECT ?start ?property ?end ?index
<urn:path> {
path:findPath path:shortestPath;
path:sourceNode ?pathSource;
path:destinationNode ?pathDestination;
path:startNode ?start;
path:propertyBinding ?property;
path:endNode ?end;
path:resultBindingIndex ?index. }
?pathDestination. }
Traditional tabular analytics are not excluded - the
following builds a time series of all transactions using
some reference account as debit account:
SELECT (SUM(?amount) AS ?dailyAmount) ?date
{?transaction :debitAccount :603_02_00;
:amount ?amount; :date ?date}
GROUP BY ?date
ORDER BY ?date
These are some relevant patterns prescribed for
the analysis workbench which makes it possible to
gain a networking view on accounts interaction with-
out making use of any network analysis tools. A
rudimentary interface for graph browsing/path high-
lighting, CONSTRUCT-ing subgraphs and running
SPARQL* queries was build on top to showcase key
operations (see Fig. 5).
Design Science artifacts can be evaluated according
to a large variety of criteria which have been orga-
nized in a taxonomy by (Prat-N., 2014). Limited by
the current Technological Readiness Level of the pro-
posed solution, we are currently focusing on:
Consistency with Organization: the proposed arti-
fact satisfies the set of competency questions derived
from interviews with operators (accountants) of the
legacy ERP systems. The competency questions are
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
Figure 5: Account Graph Navigator.
answered by the queries and reasoning patterns de-
tailed in the previous section. Consistency with tech-
nology: a Knowledge Graph can be populated with
legacy data in several ways - either an ETL strategy
lifts legacy data to a new form of data warehouse; or,
a virtual graph layer is deployed to unify access to ex-
isting data - the hereby reported project employs for
now the first approach with the help of GraphDB’s
OntoRefine RDFizer (Ontotext, c). The legacy ERP
system is however very limited in terms of interop-
erability channels and a future phase of the research
aims to build demonstrators of accounting operations
that can advocate replacement of the legacy systems;
obstacles in this respect are not only technological,
also having to do with extensive reskilling and change
A second tier of evaluation priorities is postponed
for future work - i.e, Consistency with people: no data
is currently available on technology acceptance, as
the proposed artifact still requires some technical un-
derstanding of graph technology and query-level in-
teraction that should be obfuscated entirely behind a
graph-aware user experience. However, current itera-
tions focused on feasibility as a prerequisite for ac-
ceptance. Digitally savvy human capital is needed
in any Digital Transformation project - however KG
savviness is a major challenge due to lack of educa-
tional content - less than ten percent of the project
team had prior awareness about KG and a major ef-
fort of upskilling developers was needed; this effort
should not be further pushed to end-users, for which
a graph-driven user experience must be designed - a
distinct challenge that will be tackled by follow-up
We follow the practice of Design Science re-
search of trying to inform design-based theorizing,
by proposing the following key characteristic for a
KG-based Digital Transformation: it is a form of in-
formation system transformation supported by digital
means that specifically provide - both to relationship-
rich data governance and to relationship-aware dig-
ital assets (services, agents etc.) - an ability to nav-
igate data context, as it was defined at the beginning
of Section 3.
Knowledge representation and reasoning are inspired
by human problem solving, and can empower intel-
ligent systems to mimic deterministic pattern seek-
ing. Considering the current Technological Readiness
Level of the proposal, we summarize a SWOT evalu-
ation to inform future iterations of the proposal, as the
project is open-ended and will evolve towards further
data integrability with heterogeneous sources that can
enrich the discussed patterns:
Strengths: We defined a method of mapping exist-
ing accounting transactions and accountant’s knowl-
edge to semantic query patterns based on RDF graphs
and the RDF-star extension. They exploit a net-
worked view of accounts interactions based on their
co-occurrence in the same ledger double-entry or
check them against regulatory constraints that are
domain-specific and only available in textual regula-
Weaknesses: The current proof-of-concept is on a low
Technological Readiness Level. An ontology is still
to be designed to further enrich the graph - we’re cur-
rently considering FIBO (FIBO, 2021) although it is
currently considered overkill relative to the project
objectives. For the work hereby reported we are cur-
rently not inclined towards developing a novel ontol-
ogy as data was lifted from legacy SQL-based silos
and inherits concepts and properties from the source
schemas. The focus of the reported project iteration
was on fulfilment of certain key use cases under proto-
typical feasibility conditions - to theorize on the Dig-
ital Transformation characteristics that were distilled
in Section 5.
Opportunities: Complex fraudulent patterns may be
devised and prescribed as graph-driven features. The
accounting ecosystem is fundamentally link-oriented,
something already demonstrated by the XBRL stan-
Towards a Knowledge Graph-specific Definition of Digital Transformation: An Account Networking View for Auditing
dard (XBRLInternational, ) which heavily relies on
XLink - we conclude that XBRL was the XML-
era compromise for a tentative accounting-specific
knowledge representationn. Financial authorities be-
ing preoccupied with cross-checking business activ-
ities that are fundamentally federated across partner
businesses, Knowledge Graphs may be a key enabler
for cross-organization auditing if such technology is
adopted at large.
Threats: The uptake of Knowledge Graphs is still lim-
ited and the data analytics methods practiced by audi-
tors are still fundamentally table-biased. Promoting
a mindset where semantic networks (in terms of the
accounting domain) meets the obstacle that most ac-
countants have no awareness of the existence of non-
tabular data models, even if they are closer to how
accounting principles are applied by a human opera-
As a future work we propose to extend our work
in order to identify possible benefits with respect to
fraud detection.
The current work was supported by the project
POC/398/1/1/124155 - co-financed by the European
Regional Development Fund (ERDF) through the
Competitiveness Operational Programme for Roma-
nia 2014-2020.
Bizer C., Health T., B.-L. T. (2009). Linked data – the story
so far. In International Journal on Semantic Web and
Cagle, K. (Accessed June 20th, 2021). The
role of context in data. In Available at
DBpedia (Accessed April 1st, 2021). In
DCManifesto (Accessed April 1st, 2021).
In The Data-centric Manifesto
FIBO (Accessed April 1st, 2021). Enterprise data manage-
ment council. fibo the financial industry business on-
tology. In
Gartner (Accessed April 1st, 2021a). Gart-
ner: 2 megatrends dominate the gartner
hype cycle for artificial intelligence. In
Gartner (Accessed April 1st, 2021b). Gartner: Top
10 data and analytics trends for 2021. In
Hartig, O. (Accessed April 1st, 2021). Po-
sition statement: Rdf* and sparql*. In
Ontotext. Graph path search service on graphdb. In
Ontotext. Graphdb. In
Ontotext. Ontorefine. In
Prat-N., Comyn-Wattiau I., A. J. (2014). Artifact evalua-
tion in information systems design science research -
a holistic view. In PACIS 2014 Proceedings.
Renu G., B. M. (2013). Errors and frauds in financial trans-
actions: Auditors opinion. In The Global eLearning
Wieringa, R. (2014). Design science methodology for infor-
mation systems and software engineering. Springer.
XBRLInternational. The xbrl standard. In
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence