Semantic Prompting over Knowledge Graphs

for Next-Generation Recommender Systems

Antony Seabra

, Claudio Cavalcante

and Sergio Lifschitz

Departamento de Informatica, PUC-Rio, Brazil

Keywords:

Recommender Systems, Knowledge Graphs, Large Language Models (LLMs, Semantic Prompt Generation,

RDF Triples, Natural Language Interfaces.

Abstract:

This paper presents a novel recommender system framework that integrates Knowledge Graphs (KGs) and

Large Language Models (LLMs) through dynamic semantic prompt generation. Rather than relying on static

templates or embeddings alone, the system dynamically constructs natural language prompts by traversing

RDF-based knowledge graphs and extracting relevant entity relationships tailored to the user and recommen-

dation task. These semantically enriched prompts serve as the interface between structured knowledge and the

generative capabilities of LLMs, enabling more coherent and context-aware suggestions. We validate our ap-

proach in three practical scenarios: personalized product recommendation, identiﬁcation of users for targeted

marketing, and product bundling optimization. Results demonstrate that aligning prompt construction with

domain semantics signiﬁcantly improves recommendation quality and consistency. The paper also discusses

strategies for prompt generation, template abstraction, and knowledge selection, highlighting their impact on

the robustness and adaptability of the system.

1 INTRODUCTION

Recommender systems play a key role in tailor-

ing digital experiences across domains such as e-

commerce, media streaming, and online services.

By leveraging user proﬁles, contextual signals, and

historical interactions, these systems aim to suggest

items that align with user interests, thus boosting en-

gagement and driving decision-making. Traditional

recommendation approaches—ranging from collab-

orative ﬁltering to content-based methods—have

evolved signiﬁcantly with the integration of semantic

knowledge and natural language technologies.

Recent advances in Large Language Models

(LLMs) have transformed the ﬁeld of natural lan-

guage processing, enabling models to interpret, gen-

erate, and reason over text with remarkable ﬂuency.

This progress has opened new avenues for building in-

telligent, conversational recommendation interfaces.

However, LLMs alone lack domain-speciﬁc ground-

ing, which can lead to generic or inconsistent sugges-

tions when applied to structured decision-making sce-

https://orcid.org/0009-0007-9459-8216

https://orcid.org/0009-0007-6327-4083

https://orcid.org/0000-0003-3073-3734

narios.

Knowledge Graphs (KGs) offer a powerful mech-

anism to enrich recommendation processes with do-

main semantics. By organizing information into

entities and relationships using formal representa-

tions such as Resource Description Framework (RDF)

triples, KGs capture intricate, structured knowledge

about products, users, and their interrelations. The in-

tegration of KGs with LLMs has the potential to com-

bine the expressiveness of natural language with the

precision of structured data.

In this work, we propose a hybrid recommenda-

tion framework that bridges KGs and LLMs through

dynamic semantic prompt generation. Instead of stat-

ically encoding knowledge into embeddings or manu-

ally deﬁning rules, our system dynamically traverses

RDF graphs to extract relevant information, which

is then used to formulate natural language prompts

tailored to the recommendation task at hand. These

prompts guide the LLM in generating contextually

aligned and semantically grounded recommendations.

The central research questions addressed in this

paper are:

P1. How can semantic knowledge from a Knowl-

edge Graph be dynamically transformed into prompts

that effectively guide an LLM?

394

Seabra, A., Cavalcante, C. and Lifschitz, S.

Semantic Prompting over Knowledge Graphs for Next-Generation Recommender Systems.

DOI: 10.5220/0013741700003985

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 21st International Conference on Web Information Systems and Technologies (WEBIST 2025), pages 394-403

ISBN: 978-989-758-772-6; ISSN: 2184-3252

P2. How can the alignment between task-speciﬁc

goals and KG-derived semantics improve the quality

of recommendations generated by LLMs?

P3. What are the implementation strategies and ar-

chitectural components required to integrate semantic

prompting into a web-based recommender system?

To address these questions, we design and eval-

uate a recommendation pipeline that operationalizes

RDF triples as input for LLM-guided reasoning via

prompt engineering. The system is validated through

three application scenarios involving product sugges-

tion, promotional targeting, and bundling strategies.

The remainder of the paper is organized as fol-

lows: Section 2 provides background on recom-

mender systems, semantic technologies, and LLM in-

tegration. Section 3 details our methodology for se-

mantic prompt generation. Section 4 introduces the

system architecture and implementation. Section 5

presents the experimental evaluation. Section 6 re-

views related work, and Section 7 concludes with ﬁ-

nal remarks and directions for future research.

2 BACKGROUND

2.1 Recommender Systems

Recommender systems have become a crucial com-

ponent of many online platforms, offering personal-

ized suggestions to users based on their preferences

and behaviors. Over the years, various types of rec-

ommender systems have been developed, each with

unique strengths and weaknesses. The three main

types are collaborative ﬁltering, content-based ﬁlter-

ing, and hybrid approaches. Each of these methods

employs different techniques to generate recommen-

dations and addresses different aspects of the recom-

mendation problem (Ricci et al., 2010).

Collaborative ﬁltering is one of the most widely

used techniques in recommender systems. It works

by analyzing user behavior and preferences, typically

through user-item interaction matrices, to ﬁnd sim-

ilarities between users or items (Su and Khoshgof-

taar, 2009). The two primary approaches within col-

laborative ﬁltering are user-based and item-based ﬁl-

tering. User-based ﬁltering recommends items to a

user based on the preferences of similar users, while

item-based ﬁltering suggests items similar to those the

user has previously liked. The strengths of collabora-

tive ﬁltering include its ability to provide recommen-

dations without needing explicit content information

and its effectiveness in leveraging the wisdom of the

crowd. However, it suffers from the cold start prob-

lem, where new users or items with insufﬁcient inter-

actions are challenging to recommend accurately, and

it can struggle with sparsity in the user-item interac-

tion matrix (Schafer et al., 2007).

Content-based ﬁltering, on the other hand, relies

on the features of the items themselves to make rec-

ommendations (Lops et al., 2011). This approach

builds user proﬁles based on the attributes of items

they have previously interacted with and recommends

new items that share similar characteristics. Content-

based ﬁltering is particularly effective in domains

where item features are well-deﬁned and structured,

such as in recommending movies based on genres,

actors, and directors. One of the main advantages of

content-based ﬁltering is its ability to handle the cold

start problem more effectively for new items, as long

as their features are known. However, it has limita-

tions in terms of recommendation diversity, as it tends

to suggest items that are too similar to those the user

has already seen, potentially leading to a narrow user

experience (Aggarwal et al., 2016).

Hybrid recommender systems combine the

strengths of collaborative ﬁltering and content-based

ﬁltering to overcome their individual limitations

(Burke, 2002). By integrating multiple recommen-

dation strategies, hybrid systems can provide more

accurate and diverse suggestions. These systems can

use various methods to combine recommendations,

such as switching between techniques based on

the context, weighting the contributions of differ-

ent methods, or blending their outputs. Hybrid

systems can address the cold start problem more

effectively by using content-based approaches for

new items and collaborative ﬁltering for items with

sufﬁcient interaction data. They also tend to provide

a better balance between relevance and diversity in

recommendations. However, hybrid systems can

be more complex to implement and require more

computational resources, making them potentially

more expensive to deploy and maintain (Zhang et al.,

2019).

In recent years, the development of deep learning

and advanced machine learning techniques has further

enhanced the capabilities of recommender systems.

Techniques such as neural collaborative ﬁltering,

graph-based recommendation, and the use of knowl-

edge graphs have shown promising results in improv-

ing recommendation accuracy and explainability (He

et al., 2017) and (Wang et al., 2019). These advanced

methods leverage large-scale datasets and complex

models to capture intricate patterns in user behav-

ior and item characteristics. While they offer signif-

icant improvements in performance, they also intro-

duce challenges related to model interpretability and

the need for substantial computational resources. As

Semantic Prompting over Knowledge Graphs for Next-Generation Recommender Systems

395

recommender systems continue to evolve, balancing

accuracy, diversity, explainability, and resource efﬁ-

ciency remains a key focus for researchers and prac-

titioners in the ﬁeld (Wang et al., 2020).

2.2 Knowledge Graphs

Knowledge Graphs (KGs) are structured represen-

tations of knowledge that connect entities, such as

people, places, and concepts, through relationships

or edges. These graphs consist of nodes represent-

ing entities and edges depicting the relationships be-

tween them (Hogan et al., 2021). The primary pur-

pose of KGs is to provide a comprehensive and inter-

connected view of knowledge, allowing for efﬁcient

querying and inference. Knowledge Graphs lever-

age semantic information to create meaningful con-

nections and are widely used in various applications,

including search engines, natural language process-

ing, and AI systems. Their ability to integrate and

organize vast amounts of heterogeneous data makes

them valuable tools for managing complex informa-

tion landscapes (Paulheim, 2017).

The construction of Knowledge Graphs involves

several key processes, such as entity extraction, re-

lationship extraction, and graph embedding. En-

tity extraction identiﬁes and categorizes entities from

unstructured data sources, while relationship extrac-

tion identiﬁes the connections between these entities

(Shen et al., 2014). Graph embedding techniques then

transform the graph structure into low-dimensional

vector spaces, enabling machine learning algorithms

to process and analyze the data effectively. Knowl-

edge Graphs can be manually curated, automatically

generated, or created using a combination of both ap-

proaches (Nickel et al., 2015). The rich, intercon-

nected nature of KGs enables advanced data analysis,

supporting tasks like link prediction, entity resolution,

and semantic search (Ji et al., 2021).

In the context of recommender systems, Knowl-

edge Graphs enhance recommendation quality by

providing additional contextual information and rela-

tionships between items. By integrating KGs, recom-

mender systems can move beyond simple user-item

interactions and incorporate richer data about item

attributes, user preferences, and domain knowledge

(Zhang et al., 2019). For instance, a movie recom-

mendation system can leverage a KG to understand

relationships between actors, directors, genres, and

user ratings, allowing it to generate more nuanced and

accurate recommendations. The semantic relation-

ships captured in KGs enable the system to make in-

ferences and discover hidden patterns, leading to im-

proved recommendation diversity and relevance (Cao

et al., 2019).

Furthermore, Knowledge Graphs facilitate ex-

plainability in recommender systems by providing

transparent and interpretable insights into the rec-

ommendation process. When a recommendation is

made, the system can trace the reasoning through

the KG, offering explanations such as ”This movie is

recommended because it shares similar themes with

movies you have previously enjoyed and features an

actor you frequently watch” (Wang et al., 2018). This

transparency enhances user trust and satisfaction, as

users can understand the rationale behind the recom-

mendations. Additionally, KGs help address the cold

start problem by leveraging the semantic relationships

to recommend new items based on their attributes and

connections within the graph. As a result, integrating

Knowledge Graphs into recommender systems not

only improves overall accuracy but also boosts user

engagement and trust through explainable AI.

2.3 Large Language Models

Large Language Models (LLMs) have revolutionized

the ﬁeld of Natural Language Processing (NLP) with

their ability to understand and generate human-like

text. At the heart of the most advanced LLMs is

the Transformers architecture, a deep learning model

introduced in the seminal paper Attention Is All You

Need by (Vaswani et al., 2017). Transformers lever-

age a mechanism called attention, which allows the

model to weigh the inﬂuence of different parts of the

input data at different times, effectively enabling it to

focus on relevant parts of the text when making pre-

dictions.

Prior to Transformers, Recurrent Neural Networks

(RNNs) and their variants like Long Short-Term

Memory (LSTM) networks were the standard in NLP.

These architectures processed input data sequentially,

which naturally aligned with the sequential nature of

language. However, they had limitations, particularly

in dealing with long-range dependencies within text

due to issues like vanishing gradients (Pascanu et al.,

2013). Transformers overcome these challenges by

processing all parts of the input data in parallel, dras-

tically improving the model’s ability to handle long-

distance relationships in text.

Chat models, a subset of LLMs, are special-

ized in generating conversational text that is coher-

ent and contextually appropriate. This specialization

is achieved through the training process, where the

models are fed vast amounts of conversational data,

enabling them to learn the nuances of dialogue. Chat-

GPT, for instance, is ﬁne-tuned on a dataset of conver-

sational exchanges and it was optimized for dialogue

WEBIST 2025 - 21st International Conference on Web Information Systems and Technologies

396

by using Reinforcement Learning with Human Feed-

back (RLHF) - a method that uses human demonstra-

tions and preference comparisons to guide the model

toward desired behavior (OpenAI, 2023a).

The transformative impact of LLMs, and partic-

ularly those built on the Transformers architecture,

has been profound. By moving away from the con-

straints of sequential data processing and embracing

parallelization and attention mechanisms, these mod-

els have set new standards for what is possible in the

realm of NLP. With the ability to augment generation

with external data or specialize through ﬁne-tuning,

LLMs have become not just tools for language gen-

eration but platforms for building highly specialized,

knowledge-rich applications that can retrieve infor-

mation in a dialogue-like way, ﬁnd useful information

and generate insights for decision making.

The ability to augment the generation capabilities

of LLMs using enriched context from external data

sources is a signiﬁcant advancement in AI-driven sys-

tems. An LLM context refers to the surrounding in-

formation provided to a LLM to enhance its under-

standing and response generation capabilities. This

context can include a wide array of data, such as text

passages, structured data, and external data sources

like Knowledge Graphs. Utilizing these external data

sources allows the LLM to generate more accurate

and relevant responses without the need for retrain-

ing. By providing detailed context, such as product

attributes, user reviews, or categorical data, the model

can produce insights that are tailored and contextually

aware.

2.4 Prompt Engineering

One key aspect of providing contexts to LLMs is the

ability of designing and optimizing prompts to guide

LLMs in generating the answers. This is what is

called Prompt Engineering. Its main goal is to max-

imize the potential of LLMs by providing them with

instructions and context (OpenAI, 2023b).

In the realm of Prompt Engineering, instructions

are the crucial ﬁrst steps. Through them, engineers

can detail the roadmap to an answer, outlining the de-

sired task, style and format for the LLM’s response

(White et al., 2023). For instance, To deﬁne the style

of a conversation, a prompt could be phrased as ”Use

professional language and address the client respect-

fully” or ”Use informal language and emojis to con-

vey a friendly tone”. To specify the format of dates

in answers, a prompt instruction could be ”Use the

American format, MM/DD/YYYY, for all dates”.

On the other hand, as mentioned earlier, context

refers to the information provided to LLMs alongside

the core instructions. The most important aspect of

a context is that it can provide information that sup-

ports the answer given by the LLM, and it is very

useful when implementing question-answering sys-

tems. This supplemental context can be presented in

various formats. One particularly effective format is

RDF triples, which represent information as subject-

predicate-object statements. RDF triples are a stan-

dardized way of encoding structured data about enti-

ties and their relationships, making them ideal for em-

bedding precise information into prompts. By includ-

ing RDF triples in a prompt, we can clearly convey

complex relationships and attributes in a format that

the LLM can easily process, leading to more accurate

and relevant responses. According to (Wang et al.,

2023), prompts provide guidance to ensure that Chat-

GPT generates responses aligned with the user’s in-

tent. As a result, well-engineered prompts greatly im-

prove the efﬁcacy and appropriateness of ChatGPT’s

responses.

3 METHODOLOGY

Our proposal in this study is to use LLM’s context

and Prompt Engineering to build a recommender sys-

tem based on a Knowledge Graph that integrates data

on products, features, categories, brands, sales and

users. Figure 3 shows a segment of this KG showing

relationships between a ”Notebook” and other entities

such as user ”David”, brand ”HP”, products ”Printer”

and ”Router”, and feature ”portable”. The KG struc-

tures the relationships between these entities, which

will enable the generation of a context to be presented

to an LLM.

Figure 1: Consumer behaviour graph.

There are various types of relationships that ex-

Semantic Prompting over Knowledge Graphs for Next-Generation Recommender Systems

397

ist within the KG used in our recommender system.

The following table summarizes them. Each row rep-

resents a speciﬁc relationship type, connecting differ-

ent types of entities. The relationship ”buy” indicates

that a user has purchased a particular product. The

relationship ”mention” originating in ”User” captures

instances where a user has mentioned speciﬁc features

of a product in their comments or reviews. It provides

insight into what aspects of a product are important to

users.

Table 1: Summary of Product Relationships and Associated

Purchase probabilities.

Entity Type 1 Relationship Entity Type 2

User buy Product

User mention Feature

Product mention Feature

Product also buy Product

Product also view Product

Product belongs to category Category

Product belongs to brand Brand

The relationship ”mention” originating in ”Prod-

uct” shows which features are associated with spe-

ciﬁc products based on user comments and reviews.

It helps in understanding the attributes and character-

istics commonly linked to products. The ”also buy”

relationship indicates that users who bought one prod-

uct also bought another product. It is useful for iden-

tifying complementary products and making bundle

recommendations. The ”also view” relationship sig-

niﬁes that users who viewed one product also viewed

another product. It helps in recommending products

that are often considered together by users. Finally,

the ”belongs to category” and ”belongs to brand” re-

lationships helps in organizing products and enabling

category-based and brand-based recommendations.

One key-aspect of our methodology is that rela-

tionships in the KG are assigned weights, which in-

ﬂuence the recommendation outcomes by prioritizing

certain connections over others. These weights are

derived from the signiﬁcance of the relationships as

determined by domain knowledge and data analysis.

The table below presents these weights.

Table 2: Summary of Product Relationships and Associated

Purchase Probabilities.

Relationship Min Occ. Purchase Prob.

also buy 5 High

also view 1 Medium

belongs to brand - Medium

belongs to category - Low

Our recommender system incorporates explain-

able AI principles, ensuring that the rationale be-

hind each recommendation is transparent to and in-

terpretable by users. By leveraging the relationships

and their corresponding weights in the KG, the sys-

tem can provide detailed explanations for its sugges-

tions. For instance, a recommendation might be jus-

tiﬁed based on the strong association between a prod-

uct and its brand, or the positive user comments it has

received, or both. This explainability enhances user

trust and satisfaction, as users can understand why

certain products are being recommended.

Regarding to the extraction and transformation of

data from the KG, we use RDF triples as the funda-

mental data units, comprising three components: a

subject, a predicate, and an object. These triples en-

capsulate the semantic relationships between entities,

forming the backbone of the knowledge representa-

tion. To integrate these RDF triples effectively within

the context of the LLM, we transform them into a

structured format that the LLM can readily interpret

and utilize during inference. This transformation in-

volves reformatting the RDF triples into a natural lan-

guage or structured template that preserves the origi-

nal semantic relationships while making the informa-

tion accessible to the LLM. The following example

illustrates an RDF triple and its corresponding format-

ted version:

Figure 2: Formatted triple for LLM context.

In the ﬁnal step, we leverage Prompt Engineering

to steer the LLM in generating targeted recommen-

dations. When RDF triples are formatted for inclu-

sion in a prompt, they can subsequently be appended

to questions directed at the LLM. The prompt itself

presents the triplets variable alongside instructions on

interpreting the relationships within these triplets, a

structured approach to guide the assistant’s under-

standing and response generation. It offers a concrete

dataset for analysis and instruction on how to interpret

the relationships.

messages = [

{"role": "system",

"content": "You are a helpful assistant."},

{"role": "user",

"content": f"Consider the following

WEBIST 2025 - 21st International Conference on Web Information Systems and Technologies

398

relationships represented as triplets:

\n{formatted_triplets}"},

{"role": "user",

"content": "Consider that if a person has

bought a product that participates in a

relation ’also_buy’ with other products,

there is a high probability for this

person to buy these other products."},

...

In addition to incorporating the formatted RDF

triples, the prompts also include speciﬁc instructions

for the LLM on how to interpret and weight the re-

lationships represented in these triples. This ensures

that the LLM not only understands the entities and

their connections but also prioritizes certain relation-

ships based on their relevance to the user’s query. Fur-

thermore, the user’s question is integrated into the

prompt, guiding the LLM to focus on the speciﬁc

needs and preferences of the user. By combining

these elements—formatted triples, interpretative in-

structions, and the user query—the prompt provides

a comprehensive framework that enables the LLM to

generate highly tailored and contextually rich recom-

mendations. This approach ensures that the LLM’s

outputs are not only aligned with the knowledge graph

but also ﬁnely tuned to the nuances of the user’s re-

quest.

4 ARCHITECTURE

The architecture of the application is structured

around three main components: the backend layer for

data acquisition and preparation, the integration layer

with language models, and the user interface layer.

The backend layer is responsible for the data aquisi-

tion and processing to prepare the formatted triples.

The data used in this study was sourced from CSV

ﬁles, which contain the entities and the relationships

between them.

To efﬁciently manage and process this data, we

employed Apache Spark in conjunction with the

GraphFrames library. The data processing begins

with initializing a Spark session, providing the foun-

dation for creating a graph-based structure, where ver-

tices represent entities, and edges represent the rela-

tionships. Data is read from the CSV ﬁles to construct

appropriate lists and then to transform these lists in

Spark Dataframes, which are then used to construct

a graph within the GraphFrame framework. This

graph structure allows us to represent the data as RDF

triples, where each edge in the graph corresponds to a

triple consisting of a subject, predicate, and object.

Once the data is prepared, the challenge is to for-

Figure 3: Recommender system architecture.

mat this information in a manner that can be seam-

lessly integrated into a prompt for the LLM. This in-

volves creating a textual representation of the RDF

triples that is both comprehensible to the LLM and ca-

pable of providing the necessary context for answer-

ing a given question. The formatting process includes

the conversion of RDF triples into natural language

sentences or structured statements that retain the se-

mantic relationships encoded in the RDF format. This

step ensures that the rich semantic information con-

tained within the Knowledge Graph is preserved and

made accessible to the LLM.

The Integration Layer with Language Models han-

dles the interaction between the prepared data and the

Large Language Model. It incorporates the formatted

RDF triples into the context fed to the LLM, ensuring

that the semantic relationships captured in the knowl-

edge graph are effectively utilized during inference.

Additionally, this layer implements Prompt Engineer-

ing techniques, where speciﬁc prompts are crafted to

guide the LLM in interpreting and prioritizing rela-

tionships within the triples, as well as responding to

the user’s query with accurate and contextually rich

recommendations.

The user interface layer is designed to enable

seamless interaction between the user and the rec-

ommender system, serving as a bridge between user

intent and semantically grounded recommendations.

It not only facilitates the input of natural language

queries but also presents the output of the system

in a transparent and interpretable manner. By sur-

facing recommendations that are generated through

the reasoning capabilities of the LLM and grounded

in structured knowledge graph data, the interface en-

Semantic Prompting over Knowledge Graphs for Next-Generation Recommender Systems

399

sures that semantic relationships and decision ratio-

nale are clearly conveyed to the user.

5 EVALUATION

The dataset used in our evaluation is a propri-

etary one, consisting of approximately 5,200 enti-

ties and 16,600 relationships. There are relation-

ships in the dataset such as: ”buy” relationships indi-

cating purchases made by users, ”mention” relation-

ships reﬂecting product attributes mentioned by users,

”also buy” relationships denoting products that are

frequently bought together, ”also view” relationships

showing products that are often viewed together, ”be-

longs to category” relationships classifying products

into speciﬁc categories, and ”belongs to brand” rela-

tionships linking products to their respective brands.

To assess the explainability of the proposed recom-

mender system, we evaluated it using the following

three questions: (1) What products could David buy?,

(2) Which product categories should be prioritized for

a discount? and (3) Which two products would be the

most suitable candidates for a bundle discount?

Each of these questions was submitted to the sys-

tem, and the results were analyzed with a focus on

how effectively the system leveraged semantic knowl-

edge to generate contextually relevant and semanti-

cally coherent recommendations. Particular attention

was given to the system’s ability to incorporate struc-

tured domain relationships into the output, ensuring

that the suggested items aligned with the underlying

meaning and intent of each query.

5.1 What Products Could David Buy?

For this query, the system was asked to recommend

potential products for David based on his previous

purchases and mentions, as well as his preferences

indicated by the RDF triples. The system incorpo-

rated formatted triples and relevant prompts to iden-

tify products that align with David’s purchasing his-

tory, and presents the graph show below, indicating

products and associated probabilities

The recommendation explains that, analyzing cus-

tomer behavior patterns, the system identiﬁed that

products commonly purchased alongside a Notebook,

such as a Mouse, Router, and Printer, hold a high

probability (80%) of being appealing to David. The

strong connection between these items is reinforced

by the ’also buy’ relation, which shows that these

products are often bought together with a Notebook.

Additionally, the Router is recommended with the

same high probability due to its frequent co-viewing

Figure 4: Recommendations for a particular user.

with a Notebook, as indicated by the ’also view’ rela-

tion. Furthermore, considering brand loyalty, the sys-

tem suggests that David might be interested in pur-

chasing a Monitor from the same brand (HP) as the

Notebook, assigning it a medium probability (50%)

under the ’belongs to brand’ relation. Lastly, the sys-

tem identiﬁes a Webcam as a product in the same cat-

egory as the Notebook (Personal Computer), although

with a lower probability (20%), suggesting it as a less

likely but still relevant option. We present other ex-

plainable recommendations in the following table.

Table 3: Product Recommendations for users.

User Recommendation

Sophia Sophia bought a ’USB Drive’. There is a

high probability she would buy a ’Router’

because both are linked by the ’also buy’

relationship.

Eva Eva bought a ’Headset’. There is a high

probability she would buy a ’Tablet’ or

’Smartphone’ because these products are

linked by the ’also

buy’ relationship.

Chris Chris bought a ’Webcam’. There is a high

probability he would buy a ’Keyboard’ due

to the ’also view’ relationship.

Liam Liam bought a ’Mouse’. There is a high

probability he would buy a ’Notebook’ be-

cause both are linked by the ’also buy’ re-

lationship.

Alice Alice bought a ’Tablet’. There is a high

probability she would buy a ’Headset’ or

’Keyboard’ as they are connected by the

’also buy’ relationship.

5.2 Which Product Categories Should

Be Prioritized for a Discount?

The recommender system suggests prioritizing dis-

counts on products within the ’Input Device’ cat-

egory to increase sales. This category, which in-

cludes items like Mice, Keyboards, Webcams, Smart-

phones, Headsets, and Tablets, has a medium prob-

WEBIST 2025 - 21st International Conference on Web Information Systems and Technologies

400

ability of leading to additional purchases when dis-

counted. The medium probability indicates that cus-

tomers who have purchased one product in this cate-

gory are somewhat likely to buy another, especially if

brands like Logitech, Samsung, and Sony are consid-

ered.

On the other hand, categories such as ’Storage De-

vice’, ’Output Device’, and ’Personal Computer’ ex-

hibit low probabilities for additional purchases. Prod-

ucts in these categories, such as USB Drives, SSDs,

Monitors, Printers, and Notebooks, typically have

longer lifecycles or higher price points, which reduces

the likelihood of repeat purchases or cross-category

purchases.

Figure 5: Recommendations for product categories.

Figure 6 shows the probabilities for each product

category in the context of a discount strategy. The

’Input Device’ category stands out with a medium

probability (50%), indicating it as the most promising

category for increasing sales through discounts. The

other categories ’Storage Device’, ’Output Device’,

and ’Personal Computer’, all have lower probabilities

(20%), suggesting they are less likely to beneﬁt from

discounting.

5.3 Which Two Products Would Be the

Most Suitable Candidates for a

Bundle Discount?

The recommender system identiﬁed four pairs of

products as suitable candidates for a bundle dis-

count, based on the analysis of relationships such as

’also buy’, ’also view’, ’belongs to brand’, and ’be-

longs to category’. Here are the recommended pairs:

• Tablet and Headset: The ’also buy’ relationship

indicates that customers who purchase a Tablet of-

ten also buy a Headset. This pair is highly likely

to beneﬁt from a bundle discount.

• Notebook and Router: Both ’also view’ and

’also buy’ relationships suggest that customers

who are interested in a Notebook are also likely

to want a Router, making this pair a strong candi-

date for bundling.

• SSD and Router: Data shows that customers who

purchase a Router also frequently buy SSDs, mak-

ing this pair another potential bundle option.

• Keyboard and Mouse: These products are linked

by both brand (Logitech) and category (’Input De-

vice’). While the probability of purchasing one

after the other may be medium to low, bundling

them could increase this likelihood.

The following ﬁgure shows a bar graph represent-

ing the probabilities for each pair of products iden-

tiﬁed as suitable candidates for a bundle discount.

The graph shows that the Tablet & Headset, Note-

book & Router, and SSD & Router pairs all have a

high probability of 80% for cross-purchasing, mak-

ing them strong candidates for bundling. The Key-

board & Mouse pair has a slightly lower probability

of 50%, but bundling them could still encourage ad-

ditional sales due to their shared brand and category.

Figure 6: Recommendations for bundles.

6 RELATED WORK

There is a growing trend underscoring the intersec-

tion of Recommender Systems and Natural Language

Processing, specially with LLMs serving as a pow-

erful tool for advancing recommendation strategies.

The emergence of LLMs has opened new frontiers in

the recommender systems domain, as they possess the

ability to comprehend and generate human-like text,

which has led to a growing number of studies explor-

ing their potential in enhancing recommendation sys-

tems (Zhao et al., 2023) and (Balloccu et al., 2024).

Knowledge Graphs (KGs) have gained attention

for their ability to encode structured, semantic in-

formation, which can be invaluable in enhancing the

reasoning capabilities of LLMs. Recent studies, like

(Pan et al., 2024) and (Zhu et al., 2023), have explored

integrating KGs with LLMs to improve the quality of

responses in various tasks, including question answer-

ing, entity extraction, and knowledge graph reason-

Semantic Prompting over Knowledge Graphs for Next-Generation Recommender Systems

401

ing. Approaches typically involve either using KGs

as input context for LLMs or leveraging LLMs for

dynamic KG construction. According to (Pan et al.,

2024), KGs can enhance LLMs by providing external

knowledge for inference and interpretability. The syn-

ergy between KGs and LLMs has shown promising

results in capturing rich, domain-speciﬁc knowledge

and enhancing explainability, particularly in systems

requiring complex reasoning.

On the other hand, numerous techniques have

emerged to enhance the extraction abilities of LLMs,

improving their effectiveness in various applications

like question answering, knowledge retrieval, and

reasoning tasks. Prompt engineering, Retrieval-

Augmented Generation (RAG), GraphRAG and Text-

to-SQL are among of these popular techniques.

Prompt engineering has been increasingly recognized

for its potential to signiﬁcantly improve the perfor-

mance of LLMs by instructing them to behave differ-

ently from their default. (White et al., 2023) demon-

strated how carefully designed prompts can enable

more precise responses from LLMs across a range of

tasks, underscoring the importance of prompt design

in leveraging model capabilities. According to the au-

thors, prompt patterns signiﬁcantly enrich the capa-

bilities that can be created in a conversational LLM.

Indeed, this approach is essential for guiding LLMs to

understand and respond to queries more effectively,

by encapsulating the query within a context that the

model is more likely to comprehend and respond to

accurately.

(Giray, 2023) states that, by employing prompt

engineering techniques, academic writers and re-

searchers can unlock the full potential of language

models, harnessing their capabilities across various

domains, and that this discipline opens up new av-

enues for improving AI systems and enhancing their

performance in a range of applications, from text gen-

eration to image synthesis and beyond. The authors

presents the prompt components that can be manipu-

lated by engineers to guide text generation of an LLM.

These componentes include an instruction, a context,

an input data and an output indicator. Instruction out-

lines what the LLM is expected to do, providing clear

directions to guide the model’s response. The con-

text gives background information necessary for the

model to generate relevant and informed responses.

The input data refers to the actual data fed into the

model for processing, like a question, an image or a

set of data points. And the output indicator tells the

model how to format its response and what type of

output is expected.

7 CONCLUSIONS

This article has elucidated the critical role of

techniques such as Prompt Engineering, Retrieval-

Augmented Generation (RAG) and text-to-SQL in en-

hancing the functionality and applicability of LLMs

in accessing and integrating external data sources.

The Python scripts utilized in our analyses are openly

accessible at (Seabra, 2024). These methodologies

are fundamental in interacting with and leverage vast,

dynamic external knowledge repositories, without the

need of retraining a model.

Notably, these approaches have been applied with

notable success to various data-intensive environ-

ments, including documents, knowledge graphs, and

databases. By enabling LLMs to dynamically query

and retrieve relevant information from these struc-

tured and unstructured data sources, the techniques

enhance the model’s ability to generate informed and

contextually accurate outputs. This synergy not only

maximizes the utility of existing data but also expands

the potential applications of LLMs across different

sectors, including business intelligence, legal advise-

ment, and academic research.

The promising results obtained from these tech-

niques underscore the potential for data interaction

and retrieval. However, to fully ascertain their ef-

fectiveness and scalability, future work should focus

on testing these methodologies across more volumi-

nous and diverse data sets, encompassing extensive

documents, knowledge graphs (KGs), and expansive

databases. Such rigorous testing is essential to vali-

date the robustness and adaptability of the strategies

employed, ensuring that they maintain high levels of

accuracy and efﬁciency when scaled.

Moreover, exploring these techniques in larger,

more complex data environments will also shed light

on their limitations and the potential need for reﬁne-

ment or adaptation. As mentioned in the paper, the

continuous expansion of token limits in LLMs marks

a signiﬁcant trend in the evolution of artiﬁcial in-

telligence technologies. As these limits grow, de-

velopers are empowered to work with increasingly

larger blocks of text in a single submission, enabling

a deeper and more comprehensive analysis of data.

This future exploration will not only bolster the con-

ﬁdence in deploying these techniques in real-world

scenarios but also pave the way for their optimiza-

tion and potential customization to speciﬁc domains

or data types, ultimately enhancing the utility and im-

pact of LLMs across various sectors.

WEBIST 2025 - 21st International Conference on Web Information Systems and Technologies

402

REFERENCES

Aggarwal, C. C. et al. (2016). Recommender systems, vol-

ume 1. Springer.

Balloccu, G., Boratto, L., Fenu, G., Malloci, F. M., and

Marras, M. (2024). Explainable recommender sys-

tems with knowledge graphs and language models. In

European Conference on Information Retrieval, pages

352–357. Springer.

Burke, R. (2002). Hybrid recommender systems: Survey

and experiments. User modeling and user-adapted in-

teraction, 12:331–370.

Cao, Y., Wang, X., He, X., Hu, Z., and Chua, T.-S. (2019).

Unifying knowledge graph learning and recommen-

dation: Towards a better understanding of user pref-

erences. In The world wide web conference, pages

151–161.

Giray, L. (2023). Prompt engineering with chatgpt: a guide

for academic writers. Annals of biomedical engineer-

ing, 51(12):2629–2633.

He, X., Liao, L., Zhang, H., Nie, L., Hu, X., and Chua, T.-S.

(2017). Neural collaborative ﬁltering. In Proceedings

of the 26th international conference on world wide

web, pages 173–182.

Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., Melo,

G. D., Gutierrez, C., Kirrane, S., Gayo, J. E. L.,

Navigli, R., Neumaier, S., et al. (2021). Knowledge

graphs. ACM Computing Surveys (Csur), 54(4):1–37.

Ji, S., Pan, S., Cambria, E., Marttinen, P., and Philip, S. Y.

(2021). A survey on knowledge graphs: Representa-

tion, acquisition, and applications. IEEE transactions

on neural networks and learning systems, 33(2):494–

514.

Lops, P., De Gemmis, M., and Semeraro, G. (2011).

Content-based recommender systems: State of the art

and trends. Recommender systems handbook, pages

73–105.

Nickel, M., Murphy, K., Tresp, V., and Gabrilovich, E.

(2015). A review of relational machine learning

for knowledge graphs. Proceedings of the IEEE,

104(1):11–33.

OpenAI (2023a). Chatgpt ﬁne-tune descrip-

tion. https://help.openai.com/en/articles/

6783457-what-is-chatgpt. Accessed: 2024-03-

01.

OpenAI (2023b). Chatgpt prompt engineer-

ing. https://platform.openai.com/docs/guides/

prompt-engineering. Accessed: 2024-04-01.

Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., and Wu, X.

(2024). Unifying large language models and knowl-

edge graphs: A roadmap. IEEE Transactions on

Knowledge and Data Engineering.

Pascanu, R., Mikolov, T., and Bengio, Y. (2013). On the

difﬁculty of training recurrent neural networks. In

International conference on machine learning, pages

1310–1318. Pmlr.

Paulheim, H. (2017). Knowledge graph reﬁnement: A sur-

vey of approaches and evaluation methods. Semantic

web, 8(3):489–508.

Ricci, F., Rokach, L., and Shapira, B. (2010). Introduction

to recommender systems handbook. In Recommender

systems handbook, pages 1–35. Springer.

Schafer, J. B., Frankowski, D., Herlocker, J., and Sen, S.

(2007). Collaborative ﬁltering recommender systems.

In The adaptive web: methods and strategies of web

personalization, pages 291–324. Springer.

Seabra, A. (2024). Github repository. https://github.com/

antonyseabramedeiros/qasystems. Accessed: 2024-

04-01.

Shen, W., Wang, J., and Han, J. (2014). Entity linking with

a knowledge base: Issues, techniques, and solutions.

IEEE Transactions on Knowledge and Data Engineer-

ing, 27(2):443–460.

Su, X. and Khoshgoftaar, T. M. (2009). A survey of col-

laborative ﬁltering techniques. Advances in artiﬁcial

intelligence, 2009(1):421425.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.

(2017). Attention is all you need. Advances in neural

information processing systems, 30.

Wang, H., Zhang, F., Xie, X., and Guo, M. (2018). Dkn:

Deep knowledge-aware network for news recommen-

dation. In Proceedings of the 2018 world wide web

conference, pages 1835–1844.

Wang, M., Wang, M., Xu, X., Yang, L., Cai, D., and Yin,

M. (2023). Unleashing chatgpt’s power: A case study

on optimizing information retrieval in ﬂipped class-

rooms via prompt engineering. IEEE Transactions on

Learning Technologies.

Wang, P., Shi, T., and Reddy, C. K. (2020). Text-to-sql gen-

eration for question answering on electronic medical

records.

Wang, X., He, X., Wang, M., Feng, F., and Chua, T.-S.

(2019). Neural graph collaborative ﬁltering. In Pro-

ceedings of the 42nd international ACM SIGIR con-

ference on Research and development in Information

Retrieval, pages 165–174.

White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert,

H., Elnashar, A., Spencer-Smith, J., and Schmidt,

D. C. (2023). A prompt pattern catalog to enhance

prompt engineering with chatgpt. arXiv preprint

arXiv:2302.11382.

Zhang, S., Yao, L., Sun, A., and Tay, Y. (2019). Deep

learning based recommender system: A survey and

new perspectives. ACM computing surveys (CSUR),

52(1):1–38.

Zhao, Z., Fan, W., Li, J., Liu, Y., Mei, X., Wang, Y., Wen,

Z., Wang, F., Zhao, X., Tang, J., et al. (2023). Recom-

mender systems in the era of large language models

(llms). arXiv preprint arXiv:2307.02046.

Zhu, Y., Wang, X., Chen, J., Qiao, S., Ou, Y., Yao, Y.,

Deng, S., Chen, H., and Zhang, N. (2023). Llms for

knowledge graph construction and reasoning: Recent

capabilities and future opportunities. arXiv preprint

arXiv:2305.13168.

Semantic Prompting over Knowledge Graphs for Next-Generation Recommender Systems

403