LLMs and Knowledge Discovery in Low-Resource Language

Parliamentary Corpora: The PQ Dashboard Case Study

Joel Azzopardi

Department of AI, Faculty of ICT, University of Malta, Malta

Keywords:

Parliamentary Questions (PQs), Large Language Models (LLMs), Artiﬁcial Intelligence (AI), Natural

Language Processing (NLP), Civic Engagement, Distant Reading, Interactive Dashboard.

Abstract:

Parliamentary Questions (PQs) are a critical mechanism for democratic oversight and accountability. However,

their comprehensive analysis can be hindered by limitations such as single-language availability (especially

when the language is a low-resource language such as Maltese) and a lack of structured thematic organisation

or interlinking. This paper introduces the PQ Dashboard, a web-based platform developed to enhance the

accessibility and analytical utility of Maltese Parliamentary Questions. The system employs AI and open Large

Language Models (LLMs) to automate PQ collection, translate content into English, classify it according to the

COFOG-99 taxonomy, extract key terms, and identify interconnections. The interactive dashboard provides

users – including the public, journalists, and academic researchers – with functionalities to navigate PQs by

category or keyword, visualise thematic distributions, and analyse trends in MPs’ activity and ministerial

responses. This enhanced data accessibility aims to facilitate deeper insights into parliamentary discourse,

policy development, and governmental accountability. The PQ Dashboard demonstrates a practical application

of AI-driven solutions for transforming unstructured public data into a more accessible and analysable format,

thereby contributing to increased transparency and informed public engagement.

1 INTRODUCTION

Parliamentary Questions (PQs) constitute a corner-

stone of democratic governance, serving as a vital

mechanism through which Members of Parliament

(MPs) hold ministries accountable, seek information,

and scrutinise government policy. The effective func-

tioning of this oversight process relies heavily on the

accessibility and interpretability of these parliamen-

tary records for various stakeholders, including the

public, journalists, and academic researchers.

However, traditional parliamentary portals, such

as the ofﬁcial Maltese Parliament website hosting

the PQs (https://pq.gov.mt; last accessed: Septem-

ber 2025), typically offer basic search functionalities

and categorisation based on metadata like date, MP,

or ministry. While these provide fundamental access,

they often lack advanced semantic search capabilities,

thematic overviews, or tools for longitudinal analy-

sis. This limitation is particularly pronounced for par-

liamentary data in low-resource languages, such as

Maltese, where readily available Natural Language

https://orcid.org/0000-0001-6709-8530

Processing (NLP) tools and pre-trained models are

less common, posing a signiﬁcant barrier to compre-

hensive analysis and public engagement (Koehn and

Knowles, 2017; Ranathunga et al., 2022). Early work

on Maltese parliamentary data, such as Analysing and

Visualising Parliamentary Questions: A Linked Data

Approach (Abela and Azzopardi, 2018), has explored

methods to enhance accessibility through linked data

and visualisations, but these did not incorporate ad-

vanced Artiﬁcial Intelligence (AI) for semantic en-

richment or multilingual access to the extent now pos-

sible with Large Language Models (LLMs).

Speciﬁcally, the current Maltese parliamentary

portal publishes all answered PQs exclusively in Mal-

tese. While it offers basic search and categorises PQs

by criteria such as category, heading, MP, ministry,

and sitting, it does not readily reveal overarching top-

ics, enable tracking of trends in topics’ popularity, or

facilitate analysis of MPs’ activity over time. Further-

more, references within PQs to other questions are

not directly linked, which limits navigability and the

ability to trace interconnected legislative discourse.

These deﬁciencies collectively restrict the ability of

users to gain high-level overviews, thematic insights,

Azzopardi, J.

LLMs and Knowledge Discovery in Low-Resource Language Parliamentary Corpora: The PQ Dashboard Case Study.

DOI: 10.5220/0013835100004000

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2025) - Volume 1: KDIR, pages 159-170

159

and a deeper understanding of the parliamentary pro-

cess.

This paper introduces the PQ Dashboard, a novel

online platform designed to address these critical lim-

itations. By leveraging advanced AI and NLP tech-

niques, namely open LLMs, the PQ Dashboard aims

to transform the accessibility and analytical utility

of Maltese Parliamentary Questions, thereby promot-

ing greater transparency and informed civic partici-

pation. As a publicly accessible web portal avail-

able at https://pq.ir.mt, the PQ Dashboard’s innova-

tive features considerably enhance the value of the

available data. It achieves this by offering content

in both Maltese and English, providing categorisation

into COFOG-99 categories, identifying relevant key-

words, and pinpointing citations between Parliamen-

tary Questions. This comprehensive approach facil-

itates the presentation of rich insights and aggregate

information, including the proﬁling of MPs and Min-

istries’ activities through the PQs being submitted,

and enables seamless navigation between PQs that are

linked together via citations.

The rest of this paper is structured as follows: Sec-

tion 2 ﬁrst provides an overview of similar systems.

Section 3 then describes the underlying process of

the system, from data acquisition via web scraping to

the processing of the data utilising LLMs, and ﬁnally,

the updating of the server hosting the web applica-

tion. Section 4 details the functionalities of the de-

veloped online dashboard and presents some key in-

sights that can be extracted. In Section 5, we describe

the evaluation that was carried out. Finally, Section

6 presents the conclusions and outlines our plans for

future work.

2 SIMILAR SYSTEMS

Modern legislatures table thousands of Parliamentary

Questions (PQs) each session – Malta alone recorded

almost 30,000 PQs between May 2022 and July

2025, and this legislature is not yet complete (Par-

liament of Malta, 2022). While most parliaments

now publish PQs and responses online, practices vary

widely. Basic portals that provide documents (in

docx or pdf formats) remain common, while only

a minority offer machine-readable formats or public

APIs (Inter-Parliamentary Union (IPU) and UN/IPU

Global Centre for ICT in Parliament, 2024). Meta-

data is often inconsistent, and keyword-based search

systems limit the discoverability of relevant mate-

rial (Inter-Parliamentary Union (IPU) and Parliamen-

tary Data Science Hub, Centre for Innovation in Par-

liament, 2024). Further hindrances to transparency

include delayed responses or evasive replies such as

“data not held” (TMID Editorial, 2024).

Recent advances in Artiﬁcial Intelligence (AI)

and Natural Language Processing (NLP), particularly

through the development of Large Language Mod-

els (LLMs), offer the potential to overcome these

limitations (Zhuang et al., 2025). These tools can

deliver semantic search that transcends literal key-

word matching (Alvarez and Morrier, 2025), gener-

ate concise summaries of technical responses, and

extract topics for building interactive dashboards.

They can also support classiﬁcation and trend detec-

tion, link related questions, and assess response qual-

ity—enabling deeper analysis of political discourse.

2.1 Comparative Practices and

International Benchmarks

Several research initiatives and civic technology

projects demonstrate best-practice applications of AI

using data from different national parliaments:

• United Kingdom: Provides comprehensive APIs

and machine-readable formats via https://explore.

data.parliament.uk, which supports civic tools

like TheyWorkForYou (Parliament, 2024).

• Brazil: Uses machine learning in its “Ulysses

Suite” to categorise citizen input and assist

legislative drafting (Inter-Parliamentary Union

(IPU), 2022).

• Italy: Applies generative AI to cluster and as-

sess thousands of amendments before committee

review (Citino, 2024).

• Finland: Leverages semantic web technologies in

the ParliamentSampo platform for concept- and

speaker-based exploration (Hyv

onen et al., 2022).

• France: Employs the LLaMandement LLM

to summarise complex legislative amend-

ments (Gesnouin et al., 2024).

These systems are commonly underpinned

by open data standards, cross-parliamentary

collaboration, and responsible AI frame-

works (Inter-Parliamentary Union (IPU) and

UN/IPU Global Centre for ICT in Parliament, 2024;

Inter-Parliamentary Union (IPU) and Parliamen-

tary Data Science Hub, Centre for Innovation in

Parliament, 2024).

2.2 Current Capabilities in Malta

While Malta has established a foundational level of

transparency through its ofﬁcial PQ portal (Abela and

Azzopardi, 2018), the absence of a structured data

KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval

160

pipeline or a public API continues to limit both in-

ternal analysis and external reuse. A linked data pro-

totype developed in 2017–2018, known as PQViz,

demonstrated the feasibility of graph-based explo-

ration (Abela and Azzopardi, 2018). Building on this

foundation, the PQ Dashboard presented in this paper

– accessible at https://pq.ir.mt – incorporates a range

of LLM-based features, as detailed in Section 4.

Despite these advancements, persistent delays and

incomplete responses from ministries remain a sig-

niﬁcant barrier to effective scrutiny (TMID Editorial,

2024). Table 1 provides a comparative overview of

PQ access capabilities across various national parlia-

ments.

2.3 AI Techniques Utilised in

Parliamentary Analysis

2.3.1 Semantic Search and Retrieval

Conventional search in legislative portals is limited

by exact term matching (Bryłkowski and Klikowski,

2025). Semantic search techniques, driven by LLMs,

enable retrieval based on meaning and intent (Al-

varez and Morrier, 2025). Embedding-based mod-

els transform language into vector representations, al-

lowing for similarity matching even in the absence of

direct keyword overlap (Bryłkowski and Klikowski,

2025). This signiﬁcantly enhances search and re-

trieval through large, complex corpora.

Retrieval-Augmented Generation (RAG) archi-

tectures combine traditional retrieval with gener-

ative capabilities, enabling factual, grounded re-

sponses (Sharma, 2025). This is particularly valu-

able for civic applications, where hallucinated content

could erode public trust. Although RAG improves ac-

curacy, it also introduces challenges such as retrieval

noise and outdated knowledge bases. Ongoing re-

search aims to reﬁne these systems for reliable use

in governance (Sharma, 2025).

2.3.2 Summarisation and Topic Modelling

LLMs can generate high-quality summaries of PQ re-

sponses, aiding readers in quickly understanding de-

tailed content (Alvarez and Morrier, 2025; Siino et al.,

2025). France’s LLaMandement project exempliﬁes

the effectiveness of such summarisation in legislative

settings (Gesnouin et al., 2024). Topic modelling fur-

ther enables the discovery of dominant themes and

emerging trends (Polat and Korpe, 2022; Hyv

onen

et al., 2022), supporting policy research and media

scrutiny.

2.3.3 Discourse and Answer Quality Analysis

LLMs can classify PQs based on rhetorical purpose

(e.g., factual, policy, accusatory) and assess the qual-

ity of responses (e.g., explanatory, evasive) (Alvarez

and Morrier, 2025). This supports analysis of dis-

course strategies and highlights avoidance tactics (Al-

varez and Morrier, 2025). Additional work has in-

vestigated potential algorithmic bias and the need for

domain-aware evaluation frameworks (Cunningham

et al., 2025; Rozado, 2024).

2.4 AI in Broader Public Sector

Applications

AI is increasingly embedded in public-sector work-

ﬂows to optimise service delivery, automate ap-

provals, and assist decision-making (Zhao et al.,

2025). These applications have been shown to en-

hance efﬁciency and public trust by enabling respon-

sive and transparent government operations.

In the United States, over 1,700 AI use cases

have been documentedi in December 2024, includ-

ing fraud detection at the Veterans Administration and

decision support at the Social Security Administra-

tion (Martorana, 2025). Malta has also seen innova-

tion through AI-based analysis of legal judgments us-

ing cross-lingual information retrieval and rhetorical

role labelling (Azzopardi, 2024).

2.5 Ethical and Regulatory

Considerations

The deployment of AI in public-sector contexts raises

important concerns around transparency, bias, and

privacy. The Artiﬁcial Intelligence Act adopted by

the European Union (European Parliament and Coun-

cil of the European Union, 2024) establishes a regu-

latory framework for trustworthy AI, including clas-

siﬁcation of high-risk applications such as those

used in governance and public services. The Inter-

Parliamentary Union (IPU) has complemented this

with its Guidelines for AI in Parliaments, which out-

line principles for responsible AI use within leg-

islative contexts, emphasising human oversight, fair-

ness, and accountability (Inter-Parliamentary Union

(IPU) and Parliamentary Data Science Hub, Centre

for Innovation in Parliament, 2024). These frame-

works encourage parliaments and related organisa-

tions to adopt ethical practices that safeguard public

trust while enabling innovation.

LLMs and Knowledge Discovery in Low-Resource Language Parliamentary Corpora: The PQ Dashboard Case Study

161

Table 1: Comparative Analysis of PQ Access Features.

Feature Malta (Current

State)

UK (Best Practice) Canada (Best Prac-

tice)

Brazil/Italy/Finland

(Advanced AI Ex-

amples)

Online Portal Ac-

cess

Yes, via https:

//parlament.mt

Yes, via https://

questions-statements.

parliament.uk and

data API

Yes, Open Parliament

portal and LEGISinfo

Integrated into main

websites

Search Functional-

ity

Basic keyword search Advanced ﬁltering

and API support

Advanced portal

search and ﬁltering

Semantic/NL search,

AI dashboards

Data Format

Availability

PDF and DOC only Open formats (XML,

JSON, CSV)

CSV/XML via API Structured data and

Linked Open Data

Public API Not available Fully documented

API

API access supported APIs power civic and

internal tools

Visualisation /

Analysis

PQViz (legacy) External tools (e.g.,

TheyWorkForYou)

Tools like OpenPar-

liament.ca

AI-driven dashboards

and semantic portals

AI Integration

Status

Minimal / exploratory Emerging Early-stage interest Fully operational

(Brazil, Italy), cus-

tom LLMs (France)

Answer Timeli-

ness / Quality

Signiﬁcant delays

and evasive answers

Some evasions but

procedural account-

ability

Formal processes AI being used to as-

sess quality

3 METHODOLOGY

The PQ Dashboard is an innovative online plat-

form developed using entirely open-source technolo-

gies, demonstrating a commitment to transparency,

reusability, and data sovereignty. This section details

the system architecture, operational workﬂow, and the

speciﬁc Artiﬁcial Intelligence (AI) and Natural Lan-

guage Processing (NLP) techniques employed.

3.1 System Architecture and

Technologies

The system is built upon a robust, Linux-based in-

frastructure, speciﬁcally utilising Ubuntu 24.04 for its

operating environment. The core components of the

web application are developed using Python Flask,

served by Gunicorn, with the front-end interface con-

structed using standard HTML, CSS, and JavaScript.

Data persistence is managed by MongoDB, a NoSQL

database, chosen for its schema-less design which is

optimal for handling document-oriented data, typi-

cally in Binary JSON (BSON) format. MongoDB

also supports efﬁcient data retrieval through indexing,

crucial for the interactive dashboard.

The PQ Dashboard operates across a two-server

architecture:

• Processing Server: This in-house server, located

within the University of Malta, is dedicated to

data acquisition and intensive processing tasks. Its

speciﬁcations include an Nvidia RTX 4090 GPU

with 24GB of dedicated memory, 32GB of RAM,

and an Intel i7-13700K CPU (providing 24 vir-

tual processors). This hardware conﬁguration is

speciﬁcally chosen to support the computational

demands of Large Language Models (LLMs) and

other NLP operations, enabling the hosting of

LLMs up to approximately 12 billion parameters.

• Publicly Accessible Virtual Private Server

(VPS): This server hosts the web application,

making the PQ Dashboard publicly available. It

is conﬁgured with 8GB of RAM and an Intel(R)

Xeon(R) CPU E5-2680 (2 virtual processors).

The VPS focuses solely on serving user queries

via the web interface, with all heavy data process-

ing ofﬂoaded to the dedicated processing server.

3.2 Operational Workﬂow and Data

Acquisition

The system operates on a daily automated cycle to

ensure the PQ Dashboard remains up-to-date with the

latest parliamentary information. Each night, the data

processing server performs the following sequence of

operations:

1. It connects to the ofﬁcial Maltese Parliament por-

tal (https://pq.gov.mt) to check for newly available

Parliamentary Questions.

KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval

162

2. The system maintains a record of previously pro-

cessed parliamentary sittings, identifying any new

sittings for which data has been submitted.

3. New PQs from these sittings are then scraped

from the portal. The scraping process is exe-

cuted using Selenium, which operates a headless

Google Chrome browser to interact with the web-

site. JavaScript is employed to identify and extract

the salient parts of each PQ from the web page,

with Python then retrieving this data and harvest-

ing it into the MongoDB database.

4. The newly acquired PQ data is then processed us-

ing the LLM.

5. All processed PQs are stored in the MongoDB

database on the data processing server.

6. To maintain synchronisation, the processed PQs

are also pushed to the publicly accessible VPS

via a custom-built API, ensuring data consistency

across both servers.

Each Parliamentary Question is accompanied by

several core metadata ﬁelds, including: title, question

and answer texts, the Member of Parliament (MP)

submitting the question, the responding minister and

ministry, and supplementary information such as the

legislature, PQ number, sitting, date, and classiﬁca-

tion category.

3.3 Large Language Model (LLM)

Processing

The core of the data enrichment process relies on an

open Large Language Model. The system currently

utilises the Gemma 2 9B Instruction Tuned model,

released by Google and available via Hugging Face

(Google, 2024). This model was selected following

comparative testing with other open LLMs available

at the time of development, including Meta Llama

3 and Mistral. Gemma demonstrated superior per-

formance in processing Maltese language texts. The

choice of LLM size was constrained by the available

24GB GPU memory on the processing server, which

efﬁciently supports models up to approximately 12

billion parameters. Running the LLMs on CPU was

tested but found to be prohibitively slow.

The LLM performs several critical tasks on the

scraped PQ data:

• Translation: The Title, Question, Answer, and

associated ministry ﬁelds of each PQ are automat-

ically translated from Maltese into English.

• COFOG-99 Categorisation: The LLM identi-

ﬁes and assigns the relevant COFOG-99 (Classiﬁ-

cation of the Functions of Government) category

to each PQ (United Nations Statistics Division,

2024). This United Nations standard provides a

thematic classiﬁcation of government activities.

• Keyword Extraction: Key terms are extracted

from the PQ content to facilitate thematic brows-

ing and analysis.

• Inter-PQ Link Identiﬁcation: The LLM identi-

ﬁes references within PQs to other Parliamentary

Questions, establishing outgoing links.

Given the inherent limits of LLM queries, particu-

larly for longer PQs, the system implements a strat-

egy where lengthy PQs are split into sections, pro-

cessed separately by the LLM, and then the results

are merged to ensure comprehensive analysis.

3.4 Additional Data Processing and

Synchronisation

Beyond the LLM-driven tasks, additional processing

is performed to enhance data navigability. Once out-

going links from a PQ to other PQs are identiﬁed,

the system automatically establishes corresponding

incoming links for the referenced PQs. This allows

the user interface to display both outgoing and in-

coming links, improving the user’s ability to trace in-

terconnected legislative discourse, a feature not avail-

able on the ofﬁcial parliamentary portal. All extracted

and processed data is then harvested into the Mon-

goDB database on the processing server.

Data synchronisation between the processing

server (located within the University of Malta’s secure

network and not publicly accessible) and the public-

facing VPS is managed via a custom-built API. The

processing server pushes newly processed data to the

VPS over HTTPS through a custom-built API. The

API, residing on the VPS, listens for incoming re-

quests via HTTPS and is only accessible from a pre-

deﬁned range of trusted IP addresses, enhancing se-

curity. This API exposes calls to check which par-

liamentary sittings are currently stored on the server

database, and which PQs are stored for that sitting.

It then provides methods to store the details about a

new parliamentary sitting and to add PQs to a sit-

ting. Through this API, the databases on the data

processing server and the VPS are kept automatically

synchronised, ensuring up-to-date content on the pub-

lic dashboard without exposing the processing infras-

tructure. The data processing server speciﬁcally uses

this API to push the data relevant to newly processed

PQs to the VPS thus ensuring that the PQ Dashboard

is kept updated.

LLMs and Knowledge Discovery in Low-Resource Language Parliamentary Corpora: The PQ Dashboard Case Study

163

3.5 Web Portal

The online PQ Dashboard, accessible at https://pq.ir.

mt, serves as the user-facing component, hosted on

the VPS. It does not perform additional data process-

ing but provides an intuitive interface for users to

query and explore the processed parliamentary data.

The system was developed using open-source tech-

nologies. Further details on the functionalities pro-

vided within this dashboard are described in Section

4 below.

The backend is powered by Python Gunicorn,

proxied behind Apache2. The frontend is built with

HTML, CSS, and JavaScript, utilising Bootstrap to

ensure a responsive design across various screen

sizes. Other libraries incorporated include DataTa-

bles for efﬁcient data presentation and quick search-

ing within tables, and Chart.js for data visualisations.

MongoDB is used as the database.

4 PQ DASHBOARD:

FUNCTIONALITIES AND

EXTRACTED INSIGHTS

The PQ Dashboard is available publicly on https:

//pq.ir.mt. It currently covers Parliamentary Ques-

tions (PQs) from the current (14th) legislature, and

the system is updated nightly with newly published

PQs available on the ofﬁcial PQ portal (process de-

scribed in Section 3). Figure 1 shows a screenshot of

the dashboard.

Key functionalities available to users include:

• Browse by Categories: Users can explore PQs

classiﬁed into COFOG-99 Categories, enabling

thematic navigation of government functions.

• Browse by Keywords: The dashboard allows

users to search and browse PQs using extracted

keywords, facilitating the discovery of frequently

discussed topics.

• Analysing MPs’ Activity: The platform provides

tools to analyse the activity patterns of Members

of Parliament (MPs).

• Overview by Ministry: Users can view PQs or-

ganised by the ministries they were directed to,

offering insights into ministerial engagement.

• Advanced Filtering: Users can ﬁlter by date,

MPs, ministry, category and keyword. Any com-

bination is possible, and multiple values for MPs,

ministry, category and keyword can be entered.

Select2 is utilised to allow users easy search

through the dropdown values.

• Bilingual Access: The dashboard provides access

to PQs in both their original Maltese and the auto-

matically translated English versions, catering to

a wider audience.

• Ofﬁcial Source Linking: For each PQ, a direct

link is provided to the ofﬁcial version on https:

//pq.gov.mt. This ensures transparency and allows

users to verify any information or inconsistencies

with the authoritative source.

• Inter-PQ Navigation: Users can seamlessly

browse between PQs via identiﬁed incoming and

outgoing links. This feature signiﬁcantly en-

hances research capabilities, as PQs often refer to

previous questions, and the dashboard eliminates

the need for manual searches for referenced doc-

uments.

The subsequent subsections provide further de-

tails and some insights obtained when utilising the

different functionalities.

4.1 Browse by Categories

Users can view the distribution of PQs across the dif-

ferent COFOG-99 categories. They can then drill

down by selecting a particular category to view the

list of PQs categorised within that category. This ac-

tion opens the list of PQs as a clickable table, allowing

users to view individual PQs. Furthermore, horizon-

tal bar charts are displayed, showing the distribution

of ministries targeted within these PQs and the distri-

bution of MPs who posed these questions.

For instance, the majority of PQs published from

the start of the current legislature (16th May 2022)

to mid-July 2025 related to General Public Services

(8855 PQs). The second most popular COFOG-99

category within the same period is Economic Affairs,

with 4358 PQs. Of these PQs (related to General Pub-

lic Services), 866 PQs were addressed to the Minis-

ter for National Heritage, the Arts and Local Gov-

ernment, and 594 were addressed to the Ministry for

Transport, Infrastructure and Public Works. The MP

who asked the majority of PQs from this category was

Jerome Caruana Cilia (929 PQs), followed by Ivan

Bartolo (658 PQs).

4.2 Analysing MPs’ Activity

Users can view a list of MPs who posed questions,

along with statistics on the number of PQs posed as

per the ﬁltering selected. For instance, Jerome Caru-

ana Cilia submitted the most PQs (2195) in this legis-

lature, followed by Graziella Galea (2006 PQs). How-

ever, if only PQs related to Environmental Protec-

KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval

164

Figure 1: The PQ Dashboard landing page showing the distribution of PQs by Jerome Caruana Cilia according to the different

COFOG-99 categories.

tion are considered, Rebekah Borg submitted the most

questions (117), followed by Graziella Galea (95).

Users can then select a particular MP to drill down

into that MP’s activity. This opens a clickable table

listing PQs, along with horizontal bar charts showing

the distribution of ministries targeted by that MP, the

distribution of categories to which those PQs belong,

and the distribution of keywords associated with those

PQs.

For instance, when considering the PQs related

to Environmental Protection by Rebekah Borg, the

majority of them were submitted to the Ministry for

the Environment, Energy and the Regeneration of the

Grand Harbour (54 PQs), the Ministry for the En-

vironment, Energy and Public Health (32 PQs) and

the Ministry for the Environment, Energy and Enter-

prise (9 PQs). It should be noted that when ministe-

rial portfolios are modiﬁed by the government, the af-

fected ministries appear as ‘new’ ministries. Most of

the PQs submitted by Rebekah Borg related to Envi-

ronmental Protection concerned the Environment and

Resources Authority. This was clearly highlighted by

the visualisation of keyword distribution.

4.3 Overview by Ministry

This section displays a list of ministries along with the

number of PQs addressed to each of them. Users can

select a ministry to view a list of PQs addressed to that

ministry (subject to any additional ﬁltering provided

by the users), a horizontal bar chart showing the dis-

tribution of MPs who posed these questions, a hori-

LLMs and Knowledge Discovery in Low-Resource Language Parliamentary Corpora: The PQ Dashboard Case Study

165

zontal bar chart showing the distribution of categories

to which those PQs belong, and a horizontal bar chart

showing the distribution of keywords associated with

those PQs.

For example, the Ministry for Education, Sport,

Youth, Recreation and Innovation is the most com-

monly addressed ministry (2911 PQs). Over 20%

of these PQs (632) were posed by Justin Schembri,

with Graziella Galea submitting the second most (249

PQs). The great majority of these PQs were cate-

gorised within the obvious COFOG-99 category ‘Ed-

ucation’. Others were categorised within ‘General

Public Services’, and ‘Recreation, Culture and Re-

ligion’, amongst others. Some of the most popular

keywords include ‘Schools’, ‘Students’ and ‘Sport’.

Figure 2 provides additional insights, showing the

breakdown of PQs directed to the Ministry for Gozo

and Planning between May and July 2025, classiﬁed

by the Member of Parliament (MP) posing the ques-

tion and the corresponding COFOG-99 categories.

Figure 2: Insights into Parliamentary Questions (PQs) di-

rected to the Ministry for Gozo and Planning from May to

July 2025, highlighting the activity of posing MPs and the

distribution across COFOG-99 categories.

4.4 Viewing a PQ

When a user is presented with PQs in a datatable,

these are clickable to view the details of that PQ. The

presented details for a PQ include:

• Details available in the original ofﬁcial PQ,

namely:

– PQ number

– Sitting

– MP posing the question

– PQ Type (Oral vs. Written)

– The Ministry to which the PQ is addressed (in

Maltese)

– The Minister answering the PQ

– PQ Title (in Maltese)

– Question Text (in Maltese)

– Answer Text (in Maltese)

• Translated version of the applicable ﬁelds, more

speciﬁcally:

– Ministry name

– Title

– Question Text

– Answer Text

• Other details retrieved from the LLM processing:

– COFOG-99 Category (in both Maltese and En-

glish)

– Extracted Keywords (in both Maltese and En-

glish)

– Links to PQs cited within this PQ (a.k.a. out-

going links)

– Links to PQs citing this PQ (a.k.a. incoming

links)

• Link to the original PQ from the ofﬁcial Parlia-

ment of Malta portal (https://pq.gov.mt).

Figure 3 shows the PQ Dashboard’s detailed view

of a speciﬁc PQ.

It should be noted that the origi-

nal PQ may contain links to documents laid on the

Clerk’s table. These ‘documents laid’ are to date not

processed or displayed in the PQ Dashboard.

5 EVALUATION

The PQ Dashboard was informally evaluated through

demonstrations to various key stakeholders, includ-

ing ofﬁcials from the Parliament of Malta, civil ser-

vice ofﬁcials within the Government of Malta, and

former Members of Parliament. Qualitative feedback

was recorded during these sessions. While acknowl-

edging that a more rigorous and systematic evaluation

would have been desirable, the insights gathered from

this qualitative feedback still proved valuable in as-

sessing the system’s utility and identifying areas for

future enhancement.

The original PQ is accessible at https://pq.gov.

mt/pqweb.nsf/06d013e9f9ab0283c12568f50054014f/

c1257d2e0046dfa1c1258c8100253c15 (last accessed:

September 2025).

KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval

166

Figure 3: A detailed view of a Parliamentary Question (PQ) within the dashboard, displaying all relevant information.

Positive Feedback

The overall feedback received was overwhelm-

ingly positive, particularly highlighting several key

strengths of the system:

• Multilingual Search Capability: A signiﬁcant

advantage highlighted was the ability to search

Parliamentary Questions (PQs) using English,

which greatly enhances accessibility for non-

Maltese speakers and facilitates broader research.

• Simpliﬁed Navigation through Inter-PQ Links:

Users appreciated the simpliﬁcation of PQ

searches, especially when different PQs refer to

each other. The system’s ability to identify and

link related questions streamlines the research

process, eliminating the need for users to initiate

new searches for referenced PQs, a limitation of

the ofﬁcial parliamentary portal.

• Thematic Overviews: The dashboard provides

clear overviews of the typical subjects of ques-

tions, allowing users to quickly grasp prevailing

topics and trends in parliamentary discourse.

Beyond general research into PQs, speciﬁc useful

applications of such a system were highlighted:

• Tracking Civil Service Projects: The system

was identiﬁed as a valuable tool for keeping track

of current projects related to the civil service. It

was noted that, until recently, PQs were often

LLMs and Knowledge Discovery in Low-Resource Language Parliamentary Corpora: The PQ Dashboard Case Study

167

the only means to acquire certain governmental

data (e.g., data related to governmental authori-

ties) when not shared through other ofﬁcial chan-

nels, making the dashboard a key tool for data ac-

quisition in such scenarios.

• Ministerial Consistency Checks: Ministerial of-

ﬁcials found the system useful for preparing an-

swers to new PQs, enabling them to ensure con-

sistency with information previously provided in

past PQ answers.

Identiﬁed Shortcomings and Areas for

Improvement

Despite the positive reception, a number of shortcom-

ings and areas for improvement were also noted, pro-

viding crucial guidance for future development:

• Translation Nuances: While the automatic trans-

lations were generally considered to be of good

quality, it was observed that they occasionally

failed to capture the exact meaning implied in the

original Maltese text. Such issues were typically

present in “canned” parts of the text—frequently

recurring phrases in PQs. For instance, the phrase

“Ninforma lill-Onor. Interpellant illi...” (“I in-

form the Honourable Member that...”) was some-

times translated as “I inform the Honourable...”,

omitting the word ‘Member’. A suggested

workaround involves implementing rules to apply

pre-prepared, accurate translations for such com-

mon phrases, thereby correcting the AI-generated

output.

• Clear Source Attribution: It was recommended

that the system explicitly mark which parts of the

displayed text were obtained directly from ofﬁ-

cial sources (e.g., original Maltese question and

answer texts) and which were generated or pro-

cessed by AI. This would enhance transparency

and user trust.

• System Independence Disclaimer: To avoid

misconceptions due to potential inaccuracies in

AI-generated data, it was suggested that the sys-

tem explicitly state its separation from the ofﬁcial

parliamentary version. This clariﬁes that the PQ

Dashboard is a supplementary tool and defers to

https://pq.gov.mt as the authoritative source.

• Limited Historical Coverage: The current sys-

tem is limited to PQs from the 14th legislature.

Stakeholders expressed a strong desire for the sys-

tem to be extended to cover all PQs available on-

line, from the 9th legislature onwards, as provided

by the ofﬁcial PQ portal. This expansion would

unlock a wealth of historical data for more com-

prehensive longitudinal analysis.

• User Language Preference: The current inter-

face displays both Maltese and English versions

of PQs side-by-side. Feedback suggested imple-

menting a user-selectable language ﬂag, allow-

ing users to choose their preferred language for

all subsequent interactions within the dashboard,

rather than a dual display.

The qualitative feedback, despite its informal na-

ture, has been instrumental in validating the core util-

ity of the PQ Dashboard and in clearly delineating

a roadmap for its future development, ensuring that

subsequent enhancements directly address user needs

and improve the system’s accuracy and usability.

6 CONCLUSIONS AND FUTURE

WORK

The PQ Dashboard stands as a compelling demonstra-

tion of how accessible public data can be achieved

through the strategic utilisation of relatively low-cost

hardware and the integration of Machine Translation.

By offering access to Parliamentary Questions (PQs)

in both their original Maltese and automatically trans-

lated English versions, the system broadens substan-

tially the potential audience for this vital public in-

formation. Furthermore, the extraction of key details

from each document, such as COFOG-99 categories,

keywords, and inter-document citations, serves to en-

hance accessibility further. These extracted details

are then leveraged to generate valuable aggregations,

enabling users to ‘proﬁle’ Members of Parliament

(MPs) and Ministries, and identify their primary in-

terests, thereby offering deeper insights into parlia-

mentary activity.

A fundamental principle underpinning the sys-

tem’s development was the exclusive reliance on free

and publicly available open-source technologies, pur-

posefully eschewing commercial services. This ap-

proach not only ensures inherent cost-effectiveness

but also serves as a robust showcase for the devel-

opment of secure and privacy-preserving Artiﬁcial

Intelligence (AI) solutions within the public sector.

This is particularly relevant even when dealing with

publicly available information where immediate pri-

vacy concerns might appear less pressing, as it un-

derscores a commitment to data sovereignty and ethi-

cal AI practices through on-premise Large Language

Model (LLM) deployment.

Future work will expand the capabilities and scope

of the PQ Dashboard. The ﬁrst priority is to ad-

KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval

168

dress the limitations identiﬁed during evaluation (see

Section 5), with emphasis on extending coverage

to earlier legislatures—starting from the 9th—whose

records are already publicly available. Secondly, the

system will be updated to incorporate more recent

or advanced Large Language Models, ensuring con-

tinued state-of-the-art performance. In addition, dis-

course analysis is planned to capture different ques-

tion types (e.g., factual, policy-oriented, or account-

ability questions) and to categorise response types

(e.g., factual, explanatory, or evasive answers), offer-

ing a more nuanced understanding of parliamentary

dialogue. Furthermore, a more formal and structured

user study is planned to systematically assess the sys-

tem’s strengths and weaknesses.

ACKNOWLEDGEMENTS

The authors acknowledge the use of Artiﬁcial Intel-

ligence (AI) language models in the drafting and re-

ﬁnement of this manuscript. Speciﬁcally, the Gem-

ini family of Large Language Models (developed

by Google) and OpenAI’s ChatGPT (version GPT-

4, July 2025) were utilised to support summarisation,

language reﬁnement and structural organisation. All

AI-assisted content was thoroughly reviewed, edited,

and validated by the authors to ensure accuracy, orig-

inality, and compliance with academic standards.

REFERENCES

Abela, C. and Azzopardi, J. (2018). Analysing and visualis-

ing parliamentary questions: A linked data approach.

In Szyma

nski, J. and Velegrakis, Y., editors, Semantic

Keyword-Based Search on Structured Data Sources,

pages 32–43, Cham. Springer International Publish-

ing.

Alvarez, R. M. and Morrier, J. (2025). Measuring the qual-

ity of answers in political q&as with large language

models. Political Analysis, page 1–18.

Azzopardi, J. (2024). Translating justice: A cross-lingual

information retrieval system for maltese case law doc-

uments. In Goharian, N., Tonellotto, N., He, Y., Li-

pani, A., McDonald, G., Macdonald, C., and Ounis,

I., editors, Advances in Information Retrieval, pages

236–240. Springer Nature Switzerland, Cham.

Bryłkowski, A. and Klikowski, J. (2025). Large language

models in legislative content analysis: A dataset from

the polish parliament. Available at https://arxiv.org/

abs/2503.12100.

Citino, Y. M. (2024). Leveraging automated technologies

for law-making in italy: Generative ai and constitu-

tional challenges. Parliamentary Affairs, 78(3):625–

647.

Cunningham, E., Cross, J., and Greene, D. (2025). Iden-

tifying algorithmic and domain-speciﬁc bias in par-

liamentary debate summarisation. Available at https:

//arxiv.org/abs/2507.14221.

European Parliament and Council of the European Union

(2024). Artiﬁcial Intelligence Act (regulation (eu)

2024/1689). Entered into force 1 August 2024; full

text available via EUR-Lex. https://eur-lex.europa.eu/

eli/reg/2024/1689/oj.

Gesnouin, J., Tannier, Y., Silva, C. G. D., Tapory, H., Brier,

C., Simon, H., Rozenberg, R., Woehrel, H., Yakaabi,

M. E., Binder, T., Marie, G., Caron, E., Nogueira, M.,

Fontas, T., Puydebois, L., Theophile, M., Morandi, S.,

Petit, M., Creissac, D., Ennouchy, P., Valetoux, E.,

Visade, C., Balloux, S., Cortes, E., Devineau, P.-E.,

Tan, U., Namara, E. M., and Yang, S. (2024). Llaman-

dement: Large language models for summarization of

french legislative proposals.

Google (2024). Gemma 2 9b instruction tuned

model. Hugging Face. https://huggingface.co/google/

gemma-2-9b-it.

Hyv

onen, E., Leskinen, P., Sinikallio, L., Mela, M. L.,

Tuominen, J., Elo, K., Drobac, S., Koho, M., Ikkala,

E., Tamper, M., Leal, R., and Kes

aniemi, J. (2022).

Finnish parliament on the semantic web: Using parlia-

mentsampo data service and semantic portal for study-

ing political culture and language. In Digital Par-

liamentary Data in Action (DiPaDa 2022) Workshop,

CEUR Workshop Proceedings, Vol. 3133, pages 69–

82, Uppsala, Sweden. CEUR-WS.org. Presented 15

March 2022; licensed under CC BY 4.0.

Inter-Parliamentary Union (IPU) and Parliamentary Data

Science Hub, Centre for Innovation in Parlia-

ment (2024). Use Cases for AI in Parlia-

ments. Published in partnership with the Par-

liamentary Data Science Hub, following IPU As-

sembly resolution on AI (October 2024). Avail-

able at: https://www.ipu.org/resources/publications/

reference/2024-12/use-cases-ai-in-parliaments.

Inter-Parliamentary Union (IPU) (2022). Brazil: A dig-

itally mature parliament. Case study; published

1 June 2022 on the IPU News & Case Studies

page https://www.ipu.org/news/case-studies/2022-06/

brazil-digitally-mature-parliament.

Inter-Parliamentary Union (IPU) and Parliamentary

Data Science Hub, Centre for Innovation in

Parliament (2024). Guidelines for ai in parlia-

ments. Published December 2024; launched

at event on 3 December 2024; framework for

responsible AI use in parliamentary contexts

https://www.ipu.org/resources/publications/reference/

2024-12/guidelines-ai-in-parliaments.

Inter-Parliamentary Union (IPU) and UN/IPU Global Cen-

tre for ICT in Parliament (2024). World e-parliament

report 2024. Based on survey of 115 parliaments

in 86 countries; licensed under CC BY-NC-SA 4.0.

https://www.ipu.org/resources/publications/reports/

2024-10/world-e-parliament-report-2024.

Koehn, P. and Knowles, R. (2017). Six challenges for neural

machine translation. In Luong, T., Birch, A., Neubig,

G., and Finch, A., editors, Proceedings of the First

LLMs and Knowledge Discovery in Low-Resource Language Parliamentary Corpora: The PQ Dashboard Case Study

169

Workshop on Neural Machine Translation, pages 28–

39, Vancouver. Association for Computational Lin-

guistics.

Martorana, C. (2025). Ai in action: 5 essential ﬁndings

from the 2024 federal ai use case inventory. Accessed

July 2025; ofﬁcial .gov publication https://www.cio.

gov/ai-in-action/.

Parliament, U. (2024). Explore data from the uk parliament.

Online data portal. https://explore.data.parliament.uk/

Accessed: 2024-10-09.

Parliament of Malta (2022). Parliament of malta — par-

liamentary questions. https://pq.gov.mt. Accessed:

2025-07-23.

Polat, H. and Korpe, M. (2022). Estimation of demographic

traits of the deputies through parliamentary debates

using machine learning. Electronics, 11(15).

Ranathunga, S., Prifti Skenduli, M., Shekhar, R., Alam, M.,

and Kaur, R. (2022). Neural machine translation for

low-resource languages: A survey. ACM Computing

Surveys, 55.

Rozado, D. (2024). The political preferences of llms. PLOS

ONE, 19:1–15.

Sharma, C. (2025). Retrieval-augmented generation:

A comprehensive survey of architectures, enhance-

ments, and robustness frontiers. Available at https:

//arxiv.org/abs/2506.00054.

Siino, M., Falco, M., Croce, D., and Rosso, P. (2025).

Exploring llms applications in law: A literature re-

view on current legal nlp approaches. IEEE Access,

13:18253–18276.

TMID Editorial (2024). Pqs should be answered. The Malta

Independent. Editorial (“TMID Editorial: PQs should

be answered”).

United Nations Statistics Division (2024). Revision of the

classiﬁcation of the functions of government (cofog).

Webpage, UN Statistics Division. https://unstats.un.

org/unsd/classiﬁcations/cofog/revision.

Zhao, X., Huo, Y., Abedin, M. Z., Shang, Y., and Alofaysan,

H. (2025). Intelligent government: The impact and

mechanism of government transparency driven by ai.

Public Money & Management, 0(0):1–12.

Zhuang, Z., Chen, J., Xu, H., Jiang, Y., and Lin, J. (2025).

Large language models for automated scholarly paper

review: A survey. Information Fusion, 124:103332.

KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval

170