
longer periods of time, such as stop-and-go scenarios
or long overtaking manoeuvres. One reason for this
performance drop could be the segmentation of the
data into 15 second sequences. As a result, these sce-
narios cannot be adequately queried as they are not
properly embedded within this short time window. In
these cases, the LLM reports that there are few or no
such instances in the dataset, again demonstrating a
good understanding but limiting its retrieval perfor-
mance score for these queries. Choosing variable time
windows to capture both short and long term manoeu-
vre development could help to capture these manoeu-
vres with the presented method. Additional perfor-
mance challenges are particularly notable within the
Combination category, which captures combinatorial
queries of multiple behavioural elements. In many
cases, the response contains specific subsets of the
combinatorics, but often struggles to capture all re-
lated aspects for highly combinatorial queries at L2
and L3.
Qualitative analysis of the model responses shows
a comprehensive understanding of the LLM in terms
of the data samples and the associated clusters in the
geometric embedding space. By providing both data
and knowledge graph references, coupled with a full
text answer and detailed reasoning about the retrieved
scenarios, interpretability is facilitated.
7 CONCLUSION AND FUTURE
WORK
This paper presents a comprehensive approach for re-
trieving behavioural scenarios on unlabelled trajecto-
ries from real driving data. The geometric embed-
ding based on HD is able to capture detailed scenario
attributes such as infrastructure, specific features of
object trajectories as well as their interactions. The
guided description of the trajectory space by GPT-4o
combined with the indexing by a GraphRAG pipeline
allows users to query and analyse the generated repre-
sentation with natural language and behavioural sce-
nario descriptions without prior annotation and addi-
tional information extraction. The open context na-
ture of the LLM, by providing an open vocabulary,
allows queries that do not necessarily need to be con-
sidered prior to data extraction.
Future research could focus on improving the pro-
posed approach in several ways. First, the introduc-
tion of different segment lengths can be considered
to accommodate different granularities in manoeuvre
combinations, which would address potential perfor-
mance issues, such as the observed challenges in stop-
and-go scenarios as longer length takeovers. Subse-
quently, the use of alternative models to VRAE, such
as Transformers, may also prove beneficial for ex-
tended context representation. This can be coupled
with the inclusion of additional scenario parameters
based on the 6LM for enhanced context representa-
tion and search. The database should be extended
beyond the highway scenarios in the highD dataset,
such as intersections, urban scenarios and other real-
world use cases. To evaluate the scenario retrieval
performance, ground truth datasets should be created
that allow for additional performance metrics such as
Recall@k. This can be combined with the evalua-
tion of different alternative prompting and RAG ap-
proaches to improve retrieval specifically for combi-
natorial queries. Besides the HD which introduces a
spatial distance, additional metrics should be evalu-
ated with respect to the resulting embedding space for
driving scenarios, specifically including temporal and
spatial distances. Finally, the method should be eval-
uated and improved on the basis of user studies with
the involvement of V&V engineers, investigating the
most important types of queries as well as drawing the
relation to the application related to the SOTIF stan-
dard.
REFERENCES
Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody,
A., Truitt, S., and Larson, J. (2024). From local to
global: A graph rag approach to query-focused sum-
marization. arXiv preprint arXiv:2404.16130.
Elspas, P., Klose, Y., Isele, S. T., Bach, J., and Sax, E.
(2021). Time series segmentation for driving scenario
detection with fully convolutional networks. In VE-
HITS, pages 56–64.
Elspas, P., Langner, J., Aydinbas, M., Bach, J., and Sax,
E. (2020). Leveraging regular expressions for flexible
scenario detection in recorded driving data. In 2020
IEEE International Symposium on Systems Engineer-
ing (ISSE), pages 1–8. IEEE.
Hoseini, F., Rahrovani, S., and Chehreghani, M. H. (2021).
Vehicle motion trajectories clustering via embedding
transitive relations. In 2021 IEEE International In-
telligent Transportation Systems Conference (ITSC),
pages 1314–1321.
International Organization for Standardization (2022).
Road vehicles - safety of the inteded fuctionality. ISO
21448:2022(E). ICS: 43.040.10.
Krajewski, R., Bock, J., Kloeker, L., and Eckstein, L.
(2018). The highd dataset: A drone dataset of natural-
istic vehicle trajectories on german highways for val-
idation of highly automated driving systems. In 2018
21st International Conference on Intelligent Trans-
portation Systems (ITSC), pages 2118–2125.
McInnes, L., Healy, J., and Astels, S. (2017). hdbscan: Hi-
VEHITS 2025 - 11th International Conference on Vehicle Technology and Intelligent Transport Systems
74