loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Arman Arzani ; Theodor Josef Vogl ; Marcus Handte and Pedro José Marrón

Affiliation: University of Duisburg-Essen, Essen, Germany

Keyword(s): Innovation Management, Data Mining, University Structure Extraction, Web Page Classification.

Abstract: To support innovation coaches in scouting activities such as discovering expertise, trends inside a university and finding potential innovators, we designed INSE, an innovation search engine which automates the data gathering and analysis processes. The primary goal of INSE is to provide comprehensive system support across all stages of innovation scouting, reducing the need for manual data collection and aggregation. To provide innovation coaches with the necessary information on individuals, INSE must first establish the structure of the organization. This includes identifying the associated staff and researchers in order to assess their academic activities. While this could in theory be done manually, this task is error-prone and virtually impossible to do for large organizations. In this paper, we propose a generic organization mining approach that combines a rule-based algorithm, LLMs and finetuned sequence-to-sequence classifier on university websites, independent of web techno logies, content management systems or website layout. We implement the approach and evaluate the results against four different universities, namely Duisburg-Essen, Münster, Dortmund, and Wuppertal. The evaluation indicate that our approach is generic and enables the identification of university aggregators pages with F1 score of above 85% and landing pages of entities with F1 scores of 100% for faculties, above 78% for institutes and chairs. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.54

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Arzani, A., Vogl, T. J., Handte, M. and Marrón, P. J. (2025). A Hybrid Approach for Mining the Organizational Structure from University Websites. In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR; ISBN ; ISSN 2184-3228, SciTePress, pages 188-199. DOI: 10.5220/0013658600004000

@conference{kdir25,
author={Arman Arzani and Theodor Josef Vogl and Marcus Handte and Pedro José Marrón},
title={A Hybrid Approach for Mining the Organizational Structure from University Websites},
booktitle={Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR},
year={2025},
pages={188-199},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013658600004000},
isbn={},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR
TI - A Hybrid Approach for Mining the Organizational Structure from University Websites
SN -
IS - 2184-3228
AU - Arzani, A.
AU - Vogl, T.
AU - Handte, M.
AU - Marrón, P.
PY - 2025
SP - 188
EP - 199
DO - 10.5220/0013658600004000
PB - SciTePress