Authors:
Yasuhiro Yamada
1
and
Tetsuya Nakatoh
2
Affiliations:
1
Institute of Science and Engineering, Academic Assembly, Shimane University, 1060 Nishikawatsu-cho, Matsue-shi, Shimane, 690-8504 and Japan
;
2
Research Institute for Information Technology, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, 819-0395 and Japan
Keyword(s):
Open Government Data, E-Government, Tag Recommendation, Multi-label Classification, Metadata.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Artificial Intelligence
;
e-Business
;
Enterprise Information Systems
;
Government
;
Intelligent Information Systems
;
Knowledge Management and Information Sharing
;
Knowledge-Based Systems
;
Metadata and Structured Documents
;
Society, e-Business and e-Government
;
Symbolic Systems
;
Web Information Systems and Technologies
Abstract:
Open government data (OGD) is statistical data made and published by governments. Administrators often give tags to the metadata of OGD. Tags, which are a collection of a single word or multiple words, express the data. Tags are useful to understand the data without actually reading the data and also to search for OGD. However, administrators have to understand the data in detail in order to assign tags. We take two different approaches for giving appropriate tags to OGD. First, we use a multi-label classification technique to give tags to OGD from tags in the training data. Second, we extract particular noun phrases from the metadata of OGD by calculating the difference between the frequency of a noun phrase and the frequencies of single words within the noun phrase. Experiments using 196,587 datasets on Data.gov show that the accuracy of prediction by the multi-label classification method is enough to develop a tag recommendation system. Also, the experiments show that our extracti
on method of particular noun phrases extracts some infrequent tags of the datasets.
(More)