# Mapping Distance Graph Kernels using Bipartite Matching

### Tetsuya Kataoka, Eimi Shiotsuki, Akihiro Inokuchi

#### Abstract

The objective of graph classification is to classify graphs of similar structures into the same class. This problem is of key importance in areas such as cheminformatics and bioinformatics. Support Vector Machines can efficiently classify graphs if graph kernels are used instead of feature vectors. In this paper, we propose two novel and efficient graph kernels called Mapping Distance Kernel with Stars (MDKS) and Mapping Distance Kernel with Vectors (MDKV). MDKS approximately measures the graph edit distance using star structures of height one. The method runs in $O(\upsilon^3)$, where $\upsilon$ is the maximum number of vertices in the graphs. However, when the height of the star structures is increased to avoid structural information loss, this graph kernel is no longer efficient. Hence, MDKV represents star structures of height greater than one as vectors and sums their Euclidean distances. It runs in $O(h(\upsilon^3 +|\Sigma|\upsilon^2))$, where $\Sigma$ is a set of vertex labels and graphs are iteratively relabeled $h$ times. We verify the computational efficiency of the proposed graph kernels on artificially generated datasets. Further, results on three real-world datasets show that the classification accuracy of the proposed graph kernels is higher than three conventional graph kernel methods.

#### References

- Schölkopf, Bernhard, and Smola, Alexander J.. 2002. Learning with Kernels. MIT Press.
- Kashima, Hisashi, Tsuda, Koji, and Inokuchi, Akihiro. 2003. Marginalized Kernels Between Labeled Graphs. In Proc. of the International Conference on Machine Learning (ICML). 321-328.
- Zhiping, Zeng, Anthony K.H. Tung, Jianyong Wang, Jianhua Feng, and Lizhu Zhou. 2009. Comparing Stars: On Approximating Graph Edit Distance. In Proc. of the VLDB (PVLDB). 2(1): 25-36.
- Riesen, Kaspar, and Bunkle, Horst. 2009. Approximate graph edit distance computation by means of bipartite graph matching. Image Vision Computing. 27(7): 950-959.
- Hido, Shohei, and Kashima, Hisashi. 2009. A Linear-Time Graph Kernel. In Proc. of the International Conference on Data Mining (ICDM). 179-188.
- Shervashidze, Nino, Schweitzer, Pascal, Jan van Leeuwen, Erik, Mehlhorn, Kurt, and Borgwardt, Karsten M.. 2011. Weisfeiler-Lehman Graph Kernels. Journal of Machine Learning Research (JMLR): 2539-2561.
- Kataoka, Tetsuya, and Inokuchi, Akihiro. 2016. Hadamard Code Graph Kernels for Classifying Graphs. In Proc. of the International Conference on Pattern Recognition Applications and Methods (ICPRAM). 24-32.
- Schölkopf, Bernhard, Tsuda, Koji, and Vert, Jean-Philippe. 2004. Kernel Methods in Computational Biology. MIT Press.
- Kuhn, Harold W.. 1955. The Hungarian Method for the Assignment Problem. Naval Research Logistics. 2: 83-97.
- Bernhard E. Boser, Isabelle, Guyon, and Vladimir, Vapnik. 1992. A Training Algorithm for Optimal Margin Classifiers. In Proc. of the Conference on Learning Theory (COLT). 144-152.
- Hart, Peter E., Nilsson, Nils J., and Raphael, Bertram. 1968. A Formal Basis for the Heuristic Determination of Minimum Cost Paths. Journal of IEEE Trans. Systems Science and Cybernetics. 4(2): 100-107.
- Debnath, Asim Kumar, Lopez de Compadre, Rosa L., Debnath, Gargi, Shusterman, Alan J., and Hansch, Corwin. 1991. Structure-Activity Relationship of Mutagenic Aromatic and Heteroaromatic Nitro Compounds. Correlation with Molecular Orbital Energies and Hydrophobicity. Journal of Medicinal Chemistry 34: 786-797.
- Helma, Christoph, and Kramer, Stefan. 2003. A Survey of the Predictive Toxicology Challenge Bioinformatics 19(10): 1179-1182.
- Schomburg, Ida, Chang, Antje, Ebeling, Christian, Gremse, Marion, Heldt, Christian, Huhn, Gregor, and Schomburg, Dietmar. 2004. BRENDA, the Enzyme Database: Updates and Major New Developments. Nucleic Acids Research 32D: 431-433.
- Chang, Chih-Chung, and Lin, Chih-Jen. 2001. LIBSVM: A library for support vector machines. Available online at http://www.csie.ntu.edu.tw/cjlin/libsvm.
- Neuhaus, Michel, and Bunke, Horst. 2007. Bridging the Gap Between Graph Edit Distance and Kernel Machines. World Scientific.
- H. W. Kuhn. 1955. The Hungarian Method for the Assignment Problem. Naval Research Logistics, 2: 83-97.
- E. D. Demaine, S. Mozes, B. Rossman, and O. Weimann. 2009. An Optimal Decomposition Algorithm for Tree Edit Distance. ACM Transaction on Algorithm, 6 (1).
- Carletti, Vincenzo, Gaüzère, Benoit, Brun, Luc, and Vento, Mario. 2015. Approximate Graph Edit Distance Computation Combining Bipartite Matching and Exact Neighborhood Substructure Distance. Graph Based Representations in Pattern Recognition (GbRPR), 188-197.

#### Paper Citation

#### in Harvard Style

Kataoka T., Shiotsuki E. and Inokuchi A. (2017). **Mapping Distance Graph Kernels using Bipartite Matching** . In *Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,* ISBN 978-989-758-222-6, pages 61-70. DOI: 10.5220/0006112900610070

#### in Bibtex Style

@conference{icpram17,

author={Tetsuya Kataoka and Eimi Shiotsuki and Akihiro Inokuchi},

title={Mapping Distance Graph Kernels using Bipartite Matching},

booktitle={Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

year={2017},

pages={61-70},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0006112900610070},

isbn={978-989-758-222-6},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,

TI - Mapping Distance Graph Kernels using Bipartite Matching

SN - 978-989-758-222-6

AU - Kataoka T.

AU - Shiotsuki E.

AU - Inokuchi A.

PY - 2017

SP - 61

EP - 70

DO - 10.5220/0006112900610070