UniQue: An Approach for Unified and Efficient Querying of Heterogeneous Web Data Sources

Markku Laine, Jari Kleimola, Petri Vuorimaa

2016

Abstract

Governments, organizations, and people are publishing open data on the Web more than ever before. To consume the data, however, requires substantial effort from web mashup developers, as they have to familiarize themselves with a diversity of data formats and query techniques specific to each data source. While several solutions have been proposed to improve web querying, none of them covers aforementioned aspects in a developer friendly and efficient manner. Therefore, we devised a unified querying (UniQue) approach and a proxy-based implementation that provides a uniform and declarative interface for querying heterogeneous data sources across the Web. Besides hiding the differences between the underlying data formats and query techniques, UniQue heavily embraces open W3C standards to minimize the learning effort required by developers. Pursuing this further, we propose Unified Query Language (UQL) that combines the expressiveness of CSS Selectors and XPath into a single and flexible selector language. We show that the adoption of UniQue and UQL can effectively streamline web querying, leverage developers’ existing knowledge, and reduce generated network traffic compared to the current state-of-the-art approach.

References

  1. 28msec, 2016. 28msec. http://www.28.io/.
  2. Akamai, 2016. Ion Web Performance Optimization | Akamai. https://www.akamai.com/us/en/solutions/pro ducts/web-performance/web-performance-optimizatio n.jsp.
  3. Bailey, J., Bry, F., Furche, T., Schaffert, S., 2005. Web and Semantic Web Query Languages: A Survey. In First International Summer School 2005, LNCS 3564, pp. 35-133. Springer.
  4. Berger, S., Bry, F., Furche, T., Linse, B., Schroeder, A., 2006. Beyond XML and RDF: The Versatile Web Query Language Xcerpt. In Proceedings of the 15th International Conference on World Wide Web, pp. 1053-1054. ACM.
  5. Bischof, S., Decker, S., Krennwallner, T., Lopes, N., Polleres, A., 2012. Mapping between RDF and XML with XSPARQL. Journal on Data Semantics, vol. 1, no. 3, pp. 147-185. Springer.
  6. Boyer, J.M., Bruchez, E., Klotz, L.L. Jr., Pemberton, S., Van den Bleeken, N., 2016. XForms 2.0 - W3C XForms Group Wiki (Public). http://www.w3.org/Mar kUp/Forms/wiki/XForms_2.0.
  7. Boyer, J., Gao, S., Malaika, S., Maximilien, M., Salz, R., Simeon, J., 2011. Experiences with JSON and XML Transformations. In W3C Workshop on Data and Services Integration.
  8. Bray, T., 2014. The JavaScript Object Notation (JSON) Data Interchange Format - RFC 7159 (Proposed Standard). http://tools.ietf.org/html/rfc7159.
  9. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F., 2008. Extensible Markup Language (XML) 1.0 (Fifth Edition) - W3C Recommendation. http://www.w3.org/TR/xml/.
  10. Bry, F., Koch, C., Furche, T., Schaffert, S., Badea, L., Berger, S., 2005. Querying the Web Reconsidered: Design Principles for Versatile Web Query Languages. International Journal on Semantic Web and Information Systems, vol. 1, no. 2, pp. 1-21. IGI Global.
  11. Çelik, T., Etemad, E.J., Glazman, D., Hickson, I., Linss, P., Williams, J., 2011. Selectors Level 3 - W3C Recommendation. http://www.w3.org/TR/selectors/.
  12. Etemad, E.J., Atkins, T. Jr., 2016. Selectors Level 4 - W3C Editor's Draft. https://drafts.csswg.org/selec tors/.
  13. Florescu, D., Fourny, G., 2013. JSONiq: The History of a Query Language. IEEE Internet Computing, vol. 17, no. 5, pp. 86-90. IEEE.
  14. Furche, T., Gottlob, G., Grasso, G., Schallhart, C., Sellers, A., 2013. OXPath: A Language for Scalable Data Extraction, Automation, and Crawling on the Deep Web. The VLDB Journal, vol. 22, no. 1, pp. 47-72. Springer.
  15. Giribet, D., 2005. Merging XPath and URLs for Enhanced Web and Web Service Data Retrievals. In Proceedings of the IADIS International Conference on Applied Computing, pp. 27-33. IADIS.
  16. Google, 2016. Data Saver - Google Chrome. https://developer.chrome.com/multidevice/datacompression.
  17. Harth, A., Norton, B., Polleres, A., Sapkota, B., Speiser, S., Stadtmüller, S., Suominen, O., 2011. Towards Uniform Access to Web Data and Services. In W3C Workshop on Data and Services Integration.
  18. Hausenblas, M., Wilde, E., Tennison, J., 2014. URI Fragment Identifiers for the text/csv Media Type - RFC 7111 (Informational). http://tools.ietf.org/html/rf c7111.
  19. Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Doyle Navara, E., O'Connor, E., Pfeiffer, S., 2014. HTML5: A Vocabulary and Associated APIs for HTML and XHTML - W3C Recommendation. http://www.w3.org /TR/html5/.
  20. Jarrar, M., Dikaiakos, M.D., 2012. A Query Formulation Language for the Data Web. IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 5, pp. 783-798. IEEE.
  21. Kosek, J., Atkins, T. Jr., 2016. Non-Element Selectors Module Level 1 - W3C Editor's Draft. https://drafts.csswg.org/selectors-nonelement/.
  22. MacLean, A., Carter, K., Lövstrand, L., Moran, T., 1990. User-Tailorable Systems: Pressing the Issues with Buttons. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 175-182. ACM.
  23. Pemberton, S., Austin, D., Axelsson, J., Çelik, T., Dominiak, D., Elenbaas H., Epperson, B., Ishikawa, M., Matsui, S., McCarron, S., Navarro, A., Peruvemba, S., Relyea, R., Schnitzenbaumer, S., Stark, P., 2002. XHTMLTM 1.0 The Extensible HyperText Markup Language (Second Edition): A Reformulation of HTML 4 in XML 1.0 - W3C Recommendation. http://www.w3.org/TR/xhtml1/.
  24. Progress Software Corporation, 2016. DataDirect XQuery Product Architecture Overview. http://www.progress.c om/products/data-integration-suite/xquery/xquery-pro duct-architecture.
  25. Robie, J., Chamberlin, D., Dyck, M., Snelson, J., 2014. XML Path Language (XPath) 3.0 - W3C Recommendation. http://www.w3.org/TR/xpath-30/.
  26. Robie, J., Chamberlin, D., Dyck, M., Snelson, J., 2014. XQuery 3.0: An XML Query Language - W3C Recommendation. http://www.w3.org/TR/xquery-30/.
  27. Shafranovich, Y., 2005. Common Format and MIME Type for Comma-Separated Values (CSV) Files - RFC 4180 (Informational). http://tools.ietf.org/html/rfc 4180.
  28. Tsai, C.-L., Chen, H.-W., Huang, J.-L., Hu, C.-L., 2011. Transmission Reduction between Mobile Phone Applications and RESTful APIs. In Proceedings of the 2011 ACM Symposium on Applied Computing, pp. 445-450. ACM.
  29. Van Roy, P., Haridi, S., 2004. Concepts, Techniques, and Models of Computer Programming, The MIT Press. Cambridge, Massachusetts, 1st edition.
  30. Yahoo, 2016. Yahoo Query Language (YQL). http://developer.yahoo.com/yql/.
Download


Paper Citation


in Harvard Style

Laine M., Kleimola J. and Vuorimaa P. (2016). UniQue: An Approach for Unified and Efficient Querying of Heterogeneous Web Data Sources . In Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-186-1, pages 84-94. DOI: 10.5220/0005764100840094


in Bibtex Style

@conference{webist16,
author={Markku Laine and Jari Kleimola and Petri Vuorimaa},
title={UniQue: An Approach for Unified and Efficient Querying of Heterogeneous Web Data Sources},
booktitle={Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2016},
pages={84-94},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005764100840094},
isbn={978-989-758-186-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - UniQue: An Approach for Unified and Efficient Querying of Heterogeneous Web Data Sources
SN - 978-989-758-186-1
AU - Laine M.
AU - Kleimola J.
AU - Vuorimaa P.
PY - 2016
SP - 84
EP - 94
DO - 10.5220/0005764100840094