Insights for Manage Geospatial Big Data in Ecosystem Monitoring using Processing Chains and High Performance Computing

Fabián Santos, Gunther Menz

2015

Abstract

Manage Geospatial Big Data is a challenging task and each time a more frequently task in ecosystem monitoring, due the accelerated increase and accessibility of geographical technologies and archives. For this reason, this research focus in the design and development of two reproducible processing chains in open source software, using the High Performance Computing approach for manage the volume, variety and velocity dimensions of two cases of Geospatial Big Data. The first one, constitutes a large collection of images of the Landsat satellites, which should be sequentially processed, in order to prepare a time-series analysis of the regeneration process of disturbed tropical forests in Ecuador. The second case constitutes a unique complex database of different sources and types of Geospatial data, which should be organized and harmonized to allow an exploratory statistical analysis and pattern extraction of the drivers that influence the restoration process of disturbed tropical forests in Ecuador. For this purpose, the design of the processing chains are based in parallel computing for divide and distribute small pieces of data between the processing units available. Therefore, the design implemented allows the possibility to scale-up the computing resources, if they are available. Our first results, applied to a multi-core computer, showed that the design of the processing chain applied to the large collection of images of the Landsat satellites is the only way to manage the volume and velocity dimensions of Geospatial Big Data.

References

  1. Sensing Image Analysis. Journal of Emerging Trends in Computing and Information Sciences 3 (4).
  2. Assunção, M., R. Calheiros, S. Bianchi, M. Netto, and R. Buyya. 2014. Big Data Computing and Clouds: Trends and Future Directions. Journal of Parallel and Distributed Computing.
  3. Avitabile, V., A. Baccini, M. Friedl, and C. Schmullius. 2012. Capabilities and limitations of Landsat and land cover data for aboveground woody biomass estimation of Uganda. Remote Sensing of Environment 117:366- 380.
  4. Beyene, E. 2011. Distributed Processing Of Large Remote Sensing Images Using MapReduce A case of Edge Detection, Institute for Geoinformatics, Universität Münster, Münster - North-Rhine Westphalia - Germany.
  5. Buckner, J., and M. Seligman. 2015. Package 'gputools': R-Project.
  6. Chen, P., and C.-Y. Zhang. 2014. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences 275 (2014) 314-347.
  7. Christophe, E., J. Michel, and J. Inglada. 2010. Remote Sensing Processing: From Multicore to GPU. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 1.
  8. Cox, M., and D. Ellsworth. 1997. Application-Controlled Demand Paging for Out-of-Core Visualization. Paper read at The 8th IEEE Visualization 7897 Conference.
  9. Denning, P. 1990. Saving All the Bits: Research Institute for Advanced Computer Science, 15.
  10. European Commission. 2014. Communication from the Commission to the European Parliament, The Council, The European Economic and Social Committee and The Committee of the Regions. Brussels 2.7.2014.
  11. Flood, N., T. Danaher, T. Gill, and S. Gillingham. 2013. An Operational Scheme for Deriving Standardised Surface Reflectance from Landsat TM/ETM+ and SPOT HRG Imagery for Eastern Australia. Remote Sensing 5:83-109.
  12. Goward, S., P. Davis, D. Fleming, L. Miller, and J. Townshend. 2003. Empirical comparison of Landsat 7 and IKONOS multispectral measurements for selected Earth Observation System (EOS) validation sites. Remote Sensing of Environment 88 (2003) (80 - 99).
  13. Gray, J. 2009. Jim Gray on eScience: A Transformed Scientific Method. In The Fourth Paradigm Data Intensive Scientific Discovery. Washington - EEUU: Microsoft Research, xxii.
  14. Hansen, M., and T. Loveland. 2012. A review of large area monitoring of land cover change using Landsat data. Remote Sensing of Environment 122 (2012) 66-74.
  15. Hansen, M., D. Roy, E. Lindquist, B. Adusei, C. Justice, and A. Altstatt. 2007. A method for integrating MODIS and Landsat data for systematic monitoring of forest cover and change in the Congo Basin. Remote Sensing of Environment (2008) 112 2495-2513.
  16. Laney, D. 2001. 3D Data Management: ControLling Data Volume, Velocity, and Variety. Application Delivery Strategies META Group Inc.
  17. MAE. 2012. Línea Base de Deforestación del Ecuador Continental edited by Subsecretaría de Patrimonio Natural. Quito - Ecuador: Ministerio del Ambiente (MAE).
  18. MAE. 2013. Metodología para la representación Cartográ- fica de los Ecosistemas del Ecuador Continental, edited by Subsecretaría de Patrimonio Natural. Quito - Ecuador: Ministerio del Ambiente del Ecuador (MAE).
  19. Morisette, J., J. Privette, and C. Justice. 2002. A framework for the validation of MODIS Land products. Remote Sensing of Environment (2002) 83:77 - 96.
  20. National Geospatial Advisory Committee. 2012. Statement on Landsat Data Use and Charges.
  21. Percivall, G. 2013. Big Processing of Geospatial Data: Open Geospatial Consortium.
  22. Potapov, P., S. Turubanova, M. Hansen, B. Adusei, M. Broich, A. Altstatt, L. Mane, and C. O. Justice. 2012. Quantifying forest cover loss in Democratic Republic of the Congo, 2000-2010, with Landsat ETM+ data. Remote Sensing of Environment 122 (2012) (106-116).
  23. Sierra, R., F. Campos, and J. Chamberlin. 2002. Assesing biodiversity conservation priorities: ecoszstem risk and representativeness in continental Ecuador. Landscape and Urban Planning 59 (2002):95-110.
  24. Sultan, N. 2009. Cloud computing for education: A new dawn? International Journal of Information Management 30 (2010) 109-116.
  25. Toffler, A. 1970. Future Shock. United States: Random House.
  26. Tryse, D. 2008. David' s Google Earth files:Disappearing Forests of the World: Google Earth.
  27. Wang, L., M. Kunze, J. Tao, and G. v. Laszewski. 2011. Towards building a cloud for scientific applications. Advances in Engineering Software 42 (2011):714-722.
  28. Weston, S. 2015. Package 'foreach': Revolution Analytics R-Project.
  29. Wyborn, L. 2013. It's not just about big data for the Earth and Environmental Sciences: it's now about High Performance Data (HPD) In Big Data: Geoscience Australia.
  30. Zecena, I., Z. Zong, R. Ge, T. Jin, Z. Chen, and M. Qiu. 2012. Energy Consumption Analysis of Parallel Sorting Algorithms Running on Multicore Systems. Paper read at Green Computing Conference (IGCC), at San Jose, CA.
Download


Paper Citation


in Harvard Style

Santos F. and Menz G. (2015). Insights for Manage Geospatial Big Data in Ecosystem Monitoring using Processing Chains and High Performance Computing . In Doctoral Consortium - DCCLOSER, (CLOSER 2015) ISBN Not Available, pages 3-9


in Bibtex Style

@conference{dccloser15,
author={Fabián Santos and Gunther Menz},
title={Insights for Manage Geospatial Big Data in Ecosystem Monitoring using Processing Chains and High Performance Computing},
booktitle={Doctoral Consortium - DCCLOSER, (CLOSER 2015)},
year={2015},
pages={3-9},
publisher={SciTePress},
organization={INSTICC},
doi={},
isbn={Not Available},
}


in EndNote Style

TY - CONF
JO - Doctoral Consortium - DCCLOSER, (CLOSER 2015)
TI - Insights for Manage Geospatial Big Data in Ecosystem Monitoring using Processing Chains and High Performance Computing
SN - Not Available
AU - Santos F.
AU - Menz G.
PY - 2015
SP - 3
EP - 9
DO -