
 
 
Figure 14: Class Diagram reduction proposed by the 
domain expert. 
5  CONCLUSIONS 
In this paper we presented a process for inferring a 
conceptual data model from a spreadsheet-based 
information system. The process has been defined in 
an industrial context and validated by an experiment 
involving three different spreadsheets-based 
information systems from the considered automobile 
industrial domain. The results of the experiment 
showed the applicability of the process and the 
acceptability of the inferred models, according to the 
judgment of experts about the application domain of  
the spreadsheets.  
Our work differ from other ones described in the 
literature, since the proposed approach has been 
tailored to spreadsheets used to implement 
information systems, rather than calculation sheets. 
In future work, we plan to perform further 
experimentations involving other spreadsheets-based 
systems belonging to different application domains. 
Moreover, we want to extend our approach further, 
by proposing reverse engineering techniques aimed 
at inferring the functional dependencies between the 
data embedded in the spreadsheets by analyzing the 
VBA functionalities they include. 
ACKNOWLEDGEMENTS 
This work was carried out in the context of the 
research project IESWECAN (Informatics for 
Embedded SoftWare Engineering of Construction 
and Agricultural machiNes - PON01-01516), 
partially founded by the Italian Ministry for 
University and Research (MIUR). 
REFERENCES 
Abraham R. and Erwig M., Header and unit inference for 
spreadsheets through spatial analyses. In Proceedings 
of the IEEE International Symposium on Visual 
Languages and Human-Centric Computing 
(VL/HCC), 2004, pages 165–172. 
Abraham R. and Erwig M., Inferring templates from 
spreadsheets. In Proceedings of the 28th International 
Conference on Software Engineering (ICSE), ACM, 
New York, NY, USA, 2006, pages 182–191. 
Abraham R., Erwig M. and Andrew S., A type system 
based on end-user vocabulary. In Proceedings of the 
IEEE Symposium on Visual Languages and Human-
Centric Computing (VL/HCC), Washington, DC, 
USA, IEEE Computer Society, 2007, pages 215–222. 
Abraham R. and Erwig M., Mutation operators for 
spreadsheets. IEEE Transactions on Software 
Engineering, 35(1):94–108, 2009. 
Ahmad Y., Antoniu T., Goldwater S. and Krishnamurthi 
S., A type system for statically detecting spreadsheet 
errors. In Proceedings of the IEEE International 
Conference on Automated Software Engineering, 
2003, pages 174–183. 
Amalfitano D., Fasolino A.R., Maggio V., Tramontana P., 
Di Mare G., Ferrara F., Scala S., Migrating legacy 
spreadsheets-based systems to Web MVC architecture: 
An industrial case study, Proceedings of CSMR-
WCRE, 2014, pages 387-390. 
Amalfitano D., Fasolino A.R., Maggio V., Tramontana P., 
De Simone V., Reverse Engineering of Data Models 
from Legacy Spreadsheets-Based Systems: An 
Industrial Case Study, Proceedings of the 22nd Italian 
Symposium on Advanced Database System, 2014, 
pages 123-130. 
Bovenzi D., Canfora G., Fasolino A.R., Enabling legacy 
system accessibility by Web heterogeneous clients. In 
proceedings of the Seventh European Conference on 
Software Maintenance and Reengineering, IEEE CS 
Press, 2003, pages 73-81. 
Canfora G., Fasolino A.R., Frattolillo G., Tramontana P., 
A wrapping approach for migrating legacy system 
interactive functionalities to Service Oriented 
Architectures. Elsevier, Journal of Systems and 
Software, 2008,  vol. 81(4):463–480,  
Chen Z. and Cafarella M., Automatic web spreadsheet 
data extraction. In Proceedings of the 3rd International 
Workshop on Semantic Search Over the Web (SS@ 
'13). ACM, New York, NY, USA, 2013, 8 pages. 
Cunha, J., Erwig M., Saraiva J., Automatically Inferring 
ClassSheet Models from Spreadsheets. In IEEE 
Symposium on Visual Languages and Human-Centric 
Computing (VL/HCC), IEEE CS Press, 2010, pages 
93-100. 
De Lucia A., Francese R., Scanniello G., Tortora G., 
Developing legacy system migration methods and 
tools for technology transfer. In Software Practice and 
Experience 38(13), Wiley,2008, pages 1333-1364. 
Di Lucca G.A., Fasolino A.R., De Carlini U., Recovering 
class diagrams from data-intensive legacy systems. In 
InformationExtractionfromLegacySpreadsheet-basedInformationSystem-AnExperienceintheAutomotiveContext
397