
The entities in charge of the administration, attention 
and regulation of the subsidized healthcare services 
have a, even greater, responsibility of identifying the 
users of the health system, without making any 
mistakes. The information registered in the DB is 
necessary for accessing the government funds that 
covers the health plan of a subsidized patient.  Thus, 
the local government entities of control and 
monitoring, periodically make revisions of the data 
registered in the information systems and check for 
inconsistencies reported with regards to subsidized 
users. 
Given the actual situation, the information 
system plays an important role in guaranteeing the 
stability of the healthcare system. DBs are a key 
element in the administration of the information of 
users in the subsidized healthcare system. In order to 
guarantee that the system counts with complete and 
clean data (information free of errors), several DBs 
of users must be integrated, such as, the DB that 
contains the deceased, the new affiliations, the 
withdrawals, the transferred, among others. This 
integration must be done at a timely basis given that 
periodically there must be a report and the efficiency 
of the system must be maintained. DM is considered 
for this matter, given that it involves techniques and 
algorithms that allow correct and optimal 
management of DBs, as well as the use of 
information to gain knowledge over the population 
involved.  
This research takes into account a point of view 
of the problem described above with respect to the 
correct identification and administration of 
inconsistencies in the registered data in the 
healthcare system. Particularly, this research 
identifies DM as an appropriate tool used for the 
timely detection of inconsistencies by many 
information systems. While some believe that DM is 
a robust and complex tool to be used for the 
detection of duplicities in the DB registrations in any 
information system, we believe it’s completely 
necessary in order to get clean data and at the same 
time, obtain new knowledge from the data and a 
profound analysis of its behavior with respect to the 
abnormalities presented, that can become 
compelling to the overall quality of the system.    
The paper is organized as follows: On section 2, 
a the state of art in DM applied to the Health Sector 
is given, supported by some applications with 
respect to duplicity detection on DBs; section 3, 
presents the case study developed, Unique 
Identification System for Users (SIUU) for the 
Health Sector, and guidelines of the solution are 
proposed; then, on section 4, Advantages and 
Disadvantages of DM in a SIUU, shows the 
importance of the DM for the detection of 
duplicities, patients’ needs and the complexity of the 
solution; the last section presents concluding 
remarks and considerations to take into account 
when implementing the project. 
2 DATA MINING APPLIED TO 
THE HEALTH SECTOR 
A DB is a set of data that belong to the same context 
and are stored in a structural way for its further use 
(Date and Date, 1990).  A DB provides institutions 
the access to information, in a way that it can be 
visualized, managed and updated, according to the 
access rights given (Batra, Parashar, Sachdeva, and 
Mehndiratta, 2013). With respect to the case study 
developed under this research, the DB identified as 
FOSYGA (MinSalud, 2014) is in charge of storing 
the Colombian healthcare information system with 
respect to the affiliation information. This DM 
provides access to sensitive information of the users 
registered in the system, which represents close to 
91,69% of the entire Colombian population (DANE, 
2013).  
One of the most wearying activities to be done in 
terms of the administration of information is to keep 
the DB updated. In the Colombian healthcare 
subsidized system (RSS), local authorities must 
guarantee that the data updated is free of errors, 
since the payment given for the healthcare attention 
of a user that no longer belongs to the system is 
absorbed by the entities that offer the service and are 
not benefitting any other users. The identification of 
multiple registrations in this type of DB allows for a 
correct use of the government funds for healthcare 
services. 
This same issue has been identified and 
approached in other countries, such as New Zealand, 
England, Spain, among others. In these countries, 
they have created a unique identification system for 
patients and have established some technological 
and legal frameworks in order to support and 
regulate the processes of affiliation and registration 
of patients in the system (Oviedo and Fernández, 
2010).Yet, the problem is still present with or 
without the implementation of a unique identifica-
tion system, given that the DB must be integrated 
and the data must be clean in order to use this 
information in the decision making process. DM has 
been approached to solve this issue, given that it 
gives the controlling entities the capacity to 
automatically classify and correct errors in the data.  
ICORES2015-InternationalConferenceonOperationsResearchandEnterpriseSystems
212