
 
counting is considerably more time consuming than 
candidate generation in Apriori-based methods. 
Second of the conducted experiments examined 
how well the algorithms scale with the increasing 
number of concurrently executed queries. In order to 
keep the queries equally similar, the level of 
overlapping between each pair of subsequent queries 
inside the batch was fixed at 75%. As can be seen in 
Figure 3, the generation time of CCT grows linearly 
with the increase of the number of queries in a batch, 
while CCan remains largely insensitive. Total 
execution times increase similarly for both methods, 
with CCan performing slightly better, especially 
with more queries in a batch. 
 
 
Figure 3: Generation and total execution times for 
different numbers of similar queries. 
6 CONCLUSIONS 
In this paper we addressed the problem of efficient 
processing of batches of frequent itemset queries in 
the context of the Apriori algorithm. We proposed a 
new algorithm, called Common Candidates, built 
upon Common Candidate Tree, offering further 
integration of computations performed for a batch of 
queries thanks to the integrated candidate generation 
procedure.  
The conducted experiments showed that the new 
method results in significant reduction of the total 
time spent on candidate generation. The impact of 
the integrated candidate generation procedure on the 
overall execution time is less spectacular but still 
noticeable. 
In the future we plan to investigate the possible 
impact of several optimizations applied to Apriori by 
its practical implementations on our batch 
processing algorithms. 
REFERENCES 
Agrawal, R., Imielinski, T., Swami, A., 1993. Mining 
Association Rules Between Sets of Items in Large 
Databases, In Proc. of the 1993 ACM SIGMOD Conf. 
Agrawal, R., Mehta, M., Shafer, J., Srikant, R., Arning, 
A., Bollinger, T., 1996. The Quest Data Mining 
System, In Proc. of the 2nd KDD Conference. 
Agrawal, R., Srikant, R., 1994. Fast Algorithms for 
Mining Association Rules, In Proc. of the 20th VLDB 
Conference. 
Baralis, E., Psaila, G.,1999. Incremental Refinement of 
Mining Queries, In Proceedings of the 1st DaWaK 
Conference. 
Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., 
Ramon, J., Vandecasteele, H., 2002. Improving the 
Efficiency of Inductive Logic Programming Through 
the Use of Query Packs, Journal of Artificial 
Intelligence Research, Vol. 16. 
Grudzinski, P., Wojciechowski, M., 2007. Integration of 
Candidate Hash Trees in Concurrent Processing of 
Frequent Itemset Queries Using Apriori, In Proc. of 
the 3rd ADMKD Workshop. 
Imielinski, T., Mannila, H., 1996. A Database Perspective 
on Knowledge Discovery, Communications  of the 
ACM, Vol. 39. 
Jin, R., Sinha, K., Agrawal, G., 2005. Simultaneous 
Optimization of Complex Mining Tasks with a 
Knowledgeable Cache, In Proc. of the 11th KDD 
Conference. 
Meo, R., 2003. Optimization of a Language for Data 
Mining, In Proc. of the ACM SAC Conference. 
Pei, J., Han, J., 2000. Can We Push More Constraints into 
Frequent Pattern Mining?, In Proc. of the 6th KDD 
Conference. 
Sellis, T., 1988. Multiple-query optimization, ACM 
Transactions on Database Systems, Vol. 13. 
Wojciechowski, M., Zakrzewicz, M., 2002. Methods for 
Batch Processing of Data Mining Queries, In Proc. of 
the 5th DB&IS Conference. 
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
490