Authors:
            
                    Mikhail Melnik
                    
                        
                    
                    ; 
                
                    Denis Nasonov
                    
                        
                    
                     and
                
                    Nikolay Butakov
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    ITMO University, Saint-Petersburg and Russia
                
        
        
        
        
        
             Keyword(s):
            Stream Data Processing, Adaptive Scheduling, Performance Modeling, Simulation, Cloud Computing.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Artificial Intelligence
                    ; 
                        Computational Intelligence
                    ; 
                        Evolutionary Computing
                    ; 
                        Genetic Algorithms
                    ; 
                        Informatics in Control, Automation and Robotics
                    ; 
                        Intelligent Control Systems and Optimization
                    ; 
                        Soft Computing
                    
            
        
        
            
                Abstract: 
                The growing demand for processing of streaming data contributes to the development of distributed streaming platforms, such as Apache Storm or Flink. However, the volume of data and complexity of their processing is growing extremely fast, which poses new challenges and tasks for developing new tools and methods for improving the efficiency of streaming data processing. One of the main ways to improve a system performance is an effective scheduling and a proper configuration of the computing platform. Running large-scale streaming applications, especially in the clouds, requires a high cost of computing resources and additional efforts to deploy and support an application itself. Thus, there is a need for an opportunity to estimate the performance of the system and its behaviour before real calculations are made. Therefore, in this work we propose a model for distributed data stream processing, stream scheduling problem statement and a developed simulator of the streaming platform, i
                mmediately allowing to explore the behaviour of the system under various conditions. In addition, we propose a genetic algorithm for efficient stream scheduling and conducting experimental studies.
                (More)