Ensemble classifiers for drift detection and monitoring in dynamical environments

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Oct 14, 2013
Imen Khamassi Moamar Sayed-Mouchaweh Moez Hammami Khaled Ghédira

Abstract

Detecting and monitoring changes during the learning process are important areas of research in many industrial applications. The challenging issue is how to diagnose and analyze these changes so that the accuracy of the learning model can be preserved. Recently, ensemble classifiers have achieved good results when dealing with concept drifts. This paper presents two ensembles learning algorithms BagEDIST and BoostEDIST, which respectively combine the Online Bagging and the Online Boosting with the drift detection method EDIST. EDIST is a new drift detection method which monitors the distance between two consecutive errors of classification. The idea behind this combination is to develop an ensemble learning algorithm which explicitly handles concept drifts by providing useful descriptions about location, speed and severity of drifts. Moreover, this paper presents a new drift diversity measure in order to study the diversity of base classifiers and see how they cope with concept drifts. From various experiments, this new measure has provided a clearer vision about the ensemble’s behavior when dealing with concept drifts.

 

How to Cite

Khamassi, I. ., Sayed-Mouchaweh, M. ., Hammami, M. ., & Ghédira, K. . (2013). Ensemble classifiers for drift detection and monitoring in dynamical environments. Annual Conference of the PHM Society, 5(1). https://doi.org/10.36001/phmconf.2013.v5i1.2324
Abstract 175 | PDF Downloads 191

##plugins.themes.bootstrap3.article.details##

Keywords

classification, Drift detection and monitoring, Non-stationary environments

References
Agrawal, R., Imielinski, T., & Swami., A. (1993). Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, vol. 6, pp. 914–925.

Baena-García, M., Campo-Avila, J. D., Fidalgo, R., Bifet, A., Gavaldà, R., & Morales-Bueno, R. (2006). Early drift detection method. In Proceedings of the Fourth International Workshop on Knowledge Discovery from DataStreams, Berlin, Germany, pp. 77-86.

Bifet, A., & Gavald, R.(2007) Learning from time-changing data with adaptive windowing. In Proceeding of 7th International Conference on Data Mining, Minnesota, USA, pp. 443-448.

Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., & Gavalda, R,. (2009). New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Paris, France, pp 139-148.

Bifet, A., Holmes, G., & Pfahringer, B.,(2010) Leveraging bagging for evolving data streams machine learning and knowledge discovery in databases. In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Barcelona, Spain, pp 135-150.

Bifet, A., Holmes, G., Kirkby, R., & Pfahringer, B.(2010). MOA: Massive Online Analysis. Journal of Machine Learning Research, vol. 11, pp. 1601-1604.

Brzezinski, D., & Stefanowski, J.,(2011). Accuracy Updated Ensemble for Data Streams with Concept Drift. In Proceedings of the 6th international conference on Hybrid artificial intelligent systems, Wroclaw, Poland, pp 155-163.

Cunningham, P., & Carney, J., (2000). Diversity versus Quality in Classification Ensembles based on Feature Selection. In Proceedings 11th European Conference on Machine Learning. Barcelona, Spain, pp. 109-116.

Domingos, P., & Hulten G.,(2000). Mining high-speed data streams. In the Proceedings of the 6th ACM SIGKDD international conference on Knowledge discovery and data mining, Boston, MA, USA, pp. 71-80.

Gama, J., Medas, P., Castillo, G., & Rodrigues. P.,(2006) Learning with local drift detection. In Proceedings of the Second International Conference on Advanced Data Mining and Applications, Xi’an, China. pp. 42-55.

Gama, J., Sebastião, R., & Rodrigues, P.,(2009). Issues in evaluation of stream learning algorithms. In the Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Paris, France, pp. 329–338.

Harries, M.,(1999). Splice-2 comparative evaluation: Electricity pricing. Technical Report, The University of South Wales, Autralia.

Hulten, G., Spencer, L., & Domingos, P.,(2001). Mining time-changing data streams. In Proceedings of the Seventh ACM SIGKDD international conference on Knowledge Discovery and Data Mining, California, USA, pp. 97-106.

Ikonomovska, E., Gama, J., Sebastio, R., & Gjorgjevik, D.,(2009). Regression trees from data streams with drift detection. In Proceedings of the 12th International Conference on Discovery Science, Berlin, Germany, pp. 121–135.

Klinkenberg, R. (2001). Learning drifting concepts: example selection vs. example weighting. Intelligent Data Analysis, vol. 8 , pp. 281–300.

Kolter, J., & Maloof, M., (2007). Dynamic weighted majority: a new ensemble method for tracking concept drift. The Journal of Machine Learning Research, vol. 8. pp. 2755-2790.

Kuncheva, L., (2004). Classifier Ensembles for Changing Environments. In Proceedings of the 5th International Workshop on Multiple Classifier Systems, Cagliari, Italy, pp. 1-15.

Kuncheva, L. I., & Whitaker, C. J, (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Journal of Machine Learning, vol. 51, pp. 181–207.

Lazarescu, M., Venkateshand, S.,& Bui, H., (2004). Using multiple windows to track concept drift. Intelligent data analysis, vol. 8, pp. 29-59.

Lughofer, E., & Angelov, P.,(2011). Handling Drifts and Shifts in On-Line Data Streams with Evolving Fuzzy Systems. Applied Soft Computing, vol. 11, pp. 2057- 2068.

Masud, M., Gao, J., Khan, L., Han, J., & Thuraisingham, B., (2011). Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transactions
on Knowledge and Data Engineering, vol. 23, pp. 859–874.

Mitchell, T,(1997). Machine Learning. McGraw Hill, New York, USA.

Minku, L., White, A., & Yao, X.,(2010). The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Transactions on Knowledge and Data Engineering, vol. 22, pp. 730–742.

Oza, N., & Russell, S., (2001). Online bagging and boosting.In Proceedings of the Eighth International Workshop of Artificial Intelligence and Statistics, Florida,USA, pp. 105-112.

Sayed-Mouchaweh, M., (2010). Semi-supervised classification method for dynamic applications. Fuzzy Sets and Systems, vol. 4, pp. 544–563.

Schlimmer, J. C., & Granger, R. H. (1986). Incremental learning from noisy data. Journal of Machine Learning, vol. 3, pp. 317-354.

Sobhani, P., & Beigy, H., (2011). New drift detection method for data streams. In Proceedings of the second international conference on Adaptive and intelligent systems, Berlin, Germany, pp. 88-97.

Tsymbal, A., (2004). The problem of concept drift: definitions and related work. Technical Report TCD-CS- 2004-15, Trinity College, Dublin, Ireland.
Section
Technical Research Papers