Resampling-Based Ensemble Methods for Online Class Imbalance Learning

Wang, Shuo and Minku, Leandro L. and Yao, Xin (2014) Resampling-Based Ensemble Methods for Online Class Imbalance Learning. IEEE Transactions on Knowledge and Data Engineering, 27 (5). ISSN 1041-4347

[img]
Preview
Text
Resampling-Based Ensemble Methods for Online Class Imbalance Learning.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (598kB)

Abstract

Online class imbalance learning is a new learning problem that combines the challenges of both online learning and class imbalance learning. It deals with data streams having very skewed class distributions. This type of problems commonly exists in real-world applications, such as fault diagnosis of real-time control monitoring systems and intrusion detection in computer networks. In our earlier work, we defined class imbalance online, and proposed two learning algorithms OOB and UOB that build an ensemble model overcoming class imbalance in real time through resampling and time-decayed metrics. In this paper, we further improve the resampling strategy inside OOB and UOB, and look into their performance in both static and dynamicdatastreams.Wegivethefirstcomprehensiveanalysisofclassimbalanceindatastreams,intermsofdatadistributions, imbalance rates and changes in class imbalance status. We find that UOB is better at recognizing minority-class examples in static data streams, and OOB is more robust against dynamic changes in class imbalance status. The data distribution is a major factor affecting their performance. Based on the insight gained, we then propose two new ensemble methods that maintain both OOB and UOB with adaptive weights for final predictions, called WEOB1 and WEOB2. They are shown to possess the strength of OOB and UOB with good accuracy and robustness.

Item Type: Article
Additional Information: © 2014 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Uncontrolled Keywords: Class imbalance, resampling, online learning, ensemble learning, Bagging
Subjects: G400 Computer Science
Divisions: Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology > Cyber Security
Depositing User: Shuo Wang
Date Deposited: 12 Jul 2019 06:31
Last Modified: 12 Jul 2019 08:37
URI: http://www.open-access.bcu.ac.uk/id/eprint/7722

Actions (login required)

View Item View Item

Research

In this section...