0% found this document useful (0 votes)

2 views6 pages

Combining_active_learning_with_concept_drift_detection_for_data_stream_mining

The paper presents novel active learning strategies for data stream mining that effectively handle concept drift detection. It emphasizes the importance of adapting learning models to evolving data distributions while minimizing labeling costs, particularly in scenarios where true class labels are costly and delayed. The proposed strategies leverage drift detection to optimize the labeling process, ensuring efficient updates to classifiers in real-time data streams.

Uploaded by

MAINAK PATRA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views6 pages

Combining_active_learning_with_concept_drift_detection_for_data_stream_mining

Uploaded by

MAINAK PATRA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2018 IEEE International Conference on Big Data (Big Data)

Combining active learning with concept drift

detection for data stream mining

Bartosz Krawczyk Bernhard Pfahringer Michał Woźniak

Dept of Computer Science Dept of Computer Science Dept of Systems and Computer Networks
Virginia Commonwealth University Univ. of Waikato Wrocław Univ. of Science and Technology
Richmond VA, USA Hamilton, New Zealand Wrocław, Poland
Email: [email protected] Email: [email protected] Email: [email protected]

Abstract—Most of data stream classifier learning methods here properties of data may change over time. For example let
assume that a true class of an incoming object is available us look at malware detection problem. Such malicious software
right after the instance has been processed and new and labeled is far from being static, as it evolves over time to elude ever-
instance may be used to update a classifier’s model, drift detection improving security systems. Such a phenomenon is known as
or capturing novel concepts. However, assumption that we have concept drift [5]. Efficient data stream mining methods must
an unlimited and infinite access to class labels is very naive and
assume the presence of this problem and be able to tackle it
usually would require a very high labeling cost. Therefore the
applicability of many supervised techniques is limited in real-life efficiently by constantly adapting to non-stationary distribution
stream analytics scenarios. Active learning emerges as a potential of data [6]. The challenge lies in how to properly use the
solution to this problem, concentrating on selecting only the most incoming objects to keep our learning model updated and limit
valuable instances and learning an accurate predictive model with the costs imposed by constantly modifying our recognition
as few labeling queries as possible. However learning from data system [7].
streams differ from online learning as distribution of examples
may change over time. Therefore, an active learning strategy must When mining data streams unlabeled objects are abundant
be able to handle concept drift and quickly adapt to evolving as they arrive over time with given ratio specific to the analyzed
nature of data. In this paper we present novel active learning problem. However labels assigned to these instances may be
strategies that are designed for effective tackling of such changes. costly to obtain due to required human input (labor cost).
We assume that most labeling effort is required when concept In some applications we may obtain true class labels with
drift occurs, as we need a representative sample of new concept very small cost (e.g., weather prediction), but this is not true
to retrain properly the predictive model. Therefore, we propose for most of problems and is connected with the label delay
active learning strategies that are guided by drift detection issue. In most problems obtaining a label would require a
module to save budget for difficult and evolving instances. Three constant access to human expert or a kind of oracle. This is
proposed strategies are based on learner uncertainty, dynamic
allocation of budget over time and search space randomization.
subject to various constraints, e.g., financial (we need to pay
Experimental evaluation of the proposed methods prove their the expert), time (objects may appear faster than the expert
usefulness for reducing labeling effort in learning from drifting is able to handle them), logistical (expert may not be able to
data streams. work 24/7) or resources (some expert-based procedures, like
laboratory tests, cannot be repeated continuously). Sometimes,
Keywords—machine learning; data stream mining; concept access to the true label is delayed, even if we have access to
drift; active learning; drift detection an oracle, e.g., the true label for the problem of the credit
approval is available ca. 2 years after the decision, while
I. I NTRODUCTION some medical diagnosis could be confirmed after laboratory
test and may take a few weeks. However, most of the methods
Contemporary machine learning problems are much more described in the literature for learning from streams naively
complex than the ones we have faced 10 or 20 years ago. With assume that true labels for objects are available all the time
the advance of the big data era we need to address emerging upon request. This assumption is a highly unrealistic one and
problems deeply connected with the nature of analyzed in- limits the usefulness of many supervised techniques in real-
stances. We can identify the 5 V s of big data: volume, velocity, life tasks [8], [9]. Therefore, methods for selecting only the
variety, veracity and value. Let us take a look on the problem most valuable samples for labeling are of crucial importance
of velocity. This paradigm assume that data is in constant mo- to data stream mining community. Here active learning have
tion, arrives constantly and thus must be handled in real-time. been identified as a promising solution to this challenge [10],
This is further connected with the notion of volume as data [11]. This approach concentrates on how to select objects for
will arrive for potentially infinite amount of time, flooding both labeling instead of requesting for all objects to be labeled. This
the processing and storage systems [1], [2]. Such a problem problem is well-known and extensively discussed in static [12]
is known in the literature as data stream [3], [4]. This forces and online scenarios [13]. However there are only few works
us to develop new methods that are able to handle learning discussing the problem of active learning for data streams [14],
from such ever-growing collection of instances under certain [15], [16], [17], [18], especially in the presence of concept drift
constrains as time and memory limitations. However, learning [10]. The difference between active learning in online and data
from data streams differ from traditional online learning as stream scenario is the changes expectation. In online scenario

978-1-5386-5035-6/18/$31.00 ©2018 IEEE 2239

Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on June 10,2025 at 16:35:53 UTC from IEEE Xplore. Restrictions apply.
we usually fix a predefined threshold associated with e.g., methods should meet the following requirements: process each
certainty of the used learner and allow it to guide the labeling instance only once, work under time and memory constraints
process. However, some regions of the decision space may and when the training procedure is being interrupted its quality
never be queried (due to an initial high certainty associated should not be lower than classifier trained using batch mode on
with them) and therefore change appearing in these areas the same data. Online classifiers offer high working speed and
will never be detected. Additionally, as data streams evolve are able to flexibly adapt themselves to evolving data (which
over time it seems far from rational to keep a fixed certainty is also known as implicit drift detection). Some of popular
threshold. Instead it should adapt to the nature of stream at classifiers like Naı̈ve Bayes or Neural Networks may work
a given moment. When concept drift takes place, examples in online mode, but there is a number of more sophisticated
from the new distributions are of highest importance to the approaches like Concept-Adapting Very Fast Decision Tree
learner, as they must be utilized to maintain its competence. algorithm [22], which ensures consistency with incoming data
At the same time, querying a stable data stream would not by maintaining alternative subtrees [23].
bring any new information for the learner. However, active
learning strategies proposed so far have no direct connection Methods based on sliding windows are based on the
with the drift detection procedure. instance forgetting mechanisms. They assume that recently
arrived objects represent the current state of the analyzed data
In this paper we propose new active learning strategies stream and hence should be more relevant to the recognition
guided directly by drift detectors. Information about shift system. Usually such a window has a fixed size and cuts off
detection is being used by the active learning strategy to older instances, or applies a data weighting scheme in which
increase querying ratio in order to accumulate a high number of important objects have assigned higher weights. When dealing
new and valuable samples for classifier update procedure. This with the sliding window the main question is how to adjust
allows us to balance the budget assigned to analyzed stream the window size. On the one hand, a shorter window allows
by limiting querying in static moments and saving it for when focusing on the emerging context, though data may not be
the change occurs. Three strategies are proposed, based on representative for a longer lasting context. On the other hand, a
learner uncertainty, dynamic allocation of budget over time wider window may result in mixing the instances representing
and search space randomization. We apply a semi-supervised different contexts. Therefore, recent proposals include dynamic
drift detection method that works only with previously labeled window size allocation or combining multiple windows [24].
samples. Therefore, when uncertain samples begin to appear a
drift detector is able to properly evaluate if we are dealing Finally, ensemble learning have gained a significant popu-
with outliers or a real drift. We put no restriction on the larity in the stream mining community along recent years [25].
classifier used, which makes our methods highly suitable for They maintain the advantages present in static scenarios, such
adapting any supervised learning methods to real-life data as exploiting local competencies of classifiers or robustness to
stream mining with limited budget. We compare our proposals overfitting. At the same time one may view the drifting context
to a fully labeled data stream and reference methods that do not as an additional way to ensure diversity among committee
use feedback from the drift detection module. Their accuracies members. Here dynamic combiners, online ensembles and en-
and reaction to changes are being examined over a number of sembles with dynamic line-up are the most popular approaches
artificial and real stream benchmarks. Obtained results prove [26].
the usefulness of the proposed methods and show that they are
able to better allocate the labeling budget to when we really
need to update our predictive model. III. P ROPOSED ACTIVE LEARNING STRATEGIES

II. M INING DATA STREAMS WITH CONCEPT DRIFT In this section we will describe the proposed active learning
strategies guided by drift detection for evolving data streams.
Four main categories of approaches for handling concept
drift can be distinguished. Let us present shortly all of them.
Methods with triggers, which base on so-called drift de- A. Preliminaries
tectors methods aim at identifying a moment when change
appears or is likely to appear and alarm the recognition system Let us assume that our stream consist of a potentially infi-
[19]. This is an external module that monitors the properties of nite set of examples DS = {(x1 , j1 ), (x2 , j2 ), ..., (xk , jk ), ...},
data stream in a supervised, semi-supervised or unsupervised where xk stands for feature vector (xk ∈ X ) describing the kth
manner. It is important to point out that using supervised drift object and jk its label jk ∈ M, which should be assigned by
detectors require a full access to true class labels or to the oracle and of course learning algorithm should pay for it. As
performance of using classifier, which in real-life scenarios we mentioned before we want to reduce the label querying cost
is almost impossible as discussed in the previous section. On then we introduce a budget B that shows how many instances
the other hand unsupervised drift detection methods cannot we can afford to label. We assume that 0 < B < 1. In cases
detect a real concept drift in cases where statistical properties of B = 0 and B = 1 we would have a fully unlabeled and
of data did not changed (e.g., classes have swapped places) fully labeled data stream respectively. A labeling strategy is
[20]. Therefore, semi-supervised drift detection seems as the an realization of active learning paradigms that allow us to
best option. evaluate if for a currently analyzed sample we are interested
in obtaining its true label. Output of such an strategy is realized
Online learners are classifiers that constantly update their as a Boolean variable, indicating a decision regarding the label
structure while processing the incoming instances [21]. Such query.

2240
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on June 10,2025 at 16:35:53 UTC from IEEE Xplore. Restrictions apply.
B. Proposed framework C. Random strategy++
This is a very simple active learning strategy that randomly
Let us now present a general framework for the proposed draws instance labels with probability equal to the assumed
active learning strategies. We propose to construct it on an budget B. We propose to improve label query if the change is
online learning scenario from data streams with concept drift. being detected by increasing the labeling probability according
For detecting changes in data we use a drift detector module. to the output of drift detector (alarm or change detected). The
It is realized as ADWIN2 drift detector [27], due to its low details of this strategy are given in Algorithm 2.
computational complexity and proved efficiency. It uses only
labeled samples coming from an active learning, thus not Algorithm 2: Labeling strategy RAND++(x,r,B)
imposing any additional costs on the proposed system. When input: new object x, labeling rate adjustment
the accuracy of the classifier begins to decrease we start to train R ∈ [0, 1], budget B
a new classifier in the background using arriving objects. In Result: labeling ∈ [TRUE, FALSE]
case of change being detected the new classifier replaces the generate a uniform random variable λ ∈ [0, 1]
old one. For each incoming object we check if the labeling if drift warning then
strategy conditions are being fulfilled (they are triggered ran- λ←λ−R
domly or by a loss of classifiers’ confidence). However, we are else
most interested in obtaining labels for new objects appearing if drift detected then
after concept drift. Quickly gathering a representative sample λ ← λ − 2R
would allow for early preparation of new classifier and efficient
replacement of the outdated learner. Therefore, we should labeling ← I(λ ≤ B)
maintain our budget and dynamically allocate it over time
when needed. We propose to create a feedback between the
labeling strategies and drift detection module. In case of alarm
being raised or change being detected we increase the labeling D. Variable uncertainty strategy++
rate in order to probe the emerging concept. Let’s R stands
for labeling ratio, which should depend on answer of the drift This strategy is based on monitoring the certainty of clas-
detector. The labeling ratios should be ordered in the following sifier Ψ decision expressed as its support functions FΨ (x, j)
way: R(static) < R(alarm) < R(change). This allows us for object x belonging to j-th class. It aims to label the
to control budget and save it for obtaining new knowledge for least certain instances within a time interval. A time-variable
the recognition system. The details of the proposed framework threshold imposed on classifier’s certainty is being used. It
are given in Algorithm 1. adjusts itself depending on the incoming data to balance
the budget use over time. For static parts of the stream the
classifier’s certainty stabilizes and threshold is being increased
to allow for labeling of only the most uncertain objects. When
Algorithm 1: Proposed general framework for active the drift detector returns information about detected alarm or
learning from drifting data streams. change we start to rapidly decrease the threshold in order
to allow for gathering a higher number of labeled objects to
input: budget B, labeling rate R, labeling strategy S(x, quickly adapt a new model to the current state of the stream.
R), classifier Ψ, drift detector D The details of this strategy are given in Algorithm 3.
labeling cost b ← 0
while end of stream = FALSE do
obtain new object x from the stream Algorithm 3: Labeling strategy VAR-UN++(x,s,θ,r,Ψ)
if b < B then input: new object x, threshold θ, threshold adjustment
if S(x, R) = TRUE then s ∈ [0, 1], labeling rate adjustment R ∈ [0, 1],
obtain label y of object x R > s, trained classifier Ψ
b←b+1 Result: labeling ∈ [TRUE, FALSE]
update classifier Ψ with (x, y) initialize θ and store its latest value
update drift detector D with (x, y) if maxm∈M FΨ (x, m) < θ then
if drift warning = TRUE then decrease the uncertainty threshold as follows:
start a new classifier Ψnew if drift warning then
increase labeling rate R θ ←θ−R
else else
if drift detected = TRUE then if drift detected then
replace Ψ with Ψnew θ ← θ − 2R
further increase labeling rate R else
else θ ←θ−s
return to initial labeling rate R
labeling ← TRUE
if Ψnew exists then else
update classifier Ψnew with (x, y) increase the uncertainty threshold θ ← θ + s
labeling ← FALSE

2241
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on June 10,2025 at 16:35:53 UTC from IEEE Xplore. Restrictions apply.
E. Randomized variable uncertainty strategy++ We compare the proposed strategies (RAND++, VAR-
UN++ and R-VAR-UN++) with their basic versions that do
This is a modification of the previous strategy that modifies not use information from the drift detector [10]. We use the
the threshold by a random factor. This allows for labeling some following parameters for these strategies: threshold adjustment
of the examples to which classifier displays high certainty in s = 0.01, labeling rate adjustment r = 0.03 and thresh-
order not to miss any possible drift that may appear in any part old random variance δ = 1. We analyze the budget size
of the decision space. However, this happens at the expense B ∈ [0.05, 0.10, · · · , 0.60] that is being calculated over a time
of sacrificing some of uncertain instances. Thus this strategy window of 2500 instances. Hoeffding tree is selected as a base
is expected to perform worse than its predecessor for static classifier.
streams, but adapt faster to occurring changes. The details of
this strategy are given in Algorithm 4. For evaluating classifiers we use the prequential accu-
racy metric. Wilcoxon signed-rank test is adopted as a non-
parametric statistical procedure to perform pairwise compar-
Algorithm 4: Labeling strategy R-VAR-
isons between the classifier trained on fully labeled stream
UN++(x,s,θ,δ,R,Ψ)
and using active learning strategies with varying budgets.
input: new object x, threshold θ, threshold adjustment
s ∈ [0, 1], labeling rate adjustment R ∈ [0, 1], r
> s, threshold random variance δ, trained B. Results and discussion
classifier Ψ Figure 1 presents a detailed prequential accuracies for six
Result: labeling ∈ [TRUE, FALSE] examined strategies with varying budget sizes over six stream
initialize θ and store its latest value benchmarks, while Table II depicts a comparison of single
η ← random multiplier ∈ N (1, δ) best accuracies for each of three proposed active learning
θrand ← θ × η strategies and a classifier trained on a fully labeled dataset.
if maxj∈M FΨ (x, j) < θrand then Please note that our aim is to get as close as possible to
decrease the uncertainty threshold as follows: accuracies displayed by a classifier with a full access to class
if drift warning then labels, at the same time using as lowest budget as possible.
θ ←θ−R
else
if drift detected then TABLE II: Comparison of averaged prequential accuracies for
θ ← θ − 2R Hoeffding tree trained on a fully labeled data stream (FULL)
else and the best one obtained from active learning strategies.
θ ←θ−s
labeling ← TRUE Dataset FULL RAND++ VAR-UN++ R-VAR-UN++
else Airlines 69.38 67.25 65.69 66.02
increase the uncertainty threshold θ ← θ + s Electricity 81.17 78.95 79.59 80.20
Forest Cover 80.34 71.67 74.28 74.39
labeling ← FALSE RBF 93.47 92.07 92.26 92.98
Hyperplanes 83.16 81.95 82.03 82.21
Tree 69.98 68.05 69.11 69.71

IV. E XPERIMENTAL STUDY From these results we can observe that only for the
Electricity datasets the proposed active learning strategies were
In this section we present the experimental evaluation of similar to the reference ones. For remaining stream bench-
the proposed active learning methods for drifting data streams. marks we can observe a significant gain in accuracy when
the feedback from drift detector is being utilized by the label
A. Set-up query. Additionally, the proposed improved strategies perform
very well even with limited budgets, offering a balanced
For non-stationary data streams there is still just a few effectiveness regardless of the budget setting. This is especially
publicly available data sets to work with. Most of them are vivid for Airlines, Forest Cover and Hyperplanes datasets. This
artificially generated ones, with only some real-life examples. allows us to conclude that for limited budget the introduced
Following the standard approaches found in literature we labeling queries are concentrated mainly on moments when
decided to use both artificial and real-life data sets, details drift takes place, thus better sampling the changed distribution
of which can be found in Table I. and allowing for rapid construction of a more competent
classifier for the current concept.
TABLE I: Details of data stream benchmarks used in the Results of Wilcoxon test over multiple datasets are pre-
experiments. sented in Table III. We can see that the proposed strategies
obtain very similar results to a classifier trained on a fully
Data set Objects Features Classes Drift type labeled stream. Using as little as 15% of data were are able to
Airlines 539 383 7 2 unknown induce a classifier that is not statistically significantly differ
Electricity 45 312 7 2 unknown
Forest Cover 581 012 53 7 unknown
from one that has access to all of labels. This is a very
RBF 1 000 000 20 4 gradual important observation, which proves that by careful labeling
Hyperplane 1 000 000 10 2 incremental of only the most difficult and evolving instances we are able
Tree 1 000 000 10 6 sudden recurring
to obtain comparable accuracy at greatly decreased cost.

2242
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on June 10,2025 at 16:35:53 UTC from IEEE Xplore. Restrictions apply.
68

74
80
67

72
78
accuracy[%]

accuracy[%]
66
accuracy[%]

70
65

76
64

RAND RAND
RAND

68
RAND++ RAND++
RAND++
VAR−UN VAR−UN
63

VAR−UN

74
VAR−UN++ VAR−UN++ VAR−UN++
R−VAR−UN R−VAR−UN R−VAR−UN

66
R−VAR−UN++ R−VAR−UN++ R−VAR−UN++
62

0.1 0.2 0.3 0.4 0.5 0.6 0.1 0.2 0.3 0.4 0.5 0.6 0.1 0.2 0.3 0.4 0.5 0.6
budget used budget used budget used

(a) Airlines (b) Electricity (c) Forest Cover

70
82

69
92

68
accuracy[%]

accuracy[%]

accuracy[%]
91

67
90

66
RAND RAND RAND
RAND++ RAND++ RAND++
89

65
VAR−UN VAR−UN VAR−UN
VAR−UN++ VAR−UN++ VAR−UN++
R−VAR−UN R−VAR−UN R−VAR−UN
R−VAR−UN++ R−VAR−UN++ R−VAR−UN++
88

0.1 0.2 0.3 0.4 0.5 0.6 0.1 0.2 0.3 0.4 0.5 0.6 64 0.1 0.2 0.3 0.4 0.5 0.6
budget used budget used budget used

(d) RBF (e) Hyperplane (f) Tree

Fig. 1: Accuracies on examined datasets for Hoeffding tree and a given labeling budget using different active learning strategies.

TABLE III: Wilcoxon tests for comparing a Hoeffding tree trained using a fully labeled stream (FULL) and a Hoeffding tree
trained with selected labeling strategy and fixed budget. Symbol ”<” stands for situation when classifier trained on a fully labeled
data stream is statistically significantly better and symbol ”=” for situation when there are no statistically significant differences
between the proposed active learning approach and fully labeled stream.
Budget
Comparison 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60
RAND++ vs. FULL <0.3643 <0.2017 <0.0740 =0.0483 =0.4724 =0.4503 =0.4025 = 0.3977 = 0.3643 =0.3428 = 0.3215 =0.3194
VAR-UN++ vs. FULL <0.2916 <0.1866 =0.0492 =0.0446 =0.0418 =0.0378 =0.0382 =0.0357 =0.0321 =0.0277 =0.0273 =0.0270
R-VAR-UN++ vs. FULL <0.2848 <0.1609 =0.0487 =0.0431 =0.0409 =0.0351 =0.0369 =0.0334 =0.0313 =0.0258 =0.0249 =0.0246

Finally let us analyze how well the proposed and reference V. C ONCLUSIONS AND FUTURE WORKS
active learning strategies managed the concept drift occurrence.
Figure 2 depicts the percentage of drift examples in the labeled In this paper we have proposed three improved active learn-
set. From these figures we can see that the proposed strategies ing strategies for mining drifting data streams. The novelty of
are able to better identify instances that appear during the drift our proposal lied in a direct feedback from the drift detection
and use them to adapt classification system. Our improved mechanism that controlled the labeling ratio. This way we were
strategies label 2-3 times more examples during drift has a able to dynamically allocate our budget and obtain labels for
direct influence on the obtained accuracies. Additionally, we objects coming from evolved distribution. This had a direct
may see that R-VAR-UN++ strategy is able to detect the link to the accuracy of the classification procedure as we were
highest number of drifting instances, thus proving our claim able to quicker capture the changes in streams. We showed
from Section III-E that usage of threshold randomization will that our proposed strategies allowed for a highly accurate
be beneficial to detection of changes occurring in any point of stream classification by increased label querying in the drifting
the decision space. moments, even when the available budget was small. Using
statistical test we have showed that proposed active learning

2243
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on June 10,2025 at 16:35:53 UTC from IEEE Xplore. Restrictions apply.
40 40 40
[8] B. Cyganek and S. Gruszczynski, “Hybrid computer vision system for
ratio of drift examples [%]

ratio of drift examples [%]

31.45
30
26.53
28.17 30 30 27.48 drivers’ eye recognition and fatigue monitoring,” Neurocomputing, vol.
24.43

20 20 20 18.73
126, pp. 78–94, 2014.
16.02
13.56 14.31

10
11.23 10.28 11.04
10
6.09 6.49 7.01 10 7.26
12.59
[9] Z. S. Abdallah, M. M. Gaber, B. Srinivasan, and S. Krishnaswamy,
5.31

0 0 0
“Adaptive mobile activity recognition system with evolving data
streams,” Neurocomputing, vol. 150, pp. 304–317, 2015.
RAND

RAND++

RAND

RAND++

RAND

RAND++
VAR−UN

VAR−UN++

R−VAR−UN

R−VAR−UN++

VAR−UN

VAR−UN++

R−VAR−UN

R−VAR−UN++

VAR−UN

VAR−UN++

R−VAR−UN

R−VAR−UN++
[10] I. Zliobaite, A. Bifet, B. Pfahringer, and G. Holmes, “Active learning
with drifting streaming data,” IEEE Trans. Neural Netw. Learning Syst.,
(a) Airlines (b) Electricity (c) Forest Cover vol. 25, no. 1, pp. 27–39, 2014.
[11] S. Mohamad, A. Bouchachia, and M. Sayed Mouchaweh, “A bi-criteria
40
34.09
40
34.74
40
active learning algorithm for dynamic data streams,” IEEE Trans.
ratio of drift examples [%]

ratio of drift examples [%]

31.96 32.98 32.19
30
26.17
30
30.83
30
24.82
28.17 Neural Netw. Learning Syst., vol. 29, no. 1, pp. 74–86, 2018.
23.84 24.57
21.9 22.85
21.07
20 18.75 20
19.63 19.86 19.97
20 [12] C. C. Aggarwal, X. Kong, Q. Gu, J. Han, and P. S. Yu, “Active learning:
10 10 10
A survey,” in Data Classification: Algorithms and Applications, 2014,
0 0 0
pp. 571–606.
RAND

RAND++

RAND

RAND++

RAND

RAND++
VAR−UN

VAR−UN++

R−VAR−UN

R−VAR−UN++

VAR−UN

VAR−UN++

R−VAR−UN

R−VAR−UN++

VAR−UN

VAR−UN++

R−VAR−UN

R−VAR−UN++
[13] L. Ma, S. Destercke, and Y. Wang, “Online active learning of decision
trees with evidential data,” Pattern Recognition, vol. 52, pp. 33–45,
2016.
(d) RBF (e) Hyperplane (f) Tree [14] M. Bouguelia, Y. Belaı̈d, and A. Belaı̈d, “An adaptive streaming active
learning strategy based on instance weighting,” Pattern Recognition
Letters, vol. 70, pp. 38–44, 2016.
Fig. 2: Percentage of drift examples in the labeled set by
examined active learning strategies. Results averaged over all [15] B. Kurlej and M. Woźniak, “Active learning approach to concept drift
problem,” Logic Journal of the IGPL, vol. 20, no. 3, pp. 550–559, 2012.
budgets examined.
[16] H. Nguyen, W. K. Ng, and Y. Woon, “Concurrent semi-supervised
learning with active learning of data streams,” Trans. Large-Scale Data-
and Knowledge-Centered Systems, vol. 8, pp. 113–136, 2013.
[17] M. Woźniak, B. Cyganek, A. Kasprzak, P. Ksieniewicz, and
strategies are not significantly worse than using a fully labeled K. Walkowiak, “Active learning classifier for streaming data,” in Hybrid
data stream. This contribution was a step forward to using Artificial Intelligent Systems - 11th International Conference, HAIS
supervised learning methods in realistic stream settings. In our 2016, Seville, Spain, April 18-20, 2016, Proceedings, 2016, pp. 186–
197.
future works we plan to apply active learning strategies with
dynamic budget for multi-class novelty detection in streaming [18] S. Mohamad, M. Sayed Mouchaweh, and A. Bouchachia, “Active
learning for classifying data streams with unknown number of classes,”
data. Neural Networks, vol. 98, pp. 1–15, 2018.
[19] P. M. G. Jr., S. G. T. de Carvalho Santos, R. S. M. de Barros, and D. C.
D. L. Vieira, “A comparative study on concept drift detectors,” Expert
ACKNOWLEDGMENT Syst. Appl., vol. 41, no. 18, pp. 8144–8156, 2014.
This was supported by the Polish National Science Center [20] P. Sobolewski and M. Woniak, “Concept drift detection and model se-
lection with simulated recurrence and ensembles of statistical detectors,”
under the grant no. DEC-2013/09/B/ST6/02264. Journal of Universal Computer Science, vol. 19, no. 4, pp. 462–483,
feb 2013.
All experiments were carried out using computer equip-
[21] G. Melki, V. Kecman, S. Ventura, and A. Cano, “OLLAWV: online
ment sponsored by EC under FP7, Coordination and Support learning algorithm using worst-violators,” Appl. Soft Comput., vol. 66,
Action, Grant Agreement Number 316097, ENGINE - Euro- pp. 384–393, 2018.
pean Research Centre of Network Intelligence for Innovation [22] G. Hulten, L. Spencer, and P. M. Domingos, “Mining time-changing
Enhancement (http://engine.pwr.wroc.pl/). data streams,” in Proceedings of the seventh ACM SIGKDD inter-
national conference on Knowledge discovery and data mining, San
Francisco, CA, USA, August 26-29, 2001, 2001, pp. 97–106.
R EFERENCES [23] P. Domingos and G. Hulten, “A general framework for mining massive
data streams.” Journal of Computational and Graphical Statistics,
[1] A. Cano, “A survey on graphic processing unit computing for large-
vol. 12, pp. 945–949, 2003.
scale data mining,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov.,
vol. 8, no. 1, 2018. [24] U. Yun and G. Lee, “Sliding window based weighted erasable stream
pattern mining for stream data applications,” Future Generation Comp.
[2] H. T. Nguyen, M. T. Thai, and T. N. Dinh, “A billion-scale approxima- Syst., vol. 59, pp. 1–20, 2016.
tion algorithm for maximizing benefit in viral marketing,” IEEE/ACM
Trans. Netw., vol. 25, no. 4, pp. 2419–2429, 2017. [25] B. Krawczyk, L. L. Minku, J. Gama, J. Stefanowski, and M. Woźniak,
“Ensemble learning for data stream analysis: A survey,” Information
[3] M. M. Gaber, “Advances in data stream mining,” Wiley Interdisc. Rew.: Fusion, vol. 37, pp. 132–156, 2017.
Data Mining and Knowledge Discovery, vol. 2, no. 1, pp. 79–85, 2012.
[26] P. R. L. Almeida, L. S. Oliveira, A. S. B. Jr., and R. Sabourin, “Adapting
[4] S. Ramı́rez-Gallego, B. Krawczyk, S. Garcı́a, M. Wozniak, and F. Her- dynamic classifier selection for concept drift,” Expert Syst. Appl., vol.
rera, “A survey on data preprocessing for data stream mining: Current 104, pp. 67–85, 2018.
status and future directions,” Neurocomputing, vol. 239, pp. 39–57,
2017. [27] A. Bifet and R. Gavaldà, “Learning from time-changing data with
adaptive windowing,” in Proceedings of the Seventh SIAM Interna-
[5] J. Gama, I. Zliobaite, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A tional Conference on Data Mining, April 26-28, 2007, Minneapolis,
survey on concept drift adaptation,” ACM Comput. Surv., vol. 46, no. 4, Minnesota, USA, 2007, pp. 443–448.
pp. 44:1–44:37, 2014.
[6] M. Woźniak, “A hybrid decision tree training method using data
streams,” Knowl. Inf. Syst., vol. 29, no. 2, pp. 335–347, 2011.
[7] I. Zliobaite, M. Budka, and F. T. Stahl, “Towards cost-sensitive adap-
tation: When is it worth updating your predictive model?” Neurocom-
puting, vol. 150, pp. 240–249, 2015.

2244
Authorized licensed use limited to: Nitte Meenakshi Institute of Technology. Downloaded on June 10,2025 at 16:35:53 UTC from IEEE Xplore. Restrictions apply.

Buy ebook Learning from Data Streams in Evolving Environments Moamar Sayed-Mouchaweh cheap price
100% (2)
Buy ebook Learning from Data Streams in Evolving Environments Moamar Sayed-Mouchaweh cheap price
65 pages
An Active Learning Algorithm Based On Parzen Window Classification
No ratings yet
An Active Learning Algorithm Based On Parzen Window Classification
14 pages
IJETR033424
No ratings yet
IJETR033424
8 pages
DDD: A New Ensemble Approach For Dealing With Concept Drift: Leandro L. Minku, Member, IEEE, and Xin Yao, Fellow, IEEE
No ratings yet
DDD: A New Ensemble Approach For Dealing With Concept Drift: Leandro L. Minku, Member, IEEE, and Xin Yao, Fellow, IEEE
15 pages
A Review On Concept Drift
No ratings yet
A Review On Concept Drift
7 pages
Smart Pools of Data With Ensembles For Adaptive Learning in Dynamic Data Streams With Class Imbalance
No ratings yet
Smart Pools of Data With Ensembles For Adaptive Learning in Dynamic Data Streams With Class Imbalance
9 pages
Ida 2009 Slides
No ratings yet
Ida 2009 Slides
10 pages
Cano-Krawczyk2020 Article KappaUpdatedEnsembleForDriftin
No ratings yet
Cano-Krawczyk2020 Article KappaUpdatedEnsembleForDriftin
44 pages
Gama ACMCS AdaptationCD Accepted
No ratings yet
Gama ACMCS AdaptationCD Accepted
44 pages
Online Classification of Nonstationary Data Streams
No ratings yet
Online Classification of Nonstationary Data Streams
36 pages
a-survey-on-online-active-learning-t41pz1uj
No ratings yet
a-survey-on-online-active-learning-t41pz1uj
64 pages
IJRET - in Data Streams Using Classification and Clustering Different Techniques To Find Novel Class
No ratings yet
IJRET - in Data Streams Using Classification and Clustering Different Techniques To Find Novel Class
3 pages
Mathematics 11 00820
No ratings yet
Mathematics 11 00820
38 pages
Adaptive Parameter-Free Learning From Evolving Data Streams
No ratings yet
Adaptive Parameter-Free Learning From Evolving Data Streams
12 pages
Review Paper On Concept Drifting Data Stream Mining
No ratings yet
Review Paper On Concept Drifting Data Stream Mining
4 pages
A Survey On Learning From Imbalanced Data Streams: Taxonomy, Challenges, Empirical Study, and Reproducible Experimental Framework
No ratings yet
A Survey On Learning From Imbalanced Data Streams: Taxonomy, Challenges, Empirical Study, and Reproducible Experimental Framework
63 pages
Learning From Time-Changing Data With Adaptive Windowing
No ratings yet
Learning From Time-Changing Data With Adaptive Windowing
17 pages
Mining Noisy Data Streams via a Discriminative Model 1st Edition by Fang Chu, Yizhou Wang, Carlo Zaniolo 9783540233572 pdf download
No ratings yet
Mining Noisy Data Streams via a Discriminative Model 1st Edition by Fang Chu, Yizhou Wang, Carlo Zaniolo 9783540233572 pdf download
54 pages
Bonus Tema Grupiranje Tijekovnih Podataka
No ratings yet
Bonus Tema Grupiranje Tijekovnih Podataka
36 pages
Applied Sciences: Fast Reaction To Sudden Concept Drift in The Absence of Class Labels
No ratings yet
Applied Sciences: Fast Reaction To Sudden Concept Drift in The Absence of Class Labels
16 pages
Fading Histograms in Detecting Distribution and Concept Changes
No ratings yet
Fading Histograms in Detecting Distribution and Concept Changes
30 pages
Active Learning For Data Streams A Survey
No ratings yet
Active Learning For Data Streams A Survey
48 pages
Scalable and Efficient Multi-Label Classification For Evolving Data Streams
No ratings yet
Scalable and Efficient Multi-Label Classification For Evolving Data Streams
30 pages
Learning in Nonstationary Environments: A Survey: Gregory Ditzler
No ratings yet
Learning in Nonstationary Environments: A Survey: Gregory Ditzler
14 pages
GFJHFN
No ratings yet
GFJHFN
21 pages
surv
No ratings yet
surv
45 pages
6 Tracking Recurring Contexts Using Ensemble Classifiers
No ratings yet
6 Tracking Recurring Contexts Using Ensemble Classifiers
21 pages
1 s2.0 S1877050914010850 Main
No ratings yet
1 s2.0 S1877050914010850 Main
10 pages
The Problem of Concept Drift - Definitions and Related Work
No ratings yet
The Problem of Concept Drift - Definitions and Related Work
7 pages
Streaming Active Learning With Deep Neural Networks: Ash & Adams 2020
No ratings yet
Streaming Active Learning With Deep Neural Networks: Ash & Adams 2020
17 pages
1 s2.0 S1568494624008251 Main
No ratings yet
1 s2.0 S1568494624008251 Main
12 pages
Bayesian Nonparametric Unsupervised Concept Drift Detection for Data Stream Mining
No ratings yet
Bayesian Nonparametric Unsupervised Concept Drift Detection for Data Stream Mining
22 pages
A_Novel_Drift_Detection_Algorithm_Based
No ratings yet
A_Novel_Drift_Detection_Algorithm_Based
12 pages
Adversarial concept drift detection
No ratings yet
Adversarial concept drift detection
36 pages
5 Tracking - Recurrent - Concept - Drift - in - Streaming - Data - Using - Ensemble - Classifiers
No ratings yet
5 Tracking - Recurrent - Concept - Drift - in - Streaming - Data - Using - Ensemble - Classifiers
6 pages
KRAWXZYKINFFUS2017
No ratings yet
KRAWXZYKINFFUS2017
86 pages
Concept Drift
No ratings yet
Concept Drift
13 pages
ecmlpkdd2019
No ratings yet
ecmlpkdd2019
17 pages
TR1648
No ratings yet
TR1648
47 pages
StreamDFP_A_General_Stream_Mining_Framework_for_Adaptive_Disk_Failure_Prediction(1)
No ratings yet
StreamDFP_A_General_Stream_Mining_Framework_for_Adaptive_Disk_Failure_Prediction(1)
15 pages
Active Sample Learning and Feature Selection: A Unified Approach
No ratings yet
Active Sample Learning and Feature Selection: A Unified Approach
11 pages
KNN Classifier With Self Adjusting Memory For Heterogeneous Concept Drift
No ratings yet
KNN Classifier With Self Adjusting Memory For Heterogeneous Concept Drift
10 pages
hospedalesEtAl_pakdd2011
No ratings yet
hospedalesEtAl_pakdd2011
12 pages
A survey on machine learning for recurring concept drifting data streams
No ratings yet
A survey on machine learning for recurring concept drifting data streams
17 pages
1212.6018v1
No ratings yet
1212.6018v1
20 pages
li2017
No ratings yet
li2017
12 pages
CDSC Al
No ratings yet
CDSC Al
7 pages
Drift Survey Paper JETIR2411319
No ratings yet
Drift Survey Paper JETIR2411319
9 pages
Human Annotator For Imbalanced Dossier
No ratings yet
Human Annotator For Imbalanced Dossier
11 pages
Applying Temporal Dependence To de
No ratings yet
Applying Temporal Dependence To de
19 pages
Learning_under_Concept_Drift_A_Review
No ratings yet
Learning_under_Concept_Drift_A_Review
18 pages
1-s2.0-S0950705121000411-main
No ratings yet
1-s2.0-S0950705121000411-main
15 pages
An_Impact_Study_of_Concept_Drift_in_Federated_Learning
No ratings yet
An_Impact_Study_of_Concept_Drift_in_Federated_Learning
6 pages
Active Learning
No ratings yet
Active Learning
102 pages
2303.16906v1
No ratings yet
2303.16906v1
11 pages
A Pdf-Free Change Detection Test Based On Density Difference Estimation
No ratings yet
A Pdf-Free Change Detection Test Based On Density Difference Estimation
11 pages
A PDF Free Change Detection Test Based on Density Difference Estimation
No ratings yet
A PDF Free Change Detection Test Based on Density Difference Estimation
11 pages
Introduction To Group Theory
100% (1)
Introduction To Group Theory
77 pages
2024 JC2 H2 Math Prelim - ASRJC
No ratings yet
2024 JC2 H2 Math Prelim - ASRJC
49 pages
A New Decision Tree Learning Approach For Novel Class Detection in Concept Drifting Data Stream Classification
No ratings yet
A New Decision Tree Learning Approach For Novel Class Detection in Concept Drifting Data Stream Classification
8 pages
Vat-Ly-1 - Lecture - Gphysics-1.1 (ch1-6) .PPTX - (Cuuduongthancong - Com)
No ratings yet
Vat-Ly-1 - Lecture - Gphysics-1.1 (ch1-6) .PPTX - (Cuuduongthancong - Com)
116 pages
6632-Bootcamp in Credit Risk
No ratings yet
6632-Bootcamp in Credit Risk
167 pages
Clinical Application With VT K
No ratings yet
Clinical Application With VT K
82 pages
Fuzzy Logic Dissertation
100% (2)
Fuzzy Logic Dissertation
10 pages
Introduction To Machine Learning and Data Mining: Arturo J. Patungan, Jr. University of Sto. Tomas Strandasia
No ratings yet
Introduction To Machine Learning and Data Mining: Arturo J. Patungan, Jr. University of Sto. Tomas Strandasia
103 pages
(Chapter 2) Linear Function
No ratings yet
(Chapter 2) Linear Function
17 pages
A Statistical Analysis of The Holy Quran
100% (1)
A Statistical Analysis of The Holy Quran
14 pages
5.7 Rational Exponents
No ratings yet
5.7 Rational Exponents
29 pages
New Magicks For A New Age: Bibliography Book 1
100% (1)
New Magicks For A New Age: Bibliography Book 1
18 pages
Gali-Ch3
No ratings yet
Gali-Ch3
34 pages
Economic Analysis of Milk Production and Consumpti
No ratings yet
Economic Analysis of Milk Production and Consumpti
11 pages
Laboratory Exercise 1: Discrete-Time Signals: Time-Domain Representation
No ratings yet
Laboratory Exercise 1: Discrete-Time Signals: Time-Domain Representation
16 pages
Ds Notes 2011
No ratings yet
Ds Notes 2011
44 pages
Table: Advanced Modeling Options: Coba 1.fdb SAFE v12.3.2 - License # 09 Juni 2016
No ratings yet
Table: Advanced Modeling Options: Coba 1.fdb SAFE v12.3.2 - License # 09 Juni 2016
15 pages
Article Review 10 Eng
No ratings yet
Article Review 10 Eng
28 pages
Analysis of Recursive Algorithms
No ratings yet
Analysis of Recursive Algorithms
18 pages
Chp2-Binary Numbers and Codes (15.1.09)
No ratings yet
Chp2-Binary Numbers and Codes (15.1.09)
16 pages
01_01_exploring_life
No ratings yet
01_01_exploring_life
4 pages
Holiday Homework Class X - 2024-25
No ratings yet
Holiday Homework Class X - 2024-25
2 pages
Unit 5B TestReview 2022
No ratings yet
Unit 5B TestReview 2022
4 pages
Factoring Trinomials Test PDF
No ratings yet
Factoring Trinomials Test PDF
2 pages
Mean, Median, Mode, Range: By: P.K Sir
No ratings yet
Mean, Median, Mode, Range: By: P.K Sir
9 pages
Final CT8 Summary Notes PDF
100% (2)
Final CT8 Summary Notes PDF
110 pages
Project Report_ Combinatorial Game Theory
No ratings yet
Project Report_ Combinatorial Game Theory
3 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Kami Export - 33
No ratings yet
Kami Export - 33
2 pages
2022 MID-TERM EXAM - Fundamentals of Mathematics
No ratings yet
2022 MID-TERM EXAM - Fundamentals of Mathematics
2 pages
Variation of Parameters I I
No ratings yet
Variation of Parameters I I
3 pages
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
Uncertainty Theories and Multisensor Data Fusion
From Everand
Uncertainty Theories and Multisensor Data Fusion
Alain Appriou
No ratings yet

Combining_active_learning_with_concept_drift_detection_for_data_stream_mining

Uploaded by

Combining_active_learning_with_concept_drift_detection_for_data_stream_mining

Uploaded by

2018 IEEE International Conference on Big Data (Big Data)

Combining active learning with concept drift

Bartosz Krawczyk Bernhard Pfahringer Michał Woźniak

978-1-5386-5035-6/18/$31.00 ©2018 IEEE 2239

(a) Airlines (b) Electricity (c) Forest Cover

(d) RBF (e) Hyperplane (f) Tree

ratio of drift examples [%]

ratio of drift examples [%]

ratio of drift examples [%]

ratio of drift examples [%]

You might also like