0% found this document useful (0 votes)

25 views11 pages

A New Feature Subset Selection Using Bottom-Up Clu

The document proposes a new feature subset selection method called clustering-based feature subset selection (CFSS) that uses hierarchical clustering to group similar features into clusters. CFSS selects a representative feature from each cluster to reduce redundancy while maintaining relevance. It works as a filter method but evaluates features within clusters rather than individually. Experimental results on UCI datasets show CFSS is efficient and fast compared to other popular feature selection methods.

Uploaded by

Thanmai Muvva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views11 pages

A New Feature Subset Selection Using Bottom-Up Clu

Uploaded by

Thanmai Muvva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Pattern Anal Applic (2018) 21:57–66

https://doi.org/10.1007/s10044-016-0565-8

THEORETICAL ADVANCES

A new feature subset selection using bottom-up clustering

Zeinab Dehghan1 • Eghbal G. Mansoori1

Received: 6 December 2015 / Accepted: 10 June 2016 / Published online: 18 June 2016
Ó Springer-Verlag London 2016

Abstract Feature subset selection and/or dimensionality 1 Introduction

reduction is an essential preprocess before performing any
data mining task, especially when there are too many Dimensionality reduction plays a key role in machine
features in the problem space. In this paper, a clustering- learning to remove irrelevant and redundant features.
based feature subset selection (CFSS) algorithm is pro- Dimensionality reduction has two important techniques to
posed for discriminating more relevant features. In each reduce the number of features in data. The first one is
level of agglomeration, it uses similarity measure among feature extraction which shrinks the feature space into
features to merge two most similar clusters of features. By subspace of selected features and so changes the nature of
gathering similar features into clusters and then introducing features, like principal component analysis (PCA) and
representative features of each cluster, it tries to remove locally linear embedding (LLE) [1]. This drawback limits
some redundant features. To identify the representative the usage of this method. On the other hand, feature
features, a criterion based on mutual information is pro- selection chooses features without feature transforming and
posed. Since CFSS works in a filter manner in specifying space changing so this method is more preferred.
the representatives, it is noticeably fast. As an advantage of Feature (variable) selection reduces the number of fea-
hierarchical clustering, it does not need to determine the tures by various methods. Wrapper [2] is an iterative pro-
number of clusters in advance. In CFSS, the clustering cedure that selects many subsets of features by heavy
process is repeated until all features are distributed in some computations and scores subsets by their error rates.
clusters. However, to diffuse the features in a reasonable Sequential search methods like sequential feature selection
number of clusters, a recently proposed approach is used to (SFS) algorithm [3, 4], heuristic search schemes like
obtain a suitable level for cutting the clustering tree. To genetic algorithm (GA) [5] or particle swarm optimization
assess the performance of CFSS, we have applied it on (PSO) methods [6] are presented to reach local optimum
some valid UCI datasets and compared with some popular results while having less computational complexity [7].
feature selection methods. The experimental results reveal The efficiency of wrappers depends on classifier type and
the efficiency and fastness of our proposed method. its error rate, while the overfitting of classifier is a critical
issue [2]. Filter [8] is a fast method since it uses a ranking
Keywords Dimensionality reduction Feature selection algorithm to sort the features. In this method, each feature
Hierarchical clustering Feature clustering is evaluated without considering other features. Selected
features might be highly correlated, so selecting a few of
them would be sufficient [7]. This drawback can increase
redundancy by selecting similar features. Nevertheless,
filter methods can generally work better than wrapper
& Eghbal G. Mansoori algorithms [8].
[email protected]
Feature clustering is another type of feature selection.
1
School of Electrical and Computer Engineering, Clustering methods [9] can find the structure of data and so
Shiraz University, Shiraz, Iran discover groups (clusters) of similar data. A clustering

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

58 Pattern Anal Applic (2018) 21:57–66

method tries to divide data into clusters according to data objective function does not change. The k-means algorithm
similarity such that the between-clusters similarity is iteratively bisects the feature space and tries to form
minimized and the within-cluster similarity is maximized clusters which maximize the information gain.
[10]. K-means [11] and hierarchical algorithms [12] are In this paper, an extension of agglomerative hierarchical
popular in clustering methods. Hierarchical algorithm is clustering in feature space is presented. In this clustering-
done in top-down (divisive clustering) and bottom-up based feature subset selection (CFSS) method, the dis-
(agglomerative clustering) manner [12]. In hierarchical similarity measure of two clusters is the distance between
clustering, a tree structure (called dendrogram) is con- two representative members of those clusters instead of
structed [13]. The root of tree is a cluster including all data, their nearest members (e.g., single linkage) or farthest
and the leaves of tree are clusters of single data. By cutting members (e.g., complete linkage) [21]. Representative
the tree in each level, the clusters are obtained [14]. Many feature in each cluster is the feature with the maximum
papers tried to solve feature selection problems by clus- mutual information against other features in that cluster.
tering method [15–18]. In these methods, similar features CFSS works as a filter method but among similar features
are grouped into clusters according to their similarity or in a cluster. That is, the selected feature in each cluster is a
distance and then, representative features are selected from representative of that cluster in order to measure its dis-
each cluster. tance against other clusters. As an advantage of hierar-
In [16], an unsupervised feature selection based on chical clustering, it does not need to determine the number
clustering and k-nearest neighbor (kNN) [19] algorithms of clusters. In CFSS, the clustering process is repeated until
was proposed. This method can reach to features subspace all features are distributed in some clusters with at least two
by selecting features with minimum distance and then features. However, to diffuse the features in a reasonable
removing k neighbors until all features are selected or are number of clusters, we have used the method of GACH (a
removed. It seems that removing neighbors are not rea- grid-based algorithm for hierarchical clustering of high-
sonable and gathering neighbors in a group and then ana- dimensional data) [22] to obtain a suitable level for cutting
lyzing reaction between neighbors, it can achieve better the clustering tree.
result. As a supervised method, a feature selection based on The rest of this paper is organized as follows. In Sect. 2,
feature clustering was proposed in [17]. This algorithm our proposed CFSS method is explained. In Sect. 3, the
uses an agglomerative clustering approach based on experimental results are presented. Section 4 concludes the
Ward’s linkage method [20] and a new combination of paper.
conditional mutual information for distance measure
between features. In this method, the agglomerative clus-
tering algorithm is run until k ? 1 clusters are obtained. 2 Proposed method
The cluster with the lowest mutual information is elimi-
nated to remove the irrelevant features. Representative In feature selection, finding the nature of features and
feature in each cluster is the feature which has maximum selecting a few features among similar ones is an important
mutual information with the target. Though this algorithm issue. In this regard, clustering methods can solve this
tries to reduce the redundancy, it does not determine the problem by more computations than filter approaches. The
number of required clusters. Instead, it repeats the process reason of more simplicity and few complexities in filter
with different values of k and uses the average accuracy to methods is individual feature evaluation and not achieving
find the optimal number of clusters. features structure [23]. By incorporating clustering and
In [18], a feature selection algorithm was presented that filter methods, we can gain their advantages while dis-
works in two steps. In the first step, irrelevant and redun- carding their deficiencies. Our proposed CFSS uses a
dant features are eliminated and the remaining features are clustering method to discover the features’ structure and
divided into some clusters. For this purpose, graph-theo- applies a filter method to rank the features in each cluster.
retic clustering method is used to create the minimum By this manner, individual features evaluation of filter
spanning tree (MST) from a weighted complete graph and method is bypassed in clustering phase and only is needed
then partitioning the MST into a forest whose each tree in representative features’ selection. This selection of the
represents a cluster [18]. In the second step, the represen- best feature among similar ones in a cluster will conduce
tative feature of each cluster is chosen as the feature having into redundancy reduction. To attain this intention, an
maximum mutual information with the target. As a semi- effective approach for selecting the representative feature
supervised approach, a feature selection based on divisive in each cluster is also proposed.
clustering and k-means algorithms was also proposed [15]. An evaluation criterion for feature similarity is mutual
This algorithm initializes clusters by likelihood estimation information (MI). This is an efficient strategy to measure
and then iteratively finds the optimal clusters until their relevancy between two features or dependency between a

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Pattern Anal Applic (2018) 21:57–66 59

feature and its target. The MI between features fi and fj is where the difference between two data instances is com-
defined, in terms of entropy, as [24]: puted along their features.
XX pðfi ; fj Þ Given a dataset with M data instances fx1 ; . . .; xM g and
I fi ; fj ¼ pðfi ; fj Þ log ð1Þ N features ff1 ; . . .; fN g. These data have labels ft1 ; . . .; tM g
pðfi Þpðfj Þ
from T classes. In order to find the distance between
where pðf Þ is the probability distribution of feature f and instances xk and xl , it is required to compute
pðfi ; fj Þ is the joint probability distribution of features fi and xk xl ¼ ðxk1 xl1 ; . . .; xkN xlN Þ. For example, if fi is
fj . According to (1), the higher value of MI between two weight and fj is height, the difference xki xli is mean-
features fi and fj means that the similarity (correlation) ingful since both xki and xli represent weight. Similarly,
between them is more (i.e., distance between them is less) xkj xlj has meaning as both xkj and xlj are height. How-
as shown in Fig. 1 where Hðf Þ means the entropy of feature ever, in feature clustering, the distance term does not seem
f. reasonable as in measuring the distance between features fi
CFSS tries to discover the structure of data by hierar- and fj , we have to compute fi fj ¼ ðx1i x1j ; . . .;
chically collecting similar features in the same groups. At xMi xMj Þ, while the difference between height and weight
first, each feature is assumed as an individual cluster. These is meaningless. Thus in feature clustering, similarity term
clusters (features) are placed in leaves (lowest level) of the is more suitable than distance term and so using similarity-
hierarchical clustering tree and are called: L ¼ based criteria instead of single linkage and complete
ðC1 ; . . .; CN Þ as shown in Fig. 2a. linkage is more reasonable.
Using (1), two nearest (with the highest MI) clusters Ci The similarity method, presented here, offers similarity
and Cj in level 1 (indeed, features fi and fj ) are selected and between representative features in clusters. The represen-
grouped as a new cluster Cl in level 2 (see Fig. 2b). Sim- tative feature in each cluster is the feature that has maxi-
ilarly, cluster Ck is constructed. In the representative mum dependency with the target (class label). Since this
introducing of these clusters, only one of their features feature has the maximum information, it can be the best
would be selected. This is because using highly similar candidate for representative of the similar features in a
features is not reasonable and might increase the cluster. Using this criterion, the similarity between repre-
redundancy. sentative feature in cluster Cl and other features (see
In the next step of algorithm, two clusters being merged Fig. 2b) is measured and the maximum similarity is used to
might have one/two features. In this case, the similarity select the next feature for merging. In new level 3 (in
between two clusters or between one cluster and one fea- Fig. 2c), cluster Cl is merged by another feature and new
ture should be measured. There are some methods for cluster Cp is created. In continuing this process, all features
measuring the distance between clusters such as single are merged into clusters until not any feature remains in a
linkage [25] and complete linkage [26]. These methods are leaf (indeed, all clusters in level 1 are merged). In this step,
suitable for measuring the distance between data clusters CFSS terminates and the constructed clusters with their
members are reported.
One of the advantages of hierarchical clustering is that it
does not need to know the number of clusters. Similarly in
( ) ( ) this work, the number of clusters is not set initially.
Instead, the process terminates when all features are dis-
( ; ) tributed in clusters with at least two features. After that, the
representative features in these clusters should be found.
The number of selected features from each cluster is
different in various methods. In some methods, for redun-
Fig. 1 Mutual information between two features fi and fj dancy reduction, only one feature is selected from each

Fig. 2 Steps of proposed

hierarchical clustering

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Pattern Anal Applic (2018) 21:57–66

cluster. Because of high similarity between features in a In addition to the Euclidian distance, the distance pro-
cluster, this single feature might be sufficiently good repre- posed in [27] is extended here (called DistFR) in order to
sentative. On the other hand, some authors believe that measure the distance between two features. In [27], each
selecting just top-ranked features is not enough since a com- feature is evaluated individually using this measure:
bination of good and bad features might be more appropriate.
ðl l0 Þ2
In other methods, feature is not selected from some clusters at Dð f Þ ¼ ð3Þ
r2 þ r02
all. For example, in [17] a cluster with the lowest mutual
information is eliminated to remove the irrelevant features. where l and r are the mean and standard deviation of
0
For selecting the features in CFSS, clusters are ranked feature f in the target class (l0 and r belong to non-target
according to their importance (i.e., ranking order). In this class). This measure prefers feature fi to feature fj if

regard, possibly more than one feature is selected from some Dðfi Þ [ D fj . To measure the distance of two features fi
clusters and fewer or even no feature from others. In order to and fj , the criterion in (3) is rewritten as:
determine how many features are sufficient for representing
a given dataset, the approach proposed in GACH [22] is used ðl l0it Þ2 ðljt l0jt Þ2
d fi ; fj ¼ max it2 max ð4Þ
here. In GACH as a hierarchical clustering method, the t¼1...T rit þ r02
it t¼1...T r2jt þ r02
jt
clusters are merged in a bottom-up manner until a termi-
nating situation is occurred. This condition would help where lit and rit are the mean and standard deviation of
GACH to determine the optimal number of clusters. Though feature fi in tth class (l0it and r0it belong to all except
in GACH, the data instances (not the features) are clustered, t classes). Criterion (4) is used for similarity comparing in
its stopping scheme is customized here to end up the merging (2) between two features.
process of the features’ clusters. Accordingly, all features are Using (2) for merging up the clusters of features and
considered as individual clusters at first. Then, in a merging employing the GACH approach for ending up the merging
loop, two nearest clusters Ci and Cj are selected and are process, number of clusters is predicted. The value of P
merged into cluster Cl according to this criterion: would be in (0, 1) and grows in ascending and regular rises.
P nP P o But in some stages of merging, Pð:; :Þ has some jumps
f 2Cl dðf ; c l Þ f 2Ci dðf ; c i Þ þ f 2Cj dðf ; c j Þ which can be good candidates for stopping the merging
ðCi ; Cj Þ ¼ PN PJ P process and so determining the optimal number of features
k¼1 d ð fk ; c Þ k¼1 f 2Ck dðf ; ck Þ
[22]. Figure 3 reveals these stopping positions in Pð:; :Þ
ð2Þ values, depicted for Sensor dataset (with 24 features), when
where c is the representative (centroid) of all N features,ci Euclidian and DistFR distances are used.
and cj are the representatives of features’ clusters Ci and According to Fig. 3, the Pð:; :Þ values increases in steady
Cj , respectively. Also, cl is the centroid of new cluster Cl , manner, but in 21st step, it has a sudden jump. This point is
and J is the number of features’ clusters. The function a good position to stop the merging process. Consequently,
dð:; :Þ is any dissimilarity (distance) metric between each 3 (=24–21) features are optimum for Sensor dataset.
pair of features.

Fig. 3 Pð:; :Þ values in features’ clusters merging of Sensor dataset. a Euclidian distance, b DistFR distance

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Pattern Anal Applic (2018) 21:57–66 61

On the other hand, the features of Sensor dataset are Table 1 Datasets used in the experiments
grouped into 7 clusters, using GACH approach. This means Dataset No. of No. of No. of
that we need only 3 out of 7 clusters in order to select one features instances classes
feature of each one as representative. For this purpose, the
Wine 13 178 3
clusters should be ranked in some manner. As a heuristic
criterion, the crowd of features in each cluster is considered Vehicle 18 946 4
here. Since some best features from each cluster should be Sensor readings 24 5456 4
selected as its representative, another feature ranking cri- WDBC 30 569 2
terion is required. In this regard, the mutual information is Prognostic D cancer 30 569 2
used. Prognostic B cancer 33 198 2
According to these explanations, the algorithm of CFSS Ionosphere 34 351 2
is given here. Spect heart 44 267 2
Lung cancer 56 32 3
Spam base 57 4601 2
Algorithm: CFSS (clustering-based feature subset selection) Sonar 60 208 2
Inputs: M data instances fx1 ; . . .; xM g with features ff1 ; . . .; fN g
and labels ft1 ; . . .; tM g
0
Output: the best features in F
1. Compute the similarity between each pair of features fi and fj
using (1): sij ¼ Iðfi ; fj Þ which forms S ¼ ½sij NN (4, 5, 8, 13) (2, 7, 9, 11)
2. Compute the similarity (relevancy) of each feature fi to class
labels using (1): ri ¼ Iðfi ; tÞ which forms R ¼ ½ri N1
3. Consider each feature fi as an individual cluster Ci , that is
Ci ¼ ffi g (4, 5, 8) (1, 6, 12) (2, 9, 11)
4. Gather all clusters Ci ‘s in L which gives L ¼ fC1 ; . . .; CN g
5. Let each feature fi be representative feature of cluster Ci
(4, 8) (1, 6) (3, 10) (2, 9)
6. Gather all representative features fi ‘s in F which gives
F ¼ ff1 ; . . .; fN g
7. Repeat 4 8 5 13 1 6 12 3 10 2 9 11 7
7.1. Find two most similar clusters Ci and Cj in L (according
to S via their representative features fi and fj in F) Fig. 4 Hierarchical clustering of Wine’s features by CFSS
7.2. Construct new cluster Cl by merging Ci and Cj , that is
Cl ¼ Ci [ Cj
3 Experimental results
7.3. Remove clusters Ci and Cj from L, that is
L ¼ L fCi ; Cj g
7.4. Include cluster Cl in L, that is L ¼ L [fCl g In this work, eleven various datasets from UCI ML
7.5. Find fl as the most relevant feature in cluster Cl repository [28] are used for comparing CFSS against some
(according to R) common and recent methods. Table 1 summarizes their
7.6. Introduce fl as the representative feature of cluster Cl statistics.
7.7. Remove features fi and fj from F, that is F ¼ F ffi ; fj g Firstly, the capability of CFSS in clustering of features
7.8. Include feature fl in F, that is F ¼ F [ffl g is compared versus GACH, as the most similar method. For
8. Until there is no single-feature cluster, that is this purpose, the hierarchical view of clusters, obtained by
9= Ci 2 L : jCi j ¼ 1 CFSS for Wine dataset, is depicted in Fig. 4. As shown, the
9. Determine the optimum number of features, n, using GACH clustering phase of CFSS terminates when four clusters
approach with 4, 3, 2 and 4 features are established.
10. Rank, in descending order of their cardinalities, the clusters in Figure 5 shows the dendrogram of features obtained by
L
GACH. In this case, three clusters with 5, 5 and 3 features
11. If n\jLj
0
are obtained. Comparing the two big clusters of CFSS and
F = {representative features of n top-ranked clusters in L }
GACH, it justifies that about 34 of features are in common.
Else
0
Since the dendrogram of features for higher-dimensional
F = {representative features of all clusters in L } [ { data is very huge, the clusters of three datasets (Vehicle,
n jLj features from ranked clusters in L }
Prognostic B cancer and Ionosphere) are gathered in
End if
0
Table 2. In this table, the clusters of features, obtained by
12. Return F as the set of best features
CFSS and GACH, are included where the similarity of

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

62 Pattern Anal Applic (2018) 21:57–66

(6, 7, 9, 11, 12) (2, 3, 4, 5, 8) Table 3 Effect of mutual information versus Euclidean distance on
CFSS
Dataset Classification accuracy
(6, 7, 9, 12) (2, 3, 4, 5)
Mutual information Euclidian distance

Wine 95.35 95.24

(6, 7, 9) (3, 4, 5) (1, 10, 13)
Vehicle 68.49 69.23
Sensor readings 93.67 86.75
WDBC 93.93 94.39
(6, 7) (3, 4) (1, 13) Prognostic D cancer 93.86 96.14
Prognostic B cancer 71.63 71.47
6 7 9 12 11 3 4 5 2 8 1 13 10 Ionosphere 89.23 89.14
Spect heart 75.27 74.00
Fig. 5 Hierarchical clustering of Wine’s features by GACH Lung cancer 62.33 48.67
Spam base 87.77 86.15
Table 2 Results of CFSS and GACH in feature clustering Sonar 83.35 75.85
Average 83.17 80.64
Dataset Clustering Clusters of features
method

Vehicle CFSS (1, 3, 4, 7, 8, 9, 11, 12, 16), (2, 10, 13, Table 4 Effect of our heuristic criterion versus CC1 on CFSS
15), (5, 6, 14, 17, 18)
Dataset No. of features Classification accuracy
GACH (1, 3, 4, 7, 9, 11, 12, 14), (2, 10, 13, 16), (by GACH)
(5, 6, 15, 17, 18) Heuristic CC1
Prognostic CFSS (1, 2, 4, 5), (3, 13, 23), (6, 11, 26), (7, 8,
B cancer 9, 29), (10, 30, 32, 33), (12, 14, 15), Wine 4 95.35 95.35
(16, 17, 20, 21), (18, 19), (22, 24, 25), Vehicle 6 68.49 66.15
(27, 28, 31) Sensor readings 3 93.67 85.42
GACH (1, 2, 4, 5, 22, 24, 25), (3, 13, 23), (6, 11, WDBC 5 93.93 95.36
26), (7, 8, 9, 29), (10, 30, 32, 33), (12,
Prognostic D cancer 6 93.86 95.98
14, 15), (16, 17, 18, 19, 20, 21), (27, 28,
31) Prognostic B cancer 4 71.63 74.47
Ionosphere CFSS (1, 2, 3, 5, 7, 9), (4, 6, 8, 10, 12, 14, 16), Ionosphere 5 89.23 87.43
(11, 15, 17), (13, 25, 27, 29), (18, 20), Spect heart 3 75.27 78.92
(19, 21, 23), (22, 24), (26, 28, 30), (31, Lung cancer 5 62.33 50.33
33), (32, 34)
Spam base 9 87.77 81.00
GACH (1, 2, 3, 5, 7), (4, 6, 16), (8, 10, 12, 14,
34), (9, 11, 13, 27), (15, 21, 17, 19, 23), Sonar 5 83.35 76.15
(18, 24, 25, 26, 30, 32), (20, 22, 28, 29), Average 83.17 80.6
(31, 33)

results is obvious. Additionally, since there is no random-

ness in the algorithms of both CFSS and GACH, the As mentioned before, distance metrics are suitable for
obtained clusters of features would be consistent. data clustering, whereas similarity/dissimilarity measures
To evaluate the effectiveness of selected features by are more appropriate in feature clustering. To justify this
CFSS and other methods, these features are used in a kNN claim at the first experiment, CFSS is run by two metrics:
classifier with k = 3. In this regard, the accuracy of clas- mutual information and Euclidian distance. The classifi-
sifier using the selected features is obtained by tenfold cation accuracies are reported in Table 3 where the better
cross-validation method [29]. That is, each dataset is peformance for each dataset is in boldface. Clearly, the
divided into tenfold partitions that nine are used for train- results of mutual information are better in most datasets
ing set and one is used for test set. The same procedure is and also in average.
performed nine times after exchanging the role of each According to algorithm of CFSS, the final features are
partition so that all partitions are used for test once. In not chosen from all clusters. Instead, the clusters are
order to the selected features be unbiased to initial division, ranked, and the required features are selected accordingly.
the 10-CV is iterated ten times by using different parti- In order to show the capability of our heuristic approach, it
tioning and then the average of results is reported. is compared with the criterion in [30]. This criterion tries to

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Pattern Anal Applic (2018) 21:57–66 63

Fig. 6 Distribution of Ionosphere dataset from aspect of two best features. a CFSS uses heuristic criterion, b CFSS uses CC1 criterion

Table 5 Performance of cooperating GACH with CFSS

Dataset No. of features GACH ? CFSS Trial and error ? CFSS
No. of features Classification accuracy No. of features Classification accuracy

Wine 13 4 95.35 10 98.18

Vehicle 18 6 68.49 11 71.36
Sensor readings 24 3 93.67 3 93.67
WDBC 30 5 93.93 17 96.62
Prognostic D cancer 30 6 93.86 21 96.57
Prognostic B cancer 33 4 71.63 22 71.68
Ionosphere 34 5 89.23 6 89.37
Spect heart 44 3 75.27 11 77.62
Lung cancer 56 5 62.33 5 62.33
Spam base 57 9 87.77 16 89.33
Sonar 60 5 83.35 44 86.85
Average 5.00 83.17 15.09 84.87

select those instances in a cluster which can decrease the To assess this visually, Fig. 6 compares the distribution
distances in that cluster. It uses CC1 criterion which is of Ionosphere dataset from aspect of two best features
customized here to being used by CFSS: when CC1 and our criteria are used. Obviously, the
P instances of data are less correlated when they are seen
f 2Cj d f ; cj
CC1 ¼ ð5Þ from viewpoint of two best features returned by CFSS
jCj j using heuristic method.
where cj is the centroid of features’ cluster Cj , as before. In this part, the positive cooperation of GACH method,
According to (5), the cluster with the least CC1 is the best. in estimating the number of features, and CFSS, in
Table 4 shows the results of our heuristic versus CC1 selecting good features, is approved via experiments. In
criterion where the higher accuracy for each dataset is in this regard, the number of features, estimated by GACH, is
boldface. In this table, the optimum number of clusters, used by CFSS for feature selection. Also, the optimal
reported by GACH, is also included. These results say that number of features is used as well (to find the optimum,
choosing the final features heuristically from appropriate CFSS is run several times with different number of features
clusters is more efficient than CC1 criterion. and the best one is obtained by trial and error). The

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

64 Pattern Anal Applic (2018) 21:57–66

Table 6 Robustness of CFSS to outliers and noises

Dataset Classification accuracy after filtering outliers and noises
Data without Data with Data with Data with Data with
outlier/noise 10 % outlier 20 % outlier 10 % noise 20 % noise

Wine 95.35 95.35 95.35 95.30 95.24

Vehicle 68.49 66.15 68.49 68.30 67.90
Sensor readings 93.67 92.06 93.67 93.39 93.33
WDBC 93.93 93.93 93.93 93.80 93.60
Prognostic D cancer 93.86 95.64 95.64 93.70 93.32
Prognostic B cancer 71.63 68.00 71.63 71.12 70.00
Ionosphere 89.23 88.37 88.37 88.64 88.21
Spect heart 75.27 75.92 75.69 75.16 74.32
Lung cancer 62.33 59.67 62.33 62.00 61.17
Spam base 87.77 87.22 83.57 87.55 87.50
Sonar 83.35 80.70 80.70 83.20 83.13
Average 83.17 82.10 82.67 82.92 82.52

Table 7 Comparison of CFSS

Dataset Classification accuracy using features obtained by
against three other methods in
feature selection All feature ReliefF mRMR L1-LSMI CFSS

Wine 96.35 91.47 94.41 91.82 95.35

SpectF heart 71.00 71.50 79.08 75.19 75.27
Vehicle 70.81 70.5 63.36 65.37 68.49
Sensor readings 87.50 94.08 94.00 93.00 93.67
Prognostic D cancer 96.87 92.48 92.43 88.96 93.86
Prognostic B cancer 74.47 66.74 65.68 72.37 71.63
Ionosphere 85.74 79.71 86.29 87.43 89.23
Spam base 90.25 87.17 87.00 87.00 87.77
Sonar 83.70 77.95 79.00 59.80 83.35
WDBC 96.87 89.86 92.87 89.89 93.93
Lung cancer 46.33 63.00 59.67 43.67 62.33
Average 81.81 80.41 81.25 77.68 83.17

performance of these features for each dataset is shown in robust to outliers and noises as the classification accuracies
Table 5. Obviously, the cooperation performance of confirm.
GACH and CFSS is reasonable as the classification accu- In this part, our CFSS is compared against two com-
racies are near optimal. This is achieved by using only one- mon methods, mRMR [8] and ReliefF [31] and a new
third of features, in average. method, L1-LSMI [32]. The mRMR is a method based on
In order to examine the robustness of CFSS to outliers, information theoretic which tries to select features by
some remote instances of each dataset are pretended as high dependency with class labels and less relevancy to
outliers and are removed temporarily. For this purpose, the other features. ReliefF is a filter method which selects
centroid of instances in each class is computed and some data instances at random and then changes the weight of
percent of the farthest instances to centroid are set aside. relevant (nearest) features. L1-LSMI is a least squares
Table 6 reports the performance of features, selected by feature selection by L1-penalized squared-loss mutual
CFSS, when outlier instances are filtered. In addition to information. The comparison results are given in Table 7
outliers, some random noisy data, with uniform distribu- where the performance of CFSS is the best in some
tion, are also added to each dataset and then their best datasets and also in average (as shown in bold).
features are extracted. The results of CFSS on these noisy CFSS is assessed in terms of time complexity as well.
datasets are also included in Table 6. For each dataset, the For this purpose, the CPU time needed to run CFSS,
best performance is highlighted in bold. Clearly, CFSS is ReliefF, mRMR and L1-LSMI for feature selection is

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Pattern Anal Applic (2018) 21:57–66 65

Table 8 CPU Time (in second) of CFSS against three other methods References
Dataset ReliefF mRMR L1-LSMI CFSS
1. Roweis S, Saul L (2000) Nonlinear dimensionality reduction by
Wine 0.53 1.04 19.14 0.73 locally linear embedding. Science 290(5500):2323–2326
2. Kohavi R, John GH (1997) Wrapper for feature subset selection.
SpectF heart 0.59 0.89 5.57 1.75
Artif Intell 97(1–2):273–324
Vehicle 1.00 2.82 23.00 0.80 3. Pudil P, Novovicova J, Kittler J (1994) Floating search methods
Sensor readings 11.11 11.25 50 3.96 in feature selection. Pattern Recognit Lett 15:1119–1125
Prognostic D cancer 0.85 13.7 53.36 1.44 4. Reunanen J (2003) Overfitting in making comparisons between
variable selection methods. J Mach Learn Res 3:1371–1382
Prognostic B cancer 0.46 5.03 23.68 1.25 5. Goldberg D (1989) Genetic algorithms in search, optimization
Ionosphere 0.59 14.37 34.18 1.42 and machine learning. Addison Wesley, Reading
Spam base 18.09 20.00 53.00 11.48 6. Kennedy J, Eberhart RC (1995) Particle swarm optimization.
IEEE Int Conf Neural Netw 4:942–1948
Sonar 0.52 12.80 11.99 3.72
7. Chandrashekar G, Sahin F (2014) A survey on feature selection
WDBC 0.84 58.62 29.67 1.46 methods. Comput Electr Eng 40(1):16–28
Lung cancer 0.30 0.70 5.44 0.90 8. Peng H, Long F, Ding C (2005) Feature selection based on
Average 3.17 12.84 28.09 2.63 mutual information: criteria of max-dependency, max-relevance,
and min-redundance. IEEE Trans Pattern Anal Mach Intell
27(8):1226–1238
9. Dubes R, Jain AK (1980) Clustering methodologies in explora-
tory data analysis. In: Yovits MC (ed) Advances in computers.
Academic Press Inc., New York, pp 113–125
computed and reported in Table 8 where the best method is 10. Kasim S, Deris S, Othman RM (2013) Multi-stage filtering for
bolded. Based on the average, our CFSS is fastest though in improving confidence level and determining dominant clusters in
most of datasets, ReliefF has the least computational clustering algorithms of gene expression data. Comput Biol Med
43:1120–1133
complexity. This might be because of filter nature of
11. MacQueen JB (1967) Some methods for classification and anal-
ReliefF. ysis of multivariate observations. In: Proceedings of 5th Berkeley
symposium on mathematical statistics and probability, vol 1.
University of California Press, pp 281–297
12. Rokach L, Maimon O (2005) Clustering methods. In: Data
4 Conclusion
mining and knowledge discovery handbook. Springer, New York,
pp 321–352
In this work, we presented a new feature subset selection 13. Manning CD, Schütze H (1999) Foundations of statistical natural
method based on hierarchical clustering. In each level of language processing. MIT Press, Cambridge
14. Rafsanjani MK, Varzaneh ZA, Chukanlo NE (2012) A survey of
agglomeration, it uses similarity measure among features,
hierarchical clustering algorithms. J Math Comput Sci
instead of their distance, to merge two most similar fea- 5(3):229–240
tures’ clusters. Gathering similar features into clusters and 15. Yu-chieh WU (2014) A top-down information theoretic word
then using filter method among similar features leads to clustering algorithm for phrase recognition. Inf Sci 275:213–225
16. Mitra P, Murthy C, Pal SK (2002) Unsupervised feature selection
redundancy reduction. Our method does not need to
using feature similarity. IEEE Trans Pattern Anal Mach Intell
determine the number of clusters in advance. Instead of 24(3):301–312
choosing features from all clusters, only more important 17. Sotoca JM, Pla F (2010) Supervised feature selection by clus-
clusters are used. To estimate an appropriate number of tering using conditional mutual information based distances.
Pattern Recogn 43(6):325–343
features for each dataset, the method of GACH is used.
18. Song Q, Ni J, Wang G (2013) A fast clustering-based feature
By applying CFSS algorithm to extract the best features subset selection algorithm for high-dimensional data. IEEE Trans
of some UCI datasets and then using them by a kNN Knowl Data Eng 25(1):1–14
classifier, we assessed our proposed method in comparison 19. Altman NS (1992) An introduction to kernel and nearest neighbor
nonparametric regression. Am Stat 46(3):175–185
with some feature selection methods. Via experimental
20. Ward JH (1963) Hierarchical grouping to optimize an objective
results, we showed that CFSS is reasonably efficient since function. J Am Stat Assoc 58(301):236–244
it tries to merge the similar clusters and then selects some 21. Song Y, Jin S, Shen J (2011) A unique property of single-link
good representatives from each cluster of features. More- distance and its application in data clustering. Data Knowl Eng
70:984–1003
over, it is noticeably fast since it works in a filter manner to
22. Mansoori EG (2014) GACH: a grid-based algorithm for hierar-
choose the representative features. chical clustering of high-dimensional data. Soft Comput
The stopping condition in merging clusters of features in 18(5):905–922
addition to the appropriate number of representative fea- 23. Khedkar SA, Bainwad AM, Chitnis PO (2014) A survey on
clustered feature selection algorithms for high dimensional data.
tures of each cluster is still two important and open issues
Int J Comput Sci Inf Technol (IJCSIT) 5(3):3274–3280
in our method. The future extension of CFSS should con- 24. Cover TM, Thomas JA (1991) Elements of information theory.
centrate on these two vital drawbacks. Wiley, New York

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

66 Pattern Anal Applic (2018) 21:57–66

25. Sibson R (1973) SLINK: an optimally efficient algorithm for the 30. Raskutti B, Leckie C (1999) An evaluation of criteria for mea-
single-link cluster method. Comput J (Br Comput Soc) suring the quality of clusters. In: Proceedings of the international
16(1):30–34 joint conference of artificial intelligence, pp 905–910
26. Defays D (1977) An efficient algorithm for a complete link 31. Robnik-Sikonja M, Kononenko I (1997) An adaptation of relief
method. Comput J (Br Comput Soc) 20(4):364–366 for attribute estimation in regression. In: Machine learning pro-
27. Mansoori EG (2013) Using statistical measures for feature ceedings of the fourteenth international conference (ICML),
ranking. Int J Pattern Recognit Artif Intell 27(1):1–14 pp 296–304
28. Asuncion A, Newman DJ (2007) UCI machine learning reposi- 32. Jitkrittum W, Hachiya H, Sugiyama M (2013) Feature selection
tory. Department of Information and Computer science, Univer- via L1-penalized squared loss mutual information. IEICE Trans
sity of California, Irvine, CA, online available: http://www.ics. Inf Syst 96(7):1513–1524
uci.edu/mlearn/MLRepository.html
29. McLachlan GJ, Do KA, Ambroise C (2004) Analyzing
microarray gene expression data. Wiley, New York

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:

1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at

[email protected]

TimeSeriesAnalysis&ItsApplications2e Shumway PDF
91% (11)
TimeSeriesAnalysis&ItsApplications2e Shumway PDF
82 pages
Oracle Inventory Cloud 2024 Implementation Professional
No ratings yet
Oracle Inventory Cloud 2024 Implementation Professional
44 pages
Filipino Style Sushi: The Business Concept and The Business Model
No ratings yet
Filipino Style Sushi: The Business Concept and The Business Model
5 pages
0200 Rulebook V1.6
No ratings yet
0200 Rulebook V1.6
37 pages
LinuxCon Introduction To CloudStack
No ratings yet
LinuxCon Introduction To CloudStack
34 pages
Automatic Feature Subset Selection Using Genetic Algorithm For Clustering
No ratings yet
Automatic Feature Subset Selection Using Genetic Algorithm For Clustering
5 pages
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
No ratings yet
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
7 pages
2015-Elsevier-Multi-objective-optimization-of-shared-nearest-neighbor-similarity-for-feature-selection
No ratings yet
2015-Elsevier-Multi-objective-optimization-of-shared-nearest-neighbor-similarity-for-feature-selection
12 pages
An Improved Fast Clustering Method For Feature Subset Selection On High-Dimensional Data Clustering
No ratings yet
An Improved Fast Clustering Method For Feature Subset Selection On High-Dimensional Data Clustering
5 pages
Improving Floating Search Feature Selection Using Genetic Algorithm
No ratings yet
Improving Floating Search Feature Selection Using Genetic Algorithm
19 pages
1.a Faster Clustering-Based Feature Subset Selection Algorithm For High Dimensional Data
No ratings yet
1.a Faster Clustering-Based Feature Subset Selection Algorithm For High Dimensional Data
3 pages
A_New_Feature_Selection_Method_Based_on_Frequent_a
No ratings yet
A_New_Feature_Selection_Method_Based_on_Frequent_a
15 pages
Feature selection techniques
No ratings yet
Feature selection techniques
5 pages
2015 Elsevier Kernel Penalized K Means a Feature Selection Method Based on Kernel K Means
No ratings yet
2015 Elsevier Kernel Penalized K Means a Feature Selection Method Based on Kernel K Means
11 pages
A Review of Feature Selection and Its Methods
No ratings yet
A Review of Feature Selection and Its Methods
15 pages
Improvised Method of FAST Clustering Based Feature Selection Technique Algorithm For High Dimensional Data
No ratings yet
Improvised Method of FAST Clustering Based Feature Selection Technique Algorithm For High Dimensional Data
6 pages
Literature Review On Feature Subset Selection Techniques
No ratings yet
Literature Review On Feature Subset Selection Techniques
3 pages
Toward Integrating Feature Selection Algorithms For Classification and Clustering-M7s PDF
No ratings yet
Toward Integrating Feature Selection Algorithms For Classification and Clustering-M7s PDF
12 pages
Electronics 11 02735 v2
No ratings yet
Electronics 11 02735 v2
19 pages
A Review of Feature Selection and Its Methods: Cybernetics and Information Technologies March 2019
No ratings yet
A Review of Feature Selection and Its Methods: Cybernetics and Information Technologies March 2019
25 pages
Feature Subset Selection With Fast Algorithm Implementation
No ratings yet
Feature Subset Selection With Fast Algorithm Implementation
5 pages
3.3 A Review of Unsupervised Feature Selection Methods
No ratings yet
3.3 A Review of Unsupervised Feature Selection Methods
42 pages
A Fast Clustering-Based Feature Subset Selection Algorithm For High Dimensional Data
No ratings yet
A Fast Clustering-Based Feature Subset Selection Algorithm For High Dimensional Data
8 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
5 pages
A Review of Feature Selection Methods On Synthetic Data
No ratings yet
A Review of Feature Selection Methods On Synthetic Data
37 pages
Survey 2006
No ratings yet
Survey 2006
15 pages
Fusion of Feature Selection With Symbolic Approach For Dimensionality Reduction
No ratings yet
Fusion of Feature Selection With Symbolic Approach For Dimensionality Reduction
4 pages
10.1515 - Jaiscr 2015 0031
No ratings yet
10.1515 - Jaiscr 2015 0031
8 pages
n2020
No ratings yet
n2020
6 pages
3038-Article Text-5729-1-10-20210418
No ratings yet
3038-Article Text-5729-1-10-20210418
6 pages
An Unsupervised Feature Selection Algori
No ratings yet
An Unsupervised Feature Selection Algori
12 pages
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
No ratings yet
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
8 pages
li2017
No ratings yet
li2017
7 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
Feature Selection: A Literature Review
No ratings yet
Feature Selection: A Literature Review
19 pages
Feature selection 2011 Kotsiantis
No ratings yet
Feature selection 2011 Kotsiantis
20 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
A Study On Feature Selection Techniques in Bio Informatics
100% (1)
A Study On Feature Selection Techniques in Bio Informatics
7 pages
Lecture#10
No ratings yet
Lecture#10
24 pages
A Comparative Study Between Feature Selection Algorithms - Ok
No ratings yet
A Comparative Study Between Feature Selection Algorithms - Ok
10 pages
Paper Minig and Association
No ratings yet
Paper Minig and Association
5 pages
Multiclass Recognition With Multiple Feature Trees
No ratings yet
Multiclass Recognition With Multiple Feature Trees
7 pages
Feature Selection Techniques For Microarray Dataset: A Review
No ratings yet
Feature Selection Techniques For Microarray Dataset: A Review
8 pages
A Novel Approach For Feature Selection Based On Correlation Measures CFS and Chi Square
No ratings yet
A Novel Approach For Feature Selection Based On Correlation Measures CFS and Chi Square
13 pages
Feature Selection 1692278667
No ratings yet
Feature Selection 1692278667
100 pages
Clustering Before Classification
No ratings yet
Clustering Before Classification
3 pages
Lecture 18
No ratings yet
Lecture 18
27 pages
A Review of Feature Selection Methods With Applications
No ratings yet
A Review of Feature Selection Methods With Applications
6 pages
Graph Regularized Feature Selection With Data Reconstruction
No ratings yet
Graph Regularized Feature Selection With Data Reconstruction
10 pages
2015 Elsevier Feature Selection for Clustering Using Instance Based Learning by Exploring the Nearest and Farthest Neighbors
No ratings yet
2015 Elsevier Feature Selection for Clustering Using Instance Based Learning by Exploring the Nearest and Farthest Neighbors
14 pages
Knowledge Mining Using Classification Through Clustering
No ratings yet
Knowledge Mining Using Classification Through Clustering
6 pages
Unit 4 - Data Warehousing and Mining
No ratings yet
Unit 4 - Data Warehousing and Mining
51 pages
Cluster Analysis BRM Session 14
No ratings yet
Cluster Analysis BRM Session 14
25 pages
Feature Subset Selection: A Correlation Based Filter Approach
No ratings yet
Feature Subset Selection: A Correlation Based Filter Approach
4 pages
A Survey On Evolutionary Multiobjective Feature Selection in Classification Approaches Applications and Challenges
No ratings yet
A Survey On Evolutionary Multiobjective Feature Selection in Classification Approaches Applications and Challenges
21 pages
IJETR2225
No ratings yet
IJETR2225
3 pages
Samplepaper
No ratings yet
Samplepaper
13 pages
Feature Selection Based On Fuzzy Entropy
No ratings yet
Feature Selection Based On Fuzzy Entropy
5 pages
Feature Selection Methods
No ratings yet
Feature Selection Methods
24 pages
@ailib Robust Feature Selection Algorithms
No ratings yet
@ailib Robust Feature Selection Algorithms
8 pages
By Lior Rokach and Oded Maimon: Clustering Methods
No ratings yet
By Lior Rokach and Oded Maimon: Clustering Methods
5 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Union-Find Data Structures and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Union-Find Data Structures and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
ECE-6323 Deck 01x PDF
No ratings yet
ECE-6323 Deck 01x PDF
51 pages
INVJM000008044484(1)
No ratings yet
INVJM000008044484(1)
3 pages
Tech Week Schedule
No ratings yet
Tech Week Schedule
4 pages
Extremeswitching X440-G2: Product Overview
No ratings yet
Extremeswitching X440-G2: Product Overview
13 pages
7/8" RADIAFLEX® RLKW Cable, A-Series
No ratings yet
7/8" RADIAFLEX® RLKW Cable, A-Series
2 pages
Berks Broadcasting Co V Craumer
No ratings yet
Berks Broadcasting Co V Craumer
1 page
6 - Avaliação Bimestral de Inglês - 3 Bimestre
No ratings yet
6 - Avaliação Bimestral de Inglês - 3 Bimestre
3 pages
Electronic Archiving System Report
No ratings yet
Electronic Archiving System Report
6 pages
Mathematics Mathematics Mathematics Mathematics C C C CL L L LA A A AS S S SS S S S X X X X
No ratings yet
Mathematics Mathematics Mathematics Mathematics C C C CL L L LA A A AS S S SS S S S X X X X
6 pages
Canons of Procurement
No ratings yet
Canons of Procurement
26 pages
BC558 Silicon PNP Transistor Audio Amplifier, Switch: Features
No ratings yet
BC558 Silicon PNP Transistor Audio Amplifier, Switch: Features
2 pages
MicroZed Verion 3.2 English
No ratings yet
MicroZed Verion 3.2 English
18 pages
Rear Seat Assembly: Components
No ratings yet
Rear Seat Assembly: Components
6 pages
Philippine Popular Culture - Unit II Module 4 Post Test - C-Joy
No ratings yet
Philippine Popular Culture - Unit II Module 4 Post Test - C-Joy
2 pages
GIVES Test Online
No ratings yet
GIVES Test Online
2 pages
Job Description - HR Coordinator
No ratings yet
Job Description - HR Coordinator
2 pages
Applied Statistics
No ratings yet
Applied Statistics
656 pages
CR 953 Afaehqqe
No ratings yet
CR 953 Afaehqqe
7 pages
SV 66 Pet
No ratings yet
SV 66 Pet
2 pages
ĐỀ VIP PLUS CÔ MAI PHƯƠNG
No ratings yet
ĐỀ VIP PLUS CÔ MAI PHƯƠNG
6 pages
En FANOX MANUAL SIRC FeederRecloser ProtectionRelays R010
No ratings yet
En FANOX MANUAL SIRC FeederRecloser ProtectionRelays R010
211 pages
Biffi
No ratings yet
Biffi
12 pages
Halogen Oven
No ratings yet
Halogen Oven
20 pages
If-3113 Computer Network: Bab 2 Local Area Network
No ratings yet
If-3113 Computer Network: Bab 2 Local Area Network
52 pages
BOC Gases
No ratings yet
BOC Gases
260 pages

A New Feature Subset Selection Using Bottom-Up Clu

Uploaded by

A New Feature Subset Selection Using Bottom-Up Clu

Uploaded by

Pattern Anal Applic (2018) 21:57–66

A new feature subset selection using bottom-up clustering

Abstract Feature subset selection and/or dimensionality 1 Introduction

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 2 Steps of proposed

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Wine 95.35 95.24

results is obvious. Additionally, since there is no random-

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Table 5 Performance of cooperating GACH with CFSS

Wine 13 4 95.35 10 98.18

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Table 6 Robustness of CFSS to outliers and noises

Wine 95.35 95.35 95.35 95.30 95.24

Table 7 Comparison of CFSS

Wine 96.35 91.47 94.41 91.82 95.35

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

You might also like