The Effect Of Statistical Feature Selections And Feature Extraction On Cancer Classification

Authors

  • Muhammad Azharuddin Arif
  • Zuraini Ali Shah Universiti Teknologi Malaysia
  • Ashraf Osman Ibrahim Faculty of computer Science and Information Technology, AlzaiemAlazhari University, Khartoum North

Abstract

Cancer classification has used advanced technology such as microarray technology to conduct a research.

Microarray is a technology that allows us to measured thousands of genes simultaneously. This technology also have

successfully applied in many problems, for example in medical science. Microarray also has shown it ability to diagnose a

patient that have specific disease. Thus, this technology used to detect a disease such as cancer, which usually have a binary

class. The major drawback in terms of classification of this disease is, the gene expression data produced by microarray

have high dimension. To counter this problems, an important genes should be identify and reduce the dimensionality of the

microarray data. In this research, six feature selections (Receiver Operating Characteristic curve, Wilcoxon rank sum test,

t-statistic, Kruskal-Wallis test statistic, Fisher score, and Gini index) has been used with the combination of Principal

Component Analysis (feature extraction) to solve the high dimension problem and produce a new subset of original datasets.

Then, the new dataset is classified according to their class. Three classifications (K-Nearest Neighbour, Linear Discriminant

Analysis, and Support Vector Machine) are used in this research and the performance of each classifier are calculated and

compared. The experimental result shows that, among the feature selections, both Wilcoxon rank sum test with Principal

Component Analysis for Linear Discriminant Analysis classifier and Receiver Operating Characteristic curve with Principal

Component Analysis for Support Vector Machine classifier shows highest correct rate with 96% which outperformed other

feature selections.

Issue

Section

Articles