TY - JOUR
T1 - Evolutionary framework for coding area selection from cancer data
AU - Kamal, Sarwar
AU - Dey, Nilanjan
AU - Nimmy, Sonia Farhana
AU - Ripon, Shamim H.
AU - Ali, Nawab Yousuf
AU - Ashour, Amira S.
AU - Karaa, Wahiba Ben Abdessalem
AU - Nguyen, Gia Nhu
AU - Shi, Fuqian
N1 - Publisher Copyright:
© 2016, The Natural Computing Applications Forum.
PY - 2018/2/1
Y1 - 2018/2/1
N2 - Cancer data analysis is significant to detect the codes that are responsible for cancer diseases. It is significant to find out the coding regions from diseases infected biological data. The infected data will be helpful to design proper drugs and will be supportable in laboratory assessments. Codes bear specific meaning on various features as well as symptoms of diseases. Coding of biological data is a key area to get exact information on animals to discover the desired medicine. In the current work, four different machine learning approaches such as support vector machine (SVM), principal component analysis (PCA) technique, neural mapping skyline filtering (NMSF) and Fisher’s discriminant analysis (FDA) were applied for data reduction and coding area selection. The experimental analysis established that the SVM outperforms PCA and FDA. However, due to the mapping facility, NMSF outperforms SVM. Thus, the NMSF achieved the preeminent results among the four techniques. Matthews’s correlation coefficient was used to evaluate the accuracy, specificity, sensitivity, F-measures and error rate of the four methods that are used to determine the coding area. Detailed experimental analysis included comparison study among the four classifiers for the deoxyribonucleic acid dataset.
AB - Cancer data analysis is significant to detect the codes that are responsible for cancer diseases. It is significant to find out the coding regions from diseases infected biological data. The infected data will be helpful to design proper drugs and will be supportable in laboratory assessments. Codes bear specific meaning on various features as well as symptoms of diseases. Coding of biological data is a key area to get exact information on animals to discover the desired medicine. In the current work, four different machine learning approaches such as support vector machine (SVM), principal component analysis (PCA) technique, neural mapping skyline filtering (NMSF) and Fisher’s discriminant analysis (FDA) were applied for data reduction and coding area selection. The experimental analysis established that the SVM outperforms PCA and FDA. However, due to the mapping facility, NMSF outperforms SVM. Thus, the NMSF achieved the preeminent results among the four techniques. Matthews’s correlation coefficient was used to evaluate the accuracy, specificity, sensitivity, F-measures and error rate of the four methods that are used to determine the coding area. Detailed experimental analysis included comparison study among the four classifiers for the deoxyribonucleic acid dataset.
KW - Cancer DNA dataset
KW - Fisher’s discriminant analysis (FDA)
KW - Matthews’s correlation coefficient (MCC)
KW - Neural mapping skyline filtering (NMSF)
KW - Principal component analysis (PCA)
KW - Support vector machine (SVM)
UR - http://www.scopus.com/inward/record.url?scp=84982840469&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84982840469&partnerID=8YFLogxK
UR - https://rdcu.be/dxAQX
U2 - 10.1007/s00521-016-2513-3
DO - 10.1007/s00521-016-2513-3
M3 - Article
AN - SCOPUS:84982840469
SN - 0941-0643
VL - 29
SP - 1015
EP - 1037
JO - Neural Computing and Applications
JF - Neural Computing and Applications
ER -