
 
reduction. Further, our results also indicate that, 
peak detection may not be the optimal choice for 
pre-processing proteomics data. 
ACKNOWLEDGEMENTS 
Support for this research received from the Institute 
of Complex Additive Systems Analysis, a unit of 
New Mexico Tech, is gratefully acknowledged. 
REFERENCES 
Petricoin, E. and Liotta, L. (2003), Mass spectrometry-
based diagnostic: the upcoming revolution in disease 
detection. Clin. Chem., 49, pp.533-534. 
Williams, B., Cornett, S., Dawant, B., Crecelius, A., 
Bodenheimer, B. and Caprioli, R. (2005), An 
algorithm for baseline correction of MALDI mass 
spectra,  Proceedings of the 43rd annual Southeast 
regional conference, March 18-20, 2005, Kennesaw, 
Georgia. 
Chen, S., Hong, D. and Shyr, Y. (2007), Wavelet-based 
procedures for proteomic mass spectrometry data 
processing, Computational Statistics & Data Analysis, 
2007, Vol. 52, issue 1, pp.211-220. 
Li, L. et al. (2004), Applications of the GA/KNN method 
to SELDI proteomics data. Bioinformatics, 20, 
pp.1638-1640. 
Petricoin, E. et al. (2002), Use of proteomics patterns in 
serum to identify ovarian cancer. The Lancet, 359, 
pp.572-577. 
Coombes, K. et al. (2007), Pre-processing mass 
spectrometry data. In Dubitzky, M., et al. (eds.), 
Fundamentals of Data Mining in Genomics and 
Proteomics. Kluwer, Boston, pp.79-99. 
Hilario, M. et al. (2006), Processing and classification of 
protein mass spectra. Mass Spectrom. Rev., 25:409-
449. 
Shin, H. and Markey, M. (2006), A machine learning 
perspective on the development of clinical decision 
support systems utilizing mass spectra of blood 
samples. J. Biomed. Inform. 39, pp.227-248. 
Furey, T. et al. (2000), Support vector machine 
classification and validation of cancer tissue samples 
using microarray expression data. Bioinformatics, 16: 
906-914. 
Coombes, K. et al. (2005), Improved peak detection and 
quantification of mass spectrometry data acquired 
from surface-enhanced laser desorption and ionization 
by denoising spectra with the undecimated discrete 
wavelet transform, Proteomics, Volume 5, Issue 16.  
Duan, K. and Rajapakse, J.C. (2004), SVM-RFE peak 
selection for cancer classification with mass 
spectrometry data. APBC 2005: pp.191-200. 
Guyon, I., Weston, J., Barnhill, S. and Vapnik, V.N. 
(2002), Gene Selection for Cancer Classification using 
Support Vector Machines. Machine Learning. 2002 
46(1-3): pp.389-422. 
Vapnik,V.N. (1998), Statistical Learning Theory. John 
Wiley and Sons, New York. 
Brown, M.P.S. et al. (2000), Knowledge-based analysis of 
microarray gene expression data by using support 
vector machines. Pro. Nat Acad. Sci., 97, pp.262-267. 
Liu, Q., Sung, A.H., Chen, Z. and Xu, J. (2008), Feature 
Mining and Pattern Classification for Steganalysis of 
LSB Matching Steganography in Grayscale Images, 
Pattern Recognition, 41(1): pp.56-66. 
Tenenbaum, J., Silva, V. de and Langford, J. C. (2000), A 
global geometric framework for nonlinear 
dimensionality reduction, Science, vol. 290, pp.2319-
2323. 
Saul, L. K. and Roweis, S. T. (2003), Think globally, fit 
locally: Unsupervised learning of low dimensional 
manifolds,  Journal of Machine Learning Research, 
vol. 4, pp.119-155. 
Belkin, M. and Niyogi, P. (2003), Laplacian eigenmaps 
for dimensionality reduction and data representation, 
Neural Computation, 15( 6):1373-1396.  
Xing, E., Ng, A., Jordan, M., and Russell, S. (2003), 
Distance metric learning with application to clustering 
with side-information, in Proc. NIPS, 2003. 
Domeniconi, C. and Gunopulos, D. (2002), Adaptive 
nearest neighbor classification using support vector 
machines, Proc. NIPS, 2002. 
Peng, J., Heisterkamp, D. and Dai, H. (2002), Adaptive 
kernel metric nearest neighbor classification, Proc. 
International Conference on Pattern Recognition, 
2002. 
Goldberger, J., Roweis, S., Hinton, G. and Salakhutdinov, 
R. (2005), Neighbourhood components analysis, in 
Proc. NIPS, 2005. 
Zhang, Z., Kwok, J. and Yeung, D. (2003), Parametric 
distance metric learning with label information, in 
Proc. International Joint Conference on Artificial 
Intelligence, 2003. 
Zhang, K., Tang, M. and Kwok, J. T. (2005), Applying 
neighborhood consistency for fast clustering and 
kernel density estimation. in Proc. Computer Vision 
and Pattern Recognition, 2005, pp. 1001-1007 
Chopra, S., Hadsell, R. and. LeCun Y. (2005), Learning a 
Similarity Metric Discriminatively, with Application 
to Face Verification, Proc. Computer Vision and 
Pattern Recognition, 2005, Vol. 1, pp.539-546. 
Weinberger, K., Blitzer, J. and Saul, L. (2006), Distance 
metric learning for large margin nearest neighbor 
classification, in Proc. NIPS, 2006, pp.1475-1482. 
Pusztai et al. (2004), Pharmacoproteomic Analysis of 
Prechemotherapy and Postchemotherapy Plasma 
Samples from Patients Receiving Neoadjuvant or 
Adjuvant Chemotherapy for Breast Carcinoma, 
Cancer 100: pp.1814-1822. 
Vandenberghe, L. and Boyd, S.P. (1996), Semidefinite 
programming, SIAM Review, 38(1): 49-95. 
Roweis, S. T. and Lawrance, K. S. (2000), Nonlinear 
dimensionality reduction by locally linear embedding, 
in Science, vol. 290, 2000, pp.2323-2326. 
CLASSIFICATION OF MASS SPECTROMETRY DATA - Using Manifold and Supervised Distance Metric Learning
401