Train a (statistical) classifier
 
  CC = train_sc(D,classlabel)
  CC = train_sc(D,classlabel,MODE)
  CC = train_sc(D,classlabel,MODE, W)
	weighting D(k,:) with weight W(k) (not all classifiers supported weighting)

 CC contains the model parameters of a classifier which can be applied 
   to test data using test_sc. 
   R = test_sc(CC,D,...) 

   D		training samples (each row is a sample, each column is a feature)	
   classlabel	labels of each sample, must have the same number of rows as D. 
 		Two different encodings are supported: 
		{-1,1}-encoding (multiple classes with separate columns for each class) or
		1..M encoding. 
 		So [1;2;3;1;4] is equivalent to 
			[+1,-1,-1,-1;
			[-1,+1,-1,-1;
			[-1,-1,+1,-1;
			[+1,-1,-1,-1]
			[-1,-1,-1,+1]
		Note, samples with classlabel=0 are ignored. 

  The following classifier types are supported MODE.TYPE
    'MDA'      mahalanobis distance based classifier [1]
    'MD2'      mahalanobis distance based classifier [1]
    'MD3'      mahalanobis distance based classifier [1]
    'GRB'      Gaussian radial basis function     [1]
    'QDA'      quadratic discriminant analysis    [1]
    'LD2'      linear discriminant analysis (see LDBC2) [1]
		MODE.hyperparameter.gamma: regularization parameter [default 0] 
    'LD3', 'FDA', 'LDA', 'FLDA'
               linear discriminant analysis (see LDBC3) [1]
		MODE.hyperparameter.gamma: regularization parameter [default 0] 
    'LD4'      linear discriminant analysis (see LDBC4) [1]
		MODE.hyperparameter.gamma: regularization parameter [default 0] 
    'LD5'      another LDA (motivated by CSP)
		MODE.hyperparameter.gamma: regularization parameter [default 0] 
    'RDA'      regularized discriminant analysis [7]
		MODE.hyperparameter.gamma: regularization parameter 
		MODE.hyperparameter.lambda =
		gamma = 0, lambda = 0 : MDA
		gamma = 0, lambda = 1 : LDA [default]
		Hint: hyperparameter are used only in test_sc.m, testing different 
		the hyperparameters do not need repetitive calls to train_sc, 
		it is sufficient to modify CC.hyperparameter before calling test_sc. 	
    'GDBC'     general distance based classifier  [1]
    ''         statistical classifier, requires Mode argument in TEST_SC	
    '###/DELETION'  if the data contains missing values (encoded as NaNs), 
		a row-wise or column-wise deletion (depending on which method 
		removes less data values) is applied;  
    '###/GSVD'	GSVD and statistical classifier [2,3], 
    '###/sparse'  sparse  [5] 
		'###' must be 'LDA' or any other classifier 
    'PLS'	(linear) partial least squares regression 
    'REG'      regression analysis;
    'WienerHopf'	Wiener-Hopf equation  
    'NBC'	Naive Bayesian Classifier [6]     
    'aNBC'	Augmented Naive Bayesian Classifier [6]
    'NBPW'	Naive Bayesian Parzen Window [9]     

    'PLA'	Perceptron Learning Algorithm [11]
		MODE.hyperparameter.alpha = alpha [default: 1]
		 w = w + alpha * e'*x
    'LMS', 'AdaLine'  Least mean squares, adaptive line element, Widrow-Hoff, delta rule 
		MODE.hyperparameter.alpha = alpha [default: 1]
    'Winnow2'  Winnow2 algorithm [12]

    'PSVM'	Proximal SVM [8] 
		MODE.hyperparameter.nu  (default: 1.0)
    'LPM'      Linear Programming Machine
                 uses and requires train_LPM of the iLog CPLEX optimizer 
		MODE.hyperparameter.c_value = 
    'CSP'	CommonSpatialPattern is very experimental and just a hack
		uses a smoothing window of 50 samples.
    'SVM','SVM1r'  support vector machines, one-vs-rest
		MODE.hyperparameter.c_value = 
    'SVM11'    support vector machines, one-vs-one + voting
		MODE.hyperparameter.c_value = 
    'RBF'      Support Vector Machines with RBF Kernel
		MODE.hyperparameter.c_value = 
		MODE.hyperparameter.gamma = 
    'SVM:LIB'    libSVM [default SVM algorithm)
    'SVM:bioinfo' uses and requires svmtrain from the bioinfo toolbox        
    'SVM:OSU'   uses and requires mexSVMTrain from the OSU-SVM toolbox 
    'SVM:LOO'   uses and requires svcm_train from the LOO-SVM toolbox 
    'SVM:Gunn'  uses and requires svc-functios from the Gunn-SVM toolbox 
    'SVM:KM'    uses and requires svmclass-function from the KM-SVM toolbox 
    'SVM:LINz'  LibLinear [10] (requires train.mex from LibLinear somewhere in the path)
            z=0 (default) LibLinear with -- L2-regularized logistic regression
            z=1 LibLinear with -- L2-loss support vector machines (dual)
            z=2 LibLinear with -- L2-loss support vector machines (primal)
            z=3 LibLinear with -- L1-loss support vector machines (dual)
    'SVM:LIN4'  LibLinear with -- multi-class support vector machines by Crammer and Singer
    'DT'	decision tree - not implemented yet.  

 {'REG','MDA','MD2','QDA','QDA2','LD2','LD3','LD4','LD5','LD6','NBC','aNBC','WienerHopf','LDA/GSVD','MDA/GSVD', 'LDA/sparse','MDA/sparse', 'PLA', 'LMS','LDA/DELETION','MDA/DELETION','NBC/DELETION','RDA/DELETION','REG/DELETION','RDA','GDBC','SVM','RBF','PSVM','SVM11','SVM:LIN4','SVM:LIN0','SVM:LIN1','SVM:LIN2','SVM:LIN3','WINNOW', 'DT'};

 CC contains the model parameters of a classifier. Some time ago,     
 CC was a statistical classifier containing the mean 
 and the covariance of the data of each class (encoded in the 
  so-called "extended covariance matrices". Nowadays, also other 
 classifiers are supported. 

 see also: TEST_SC, COVM, ROW_COL_DELETION

 References: 
 [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. 
       John Wiley & Sons, 2001. 
 [2] Peg Howland and Haesun Park,
       Generalizing Discriminant Analysis Using the Generalized Singular Value Decomposition
       IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 2004.
       dx.doi.org/10.1109/TPAMI.2004.46
 [3] http://www-static.cc.gatech.edu/~kihwan23/face_recog_gsvd.htm
 [4] Jieping Ye, Ravi Janardan, Cheong Hee Park, Haesun Park
       A new optimization criterion for generalized discriminant analysis on undersampled problems.
       The Third IEEE International Conference on Data Mining, Melbourne, Florida, USA
       November 19 - 22, 2003
 [5] J.D. Tebbens and P. Schlesinger (2006), 
       Improving Implementation of Linear Discriminant Analysis for the Small Sample Size Problem
	Computational Statistics & Data Analysis, vol 52(1): 423-437, 2007
       http://www.cs.cas.cz/mweb/download/publi/JdtSchl2006.pdf
 [6] H. Zhang, The optimality of Naive Bayes, 
	 http://www.cs.unb.ca/profs/hzhang/publications/FLAIRS04ZhangH.pdf
 [7] J.H. Friedman. Regularized discriminant analysis. 
	Journal of the American Statistical Association, 84:165–175, 1989.
 [8] G. Fung and O.L. Mangasarian, Proximal Support Vector Machine Classifiers, KDD 2001.
        Eds. F. Provost and R. Srikant, Proc. KDD-2001: Knowledge Discovery and Data Mining, August 26-29, 2001, San Francisco, CA.
 	p. 77-86.
 [9] Kai Keng Ang, Zhang Yang Chin, Haihong Zhang, Cuntai Guan.
	Filter Bank Common Spatial Pattern (FBCSP) in Brain-Computer Interface.
	IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). 
	1-8 June 2008 Page(s):2390 - 2397
 [10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. 
       LIBLINEAR: A Library for Large Linear Classification, Journal of Machine Learning Research 9(2008), 1871-1874. 
       Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear 
 [11] http://en.wikipedia.org/wiki/Perceptron#Learning_algorithm
 [12] Littlestone, N. (1988) 
       "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm" 
       Machine Learning 285-318(2)
 	http://en.wikipedia.org/wiki/Winnow_(algorithm)

Package: nan