This chapter explains how to train classifiers and convert their output into decisions.
- 7.1. Introduction
- 7.1.1. Training a statistical model
- 7.2. Working with pipelines
- 7.2.1. Pipeline output types
- 7.2.2. Pipeline labels
- 7.2.3. Pipeline list
- 7.2.4. Executing the trained model on new data
- 7.2.5. Visualizing model outputs using sdscatter
- 7.3. Pipeline objects
7.1. Introduction ↩
Statistical classifiers are trained from labeled examples to discriminate between user-defined classes. When applied to a new observation, a trained classifier returns a crisp decision. In PRSD Studio, we consider a classifier to be composed of two components as illustrated on the image below. The first one is the statistical model estimating a confidence that a data sample belongs to each of the considered classes. The second step is a decision function which converts the estimated confidences into a crisp decision.

The distinction between statistical model and decision function is important for practical applications as it enables fine-tuning of the trained models based on specific performance requirements. We will discuss this process later in the Chapter on ROC analysis.
We use the term soft output for the real-value result of the statistical model. Depending on a type of model, the soft output may take different forms. For a probabilistic classifier, it may be the posterior probability of class membership; for the nearest neighbor classifier the distance to the nearest neighbor; for the neural network simply the network output.
7.1.1. Training a statistical model ↩
Let us illustrate training of a statistical classifier on a simple example. We load an artificial "fruit" data set with two features and three classes, called apple, banana and stone.
In this example, we train linear classifier assuming normal densities using
the sdlinear function. It estimates a Gaussian model for each of the
classes assuming that they share the same covariance matrix:
>> load fruit
260 by 2 sddata, 3 classes: 'apple'(100) 'banana'(100) 'stone'(60)
>> p=sdlinear(a)
sequential pipeline 2x3 'Linear discriminant'
1 Gauss eq.cov. 2x3 3 classes, 3 components (sdp_normal)
2 Output normalization 3x3 (sdp_norm)
The result of the training is a pipeline object p. In PRSD Studio,
pipelines serve as a basis for training and execution of pre-defined
pattern recognition algorithms.
The pipeline p has two inputs corresponding to the two input features in
our fruit problem and three outputs representing the three classes.
The pipeline comprises two steps. The first is a Gaussian model and the second is output normalization assuring that the pipeline output is estimate of posterior probability of class membership.
When applied to the data, this pipeline will produce soft outputs of the model.
>> out=a*p
260 by 3 sddata, 3 classes: 'apple'(100) 'banana'(100) 'stone'(60)
Looking inside the output set out, we will find three per-class
outputs.
>> +out(1:4)
ans =
0.9860 0.0140 0.0000
0.9947 0.0053 0.0000
0.8437 0.1562 0.0001
0.9933 0.0067 0.0000
To return crisp decisions, we need to add a decision step in the
pipeline. The simplest way is to call sddecide function:
>> pd=sddecide(p)
sequential pipeline 2x1 'Gauss eq.cov.+Output normalization+Decision'
1 Gauss eq.cov. 2x3 3 classes, 3 components (sdp_normal)
2 Output normalization 3x3 (sdp_norm)
3 Decision 3x1 weighting, 3 classes, 1 ops at op 1 (sdp_decide)
Executed on input data set, the pipeline pd will return decisions in the
sdlab object.
>> dec=a*pd
sdlab with 260 entries, 3 groups: 'apple'(105) 'banana'(99) 'stone'(56)
>> +dec(1:7)
ans =
apple
apple
apple
apple
banana
banana
apple
Pipelines are the PRSD Studio vehicle of delivering trained classifiers into external applications. They are always executed through the C library libPRSD which is inside Matlab linked as a MEX library.
Pipelines are fully self-contained and do not need any further information for their execution. We may, therefore, apply a pipeline directly to a matrix of measurements:
>> [0 0; -8 -5; -3 5; 5 10]*p
ans =
0.6167 0.3640 0.0193
0.0070 0.9849 0.0081
0.0039 0.1592 0.8369
0.1438 0.0245 0.8317
>> dec=double(a(1:7))*pd
dec =
-101
-101
-101
-101
-102
-102
-101
7.2. Working with pipelines ↩
7.2.1. Pipeline output types ↩
Pipelines may provide different types of output. We may query the output
type using getoutput function:
>> getoutput(p)
ans =
class similarity
>> getoutput(pd)
ans =
decision
The complete list of pipeline outputs includes:
class similarity- probability or confidence per class (sdlinear,sdparzen,sdmixtureetc.)class distance- distance to nearest neighbor per class (sdknn)proto similarity- similarity to prototypes (sdprox)proto distance- distance to prototypesdata- output without affinity to any specific class (sdpca,sdlda)decision- crisp decision output (sddecide)
Distinguishing similarity and distance output helps us to automatically fix the polarity of decision functions.
7.2.2. Pipeline labels ↩
When a pipeline which does not returning decisions is applied to the data set, it sets the feature names of the resulting data set object. This helps us to interpret the pipeline output.
For example, the labels of the pipeline p trained above are:
>> getlab(p)
sdlab with 3 entries, 3 groups: 'apple'(1) 'banana'(1) 'stone'(1)
>> +getlab(p)
ans =
apple
banana
stone
Alternatively, we may just access the pipeline lab field:
>> p.lab
sdlab with 3 entries, 3 groups: 'apple'(1) 'banana'(1) 'stone'(1)
The pipelines returning decisions return empty []:
>> pd.lab
ans =
[]
7.2.3. Pipeline list ↩
Pipelines returning decisions provide detailed information about decisions they are capable of in the list.
>> pd.list
sdlist (3 entries)
ind name
1 apple
2 banana
3 stone
The pipelines not returning decisions return empty list.
>> p.list
ans =
[]
7.2.4. Executing the trained model on new data ↩
The pipeline p may be executed on a 2D feature vector or matrix with two columns:
>> [0 0]*p
ans =
0.6167 0.3640 0.0193
Note that the pipeline outputs are posteriors and thus sum to one. Each of
the outputs corresponds to one of the classes. We can request the labels
assigned to pipeline outputs using getlab method:
>> getlab(p)
sdlab with 3 entries, 3 groups: 'apple'(1) 'banana'(1) 'stone'(1)
>> +getlab(p)
ans =
apple
banana
stone
Alternatively, we can just access the pipeline lab field:
>> p.lab
sdlab with 3 entries, 3 groups: 'apple'(1) 'banana'(1) 'stone'(1)
Pipelines may be executed on matrices with rows corresponding to samples and columns to features:
>> [0 0; -8 -5; -3 5; 5 10]*p
ans =
0.6167 0.3640 0.0193
0.0070 0.9849 0.0081
0.0039 0.1592 0.8369
0.1438 0.0245 0.8317
7.2.5. Visualizing model outputs using sdscatter ↩
In order to visualize the pipeline output, we may pass it together with the
data to the sdscatter function:
>> p=sdgauss(a)
Gaussian model pipeline 2x3 3 classes, 3 components (sdp_normal)
>> sdscatter(a,p)
Warning: rendering only the default first output feature. Use 'out' option to visualize other outputs.

The scatter plot now contains a backdrop image showing the pipeline output computed in a grid over our 2D feature space. We can see the shape of the Gaussian probability density function estimated for the first class in the problem.
To visualize the soft output for the second class using the out option:
>> sdscatter(a,p,'out',2)
Note that the visualization of model output is supported only for 2D feature spaces.
7.3. Pipeline objects ↩
The concept of a pipeline is fundamental both for the design of classifiers in PRSD Studio and for their deployment in custom applications.
Pipelines represent pattern recognition algorithms. They may be trained
using build-in routines like sdquadratic or sdmixture. They may be also
created by converting algorithms trained elsewhere, for example in PRTools
or using LIBSVM. Pipelines allow composition of complex chains of
algorithms such as sequences, classifier combiners or even hierarchical
classifiers.
Pipelines enable fast transition from algorithm design under Matlab to the production deployment. This is because pipeline execution is always performed through the libPRSD library written in C. Under Matlab, the libPRSD is available through the MEX interface; outside as a DLL without any dependency on Matlab or external libraries. The benefit of this solution is that you use identical execution implementation during algorithm design, testing and in production.
7.3.1. Training a pipeline ↩
We will illustrate training a pipeline on example of general mixture model
implemented by sdmixture.
>> a=sddata(gendatf(1000))
1000 by 2 sddata, 3 classes: 'apple'(333) 'banana'(333) 'stone'(334)
Used without arguments, it optimizes the number of Gaussian component per class).
>> p=sdmixture(a)
[class 'apple' initialization:......................... 4 clusters EM:.............................. 4 comp]
[class 'banana' initialization:......................... 2 clusters EM:.............................. 2 comp]
[class 'stone' initialization:......................... 1 cluster EM:.............................. 1 comp]
Mixture of Gaussians pipeline 2x3 3 classes, 7 components (sdp_normal)
We may specify number of components per class.
>> p=sdmixture(a,'comp',5)
[class 'apple' EM:.............................. 5 comp]
[class 'banana' EM:.............................. 5 comp]
[class 'stone' EM:.............................. 5 comp]
Mixture of Gaussians pipeline 2x3 3 classes, 15 components (sdp_normal)
>> sdscatter(a,p)

The training is implemented using the Expectation-Maximization (EM) algorithm maximizing the model likelihood. By default it stops when likelihood change falls under a specific limit. We can also stop it after a given number of iterations:
>> p=sdmixture(a,'comp',[5 5 1],'iter',10)
[class 'apple' EM:.......... 5 comp]
[class 'banana' EM:.......... 5 comp]
[class 'stone' EM:.......... 1 comp]
Mixture of Gaussians pipeline 2x3 3 classes, 11 components (sdp_normal)
Pipeline may be combined together. The example below constructs a trained pipeline composed of two steps, namely a PCA followed by a quadratic classifier.
>> a
381 by 1024 sddata, 17 classes: [31 28 24 33 19 21 57 26 21 9 13 15 14 1 14 29 26]
>> p1=sdpca(a,6) % PCA projection on the first 6 eigenvectors
PCA pipeline 1024x6 75% of variance (sdp_affine)
>> p2=sdquadratic(a*p1)
sequential pipeline 6x17 'Quadratic discr.'
1 Gauss full cov. 6x17 17 classes, 17 components (sdp_normal)
2 Output normalization 17x17 (sdp_norm)
>> p=p1*p2
sequential pipeline 1024x17 'PCA+Quadratic discr.'
1 PCA 1024x6 75%% of variance (sdp_affine)
2 Gauss full cov. 6x17 17 classes, 17 components (sdp_normal)
3 Output normalization 17x17 (sdp_norm)
7.3.2. Inspecting pipeline structure and parameters ↩
Pipeline object behaves as a sequence of actions. Each action is accessible
by its index shown on the left in the pipeline display string. We can
access pipeline actions using parentheses (). In our example above, we
can inspect the second step in the pipeline p by:
>> p(2)
Gauss full cov. pipeline 6x17 17 classes, 17 components (sdp_normal)
In order to poke inside the pipeline, we can use the curly brackets {}:
>> p{2}
ans =
mean: [17x6x17 sddata]
cov: [6x6x17 double]
prior: [1x17 double]
The parameters of pipeline actions are returned in a structure. This structure contains all the information needed to execute the pipeline on new data. For the quadratic classifier, it contains the means and covariances of the classes, and class priors. Accessing the value is straightforward:
>> +p{2}.mean(1:2) % gives the mean value for the first 2 classes
ans =
888.0094 490.5106 87.9525 -314.7964 -241.2580 54.5189
-139.8726 -397.0727 -559.8701 -283.7282 -89.5977 234.9844
7.3.3. Constructing pipelines manually ↩
Pipelines may be constructed manually using the functions with sdp_*
prefix. This gives us the freedom to train in arbitrary toolbox or library.
The only requirement is that we are able to extract the classifier
parameters.
For example, we may construct Parzen classifier by supplying smoothing
parameter and the matrix with prototypes. We might use this approach to
model the two fruit classes in the data a and so protect them from outliers. We might select proper smoothing, for example, using a grid search minimizing the detector error on the existing outliers (stone class). We will discuss construction of detectors in detail in Chapter 8.
>> load fruit
260 by 2 sddata, 3 classes: 'apple'(100) 'banana'(100) 'stone'(60)
>> p=sdp_parzen('gauss',2,+a(:,:,{'apple','banana'}))
sequential pipeline 2x1 ''
1 sdp_parzen 2x1 one class, 200 prototypes
>> sdscatter(a,p)

Another example of manual pipeline construction is the knowledge base article on training Support Vector classifier through the LIBSVM.
