- 1.1. This manual
- 1.2. Introduction to PRSD Studio
- 1.2.1. Versions
- 1.2.2. System requirements
- 1.2.3. Useful general commands
- 1.2.3.1. Displaying PRSD Studio version and license information
- 1.2.3.2. Demo examples
- 1.2.3.3. Provide direct feedback to PR Sys Design
- 1.2.3.4. Control messages displayed by PRSD Studio
- 1.3. Release notes
1.1. This manual ↩
This manual assumes basic knowledge of pattern recognition and Matlab environment. In order to embed trained classifiers into custom applications, basic familiarity with C language is also assumed.
The manual is structured in four parts:
- User's guide - explains software functionality
- Reference manuals for the PRSD Toolbox and for libPRSD C library describe the programing interface
- Knowledge base - collects number of step-by-step usage examples and "howtos"
- Glossary - explains basic pattern recognition terminology
1.2. Introduction to PRSD Studio ↩
PRSD Studio is a software toolkit simplifying design and deployment of pattern recognition algorithms. It consists of two components, the Matlab-based PRSD Toolbox facilitating algorithm design and the libPRSD library delivering the execution of trained classifiers to custom applications.

PRSD Studio provides tools for:
- Construction of data sets
- Handling of multiple sets of labels and arbitrary meta-data
- Interactive visualization of data and meta-data
- Training statistical detectors and discriminants
- Quick evaluation of classifiers
- Optimizing classifier decisions according to performance requirements using two-class and multi-class ROC analysis
- Building hierarchies of classifiers
- Deploying trained classifiers in custom applications out of Matlab
1.2.1. Versions ↩
PRSD Studio comes in the following versions:
- Lite: Free limited version for non-commercial use intended for people who are learning about pattern recognition. It contains only PRSD Toolbox and is limited to data sets with maximum 300 samples and three classes.
- Student: One year license of PRSD Studio for MSc and PhD students. The license is bound to a single machine using activation and limited to data set size of 10 000 samples and 20 classes.
- Academic toolbox: PRSD Studio Matlab toolbox discounted for use by university students and researchers for non-commercial projects. The license is permanent and bound a hardware dongle which allows the researchers to move between different machines.
- Academic full: Full version of PRSD Studio discounted for use by university students and researchers for non-commercial projects only. The license is permanent, bound a hardware dongle and includes both the PRSD Toolbox and the libPRSD library for execution of trained classifiers out of Matlab.
- Commercial: Full version of PRSD Studio for commercial use. It includes personal license of PRSD Toolbox bound to a hardware dongle and unlimited royalty-free deployment of trained classifiers in custom commercial applications using libPRSD library.
For Academic and Commercial versions, also group licensing is available using floating licenses provided by a license server.
1.2.2. System requirements ↩
PRSD Studio is supported on the following platforms:
- MS Windows 32-bit
- MS Windows 64-bit
- GNU Linux 32-bit (x86)
- GNU Linux 64-bit (x86)
- Apple Mac OS X (x86)
- Apple Mac OS X 64-bit (x86), experimental
PRSD Studio requires Matlab 7.x. Limited functionality depends on PRTools 4.1.
1.2.3. Useful general commands ↩
1.2.3.1. Displaying PRSD Studio version and license information ↩
PRSD Studio may be displayed using prsdversion. It consists of a
numerical part (e.g. 2.0.9) and a build date (08-Mar-2010).
prsdversion also provides several license-related details such as license
type (Commercial, Academic or Lite), licensee name and the license expiration
date.
>> prsdversion
PRSD Toolbox 2.0.9 (08-Mar-2010), Copyright (C) 2007-2010, PR Sys Design, All rights reserved
Commercial license for PRSD. The license will expire on 20-apr-2010.
1.2.3.2. Demo examples ↩
prsd_demo lists several basic examples to get started
>> prsd_demo
run prsd_demo(num) where num is the index of the desired example
1 : Working with data sets
2 : Training a classifier and visualizing decisions
3 : Tuning a classifier using ROC analysis
4 : Multi-class ROC analysis
5 : Building detectors
6 : Building a detector-classifier cascade
1.2.3.3. Provide direct feedback to PR Sys Design ↩
prsd_feedback command allows users to submit feedback such as error
messages to PR Sys Design directly from within Matlab. Running
prsd_feedback without arguments opens an edit dialog where the user may
paste or type the desired message. An alternative is to provide the message
to prsd_feedback as a string.
1.2.3.4. Control messages displayed by PRSD Studio ↩
prsd_display command provides global verbosity control in PRSD Studio.
Running prsd_display without arguments prints the current display state
(on/off). To switch off messages printed by PRSD Studio, use:
>> prsd_display off
Default prsd_display state is on. When prsd_mex library is re-loaded
into memory, this default state is re-introduced.
Alternatively, you may use the 'nodisplay' option in the functions that support it: sdrelab, sdroc, sddetector and sdcrossval.
1.3. Release notes ↩
development version 2.2.3 (29-July-2010)
sdsvcallows to identify training samples that became support vectors (usingoriginalproperty of support vectors setp{1}.proto)sddetectorsupport for externally defined test set usingtestoptionsdfeatselfloating search provides history of feature subsets selected by individual steps.sdfeatseladds atestoption which may be used to supply external data set used for evaluation of 1-NN error criterionsddecideallows construction of an operating point manually. Support for both weighting-based discriminants and thresholding-based detectors.sdsvcsupport for setting external data set used for error estimation in parameter grid search with `test optionsddatasupports cell array propertiessdscatteruser callbacks are now accessible using 'callback' option- untrained classifier pipelines now return names using
getname
development version 2.2.2 (22-June-2010)
- fast feature selection
sdfeatselscalable to large data sets (forward search with 1000 samples, 50 features under five seconds). Individual, random selection, forward, backward and floating searches are supported using 1-NN error on a validation set as a criterion. Feature subset size is selected automatically. - sdscatter called when clicking on an data sample. This allows to custom visualization such as loading an image corresponding to a sample form disk and showing it in a separate figure.
- support for untrained high-level operations on data (subset, randsubset, sdrelab, sdroc). This allows one to easily express complex sequences of training operations.
- extended
sdscalesupporting also robust domain scaling (robust in presence of outliers) - cascades may be now trimmed to return output after specific stage using
sdconvert, e.g.pc2=sdconvert(pc,'until',2). This helps us to understand how the later stages of hierarchical classifiers improve performance. - experimental support for Mac OS X 64-bit platform
sdscatterfix for decision colormap when showing classifier decisions
development version 2.2.1 (3-May-2010)
- interactive visualization of feature distributions in
sdscatterfor both axes (use 'Show feature distribution' in 'Scatter' menu or press 'd'). This greatly simplifies understandingo of overlap in very large data sets where scatter plot is not too informative. (example) sdkmeansclassifier and clustering scalable to very large data sets (1 million samples, 10 clusters in 3.3 sec).sdkmeansprovides fast prototype selection method for k-NN classifiers. Classification performance is further improved by prototype pruning (similar effect to editing the training set).sdkcentresclassifier and clusteringrandsubsetallows to limit the maximum number of samples using 'atmax' option. This is useful to limit samples size but tolerate that some classes have less samples.findandsubsetnow allow that some of the class names do not exist and return what is present (and not empty [] as before)
stable version 2.1.0 (21-Apr-2010)
- fixing a bug in sddecide related to adding an operating point in an ROC object
- fixing an error message in sdlab constructor
- adding RBF support vector machine training using
sdsvccommand.sdsvcis based on libSVM and offers automatic grid search for sigma and C parameters and one-against-all multi-class support. (examples) - adding a reject option to a trained discriminant using the
sdrejectfunction (also for multi-class classifiers; both outlier rejection and rejection close to the decision boundary) (examples) sdcrossvalsupport for estimating ROC with variances using operating point averaging (cross-validate pipline returning soft outputs and provide fixed operating points using the 'ops' option), (example)- adding
sdcrossvalsupport for customsdalgalgorithms that are not convertible into a pipeline (algorithm needs to return the list of all possible decisions) sddrawrocnow saves completesdrocobjects back in the workspace, not only operating points (by pressing 's' key)sddecidesupport for default op.point based on thresholding (e.g. forsdsvcon two-class problems)- support for clustering using
sdmixturewith 'cluster' option sdscatteradding the "show only this class" command (press 'o' key)- default mean-error performance measure in
sdcrossvalis not anymore included if user requests a specific set of measures sdneuralmay switch off the default use of validation for teaching purposes (to illustrate overfitting of the network). Use'valfrac',[]to suppres the use of validation set.- fixing the problem with
sdrocusing 'confmat' and 'reject' options together - fixing the bug in
sdlabconstructor for single label per class - improving compatibility with PRTools (
sdimage,sddetector,sdreject,sdcrossval,sdstackgen,sdscattervisualizing images using sample inspector)
development version 2.0.9 (8-Mar-2010)
- adding support for subset by logical array for
sddataandsdlabobjects (example:a( a.lab=='banana' )) sdtestraises a warning if some of the true classes are not matched to classifier decisions (all samples from these classes are considered misclassified)- fixed sdscatter problem with the order of classes in "class on top" and "change markers"
- usability improvements in
sdfeatplot(click to change figure title; legend properly displaying special characters) - 'mean-error' performance measure may specify optional class priors used for weighting the class errors
- global display verbosity may be handled using
prsd_displaycommand (useprsd_display offto switch off display output of PRSD Studio functions). - 'nodisplay' options added in
sdmixture,sdparzen,sdcrossval randsubsetsupports random selection of objects from some classes only (example:[tr,ts]=randsubset(a,[0.5 0])returns 50% of the first class for the training)sdcrossvaloutputs string with the result summary, result struct and the evaluation object.
development version 2.0.8 (19-Feb-2010)
- fix in
sdimagefor multi-dimensional images (image cubes) - pipelines now provide operating points via
p.opsfield - API interface simplification and cleanup
- low-level output of pipelines on matrices and using C API returns indices to decision list as decisions, not the internal codes
sdlistandsdlabinternal numerical representation is not exposed to the user anymore- feature selection pipelie
sdp_fselnow may get the feature labels directly from the data setpf=sdp_fsel(data,[3 4]) sddetectorhandles output polarity automatically (k-NN output is distance, mixture output is similarity)- adding easy display of
sdlabobject details (class sizes, fractions) using the transpose operator (lab')
development version 2.0.5 (22-Dec-2009)
- classifier output visualization using
sdscattercan now switch between different soft outputs interactively using cursor keys - added
constrainmethod for easy application of ROC performance constraints - enhanced
setcuropmethod to choose operating point minimizing or maximizing specific performance measure or setting op.point based on costs - new performance measure
nconfmat- the entry in normalized confusion matrix - 'target' and 'non-target' options in
sddetectorandsdrocsetting the desired target/non-target names setstatemethod insdalgalgorithm allows to call algorithm function directly (instead of using the multiplication operator)
development version 2.0.4 (14-Dec-2009)
sdrelabnow allows to add string prefix to all classes in all labels present using 'add to all' option. This makes it easy to compare two data sets with multiple labelings (classes, patients, tissues).- adding
sdscalecommand for data scaling
development version 2.0.3 (9-Dec-2009)
isclassmethod for quick check if certain classes are present (useful for custom algorithms)sdnormfunction adding normalization step to a trained pipeline (this construct a general discriminant)sdlabfix for incorrect class size when initialized with a list and indices- adding initial version of auto-conversion for older-format
@sdppl/sdppland@sdops/sdopsobjects - fix for the inMathOverflow warning/error in
sdtreetraining
development version 2.0.2 (4-Dec-2009)
- new
sdlabobject simplifies handling of labels, decisions and indexed meta-data - new
sddataobject brings easy handling of sample meta-data - multiple sets of labels or meta-data in a dataset, unified access to sample properties
- simple queries using multiple criteria (give me all samples labeled as "Cancer" from patient 1,2 and 5 using
subset(a,'class','Cancer','patient',[1 2 5])) - access to classes is greatly simplified
- sdroc handles classifier output polarity automatically (sdexe stores the output type in
output_typedata property) - user may change class markers. Data set remembers class markers. Scatter markers are stored in the 'marker' property inside the class list.
- dissimilarity representation contains as feature properties all prototype sample properties
- labels and decisions may be easily concatenated. This allows us to add new labels with brake-down of errors (confusion-matrix entries) in one command.
- writing custom sdalg algorithms is significantly simplified
1.x Compatibility changes
- sdppl objects use new internal format.
- sderror replaced by sdtest
stable version 1.3 (30-Nov-2009)
- fix in
sdnmeanclassifier: now computing pooled diagonal covariance using class priors - adding missing
parse_measures.pfile - fixing p-code copatibility problem with Matlab 7.4
development version 1.2.5 (12-Oct-2009)
- fixes in
findpropfor numerical properties - adding 'all' and 'nodisplay' options to
sdrelab
development version 1.2.4 (13-Aug-2009)
sdtreeimplements training of decision tree classifier scalable to large number of samples (example)- fix in
prsd_feedbackcorrecting the problem with PRTools not on Matlab path
development version 1.2.3 (15-Jul-2009)
- visualization:
sdscatterprovides more detailed information in sample inspector including all sample meta-data sdrelab: adding prefix or suffix to all class names. (example)sdrelab: renaming a single class by relative index- simpler installation: PRSD Studio Lite installation does not anymore require software activation
sdroc: support for reject option on classifiers with distance soft output (sdknn)selprop,findpropsupport for set of property values defined by cell array
development version 1.2.2 (16-Jun-2009)
- libPRSD: support for AdaBoost execution using decision tree as base classifiers
- visualization:
sdscatterallows interactive change of classifier parameters using slider (k in k-NN, smoothing in Parzen, number of base classifiers in AdaBoost) - visualization:
sdimagemay be connected to ROC plot and visualize decisions at different operating points in real-time sdneuralprovidestargetoption that allows one to approximate trained classifiers (example)sdroc: fraction of all objects may be rejected by specifying fraction afterrejectoption
development version 1.2.1 (27-May-2009)
sdnbayesimplementing Naive Bayes classifier with automatic selection of number of histogram binssdrocnow supports cost-based selection of operating point for two-class scenario (in addition to the existing multi-class cost-based optimization)sddecidemay be used in pipelines to define default operating pointsdp_affinecan construct simple feature scaling pipelines
stable version 1.2 (19-May-2009)
sdmixturesupports automatic estimation of number of componentssdneuralimplementing feed-forward neural network trainingsdcrossvalnow supports untrained pipelines
development version 1.1.6 (9-May-2009)
sdparzenParzen classifier implementing scalar and vector smoothingsdknnk-th nearest neighbor classifier with for prototype selection and support for both detection and multi-class classification
development version 1.1.5 (1-May-2009)
- libPRSD now supports loading pipelines also from a buffer using
prsd_LoadPipelineFromBuffer(pipelines may be now stored in application resources or sent over network). sdrocsupports rejection both far away and close to the decision boundary using therejectoption.sdscatter: the figure title may be selected interactively by clicking on the title area- simplified selection us performance measures in
sdroc
development version 1.1.4 (19-Mar-2009)
- adding support for group licenses via license server
- support for construction of arbitrary hierarchical classifiers using decision-level fusion and their execution through libPRSD
sddetectorbrings one-command construction of detectors based on arbitrary model (both in one-class setting specifying a threshold using fraction of rejected samples and in two-class setup using ROC analysis to fix the threshold minimizing mean error).sddrawrocallows to save the current operating point into any relevant object (sdroc,sdops, pipelines,sddecidemappings, customsdalgalgorithms)- introducing
sdmixturefor training Gaussian mixture models (one- or multi-class, variable number of components per class, different stopping criteria (iterations or likelihood delta)) sdrelaballows to define classes by ~ (tilda) negation operator (e.g. turn all what is not not apple into "non-apple")sdscatterallows the user to flip through order of classes (z-order) by + and - key-strokes- number of usability improvements in construction of pipelines and interaction with PRTools (
sdrocandsdopsobjects can be now directly concatenated into pipelines;sdmapwraps pipelines for use in PRTools) - many improvements in confusion matrix estimation:
sdconfmat setpropnow allows to quickly set property to a constant value. This makes it very easy to quickly tag a group of samples with a specific label.sdconfmatcan now add new labels with all confusion matrix combinations as a property. This can be used to quickly visualize different types of error directly in the feature-spacesdconfmatcosmetic fix: string confusion matrix scales nicely with long class names- new function
selpropreturning a subset of a dataset with given property values - significant improvements in scalability of sdroc to large datasets in speed and memory usage. Practical even for datasets with 100 000 samples and tens of thousands of operating points.
- improved ROC optimizer brings better quality sets of operating points
sdconfmatcan now estimate confusion matrices for sets of operating points from the soft outputssdexecan return numerical decision codes ('code' option). This is useful for low-level work with classifier outputs.- pipelines can return numerical decision codes using
.*operator (e.g.dec=data.*p) sdeaclustclustering can be now executed on new data. Scalable to very large datasets (images).
development version 1.1.3 (26-Jan-2009)
- fix in sdscatter allowing to paint labels with legend switched on
- fix in sdscatter retaining the type of numerical properties in a dataset saved back to workspace
- sdscatter can now switch visibility of classes or groups on/off. That's helpful when inspecting large datasets with many overlapping sample groups (patients). See context menu in sdscatter Figure windows. Painting now applies only to visible samples.
- initial support for hierarchical systems composed of multiple classifiers returning decisions (
sdp_cascade)- support for meta-classes and different features at each classifier node.
- ROC analysis for hierarchical systems
sdconfmatadded- the order of labels and decisions (lablists) can be fixed by the user
- sdconfmat can correctly handle situations where only some classes/decisions are present in the test set (given the full lablists)
- sdconfmat can return the string with a table
- support for normalization of confusion matrices
- lablists may be supplied as cell arrays of strings or string arrays
- support for weight-based operating points with reject option (rejection both close to the boundary and distance-based)
sdrocautomatically shows rejected fraction and all per-classTPrs- support for similarity-based nad distance-based classifier outputs
- adding reject fraction estimate to sdroc
- support for leave-one-out over a property (object, person, patient...)
- fix for the bug where sdscatter made error when mouse pointer was moved too quickly over the new window
development version 1.1.2 (18-Nov-2008)
- adding fast approximated k-NN see example in our blog
- adding a k-centres classifier capable of both one-class classification and multi-class discrimination
- feature selection algorithm
sda_featselnow supports also backward feature selection
development version 1.1.1 (09-Nov-2008)
- adding leave-one-out evaluation to
sdcrossval - adding sdfeatsel: robust feature selection using internal cross-validation loop. It supports custom-made feature selection algorithms
- two example algorithms added illustrating the use of feature selection during training (
sda_featsel_example1) and in inner cross-validation loop based onsdfeatsel(sda_featsel_example2)
stable version 1.1 (04-Nov-2008)
- fixing critical bug in 1.1 26-Oct-2008 related to problem with dongles
- fixing the issues with one-sample test sets in ROC
development version 1.1 (26-Oct-2008)
sdscattergets full support for GUI menus and class renaming- new
sdimagecommand visualizing image stored in a dataset. Support for label paiting, class renaming, multiple sample groupings, connection to sdscatter sdscattersupport for interactive sample inspector (datasets with 1D data using bar plot or 2D images)sddrawroccan now show confusion matrices at the cursor and at the selected operating point (if present i.e. if 'confmat' flag was specified in sdroc command)sdexenow automatically converts sdalg algorithms into pipelinessdstackgennow returns also a robust base classifier (mean fusion of per-fold trained base classifiers) as a second output- improved support for prtools classifiers with output conversion
- fix:
sdrocnow stores confusion matrices in multi-class situations using 'confmat' flag - fix to scaling using affine projection. scalem is now supported for all affine scaling types
stable version 1.0 (15-Sep-2008)
- added randomization cross-validation scheme
sdcrossval(nmc,data,'method','random') - ROC object may be queried using short names of measurements
r(:,'err(Cancer)') - activation support for commercial demos
development version 1.0 (02-Sep-2008)
- fix: included missing sda_prtools wrapper
- new feature: sdscatter now allows for user-defined titles (sample details moved to the figure title bar)
