This chapter describes how to handle labels and indexed properties.
- 4.1. Introduction
- 4.2. Basic handling of labels
- 4.3. Creating labels
- 4.3.1. Providing per-sample class names
- 4.3.2. Consecutive labeling
- 4.3.3. One entry per class
- 4.3.4. Using label list
- 4.4. Operations on labels
- 4.4.1. Accessing label information
- 4.4.2. Retrieving class sizes and priors
- 4.4.3. Searching for samples with specific labels
- 4.4.4. Subsets of labels
- 4.4.5. Relabeling: Changing class names and defining meta-classes
- 4.4.6. Concatenating label sets
- 4.5. Creating label lists
- 4.5.1. From list of class names
- 4.5.2. Using string prefix
- 4.6. Operations on label lists
- 4.6.1. Accessing list content
- 4.6.2. Converting between names and indices
4.1. Introduction ↩
Label is an assignment of an object into specific category. For example, an image of a road sign may be assigned into a class 'no stopping'. In pattern recognition we usually work with sets of labels corresponding to sets of data samples. In addition to common class label per data sample, PRSD Studio also uses labels for any indexable meta-data such as patient identifiers or video frames. Labels help us to categorize observations and define concepts.
In PRSD Studio, a set of labels is represented by an object of class
sdlab. In this manual, we refer to one sdlab object by the plural
'labels'.
The sdlab object is composed of two components, namely the label
list and the string vector representation.

The label list is a table containing all information on concepts represented by the label set. These may be individual classes, patients, or video frames. Each concept has a unique string name. Each entry of the label represents one object. For example, if a label set describes sample class labels, it shows to the user one string per sample. Internally a numerical representation is used to enhance speed in the operations.
4.2. Basic handling of labels ↩
Set of labels may be constructed by providing the class names per sample:
>> lab=sdlab({'banana','banana','apple','apple','banana'})
sdlab with 5 entries, 2 groups: 'apple'(2) 'banana'(3)
We have created a label object lab holding information on 5 data
samples. The samples are labeled into two classes, 'apple' and 'banana'.
Label objects may be created in many different ways, see Section Creating labels.
Class name in PRSD Studio is always a string and is unique within the label set.
In order to view per-sample labels, we may use the getnames method or
the unary plus operator:
>> +lab
ans =
banana
banana
apple
apple
banana
The classes are described in the list object stored within the label set:
>> lab.list
sdlist (2 entries)
ind name
1 apple
2 banana
As we can see, the list has two entries, as two are the concepts described.
In PRSD Studio, classes may be always accessed by name or index in the list. For example, we may use the find function to obtain indices of label entries for a specific class:
>> find(lab=='apple')
ans =
3
4
>> find(lab==1)
ans =
3
4
4.3. Creating labels ↩
4.3.1. Providing per-sample class names ↩
Label objects may be constructed by providing per-sample string labels in a cell array
>> t={'apple','banana','apple'};
>> lab=sdlab(t)
sdlab with 3 entries, 2 groups: 'apple'(2) 'banana'(1)
or in string array:
>> t=strvcat({'apple','banana','apple'})
t =
apple
banana
apple
>> lab=sdlab(t)
sdlab with 3 entries, 2 groups: 'apple'(2) 'banana'(1)
We may also provide per-sample labels directly into sdlab:
>> lab=sdlab('apple','banana','apple')
sdlab with 3 entries, 2 groups: 'apple'(2) 'banana'(1)
We created a label set for three observations. The set contains two classes (in general called groups), namely 'apple' and 'banana'. Note that for small number of classes, the label set shows also the number of samples available per group.
When given a numerical vector, sdlab converts the numbers into
strings:
>> lab=sdlab([15 10 10 20 15 15])
sdlab with 6 entries, 3 groups: '10'(2) '15'(3) '20'(1)
4.3.2. Consecutive labeling ↩
Often, we need to create labels for sets of observations grouped by
class. For example, we know there are first 3 apples, followed by 2 lemons
and 2 bananas. We may create sdlab object by providing name and count
for each class:
>> lab=sdlab('apple',3,'lemon',2,'banana',2)
sdlab with 7 entries, 3 groups: 'apple'(3) 'lemon'(2) 'banana'(2)
To display the content of the sdlab object, we may use unary plus
operator or getnames function:
>> +lab
ans =
apple
apple
apple
lemon
lemon
banana
banana
When constructing the consecutive labeling inside Matlab functions, we may directly provide a cell array with names and a vector with class sizes:
>> lab=sdlab({'apple','lemon','banana'},[3 2 2])
sdlab with 7 entries, 3 groups: 'apple'(3) 'lemon'(2) 'banana'(2)
>> +lab
ans =
apple
apple
apple
lemon
lemon
banana
banana
4.3.3. One entry per class ↩
Sometimes, we need to create labels for a set of concepts with a single entry per concept. For example, we need to construct labels for 5 clusters. We may provide the base name and index to be appended
>> lab=sdlab('Cluster ',1:5)
sdlab with 5 entries, 5 groups:
'Cluster 1'(1) 'Cluster 2'(1) 'Cluster 3'(1) 'Cluster 4'(1) 'Cluster 5'(1)
We may, of course, provide directly the cluster identifiers, if needed:
>> lab=sdlab('Cluster ',[123 152 182])
sdlab with 3 entries, 3 groups: 'Cluster 123'(1) 'Cluster 152'(1) 'Cluster 182'(1)
4.3.4. Using label list ↩
List object sdlist in PRSD Studio describes individual concepts such as
classes. Labels may be also created using from any sdlist object:
>> ll=sdlist('apple','lemon','banana')
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
For detailed information on creating and using sdlist object, see this Section. The simplest possibility is
construction of a label set with one entry per list entry:
>> lab=sdlab(ll)
sdlab with 3 entries, 3 groups: 'apple'(1) 'lemon'(1) 'banana'(1)
We can also construct labels by supplying the sdlist object and a vector
of indices to the list.
>> lab=sdlab(ll,[1 1 3 2 2 2 2])
sdlab with 7 entries, 3 groups: 'apple'(2) 'lemon'(4) 'banana'(1)
>> +lab
ans =
apple
apple
banana
lemon
lemon
lemon
lemon
Note that sdlab discards class entries not present in the vector with indexes. This is illustrating the basic principle that sdlab always contains only classes with at least one sample.
>> lab=sdlab(ll,[3 2 2])
sdlab with 3 entries, 2 groups: 'lemon'(1) 'banana'(2)
Construction of label set from indexes is useful when working with classifier decisions. Decisions may represent only a subset of trained classes. The decision label list is therefore automatically reduced. % TODO ref needed.
4.4. Operations on labels ↩
4.4.1. Accessing label information ↩
Per-sample class names may be accessed using the plus operator or getnames method:
>> lab=sdlab('apple',3,'lemon',2,'banana',2)
sdlab with 7 entries, 3 groups: 'apple'(3) 'lemon'(2) 'banana'(2)
>> getnames(lab)
ans =
apple
apple
apple
lemon
lemon
banana
banana
Number of samples is equal to length of the label object:
>> length(lab)
ans =
7
Number of classes present in the label set may be retrieved as a length of the label list:
>> length(lab.list)
ans =
3
Per-sample class indices may be accessed using getindices method or the unary minus operator:
>> getindices(lab)
ans =
1
1
1
2
2
3
3
>> -lab(1:4)
ans =
1
1
1
2
Label object contains only entries for classes with at least one sample. Empty classes are removed from the list.
4.4.2. Retrieving class sizes and priors ↩
The sdlab object keeps track of number of sample in each class. Use
getsizes method to retrieve vector of class sizes:
>> getsizes(lab)
ans =
3 2 2
Sizes are presented in the class order defined in the label list:
>> lab.list
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
Related to class sizes are the class prior probabilities accessible by the
getpriors method:
>> getpriors(lab)
ans =
0.4286 0.2857 0.2857
Handy shortcut to list the classes, their sizes and priors is the transpose operator:
>> lab'
ind name size percentage
1 apple 3 (42.9%)
2 lemon 2 (28.6%)
3 banana 2 (28.6%)
4.4.3. Searching for samples with specific labels ↩
The find method helps us to identify label entries assigned to a specific
class. We may use both equal or not-equal operators on labels.
>> find(lab=='banana')
ans =
6
7
% all samples that are not labeled as 'lemon'
>> find(lab~='lemon')
ans =
1
2
3
6
7
In addition to the class name, we may also use relative index in the list.
4.4.4. Subsets of labels ↩
Given a list of sample indices, we can get a subset of the labels:
>> lab
sdlab with 7 entries, 3 groups: 'apple'(3) 'lemon'(2) 'banana'(2)
>> lab(2:5)
sdlab with 4 entries, 2 groups: 'apple'(2) 'lemon'(2)
>> +lab(2:5)
ans =
apple
apple
lemon
lemon
4.4.5. Relabeling: Changing class names and defining meta-classes ↩
The sdrelab function allows us to redefine the labeling by changing the
class names or defining meta-classes. It accepts label object and a cell
array with relabeling rules. Each rule is composed of source specifier
defining the classes to be changed and a new class name.
For example, lets say we want to handle lemons and bananas together as 'yellow fruit'.
>> lab
sdlab with 7 entries, 3 groups: 'apple'(3) 'lemon'(2) 'banana'(2)
>> lab.list
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
>> lab2=sdrelab(lab,{[2 3] 'yellow fruit'})
1: apple -> apple
2: lemon -> yellow fruit
3: banana -> yellow fruit
sdlab with 7 entries, 2 groups: 'apple'(3) 'yellow fruit'(4)
We have used the indices of 'lemon' and 'banana' classes as the source and 'yellow fruit' as the new class name. Because class name is always unique, we ended up with a single 'yellow fruit' class with four samples.
Multiple rules may be specified in one sdrelab command. Source
specifiers always refer to the original situation in the input data set:
>> lab2=sdrelab(lab,{[2 3] 'yellow fruit' 'apple' 'red fruit'})
1: apple -> red fruit
2: lemon -> yellow fruit
3: banana -> yellow fruit
sdlab with 7 entries, 2 groups: 'red fruit'(3) 'yellow fruit'(4)
We are often interested in a particular class and need to relabel all
remaining classes. To achieve that, we may use the tilde ~ character at
the beginning of the source class name. Such rule will be applied on the
rest of classes:
>> lab2=sdrelab(lab,{'~apple' 'yellow fruit' })
1: apple -> apple
2: lemon -> yellow fruit
3: banana -> yellow fruit
sdlab with 7 entries, 2 groups: 'apple'(3) 'yellow fruit'(4)
By default sdrelab shows the translation table. When executed from inside
our routines, we may want to suppress this extra information by adding the
'nodisplay' option:
>> lab2=sdrelab(lab,{'~apple' 'yellow fruit' },'nodisplay')
sdlab with 7 entries, 2 groups: 'apple'(3) 'yellow fruit'(4)
4.4.6. Concatenating label sets ↩
Labels may be concatenated vertically or horizontally.
Vertical concatenation means we are adding labels of other objects.
>> lab1=sdlab('apple',5,'banana',6)
sdlab with 11 entries, 2 groups: 'apple'(5) 'banana'(6)
>> lab2=sdlab('lemon',2,'apple',10)
sdlab with 12 entries, 2 groups: 'lemon'(2) 'apple'(10)
>> L=[lab1; lab2]
sdlab with 23 entries, 3 groups: 'apple'(15) 'banana'(6) 'lemon'(2)
The resulting label set now represent 23 objects of three classes.
Horizontal concatenation appends the class names for the same set of objects. This is useful for construction of unique labels. Lets say we have a data set describing a set of pixels from multiple images. We added an image label to distinguish pixels from different images. We clustered pixels in each image and for each pixel stored the cluster identifier. We have, therefore, 'Cluster 1' label assigned to some pixels in each image. To construct a unique cluster label we use the horizontal concatenation:
% image labels for 20 pixels:
>> lab1=sdlab('Image 1',20)
sdlab with 20 entries from 'Image 1'
% clustering result for the same 20 pixels:
>> lab2=sdlab('Cluster 1',8,'Cluster 2',7,'Cluster 3',5)
sdlab with 20 entries, 3 groups: 'Cluster 1'(8) 'Cluster 2'(7) 'Cluster 3'(5)
>> L=[lab1 lab2]
sdlab with 20 entries, 3 groups: 'Image 1Cluster 1'(8) 'Image 1Cluster 2'(7) 'Image 1Cluster 3'(5)
We may also provide a string separator between the image name and cluster name during the concatenation:
>> L=[lab1 '-' lab2]
sdlab with 20 entries, 3 groups: 'Image 1-Cluster 1'(8) 'Image 1-Cluster 2'(7) 'Image 1-Cluster 3'(5)
The label set L now uniquely identifies the cluster in a specific image.
4.5. Creating label lists ↩
List object sdlist describes a set of concepts. Each concept has a string
name.
4.5.1. From list of class names ↩
List object may be created by providing the class names directly as parameters:
>> ll=sdlist('apple','lemon','banana')
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
or providing the cell array with names:
>> ll=sdlist({'apple','lemon','banana'})
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
Alternative is to use a character array:
>> ll=sdlist(strvcat({'apple','lemon','banana'}))
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
Note that the user-defined order of classes in the list is preserved.
4.5.2. Using string prefix ↩
The sdlist may be also created given prefix string and numbers to be
appended. For example, we want to define a list representing five features:
>> sdlist('Feature ',1:5)
sdlist (5 entries)
ind name
1 Feature 1
2 Feature 2
3 Feature 3
4 Feature 4
5 Feature 5
The prefix string may be also omitted whatsoever:
>> sdlist(1:4)
sdlist (4 entries)
ind name
1 1
2 2
3 3
4 4
4.6. Operations on label lists ↩
4.6.1. Accessing list content ↩
List entries may be retrieved using getnames method or the plus operator:
>> ll
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
>> getnames(ll)
ans =
apple
lemon
banana
>> +ll
ans =
apple
lemon
banana
4.6.2. Converting between names and indices ↩
sdlist class provides two methods for conversions between
class names and indices in the list:
We may convert the per-sample indices into names:
>> ll=lab.list
sdlist (3 entries)
ind name
1 apple
2 lemon
3 banana
>> name2ind(ll,{'banana','lemon'})
ans =
3
2
>> ind2name(ll,[1 2 2 2 1 3])
ans =
apple
lemon
lemon
lemon
apple
banana
If a non-existent name is used, the conversion routine returns
empty matrix []. If index is out of bounds, an error is raised.
