RANDSUBSET Select a random subset of samples
SUB=RANDSUBSET(DATA,PAR)
[SUB,REST,IND1,IND2]=RANDSUBSET(DATA,PAR,options)
SUB=RANDSUBSET(DATA,PROP,PAR) % sample the property PROP
INPUT
DATA Data set
PAR Selection parameter (scalar or vector)
01 Number of samples per class
PROP Name of property to sample (default: 'lab')
OUTPUT
SUB Selected subset
REST Remainder of the DATA
IND1,IND2 Indices of the samples in SUB and REST
OPTIONS
'all' Select the subset from a complete data set
'atmax' Take at maximum the specified number of samples
DESCRIPTION
RANDSUBSET selects a random subset of samples in DATA. The Selection is
performed per-class by default. If PAR is a vector, it specifies
selection for each class separately (zero entries are supported). To
select fraction or number of samples from the complete data set,
add 'all' option:
SUB=RANDSUBSET(DATA,PAR,'all')
If property name PROP is specified, RANDSUBSET works on this property.
For example, to select random 100 samples per-patient, use:
SUB=RANDSUBSET(DATA,'patient',100)
Note that the PROP property must be indexed (SDLAB object).
To make sure the subset SUB does not contain more than M samples, use
the 'atmax' option.