Discretisation Options- Auto

 

[General Discretise Options]  [Manual Discretise Options]

 

Automatic discretisation algorithms work by calculating a number of split points within the numeric data, and assigning records within each parition a given bin number. The split points are recorded in the log window when the algorithm has finished executing.

 

Discretise Options

 

Equal width

Discretise the values in a field so that the range of the field is divided into a number of equal width partitions.

 

Equal frequency

Discretise the values in a field so that each partition has an equal number of entries.

 

Optimal algorithm

An optimal discretisation algorithm that calculates every possible partitioning. The optimal partitioning is selected as the one that minimises the spread of the entries within each partition. Note that this algorithm is extremely complex and may take a considerable amount of time, due to its enumerative nature.
Typically, you should run this algorithm on small databases of up to 1000 records.

 

Select fields

Use this button to select which fields the discretisation algorithm is applied to. Only continuous numeric fields are listed in the field selection dialog. A message above the button indicates how many fields are currently selected.

 

Number of bins

The number of bins (or partitions) that the field should be discretised into.

 

Start at bin

The number used to indicate the first bin. By default, numbering starts at 0 and subsequent bins are numbered sequentially.

 

Convert discretised field(s) to categorical

After the discretised data has been generated, convert the field to categorical values. This procedure allows the field to be used as the target field for feature selection, or mining activities.