FS Node 

Feature Selection Node

 

Feature selection is used to identify a powerfully predictive subset of fields within the database and to reduce the number of fields passed to the mining stage of the KDD process. The node uses an information-based measure (for databases with a categorical or discrete target field) or correlation coefficients (for numeric target fields 1) to rank the fields according to their predictive relationship with the selected target field. The node can either display the results of feature selection in an HTML report or discard the weaker fields and pass the stronger fields through to the next node in the stream.

 

FS View Dialog

 

Options

Full details of the options available for the feature node can be found on the feature selection options page.


Notes

  1. The target field may consist of continuous numeric values from the raw data and can be used to calculate correlation coefficients in the feature selection node. However, other algorithms such as the simulated annealing engine (part of the discovery node) will require a categorical target field and hence continuous values must be transformed.