Tutorial - Step 3

Database Familiarization Part 1

[Previous]  [Next]

Part of the problem specification stage of a KDD project is to become familiar with the database. You should understand the fields, values and the semantics of both before proceeding. The statistics node provides a simple method of displaying a summary of all fields.

Create the statistics node

The stream should now look like this:

CR Stream 2

Run the KDD stream

What happens when you run the stream?

The data will be loaded into the data source node from the specified file. During the loading process, a dialog displays the loading process's progress and the number of records that have been loaded so far.

The data then flows down the link into the statistics node. WITNESS Miner calculates the statistics (and displays a progress dialog again). As each part of the KDD stream is completed, the links turn red.

Finally, a statistics results dialog appears, displaying summary information about each field in the database. This dialog is illustrated below. The information contained in the statistics results dialog allows us to understand the different types of fields: continuous numeric, discrete numeric and categorical. For numeric fields we can see the range of each field and for categorical fields, the number of unique values.

The statistics dialog looks like this:

CR Tutorial Stats Dialog

Close the dialog to return to the main workspace.

[Previous]   [Next]