Now that we have summary statistics about the database, it would be useful to know what proportion of the Responses were YES and what proportion were NO.
Create a distribution node (in the same way as you created the other nodes) and link it to the data source node. The stream should now look like this:

Edit the options for the distribution node and set the field to display as Response. Leave all other options with their default values:

Run the stream from the distribution node to get these results:

You can read the following information from this distribution graph:
Number of customers who responded to the newsletter (class YES) = 120 (11.86%)
Number of customers who did not respond (class NO) = 892 (88.14%)
We are particularly interested in the factors that influenced the 120 customers who responded to the newsletter. We could use this information to increase the response rate for future events.
Close the dialog when you have examined it. Closing the dialog returns you to the main workspace.
It is important that the data we have collected is clean and does not contain spurious entries. We cannot use the information presented in the statistics node to identify possible outliers, such as:
an entry of -1 months in a field that describes the number of months
since a purchase was made.
an entry of 999 in a field that describes the number of siblings.
Another part of the cleansing stage is to split the database into training and testing sets. For the purpose of this tutorial we will use the database as a whole. In normal projects, you would use the sample records node to split the database.