One method of preparing a CSV file is to load the data into a text editor. Note that you might need to use a specialist package that can cope with extremely large volume of data.
Create the text file. The first line contains the field names (or headings). Subsequent lines contain records. Use commas to separate data and a carriage return to indicate the end of a row. Remember to include a comma for fields that are missing data.
WITNESS Miner needs to be alerted to the presence of categorical (text) fields. You can either do this when you create the text file (by prefixing each categorical field heading with an asterisk * symbol) or you can select each categorical field manually on the data source options dialog once you've loaded the .csv file.
Save the text file in .csv format.
Open WITNESS Miner, create a data source node, then double-click on its icon in the workspace to open the data source options dialog.
In the data file field, enter the name and path of the .csv file (use the select button to display the file selector to find the file, if necessary).
If you have already flagged the categorical field headings with an asterisk symbol, select the use (*) flag in datafile option. If you haven't flagged the categorical fields, choose the select manually option, then click on the select button to display the select categorical fields dialog. Choose the categorical field headings from the list of available field headings.
Finally, choose whether to load all of the records available from the database, or to sample a specified number of records from the database as it is being loaded. This method is more efficient than loading the whole database (which might not be possible due to memory limitations) and then sampling the records.
The remaining fields on the data source options dialog do not apply to text files, so close the data source options dialog.
The text file is now linked to the data source. If you create a view node, link it to the data source and then run the view node, you can view the database. The database that results from the Customer Response.csv file looks like this in the data view: