Importing a data file
By Javier Pérez, Artelnics.
You can use your favourite data files with Neural Designer.
- Import data file wizard
- TXT, DAT and CSV files
- OpenOffice Calc ODS files & Microsoft Excel XLSX files
- Weka ARFF files
Once a project has been created, you can import a data file by clicking on the Import data file button at the Data set page. The following figure shows the Dataset page of Neural Editor where you can find Import data file button.
By clicking on the "import data file" button, it shows up the Import data file wizard. This wizard contains the following two pages:
- The Select data file page.
- The Set file properties page.
The Select data file page allows you to select your data file. The data files supported are:
- Text files (.txt).
- Data files (.dat).
- CSV files (.csv).
- OpenOffice Calc files (.ods).
- Microsoft Excel files (.xlsx).
- Weka files (.arff).
The next figure shows a screenshot of Select data file page for data file server type.
As we can see, this page is a file dialog wich contains:
- Look in: the directory where you are looking in.
- File name: the name of your data file.
- Files of type: diferent file filters for suported files.
When you select a valid file, the next step is to click on the Next button in order to set file properties.
The Set file properties page allows you to check your data file and import the results to your project. You can preview your data file and set the file properties. The figure below shows a screenshot of Set file properties page.
You can check Columns name and Rows label if there are any on your data file. You should select the Separator if it doesn't coincide with the separator of your data file. Also you should write the Missing values label. It must coincide with the missing values label of your data file. Import Data File will set the fields to recommended.
If all the properties are set up, click on Finish button. Now you can start to use your data with Neural Designer.
Neural Designer works with .txt and .dat text files. Each line of the file is a data record and each record consists of one or more fields, separated by the same separator to all rows and lines.
A comma-separated values (CSV) file stores tabular data in plain text. Each line of the file is a data record and each record consists of one or more fields, separated by commas.
The following example represents a text data file (.txt or .dat). It has 11 rows, 2 columns and where the data is separated by spaces.
The image below is an example how to load the data from the example.
In a CSV file the data from the previous example should be represented as follows.
In the next figure you can see how the Import data file wizard has recognized the comma separator.
In the next example we add columns name to the first example. We add some missing values as NULL too.
As we see in the next figure, the Import data file wizard has recognized the columns name, but we need to change the Missing values label to NULL.
We also add a rows label to the data and we change the separator to a semicolon.
When we import the text file, the Import data file wizard recognized the columns name, the rows label and the separator. As before we should change the Missing values label to "NULL".
An .ods is an open XML-based file format for spreadsheets. OpenOffice Calc works with .ods files.
An .xlsx is an Office Open XML Workbook file for spreadsheets. Microsoft Excel works with .xlsx files.
The next figures are an example of .ods and .xlsx files.
As we can see, the import data file dialog shows a previsualization of the data set in order to realize that all is OK. The image below shows the Set file properties page with an excel example, but neural designer can update different types of files.
As you see there are some new fields to set. You have to set the Sheet number where your data is. Also you have to set the Cell range where your data is. Import Data File will set the fields to recommended values.
Like a text file, you have to set the Columns name, Rows label and the Missing value label.
An ARFF (Attribute-Relation File Format) file is an text file that describes a list of instances sharing a set of attributes.
The next example is the content of a .arff dataset file.
In the next figure you see the Set file properties page for a .arff file.
You dont have to set any file properties for a .arff file.