Developing better products faster


   



What's new in INForm's Latest Release (v5)

The latest release (January 2012) of INForm brings some changes that extend its scope into a wider range of data mining problems. Most of the changes focus on an ability to handle bigger problems - more inputs and properties, and more experimental data records. The main change is the ability to handle bigger data sets, and this has been facilitated by some major restructuring of memory usage within the program.

Another new feature is the ability to retrain selected models. If you aren't satisfied with one particular model, you can change the training parameters and develop another. Some new methods of determining the number of nodes in the hidden layer are also included, to help avoid over-training.

There are also enhancements to the user interface, including having the Test R2 displayed directly on the Training screen.

 

Handling Bigger Data Sets: 500 data fields; 50 000 data records

A number of customers have asked us to lift the limitation to 150 'data fields' (which include the sum of the number of ingredients, processing conditions and properties). Although this is generally not required for formulation, some users are tackling other data mining problems like Quantitative Structure Activity Relationships, or other large scale problems. The new version allows up to 500 different data fields.

In previous versions of INForm, the data sheet by default contained 500 rows, so could hold 500 experiments. This could be restrictive in some cases when users were investigating problems with large amounts of data - plant processing could be one example, where data points were generated automatically from a machine. Although the user could use "+500" to add additional rows to the spreadsheet, it was all too easy to forget to do this. The new version has removed this restriction, and up to 50,000 data records can be handled provided your computer has sufficient memory.

Back to top


Selective Retraining of Chosen Models

Previously, unless you were working in Interactive mode during training, if you wanted to change just one model then you needed to abort the training, and develop all the models again. Now, Selective Retraining is available. After all property models have been trained, the Parameters and Train buttons become active. Selecting a line in the training set means that you can modify the training properties for that parameter, then train that particular model again.

Back to top

New Methods of Determining Number of Nodes

INForm has always had inbuilt rules for selecting the number of nodes used in neural network models, but in some cases they selected more nodes than was necessary. Now, two new methods - Bootstrap Validation and -Fold Validation - are provided to give alternative ways for determining the number of nodes. The inbuilt rules remain the default option, though.

Back to top

Improvements to interface and usability

One of the most useful changes, in our opinion, is that the Test R2 value is now displayed right on the Training screen. So you don't need to go into the Model Statistics screen to find out the values of the Test R2. Instead you get the information on the model quality (as assessed by Test R2) much more quickly.

And if you have 'missing' data, this can now be represented by "?", which is more intuitive than the old "-99999" (which does still work, of course).

Back to top

 

 

This document maintained by webmaster@intelligensys.co.uk.
Copyright © 2012 Intelligensys Ltd