The Grammar Settings Dialog


This dialog, invoked from the Analysis... menu item, allows you to specify parameters for AXE's grammar generator.


AXE can build grammars from arbitrary raw data. This can be useful for many tasks in data analysis and archaeology: AXE's grammar generator only scratches the surface of the general task of finding repetitions and developing a structure to represent arbitrary data. However, it can still be very useful. The radio buttons on the Interpret Data As list allow you to specify the data type which the document's bytes will be interpreted as. The grammar produced from 'Characters' will be the same as that produced from 'Bytes', and the grammar produced from 'Wide Characters' will be the same as that produced from 'Words'; the difference is that document elements will be displayed as characters rather than hex numbers in the output.

The radio buttons in the Operate On list determine whether the whole document, or only the selected region, will be used. Generating a grammar from a large document can be very time consuming, and so can displaying it, so the region used should be as small as possible. Depending on the randomness of the data, a region of about 10k may be an approximate upper limit.

The Delete rules that are only used once checkbox, if checked, will prevent the suppression of redundant rules as the grammar is generated. The result will generally not be as good as the grammar produced by an ordinary Sequitur algorithm but it may be more regular and thus bring out areas of similarity.

The Include Extra Rule Info option causes each rule to be output along with a count of how many times that rule is used in the grammar, plus a full expansion of the data represented by the rule. This is extremely useful for making sense of the grammar but can make displaying the grammar window rather slow.

The Output Rule S option determines whether S, the first and usually longest rule of the grammar, is emitted. If a grammar that can be used to recreate the input data is required, then S is required; if not, S is often too long to be intelligible.

The Create Tree option determines whether a tree is generated that shows the full expansion of the grammar and is linked to the document. Usually, this tree is extremely useful but it can be emitted to save time or if the intent is to generate a grammar as text for analysis elsewhere.