A Note on Performance
There are often several ways to write a query that produces a required output report. Writing queries in different ways can sometimes cause enormous differences in query performance, so this section provides some hints on how to write queries so they are as efficient as possible.
The following topics are covered in this section:
Use of indexes
Wherever possible, use
HAS SUBWORD rather than
HAS SUBSTRING so the datastore indexes are used. The full text index is very good at finding words and phrases.
WHERE clause ordering
WHERE clauses containing
AND expressions, order the parts so that the most restrictive condition comes first, except that it is always better to use a
HAS SUBWORD condition in preference to other kinds. This reduces the search space as soon as possible.
Filtering on related nodes
To filter nodes by characteristics of related nodes, it is better to find the related nodes, then traverse to the required nodes. For example, to find Hosts in London, this query finds the London location in the index then uses a traversal so it directly and quickly finds the required Hosts:
This query starts by building a set of all Host nodes, then checks the location of each one, which is much slower:
This final version also builds a set of all Hosts and checks the location of each one, but in this case it fails when encountering Hosts in more than one location, as well as being inefficient. Queries like this should never be used:
By default, queries with no
SHOW clause are ordered by the label value defined in the taxonomy. If the data is to be exported to an external system that does not care about order, the time spent retrieving data to sort can be avoided by specifying
ORDER BY "" .