A Note on Performance
There are often several ways to write a query that produces a required output report. Writing queries in different ways can sometimes cause enormous differences in query performance, so this section provides some hints on how to write queries so they are as efficient as possible.
Use of indexes
Wherever possible, use HAS SUBWORD
rather than MATCHES
/LIKE
or HAS SUBSTRING
so the datastore indexes are used. The full text index is very good at finding words and phrases.
WHERE clause ordering
In WHERE
clauses containing AND
expressions, order the parts so that the most restrictive condition comes first, except that it is always better to use a HAS SUBWORD
condition in preference to other kinds. This reduces the search space as soon as possible.
Filtering on related nodes
To filter nodes by characteristics of related nodes, it is better to find the related nodes, then traverse to the required nodes. For example, to find Hosts in London, this query finds the London location in the index then uses a traversal so it directly and quickly finds the required Hosts:
SEARCH Location WHERE name = "London" TRAVERSE Location:Location:ElementInLocation:Host
This query starts by building a set of all Host nodes, then checks the location of each one, which is much slower:
SEARCH Host WHERE NODECOUNT(TRAVERSE Location:Location:ElementInLocation:Host WHERE name = "London")
This final version also builds a set of all Hosts and checks the location of each one, but in this case it fails when encountering Hosts in more than one location, as well as being inefficient. Queries like this should never be used:
SEARCH Host WHERE #Location:Location:ElementInLocation:Host.name = "London"
Default ordering
By default, queries with no SHOW
clause are ordered by the label value defined in the taxonomy. If the data is to be exported to an external system that does not care about order, the time spent retrieving data to sort can be avoided by specifying ORDER BY ""
.
Comments
Log in or register to comment.