================================== Advanced Solr plugin configuration ================================== .. include:: hint-querqy-5-solr.txt In the `installation `_ section you have already seen how to set up the Querqy query parser and the Querqy query component in ``solrconfig.xml``: **Querqy 5** .. code-block:: xml **Querqy 4** .. code-block:: xml This section will explain additional configuration options: * Define how to deal with undefined rewriters in the rewrite chain (Querqy 5) * The term query cache that avoids building Lucene queries for sub-queries created in query rewriting that never match in specific fields * The info logging for query rewriters * The parser for the user query string * Changing the rewriter request handler path (Querqy 5) * Configuring how rewriter definitions are stored in ZooKeeper .. _querqy-unknown-rewriters: Dealing with undefined rewriters in the rewrite chain (Querqy 5) ---------------------------------------------------------------- In Querqy 5 the rewriter chain is passed as a list of rewriter IDs in a request parameter: :code:`querqy.rewriters=rewriter1,rewriter2` where each rewriter ID references a previously defined rewriter configuration. You can use the :code:`skipUnknownRewriters` property of the query parser plugin to define what should happen if no rewriter configuration can be found for a given rewriter ID that was passed in :code:`querqy.rewriters`: .. code-block:: xml :linenos: :emphasize-lines: 2 true If 'skipUnknownRewriters' is set to :code:`true`, the missing rewriter is ignored and the rewrite chain will be processed as if this rewriter weren't part of it. A warning will be issued to the log file. If it is set to :code:`false`, Solr will reply with a '400 Bad Request' response, which is also the default behaviour when the 'skipUnknownRewriters' is not configured. .. _querqy-term-query-cache: Term query cache ---------------- When you configure rewriting rules in Querqy's 'Common Rules Rewriter', in most cases you will not specify field names on the right-hand sides. For example, you would use a synonym rule to say that if the user enters a query 'personal computer', Solr should also search for 'pc' and Querqy would automatically create field-specific queries like 'name:pc', 'description:pc', 'color:pc' etc. for the right-hand side of the synonym rule. On the other hand, it is very unlikely that an input term would have matches in all fields that are given in the 'gqf'/'qf' parameters. In the example, it is very unlikely that there would be a document having the term 'pc' in the 'color' field. You can configure Querqy to check on startup/core reloading/commit (= when opening a searcher) whether the terms on the right-hand side of the rules have matches in the query fields and cache this information. If there is no document matching the right-hand side term in a given field, the field-specific query will not be executed again until Solr opens a new searcher. Caching this information can speed up Querqy considerably, especially if there are many query fields. Version-independent cache configuration (solrconfig.xml): .. code-block:: xml f1 f2 querqy querqyTermQueryCache true f1 f2 querqy querqyTermQueryCache true Please see below for additional configuration for your Querqy version. **Querqy 5** .. code-block:: xml querqyTermQueryCache false If you `changed the request handler name <#changing-the-name-of-the-rewriter-request-handler-querqy-5>`_ of the :code:`querqy.solr.QuerqyRewriterRequestHandler`, you will have to set this name at the :code:`querqy.solr.TermQueryCachePreloader` (line #9): .. code-block:: xml :linenos: f1 f2 querqy querqyTermQueryCache true /some/othername **Querqy 4** .. code-block:: xml querqyTermQueryCache false .. _solr-query-string-parser: The query string parser ----------------------- The query string parser defines how the query string that is passed in request parameter ``q`` is parsed into Querqy's internal query object model before rewriting the query and before turning it into a Lucene query. It can be set using an element with name ``parser`` in the configuration:[1]_ .. code-block:: xml querqy.parser.WhiteSpaceQuerqyParser The parser defines how the input is interpreted. The default parser, the ``WhiteSpaceQuerqyParser`` provides only a very minimal syntax: * Query tokens are delimited by whitespace * Tokens can be prefixed a ``-``\ (token must not occur in matches) or a ``+``\ (token must occur in matches) This syntax should be sufficient for most use cases, especially for e-commerce search. Note that this query parser has no option to express field names. You can configure a ``querqy.parser.FieldAwareWhiteSpaceQuerqyParser`` to allow for field names. However, this can reduce the applicability of query rewriters considerably. You can implement your own query parser by implementing the ``querqy.parser.QuerqyParser`` interface. If your query parser needs more configuration options, you can provide it using a factory:[1]_ .. code-block:: xml querqy.solr.MyQuerqyParserFactory value 1 2 The factory must implement ``querqy.solr.SolrQuerqyParserFactory``. It will receive the configuration properties ('myConfProperty1'/'myConfProperty2') in a map parsed to its init method. Note that ``SolrQuerqyParserFactory.createParser()`` is called per request implying that QuerqyParsers are allowed to be stateful. Changing the name of the rewriter request handler (Querqy 5) ------------------------------------------------------------ The rewriter request handler is normally configured for the path :code:`/querqy/rewriter`: .. code-block:: xml This means that Querqy manages rewriters under a URL path :code:`/solr/mycollection/querqy/rewriter` and you normally should not need to change this path. Should you ever have to change it, you will need to change the :code:`name` attribute of the rewriter: .. code-block:: xml ... point the query parser to it: .. code-block:: xml :linenos: :emphasize-lines: 2 /my/rewriter-path ... and, if you use term query cache preloading, let the 'TermQueryCachePreloader' know about it: .. code-block:: xml :linenos: :emphasize-lines: 4 /my/rewriter-path .. _querqy-store-rewriters: Configuring how rewriter definitions are stored in ZooKeeper (Querqy 5) ----------------------------------------------------------------------- In SolrCloud, rewriter configurations are stored in ZooKeeper under the path :code:`querqy/rewriters` as part of the collection's config. You can set a number of Querqy configuration properties to control how rewriters are stored: .. code-block:: xml :linenos: myconfig __data 500000 If you want to share rewriter configurations across collections, you can make rewriter configurations part of a shared Solr configuration that you use to create multiple collections from it. Set the ``zkConfigName`` property to point to that shared Solr configuration and rewriter configurations will become part of it. :code:`querqy/rewriters` will then become a subpath in this shared configuration. Querqy only stores meta information about rewriters under :code:`querqy/rewriters` and keeps the actual rewriter configuations in a subpath underneath it. This subpath is named ``.data`` by default. Unfortunately, this path is not restored when you use Solr's collection backup to a file system. You can change that name using Querqy's ``zkDataDirectory`` property but you need to make sure, it doesn't start with a ``.`` if you want to have it restored from your backup. Rewriter configurations will not be moved automatically when you change the configuration property, so you will have to save all rewriters again to move them to the new location. Querqy will be able to handle reading rewriters from locations that were previously saved at a location other than the location that the current ``zkDataDirectory`` points to. (Querqy 5.3 and above) Rewriter configurations will be gzipped for storage in ZK. If the gzipped configuration still exceeds the maximum ZK file size, it will be split into multiple chunks. Querqy takes care of this under the hood but it cannot know ZK's file size limit. You can let Querqy know about this limit using the :code:`zkMaxFileSize` property, which represents the maximum compressed chunk size in bytes. The default value is 1000000, which fits the default ZooKeeper size limit. If you change this limit, the new limit will only be applied at the next time when rewriters are saved. .. [1] The queryParser class in the example below is :code:`querqy.solr.QuerqyDismaxQParserPlugin` for Query 5 and :code:`querqy.solr.DefaultQuerqyDismaxQParserPlugin` for Query 4.