Book a free consultation

Select your industry*

Please select your industry*

We will call you ASAP or you can schedule a call

Elastic Builder To The Rescue

Apr 22, 2020


Previously, we talked about the birth and the agony of expanding complex ES queries when the custom generation is applied with all of its disadvantages.

In this post, we will show examples of how easily the elastic-builder can handle the generation and which are the positive outcomes of its integration in the framework.

Step 1 – improve the clauses

Instead of the custom generation for the must subtree presented in our previous blog post, we used elastic-builder’s helper methods. Almost the same can be achieved with the class constructors. This is what we ended up with:

let esb = require('elastic-builder');
let tasksSearchObject = esb.requestBodySearch();
let tasksSearchBoolQuery = esb.boolQuery().must(esb.matchPhraseQuery('country', country));
if (!!statusList.length) {
    tasksSearchBoolQuery = tasksSearchBoolQuery
            .must(esb.queryStringQuery(statusList.map(status => "(" + status + ")").join(' OR '))
            .defaultField('status'));
}
if (prediction) {
tasksSearchBoolQuery = tasksSearchBoolQuery
            .must(esb.matchPhraseQuery('prediction', prediction));
}
let postObject = tasksSearchObject.query(tasksSearchBoolQuery).toJSON();

Short and compact, isn’t it? And the output is almost the same as the custom-generated one, except that it is optimized for the country and prediction. Instead of using a match with field phrase, we set only the match_phrase clause, thus reducing the size of the JSON, but still achieving the same goal:

"must": [
            {
                "match_phrase": {
                    "country": country
                }
            },
            {
                "query_string": {
                    "query": "(" + status_1 + ") OR (" + status_2 + ")",
                    "default_field": "status"
                }
            },
            {
                "match_phrase": {
                    "prediction": prediction
                }
            }
]

Have in mind that before using the elastic-builder we were not even aware that match_phrase clause exists.

Step 2 – deal with conditional filtering

When it comes to applying filters under specific conditions, the expected output will be different per condition. If you have multiple conditions applied on different filters you either have to make separate generations for each condition or you need to involve conditions at least for the affected clauses.

The second approach is far more efficient but you have to be extremely careful with the custom code related to the condition. Let’s take a look at the example below where you have two different status values for which you need to generate two different queries. The bool compound clause for the first will look like this:

"should": [
            {
                "bool": {
                    "must_not": {
                        "term": {
                            "task_type": task_type
                        }
                    }
                }
            },
            {
                "bool": {
                    "must": [
                                {
                                    "term": {
                                        "task_type": taskType
                                    }
                                },
                                {
                                    "range": {
                                        "details.update_date": {
                                            "gte": date
                                        }
                                    }
                                }
                            ]
                        }
            }
]

The second status value will only apply different sorting:

"sort": [
            {
                "close_date": sortOrder
            }
]

Even though the second one is more simple, it still costs us more than 40 lines of custom code for generating only 3 + 1 leaf clauses. To avoid making mistakes as much as possible, we replaced that large code segment with the elastic-builder’s helper methods in a similar fashion as in the match_phrase clause:

case status_1:
tasksSearchBoolQuery = tasksSearchBoolQuery
    .filter(esb.boolQuery()
    .should(esb.boolQuery()
    .mustNot(esb.termQuery('task_type', taskType)))
    .should(esb.boolQuery()
    .must(esb.termQuery('task_type', taskType))
    .must(esb.rangeQuery('details.update_date')
    .gte(date)));
break;
case status_2:
tasksSearchObject = tasksSearchObject.sort(esb.sort('close_date', sortOrder));
break;

It is obvious that each condition still affects different parts of the post object. The status_1 case adds 3 leaves in the bool compound clause and status_2 changes the sort clause. However, we avoided generating a new query by using “a chain” of helper methods through two variables (tasksSearchObject and tasksSearchBoolQuery that will be appended to the first one).

Step 3- handle the advanced filtering

Another good experience from a compact way of generating queries with elastic-builder is the multiple filtering with multiple values per filter. Although we faced far more complicated filters, here we will show you an example of filtering only by 2 – user and source_name – with 3 and 2 values respectively. Below is the query generated with our custom approach:

"must": [
            {
                "bool": {
                    "filter": {
                        "terms": {
                            "user": [
                                user_id_1,
                                user_id_2,
                                user_id_3
                            ]
                        }
                    }
                }
            },
            {
                "bool": {
                    "filter": {
                        "terms": {
                            "details.source_name": [
                                        source_name_1,
                                        source_name_2
                            ]
                        }
                    }
                }
            }
]

The same query can be generated in two lines with just several helper methods, contrary to what we previously had with the custom code:

tasksSearchBoolQuery = tasksSearchBoolQuery
    .must(esb.boolQuery()
    .filter(esb.termsQuery('user', [user_id_1, user_id_2, user_id_3])));
tasksSearchBoolQuery = tasksSearchBoolQuery
    .must(esb.boolQuery()
    .filter(esb.termsQuery('details.source_name', [source_name_1, source_name_2])));

Do you need to know the DSL in detail for query generation?

All of the above useful lines of code are written assuming that you know the exact query you want to generate. What if you have a situation where you are not sure what our query will look like due to a lack of knowledge in the area of ES queries?

The desired output depends on the correctly applied query and with elastic-builder, you don’t need to dig too deep into the DSL (domain-specific language) for generating the queries. Of course, any kind of knowledge you possess about the query DSL is a plus, but it is definitely not a must.

Instead, it is enough to learn the purpose of the elastic-builder classes and helper methods. For example, if you want to apply an efficient custom search by multiple fields (like titletask_type), no matter if they are nested (like d_etails.retailer_) or not, you can simply use the following queryStringQuery helper after quickly checking the official documentation:

let tasksSearchBoolQuery = tasksSearchBoolQuery.must(
esb.queryStringQuery(searchString))
    .fields(['title', 'task_type', 'details.retailer'])
    .defaultOperator('AND');

However it is preferable to be familiar with the usage of wildcards and logical operators (like AND), otherwise, you can end up specifying a wrong operator or not defining it at all (it will default to OR). The appropriate leaf clause (in the must subtree) will look like this:

"query_string": {
    "query": searchString,
    "fields": [
        "title",
        "task_type",
        "details.retailer"
    ],
"default_operator": "AND"

By specifying AND as a default operator, the ES treats the searchstring value as a whole whereas by default it would have been considered as an array of words. Therefore, it will have a totally different meaning if you apply another logical operator instead. Another similar example is the keyword analyzer. We frequently use it due to its effectiveness, for example when the sorting value can contain multiple words like sorting by details.retailer and by details.source_type:

let tasksSearchObject = tasksSearchObject
    .sort(esb.sort('details.retailer.keyword', orderRetailer))
    .sort(esb.sort('details.source_type.keyword', orderSourceType));

Notice that keyword is not a field at all, even though it is specified in the value of the field parameter. As a simple conclusion derived from answering this question, we would like to point out that elastic-builder can only help you understand and easily construct the queries. It is not responsible for what you put as a parameter value.

Results

If you consider involving elastic-builder or any other similar library for generating ES queries, we encourage you to do it from the very beginning, before the queries become too complex. Below are the results we achieved using elastic builder in our framework:

Elastic_Builder_To_The_Rescue_2.webp

Velimir Graorkoski

Book a free consultation

Select your industry*

Please select your industry*

Select your service type

Please select your service type

We will call you ASAP or you can schedule a call