Customers who trust us
Apr 09, 2020
With all due respect to what ES (ElasticSearch) brings to the querying of the data compared to the other storage engines, one of its disadvantages is the potential complexity of the queries. To generalize, if you want to send a post object to an ES endpoint, the construction of that object could become trickier than usual, especially when the goal is to retrieve data by filtering. Several years ago there wasn’t much documentation on how to build complex ES queries and there were also not many tools such as libraries and packages for ES query generation. However, this changed with the emergence of elastic.js lib for javascript and especially with its node.js successor – the elastic-builder package.
How did our story with ES query generation begin?
Over the course of two years, we were working on a particular product where we used ES queries. Over time, we became aware of the necessity to optimize the generation of such queries. Each time, we needed to change something it required a lot of effort on our side. So, we decided to find a solution to simplify ES query generation.
After multiple considerations and discussions, we ended up using the elastic-builder package to optimize the code for ES query generation in our Angular project.
In this post, we won’t be talking about whether writing custom code is a better solution than using a third-party package or whether Angular is the most suitable FE framework. Instead, we will just focus on some key segments of our solution related to the ES queries and explain them in two parts – Part 1, describing the inherited status together with the main pain points, and Part 2, containing the improved solution with the elastic-builder.
In the first phase, the requirement was simple filtering for tasks by several parameters like country, status, etc. At that point, we expected that the shape of the query would not change so we wrote several lines to manually set the post object required for delivering the needed output. The JSON tree, shown below, was more or less simple:
To generate only the must subtree, we had to use multiple arrays, objects, and conditions, considering that some of the filters may not exist at all (prediction), or may have multiple values (status):
let must: Array<any> = [
{
"match": {
"country": {
"query": country,
"type": "phrase"
}
}
}
];
if (!!statusList.length) {
let query = statusList.map(status => "(" + status + ")").join(' OR ');
must.push({
"query_string": {
"default_field": "status",
"query": query
}
});
}
if (prediction) {
must.push({
"match": {
"prediction": {
"query": prediction,
"type": "phrase"
}
}
});
}
let boolObject = {"must": must};
let postObject = {
"query": {
"bool": boolObject
}
};
The output of this generation is what we wanted since we had to know the structure of the query beforehand:
"query": {
"bool": {
"must": [
{
"match": {
"country": {
"query": country,
"type": "phrase"
}
}
},
{
"query_string": {
"default_field": "status",
"query": "(" + status_1 + ") OR (" + status_2 + ")"
}
},
{
"match": {
"prediction": {
"query": prediction,
"type": "phrase"
}
}
}
]
}
}
We used a similar approach for the filter and sort subtrees, believing that the custom code for generating the whole query was probably large enough to even consider an alternative at that point. For the time being, we were stuck with the status quo, not anticipating that more and more requirements will come sooner than expected.
Over time, it was demanded that the queries have to provide custom search, advanced filtering, conditional filtering and so on. On top of that, the number of filter values and conditions drastically increased. Each time we tried to expand the query, more and more custom code was involved, and we didn’t realize that it would slowly pollute the generation process. This resulted in more bugs than initially expected and it became almost impossible for the developers, who had recently switched to the project, to understand the code.
For such complex queries, there is no point in displaying what the JSON tree will look like. It will resemble a forest because of the different conditions for generations primarily based on the status values which always results in having a separate tree per condition.
The same goes for the code in the generation method. It is unimaginable to present such code and therefore in the second part of the blog post, we will focus only on the smaller portions that actually represent the key points for you to consider giving the elastic-builder a chance.
Velimir Graorkoski
Customers who trust us
Velimir Graorkoski
Tanja Zlatanovska
Tanja Zlatanovska