Case Study: Lightning-Fast Performance: Imperva Achieves 95% of Queries in Under 900 Milliseconds with Twingo and SingleStore!

Imperva is a leading company in the field of cybersecurity, offering a wide range of products, including its well-known web application firewall. Providing both on-premise and cloud solutions and products, Imperva serves as a one-stop shop offering cyber protection products.

When providing Realtime Protection Services, and strategic planning against future security breach attempts, Imperva relies on data-based reports, data-based insights, and quick real-time responses to immediate threats and anomalies.

“Our operation lies on the spectrum between real-time data operation through data warehouses, Data lakes, reports, analytics, and insights. All of our customers’ data flows through our pipes, and it contains many data types.” explains Elad Tamary, Tech Lead at Imperva.

Encountering 3 main Challenges: Real time reporting, Flexibility & Speed
Imperva used to use a solution which had a client-server for every Imperva endpoint, with some 1,000 servers around the world. These servers accumulate the matrices, and then the data is sent through a client to real machines that save the data on discs. That is to say that no data is being replicated and with this amount of data, the resolution was poor, data-point was once every 10 minutes, and less than that was simply impossible.

“To me, it meant we had to pre-decide, from the producer end, how the consumer will see the data.”’ Says Elad, “It translates to no flexibility and a reliability issue – every hiccup on the web ended with us apologizing. Because the data was pre-structured, you could not run different analytics, use all the data, or increase resolution, and it is more than just a paradox.”

Imperva’s data was saved on machines, and a UC-engine, which was installed on one of the machines, indexed the files – it was almost archaic.
One had to access Imperva’s admin console, where components were to be uploaded and generate graphs, by accessing the machine, downloading the files, and creating the data – it was a type of on-prem operation. That is to say that the first challenge was a need for real-time reports with no intervals.

The second challenge was flexibility in queries: “We wanted to stream a lot of data, and a point every 10 minutes was not satisfactory. On the other hand, we wanted to stay close to the raw data, which translated to enabling various queries, without structuring them in advance.”, Elad explains.

The third challenge was speed, Imperva’s system serves UI-systems, so it needed an operative Data-Store, that returns a query in less than a second, and definitely not every 10 minutes.

It was obvious that the solution will be based on Data streaming through Kafka, so to create a buffer between real-time worlds and the data ingestion via a data warehouse.
But, the biggest concern was that an answer to all three challenges was not to be found.
So, the first step was to hire Twingo as consultants.

Testing for a solution
Together with Ilya Gullman, Twingo’s CTO, Imperva compiled a shortlist of vendors that may offer a solution: SingleStore, Druid, and Amazon Timestream.

Initially, due to many factors such as the partnership between Imperva and Amazon, the favorite choice seemed to be Amazon Timestream.
However, at that time, Amazon Timestream was in its preliminary stages and its query latency at the time was too high for such a customer-facing application.

In addition, it was missing important features that are required for production systems such as backup / restore.

That being the case, it was decided to start a POC with Singlestore & Druid, which focused on three main demands:
Speed – both ingestion wise and query time wise
Pricing – the numbers of machines needed for the solution, including matrices, backup costs
Storage Size – with the enormous volume of data, though numeric, Imperva needed an effective compression solution

Druid managed to deliver the load performance, but didn’t deliver the expected SLA for the queries. In addition it was very complex to deploy and stabilize from a devops perspective.

SingleStore proved to have a strong and fast engine, met both the ingestion and the query SLA requirements, and was relatively easy to deploy. On the other hand it lacked some aspects in terms of the user interface and backup automations. Also the monitoring was missing some features. Overall advantages such as low TCO and capability to stably ingest 2 million records per second , together with 100s of user queries per second made it the winning choice.

Choosing a winner
The POC gave Imperva a “Go Card”, a winner if you will, and it was SingleStore.

“Ilya helped us construct a schema to arrange the data, divide it into tables, etc. Simultaneously we ran a stress test that examined and fined tuned the queries until we reached the rate of 1-second download.”, says Elad, ”Now we stand at a 95% query time of 900 milliseconds, over ten different types of queries, and an API, a GraphQL sitting above the Database, and every new field is automatically inserted. It means that as soon as you add something to the schema, the API is immediately updated.”

Today, Imperva has a flow, a procedure for adding data to the database, and an API connection. This enables both producer and consumer, after submitting a merge request, to update the API. Today everything that is on Imperva’s admin console is based on the database – graphs, charts, dashboards – the works.

“Since we started, many flows were added, such as network monitoring, and use cases uploading is at a volume that we can hardly track, as we continue to improve.”, adds Elad, “Our current stage is the stats engine, which means we have a statistics engine with a pipeline that works amazingly. We see performance improvement in query time and loading time, both on-site and application. Our next stage will be opening the API externally, enabling our enterprise customers to produce independent reports.”

Letting Singlestore do the heavy-lifting
Today, Imperva’s clients see a more dynamic and wide-range resolution than ever before. Clients can simply open a dashboard, play with the resolution, and get query responses in less than a second.

At the query level, Singlestore is doing the heavy-lifting and hence provides the Flexibility issue a solution. Reliability-wise, the data is replicated, SingleStore reads automatically from Kafka, which leaves Imperva to work only with the pipeline.

Elad recalls: “For example, AWS’s North-Virginia went down one day, and as soon as it was online again, we simply restarted the Pods and everything was back up in seconds. Before, we would have lost the data. In the worlds of resolution, reliability, and user experience, today you simply open a dashboard and it opens automatically, no UI opens files, edits them, searches for data, and then starts drawing, you get a JSON answer with the time-series – everything is accessible.”

Cost effective: yet another benefit
It is probably clear by now that the data-related costs were cut down, not only in the cost of stats machines but also in human resources. Before, adding a metric took some 14 to 21 days, today, except for a merge request, it is done automatically, it’s almost self-served.

This slashed Imperva’s R&D time dramatically, as Elad explains, “Before we had a team of six people just for that, today if we get a request that needs attention, someone will do it, but we do not dedicate R&D precious time on that, today we can focus on improvement. So even in human resources and working hours, SingleStore proved to be cost-effective. When you add our customers’ satisfaction, the range of features, and speed, it adds up to a wise choice.”

Twingo, leading the way for finding the optimal solution
When asked about Twingo’s contribution to the process, Elad says: “Twingo’s consultation was helpful beyond words and it is why we are advancing to our stage II in the world of data lake, which is less “live” and time-sensitive, but it is required for heavy scale issues. We are looking to the future and into the worlds of data lake 2023, including Iceberg, alongside Twingo to be prepared for whatever comes.”

“Twingo has also proved itself priceless for us in terms of technology and pipeline choice and GA readiness.”’,he adds, “At our starting point, we were at 15 seconds per query and now we’re down to less than a second. Basing our solution on the compliance with Kafka, the choice of datastore, the queries, and their fine-tuning – Twingo’s help was immeasurable. In terms of mediation and support, the process would not have been streamlined without Twingo’s assistance.”

About Imperva
Imperva is a leading cybersecurity company aiming at protecting enterprise data and application software, through all stages of digital transformation. Imperva offers a wide range of products and solutions for application security and also data, network and cloud-native security and security automation.

Founded in 2002, Imperva now employs over 1,200 employees worldwide, with headquarters in San Mateo, CA and 17 offices around the world, over 500 international partners and over 6,000 enterprise customers in some 150 countries. Imperva leads her field of expertise and was rewarded and recognized worldwide

About Twingo
Twingo specializes in the fields of Big-Data, Data Lake, and BI Analytics. The company consults, assimilates, and sells software solutions and projects in its area of expertise.

The company specializes in architecture design, choosing the right technologies, cloud transition, SAAS solutions, OEM solutions, and consulting ML, Scalability, Multi-Tenant solution design.
Over the past few years, we have led over 150 Big Data projects for leading companies in every vertical.

We are AWS Premier Partner in the field of Data & Analytics and we’ve seen many of our clients through problem-solving and establishing modified-per-needs environments designed to meet personalized needs.

About SingleStore (Formerly MemSQL) 
Is a cloud-native, operational database built for SPEED & SCALE, dedicated to helping businesses leverage data to reach their full potential and redefine the limits of what’s possible.

Recent Articles

Follow Us