Big forms, big data, big problems. Gravity Forms’ flexible data structure is a gift until you need to crunch thousands of entries fast. So we built a faster way to process data. Up to 100x faster. Here’s how we did it.
Slow data is no data
At GravityKit, we handle a lot of Gravity Forms entries through all of our add-ons. To name a few: GravityView displays the entries however you like, GravityCharts will plot the entries of various charts and graphs, and GravityMath will perform calculations with that data.
What these plugins have in common is that they are front-facing to the user. You want this data to be visible to your users as fast as possible. If a page takes too long to load, the user is already gone or frustrated at the very least.
But the more entries a form has, the more data needs to be filtered, calculated, and summarized. And this comes at the cost of speed.
We want our customers and their users to be happy. So we set out to tackle the challenges that come with the Gravity Forms data structure, to deliver fast and scalable metrics for forms of all sizes.
The problem: Slow and memory-heavy data crunching
Gravity Forms is one of the best form plugins for WordPress. It’s highly configurable with a range of powerful field types. To achieve this level of configurability, Gravity Forms relies on a data structure known as Entity-Attribute-Value (EAV).
EAV: A primer
EAV is a data structure that uses two database tables. In the case of Gravity Forms, these are:
gf_entry– contains the unique ID of the entry (the Entity), when it was created and last updated, its status (whether it is active, trashed, or spam), and various other data.gf_entry_meta– references theform_id, along with:entry_id– a reference to the unique entry ID ingf_entry(reference to the Entity)meta_key– holds the ID for a (sub) field of the form (the Attribute)meta_value– contains the value for the (sub) field (the Value)
This data structure allows a form to change over time while using the exact same database. For example, whenever you add a new field to your existing form, the data can be stored in a new row with the new field ID and the provided value. When removing a field, all the rows connected to that form and field can be removed to clear unwanted data.
Additionally, you can store any extra data you might need. A lot of add-ons use this to enrich the entry data so they can do something specific with it. For example, GravityView stores an is_approved value to filter out unapproved entries.
Note: The EAV pattern is not only used by Gravity Forms, but also by WordPress itself. This allows you to create different post types with their own data, without having to create a database table per post type.
While this data structure allows for high configurability, it also has limitations, especially when you’re working with thousands of entries! Let’s take a closer look at those limitations.
Limitation 1: Retrieving entries requires multiple queries
Because the entries and the data are stored in separate tables, it requires at least two queries to retrieve the entries.
First, you have to retrieve the required entry rows from the gf_entry table with a specific form_id. Then, you must retrieve all the rows from gf_entry_meta for that form_id, and where the entry_id is one of the entry IDs retrieved from gf_entry.
When Gravity Forms retrieves entry data, it processes all the rows, one at a time, and combines all the data for the same entry into a single array.
Note: During the combining process, Gravity Forms will also allow the form fields to change the stored data before it is returned. These changes may also require additional database queries, API calls, or other costly transformations.
Limitation 2: Entries require memory
The array of entries that Gravity Forms returns requires memory. This memory is what PHP uses to read the values. The available memory is limited to the server that is running PHP and its configuration.
Efficiently processing large numbers of entries requires retrieving them in smaller batches and releasing memory after each batch before proceeding to the next (which is why you need at least two queries).
Because of this batching, the next query needs to wait until the current batch is processed. Which leads to our next problem: time.
Limitation 3: Processing entries takes time
Imagine having 10,000 entries for a form that has 10 fields. This means that there might be around 100,000 gf_entry_meta rows (empty fields will not add a row). With a batch size of 500 entries, it would take 20 batches to process all entries.
Depending on the complexity of the query (whether you are filtering out certain values), it could take up to 50-150ms per batch or even 200ms+ on slower servers.
Add to this the time it takes to create the entry arrays and to process those entries, and you could be looking at a total time of 100-300ms per batch, or 2-6 seconds to process all entries.
Note: The calculation is indicative of an average server. The calculation also does not take into consideration additional queries from WordPress or Gravity Forms.
The reality: Our products were underperforming
GravityCharts aggregates field information and plots it on a chart. For example, a bar chart might show how often certain options were selected, or how many entries were submitted per day.
When it came to displaying large amounts of entries, performance was being constrained by every limitation discussed.
Too much information
For the aggregation, we only need the values of the fields we were targeting. Because Gravity Forms retrieves entries in their entirety, we had to pull in (and then throw away) a lot of useless data, wasting precious memory.
Memory consumption forced batching
To solve the memory issue, we began retrieving our entries in batches. By using a Generator in PHP, we could query the data in batches, while the code could still act as if it were retrieving all values at once.
A Generator is an iterable class in PHP that is very memory efficient. For more information, we recommend reading Generators over arrays.
Even though we fixed the memory issue, it still took a long time for the data to be retrieved and calculated. The most frustrating part was that the calculations weren’t even that complex. We were only counting/summing the values or counting the entries per date. This made us wonder: “Should we be doing this in PHP? Maybe there is an alternative.”
Caching to the rescue?
To prevent calculating the same outcome over and over, we thought about introducing a cache. However, caches are reset periodically. Therefore, some users would still need to endure the calculation time.
It had to be better for everyone, not only the lucky ones who happen to hit the cache. We desperately needed a better solution. Then it hit us: the database could already handle this effortlessly using aggregation functions like COUNT, SUM, AVG, even MIN and MAX!
Our solution: Use the right tool for the job
While working on the internal package we use at GravityKit that powers our advanced Conditional Logic for Gravity Forms, we realized that we need to let the database do the aggregation work! Databases already know how to aggregate values. In fact, they are optimized for this work!
We quickly tested a proof-of-concept that retrieved the data in a single query, and found it was not only way faster, it also used significantly less memory in PHP, since the database did the heavy lifting.
All excited, we got to work. To make an API that we could internally use in all our packages.
To make this work, we built a fast and reusable layer on top of Gravity Forms’ Query Builder: GF_Query. (Fun fact: GravityKit originally created that system).
We extended it to support SQL-powered summaries through a clean and developer-friendly API.
Now we can get grouped counts, sums, and averages in a single query, without ever loading full entries into PHP.
It’s clean, expressive, and super fast! But it wasn’t easy.
For the full technical breakdown, see the Developer Notes below.
Technical overview for developers
Since GravityKit originally created the GF_Query class that powers Gravity Forms entry retrieval, we know it quite well. It is what makes our advanced Condition Logic so powerful, and why we wanted to use it as the core for our aggregate abstraction. This would allow us to retrieve aggregates based on existing conditional logic.
We created two classes, Query and Field, that can form an expressive query to get the required aggregated data from the database. We added chainable methods and made the classes immutable, so we could reuse parts of the code without thinking twice.
Here’s an example of what we created. This example assumes a form (ID: 5) with two form fields:
- Category, which is a select field that has 3 values: One, Two , and Three
- Number, which is a number field
$gf_query = new GF_Query( 5, ['status' => 'active'] );
$query = Query::from( $gf_query );
$category_field = Field::from_ids( 5, '1' );
$number_field = Field::from_ids( 5, '2' );As you can see, we start with a regular GF_Query that already has conditions applied; in this case, we are only interested in active entries.
First, let’s count all the entries for this form:
$query->count(); // Returns e.g. [['count' => 15 ]]As you can see, it will return an array of arrays. This is because, more often than not, you will want to get the count, grouped by some field. The key for the count is count. That pattern is the same for all aggregates.
The real power is getting the count, grouped by a category. In this example, the value of the selected category is stored under 1; the field ID.
$query->group_by( $category_field )->count();
/**
* Returns e.g.
* [
* [ '1' => 'One', 'count' => '5' ],
* [ '1' => 'Two', 'count' => '4' ],
* [ '1' => 'Three', 'count' => '6' ],
* ]
*/Now, let’s do some aggregations. Every aggregate function needs a reference to the field you want to perform the calculation on. We have min, max,avg, and sum. But we can also combine them all in a single query with all(). We can even group by multiple fields by adding additional Field arguments to the group_by() method. You could, for example, group by category per date.
$by_category = $query->group_by( $category_field /*, $date_field */ );
$results = $by_category->all( $number_field );
/**
* Returns e.g.
* [
* [ '1' => 'One', 'min' => '1', 'max' => '5', 'avg' => '4', 'sum' => '20', count' => '5' ],
* [ '1' => 'Two', 'min' => '1', 'max' => '5', 'avg' => '4', 'sum' => '20', 'count' => '4' ],
* [ '1' => 'Three', 'min' => '1', 'max' => '5', 'avg' => '4', 'sum' => '20', 'count' => '6' ],
* ]Because Query is immutable, when we create the $by_category parameter, it is a clone of $query with the group_by applied. This means we still have access to the original $query configuration for a different use case.
For GravityCharts, it meant that we could now perform a query like this:
$results = Query::from( $existing_gf_query )
->group_by( Field::from_field( $existing_group_by_field ) )
->sum();The reason we wrapped existing GF_Field instances in a Field class is so we could add helper methods that would make the abstraction even more concise and expressive. It allowed us to add sorting methods like ->asc() and –>desc(), and consolidate the required transformations to their SQL output, based on the provided GF_Field.
It also allowed us to add a with_alias() helper, which replaces the ID of the field with a specific key in the results.
Here is a complete example:
$category = Field::from_ids( $form_id, $field_id )->with_alias( ‘category’ );
$result = Query::from( $gf_query )
->group_by( $category )
->order_by( $category->asc() )
->count();
/**
* Sorted by category name in ascending order.
* [
* ['category' => 'One', 'count' => 5],
* ['category' => 'Three', 'count' => 6],
* ['category' => 'Two', 'count' => 6],
* ]
*/
Challenges we had to overcome
While the idea of letting MySQL do “the heavy lifting” is pretty simple, there were a few things we had to migrate that PHP used to do for us.
Currency formats
Some values include currency symbols. Others store decimals and thousands differently depending on the default currency. To fix this, we sanitize and normalize all of this inside the SQL query, so the aggregate input is consistent. Then we reapply the correct formatting after the query is finished.
JSON-based multi-select fields
Not every field stores its value(s) in the same way. If a field has a single value, it uses a single row in gf_entry_meta. If it has multiple values (like checkboxes or a complex field like an address), it uses multiple rows.
However, a multi-select field stores its multiple selected values as JSON in a single row. Because of that, we couldn’t GROUP BY <field_id>. But we found two solutions for this problem:
- When available (on MySQL 8+, or MariaDB equivalent), we
JOINthe JSON data with theJSON_TABLEfunction. This creates a temporary table on which it can still perform theGROUP BYstatement. - For older MySQL versions, we created a backup PHP solution that would post-process the aggregates that MySQL produces. While this still requires some “manual processing”, the amount of data is very small, which means minimal overhead.
Adjusting for time zones
The date_created and date_updated columns on the gf_entry table are always stored in GMT. To offset those dates to the correct time zone (as defined in the WordPress settings), we previously used PHP DateTime and DateTimeZone objects during the entry processing.
In order to let MySQL handle this, we check for the availability of time zones and CONVERT_TZ in the database. If this is not available, the aggregation will still be possible, but the results could be slightly off. In that case, we have the option to show a user notification.
What this means for you? Faster GravityKit products!
This abstraction has been working so well that even with 100,000+ entries, the time it takes to get the value is mere milliseconds. And while we are not against caching, it has not been necessary. This means that the aggregates are always up to date, we do not have to make trade-offs in functionality, or limit the dataset to a smaller dataset.
By avoiding PHP and processing entries directly in the database, we’ve managed to:
- Make the processing up to 100x faster
- Eliminate the need for batching
- Reduce memory usage significantly
- Preserve our user-friendly conditional logic
Try it out
Try GravityMath or GravityCharts to experience this increased speed for yourself!

Helpful tips right in your inbox.
Subscribe to our weekly newsletter for tips, special offers, and more!
Helpful tips right in your inbox.
Subscribe to our weekly newsletter for tips, special offers, and more!
