MM-57344: Improve ElasticSearch indexing query performance (#26517)

* Use a row constructor comparison to avoid a filter

The original query had a WHERE condition that looked like:

    (x > a) OR (x = a AND y < b)

This commit changes it to a compound comparison that, given the
lexicographical order it uses, is semantically the same:

    (x, y) > (a, b)

This makes the plan in Postgres use an Index Cond instead of a Filter,
having two main performance improvements:
1. The query's time is more or less constant, not increasing every time
this query is executed. This is important because the query is executed
repeatedly until all posts are indexed.
2. The query's time is much shorter: while the original query eventually
takes 30s, hitting the timeout, the new one takes ~30ms.

Notice the difference in the plans below:
- the original query uses a `Filter` step, while the new one uses an
`IndexCond` + `Filter` step.
- the original query has a shared hit count of 40557115 (109GiB), while
the new one has a shared hit count of 13951 (109MiBA). This does not
show exactly how much data is read from disk, given it's a nested loop
node, but it gives us an idea of the amount of data processed in the
server.

Original plan:

agnivaltdb=> EXPLAIN (ANALYZE, BUFFERS) SELECT Posts.*, Channels.TeamId FROM Posts LEFT JOIN Channels ON Posts.ChannelId = Channels.Id WHERE Posts.CreateAt > '1687424888405' OR (Posts.CreateAt = '1687424888405' AND Posts.Id > 'tpomh9yu1tffmdp6dopobwuc9h') ORDER BY Posts.CreateAt ASC, Posts.Id ASC LIMIT 10000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.22..2287.94 rows=10000 width=464) (actual time=23921.236..23954.229 rows=10000 loops=1)
   Buffers: shared hit=40557115
   ->  Incremental Sort  (cost=1.22..13998881.93 rows=61217938 width=464) (actual time=23921.235..23953.205 rows=10000 loops=1)
         Sort Key: posts.createat, posts.id
         Presorted Key: posts.createat
         Full-sort Groups: 311  Sort Method: quicksort  Average Memory: 45kB  Peak Memory: 45kB
         Buffers: shared hit=40557115
         ->  Nested Loop Left Join  (cost=1.00..11421751.31 rows=61217938 width=464) (actual time=23920.970..23947.067 rows=10001 loops=1)
               Buffers: shared hit=40557115
               ->  Index Scan using idx_posts_create_at on posts  (cost=0.57..9889434.54 rows=61217938 width=461) (actual time=23920.930..23931.063 rows=10001 loops=1)
                     Filter: ((createat > '1687424888405'::bigint) OR ((createat = '1687424888405'::bigint) AND ((id)::text > 'tpomh9yu1tffmdp6dopobwuc9h'::text)))
                     Rows Removed by Filter: 40920000
                     Buffers: shared hit=40553119
               ->  Memoize  (cost=0.43..0.70 rows=1 width=30) (actual time=0.001..0.001 rows=1 loops=10001)
                     Cache Key: posts.channelid
                     Cache Mode: logical
                     Hits: 9002  Misses: 999  Evictions: 0  Overflows: 0  Memory Usage: 151kB
                     Buffers: shared hit=3996
                     ->  Index Scan using channels_pkey on channels  (cost=0.42..0.69 rows=1 width=30) (actual time=0.007..0.007 rows=1 loops=999)
                           Index Cond: ((id)::text = (posts.channelid)::text)
                           Buffers: shared hit=3996
 Planning:
   Buffers: shared hit=112
 Planning Time: 0.501 ms
 Execution Time: 23954.974 ms
(25 rows)

New plan:

agnivaltdb=> EXPLAIN (ANALYZE, BUFFERS) SELECT Posts.*, Channels.TeamId FROM Posts LEFT JOIN Channels ON Posts.ChannelId = Channels.Id WHERE (Posts.CreateAt, Posts.Id) > ('1687424888405', 'tpomh9yu1tffmdp6dopobwuc9h') ORDER BY Posts.CreateAt ASC, Posts.Id ASC LIMIT 10000;
                                                                           QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.15..1666.83 rows=10000 width=464) (actual time=0.366..34.053 rows=10000 loops=1)
   Buffers: shared hit=13951
   ->  Incremental Sort  (cost=1.15..10196977.36 rows=61217938 width=464) (actual time=0.365..33.031 rows=10000 loops=1)
         Sort Key: posts.createat, posts.id
         Presorted Key: posts.createat
         Full-sort Groups: 311  Sort Method: quicksort  Average Memory: 45kB  Peak Memory: 45kB
         Buffers: shared hit=13951
         ->  Nested Loop Left Join  (cost=1.00..7619846.74 rows=61217938 width=464) (actual time=0.059..26.840 rows=10001 loops=1)
               Buffers: shared hit=13951
               ->  Index Scan using idx_posts_create_at on posts  (cost=0.57..6087529.97 rows=61217938 width=461) (actual time=0.040..10.548 rows=10001 loops=1)
                     Index Cond: (createat >= '1687424888405'::bigint)
                     Filter: (ROW(createat, (id)::text) > ROW('1687424888405'::bigint, 'tpomh9yu1tffmdp6dopobwuc9h'::text))
                     Rows Removed by Filter: 2
                     Buffers: shared hit=9955
               ->  Memoize  (cost=0.43..0.70 rows=1 width=30) (actual time=0.001..0.001 rows=1 loops=10001)
                     Cache Key: posts.channelid
                     Cache Mode: logical
                     Hits: 9002  Misses: 999  Evictions: 0  Overflows: 0  Memory Usage: 151kB
                     Buffers: shared hit=3996
                     ->  Index Scan using channels_pkey on channels  (cost=0.42..0.69 rows=1 width=30) (actual time=0.007..0.007 rows=1 loops=999)
                           Index Cond: ((id)::text = (posts.channelid)::text)
                           Buffers: shared hit=3996
 Planning:
   Buffers: shared hit=112
 Planning Time: 0.471 ms
 Execution Time: 34.716 ms
(26 rows)

* Go back to the old query for MySQL

As one could expect, the two databases have completely opposite
behaviours, and for MySQL, the old query is significantly faster than
the new one, exactly for the same reason than in Postgres, but the other
way around: the old query uses an Index range scan while the new one
would use a Filter.

Old query (0.18s):

mysql> EXPLAIN ANALYZE SELECT Posts.*, Channels.TeamId FROM Posts USE INDEX(idx_posts_create_at_id) LEFT JOIN Channels ON Posts.ChannelId = Channels.Id WHERE Posts.CreateAt > 1557752415221 OR (Posts.CreateAt = 1557752415221 AND Posts.Id > 'ad59ire57tfwmjr5r8xqxc75qw') ORDER BY Posts.CreateAt ASC, Posts.Id ASC LIMIT 10000;
--------------
EXPLAIN ANALYZE SELECT Posts.*, Channels.TeamId FROM Posts USE INDEX(idx_posts_create_at_id) LEFT JOIN Channels ON Posts.ChannelId = Channels.Id WHERE Posts.CreateAt > 1557752415221 OR (Posts.CreateAt = 1557752415221 AND Posts.Id > 'ad59ire57tfwmjr5r8xqxc75qw') ORDER BY Posts.CreateAt ASC, Posts.Id ASC LIMIT 10000
--------------

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Limit: 10000 row(s)  (cost=7472068.91 rows=10000) (actual time=0.063..164.174 rows=10000 loops=1)
    -> Nested loop left join  (cost=7472068.91 rows=5559093) (actual time=0.062..163.450 rows=10000 loops=1)
        -> Index range scan on Posts using idx_posts_create_at_id over (CreateAt = 1557752415221 AND 'ad59ire57tfwmjr5r8xqxc75qw' < Id) OR (1557752415221 < CreateAt), with index condition: ((Posts.CreateAt > 1557752415221) or ((Posts.CreateAt = 1557752415221) and (Posts.Id > 'ad59ire57tfwmjr5r8xqxc75qw')))  (cost=1357066.29 rows=5559093) (actual time=0.043..97.358 rows=10000 loops=1)
        -> Single-row index lookup on Channels using PRIMARY (Id=Posts.ChannelId)  (cost=1.00 rows=1) (actual time=0.006..0.006 rows=1 loops=10000)
 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.18 sec)

New query (5s):

mysql> EXPLAIN ANALYZE SELECT Posts.*, Channels.TeamId FROM Posts USE INDEX(idx_posts_create_at_id) LEFT JOIN Channels ON Posts.ChannelId = Channels.Id WHERE (Posts.CreateAt, Posts.Id) > (1557752415221, 'ad
59ire57tfwmjr5r8xqxc75qw') ORDER BY Posts.CreateAt ASC, Posts.Id ASC LIMIT 10000;
--------------
EXPLAIN ANALYZE SELECT Posts.*, Channels.TeamId FROM Posts USE INDEX(idx_posts_create_at_id) LEFT JOIN Channels ON Posts.ChannelId = Channels.Id WHERE (Posts.CreateAt, Posts.Id) > (1557752415221, 'ad59ire57tfwmjr5r8xqxc75qw') ORDER BY Posts.CreateAt ASC, Posts.Id ASC LIMIT 10000
--------------

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Limit: 10000 row(s)  (cost=11119405.48 rows=10000) (actual time=5295.106..5455.285 rows=10000 loops=1)
    -> Nested loop left join  (cost=11119405.48 rows=10000) (actual time=5295.105..5454.572 rows=10000 loops=1)
        -> Filter: ((Posts.CreateAt,Posts.Id) > (1557752415221,'ad59ire57tfwmjr5r8xqxc75qw'))  (cost=221.48 rows=10000) (actual time=5295.078..5388.668 rows=10000 loops=1)
            -> Index scan on Posts using idx_posts_create_at_id  (cost=221.48 rows=10000) (actual time=0.055..5314.753 rows=600000 loops=1)
        -> Single-row index lookup on Channels using PRIMARY (Id=Posts.ChannelId)  (cost=1.00 rows=1) (actual time=0.006..0.006 rows=1 loops=10000)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (5.46 sec)
```

* Add comment explaining the difference between DBs

---------

Co-authored-by: Mattermost Build <build@mattermost.com>
This commit is contained in:
Alejandro García Montoro 2024-03-21 12:57:29 +01:00 committed by GitHub
parent eb817966a4
commit 9a2d96073e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -2460,28 +2460,51 @@ func (s *SqlPostStore) GetEditHistoryForPost(postId string) ([]*model.Post, erro
func (s *SqlPostStore) GetPostsBatchForIndexing(startTime int64, startPostID string, limit int) ([]*model.PostForIndexing, error) {
posts := []*model.PostForIndexing{}
table := "Posts"
// We force this index to avoid any chances of index merge intersection.
if s.DriverName() == model.DatabaseDriverMysql {
table += " USE INDEX(idx_posts_create_at_id)"
}
query := `SELECT
Posts.*, Channels.TeamId
FROM ` + table + `
LEFT JOIN
Channels
ON
Posts.ChannelId = Channels.Id
WHERE
Posts.CreateAt > ?
OR
(Posts.CreateAt = ? AND Posts.Id > ?)
ORDER BY
Posts.CreateAt ASC, Posts.Id ASC
LIMIT
?`
err := s.GetSearchReplicaX().Select(&posts, query, startTime, startTime, startPostID, limit)
var err error
// In order to use an index scan for both MySQL and Postgres, we need to
// diverge the implementation of the query, specifically in the WHERE
// condition: for MySQL, we need to do
// (CreateAt > ?) OR (CreateAt = ? AND Id > ?)
// while for Postgres we need
// (CreateAt, Id) > (?, ?)
// The wrong choice for any of the two databases makes the query go from
// milliseconds to dozens of seconds.
// More information in: https://github.com/mattermost/mattermost/pull/26517
// and https://community.mattermost.com/core/pl/ui5dz96shinetb8nq83myggbma
if s.DriverName() == model.DatabaseDriverMysql {
query := `SELECT
Posts.*, Channels.TeamId
FROM Posts USE INDEX(idx_posts_create_at_id)
LEFT JOIN
Channels
ON
Posts.ChannelId = Channels.Id
WHERE
Posts.CreateAt > ?
OR
(Posts.CreateAt = ? AND Posts.Id > ?)
ORDER BY
Posts.CreateAt ASC, Posts.Id ASC
LIMIT
?`
err = s.GetSearchReplicaX().Select(&posts, query, startTime, startTime, startPostID, limit)
} else {
query := `SELECT
Posts.*, Channels.TeamId
FROM Posts
LEFT JOIN
Channels
ON
Posts.ChannelId = Channels.Id
WHERE
(Posts.CreateAt, Posts.Id) > (?, ?)
ORDER BY
Posts.CreateAt ASC, Posts.Id ASC
LIMIT
?`
err = s.GetSearchReplicaX().Select(&posts, query, startTime, startPostID, limit)
}
if err != nil {
return nil, errors.Wrap(err, "failed to find Posts")
}