mirror of
https://github.com/mattermost/mattermost.git
synced 2025-02-25 18:55:24 -06:00
* MM-19548: Add a deadlock retry function for SaveChannel
A deadlock has been seen to occur in the upsertPublicChannelT method
during bulk import.
Here is a brief excerpt:
*** (1) TRANSACTION:
TRANSACTION 3141, ACTIVE 1 sec inserting
INSERT INTO
PublicChannels(Id, DeleteAt, TeamId, DisplayName, Name, Header, Purpose)
VALUES
(?, ?, ?, ?, ?, ?, ?)
ON DUPLICATE KEY UPDATE
DeleteAt = ?,
TeamId = ?,
DisplayName = ?,
Name = ?,
Header = ?,
Purpose = ?
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 76 page no 4 n bits 104 index Name of table `mydb`.`PublicChannels` trx id 3141 lock_mode X locks gap before rec insert intention waiting
** (2) TRANSACTION:
TRANSACTION 3140, ACTIVE 1 sec inserting
mysql tables in use 1, locked 1
5 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 50, OS thread handle 140641523848960, query id 3226 172.17.0.1 mmuser update
INSERT INTO
PublicChannels(Id, DeleteAt, TeamId, DisplayName, Name, Header, Purpose)
VALUES
(?, ?, ?, ?, ?, ?, ?)
ON DUPLICATE KEY UPDATE
DeleteAt = ?,
TeamId = ?,
DisplayName = ?,
Name = ?,
Header = ?,
Purpose = ?
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 76 page no 4 n bits 104 index Name of table `mydb`.`PublicChannels` trx id 3140 lock_mode X locks gap before rec
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 76 page no 4 n bits 104 index Name of table `mydb`.`PublicChannels` trx id 3140 lock_mode X locks gap before rec insert intention waiting
*** WE ROLL BACK TRANSACTION (1)
Following is my analysis:
From the deadlock output, it can be seen that it's due to a gap lock.
And that's clear because the index is Name which is a multi-column index using Name and TeamId.
But interestingly, both transactions seem to be inserting the same data, which is what is puzzling me.
The multi-column index on Name and TeamId will guarantee that they are always unique. And from looking at the code,
it does not seem possible to me that it will try to insert the same data from 2 different transactions.
But even if they do, why does tx 2 try to acquire the same lock again when it already has that ?
Here is what I think the order of events happening
Tx 2 gets a gap lock.
Tx 1 tries to get the same gap lock.
Tx 2 tries to again get the same gap lock ?
The last step is what is puzzling me. Why does an UPSERT statement acquire 2 gap locks ? From my reading of https://dev.mysql.com/doc/refman/8.0/en/innodb-locks-set.html:
> INSERT ... ON DUPLICATE KEY UPDATE differs from a simple INSERT in that an exclusive lock rather than a shared lock is placed on the row to be updated when a duplicate-key error occurs. An exclusive index-record lock is taken for a duplicate primary key value. An exclusive next-key lock is taken for a duplicate unique key value.
From what I understand, the expectation is that there will be one X lock and one gap lock is taken.
But that's not what the deadlock output seems to say.
The general advice on the internet seems to be that deadlocks will happen and not all of them can be understood.
For now, we add a generic deadlock retry function at the store package which can be reused by other queries too.
P.S.: This is a verbatim copy of my investigation posted at https://dba.stackexchange.com/questions/268652/mysql-deadlock-upsert-query-acquiring-gap-lock-twice
Testing:
This is ofcourse hard to test because it is impossible to reproduce this. I have tested this by manually returning an error
and confirming that it indeed retries.
WARN[2020-06-19T11:18:24.9585676+05:30] A deadlock happened. Retrying. caller="sqlstore/channel_store.go:568" error="Error 1213: mydeadlock"
WARN[2020-06-19T11:18:24.959158+05:30] A deadlock happened. Retrying. caller="sqlstore/channel_store.go:568" error="Error 1213: mydeadlock"
WARN[2020-06-19T11:18:24.9595072+05:30] A deadlock happened. Retrying. caller="sqlstore/channel_store.go:568" error="Error 1213: mydeadlock"
WARN[2020-06-19T11:18:24.9595451+05:30] Deadlock happened 3 times. Giving up caller="sqlstore/channel_store.go:579"
ERRO[2020-06-19T11:18:24.9596426+05:30] Unable to save channel. caller="mlog/log.go:175" err_details="Error 1213: mydeadlock" err_where=CreateChannel http_code=500 ip_addr="::1" method=POST path=/api/v4/channels request_id=745bsj13b7f6mnmsbn3t97grbw user_id=xcof1ipipbrfxpfjf6x4p6kx9e
* Fix tests
* Address review comments
* Address review comments
* Add forgotten test
Co-authored-by: Mattermod <mattermod@users.noreply.github.com>