Files
mattermost/app
Agniva De Sarker 4e154756bd MM-27648: Fix a hub deadlock while revoking session (#15293)
* MM-27648: Fix a hub deadlock while revoking session

This is a bug which has always been there in the codebase.
And it can only occur in the extreme of edge-cases.

Following is the call trace due to which this happens:

```
 0  0x0000000001dfea68 in github.com/mattermost/mattermost-server/v5/app.(*Hub).InvalidateUser // deadlock
    at ./app/web_hub.go:369
 1  0x0000000001dfc0bd in github.com/mattermost/mattermost-server/v5/app.(*App).InvalidateWebConnSessionCacheForUser
    at ./app/web_hub.go:109
 2  0x0000000001db1be5 in github.com/mattermost/mattermost-server/v5/app.(*App).ClearSessionCacheForUserSkipClusterSend
    at ./app/session.go:209
 3  0x0000000001db1763 in github.com/mattermost/mattermost-server/v5/app.(*App).ClearSessionCacheForUser
    at ./app/session.go:170
 4  0x0000000001db2d2f in github.com/mattermost/mattermost-server/v5/app.(*App).RevokeSession
    at ./app/session.go:275
 5  0x0000000001db2c09 in github.com/mattermost/mattermost-server/v5/app.(*App).RevokeSessionById
    at ./app/session.go:260
 6  0x0000000001daf442 in github.com/mattermost/mattermost-server/v5/app.(*App).GetSession
    at ./app/session.go:93
 7  0x0000000001df93f4 in github.com/mattermost/mattermost-server/v5/app.(*WebConn).IsAuthenticated
    at ./app/web_conn.go:271
 8  0x0000000001dfa29b in github.com/mattermost/mattermost-server/v5/app.(*WebConn).shouldSendEvent
    at ./app/web_conn.go:323
 9  0x0000000001e2667e in github.com/mattermost/mattermost-server/v5/app.(*Hub).Start.func1.3 // starting from hub
    at ./app/web_hub.go:491
10  0x0000000001e27c01 in github.com/mattermost/mattermost-server/v5/app.(*Hub).Start.func1
    at ./app/web_hub.go:504
11  0x0000000001e27ee2 in github.com/mattermost/mattermost-server/v5/app.(*Hub).Start.func2
    at ./app/web_hub.go:528
12  0x0000000000473811 in runtime.goexit
    at /usr/local/go/src/runtime/asm_amd64.s:1373
```

The stars have to align in such a way that the session idle timeout _has_ to happen
_exactly_ when a broadcast is happening for that user. Only then, this code path gets
triggered.

Since this is an extreme rabbit hole of calls, I have not attempted any big
refactors and went with the most sensible approach which is to make the RevokeSessionById
call asynchronous.

There are 2 main reasons:
- It was already treated as an asynchronous call because it happened during an error condition
and we were not checking for the return value anyways.
- Session idle timeout is a relatively infrequent event, so creating unbounded goroutines is not a concern.

As a bonus, we also get to check the error return and log it.

https://mattermost.atlassian.net/browse/MM-27648

* Add a test case

* Fix an incorrect comment
2020-08-19 23:27:48 +05:30
..
2020-07-31 15:53:10 +02:00
2020-07-28 10:27:24 +05:30
2020-08-04 16:37:21 +02:00
2020-08-13 01:35:57 -04:00
2020-07-28 10:27:24 +05:30
2020-08-04 16:37:21 +02:00
2020-08-13 11:12:08 +05:30
2020-07-31 15:53:10 +02:00