Files
mattermost/server/einterfaces
Agniva De Sarker cb75a20c54 MM-61904: Make reliable websockets work in HA (#29489)
We do a cluster request to get the active and dead queues
from other nodes in the cluster to sync any missing
information.

We check the dead queue in the other nodes to see
if there's been any message loss or not. Accordingly,
we send just the active queue or both active and dead queues.

There's still an edge case that is left out where
a client could have potentially connected and reconnected
to multiple nodes leaving multiple active queues
in multiple nodes. We don't handle this scenario
because then potentially we need to create
a slice of sendQueueSize * number_of_nodes. And then
this can happen again, leading to an infinite increase
in sendQueueSize.

We leave this edge-case to Redis, acknowledging
a limitation in our architecture.

In this PR, when there's no message loss, we just
take the active queue from the last node it connected
to.

And if there's message loss where the client's
seqNum is within the last node's dead queue, we also
handle that.

But if there's severe message loss where the client's
seqNum falls within the dead queue of another node, then
we just send the data from that node to reconstruct the
data as much as possible. It could be possible to set
a new connection ID in this case, but this involves
more data transfer always from all nodes and recomputing
the state in the requestor node.

https://mattermost.atlassian.net/browse/MM-61904

```release-note
NONE
```

Co-authored-by: Mattermost Build <build@mattermost.com>
2025-01-17 11:11:32 +05:30
..
2024-03-13 10:26:06 +05:30
2025-01-13 20:23:09 +01:00