* MM-24547: Fix writer leak when connection closes
When the connection is closed, the exit path does not
shut down the writer goroutine. In which case, it will keep spinning forever.
Since we already have the CAS mechanism now, we can move the closing
functionality into the main Close method and just call that in the defer block.
This makes closing the websocket client idempotent from both perspective -
- Explicitly closing.
- Closing due to connection tear down.
There are still 2 races left:
- Using the exported Conn to directly write messages. We cannot do anything about
that as long as clients directly using that.
- Setting the wsc.pingTimeoutTimer field in a separate goroutine when calling
.Connect(). This will need to be seen later.
* Fix ineffectual assignment
* Duplicate the closing of writer
The problem with refactoring the writer closing to a common
function was that we needed to wait for the reader to exit
before closing the EventChannel and ResponseChannel.
But then there is another problem that the API can be used in such
a way that the client is liable to call Close without even calling
Listen. In that case, we cannot wait for Listen to quit.
So from Close, we can only close the connection. And therefore
we need to duplicate the writer closing in the read loop's
defer block.
* Cleanup some comments
Given that this query is part of the top 5 most used queries
we want to move it to use raw queries instead of gorp so we
can get rid of the reflection overhead
* MM-24395: Set response header to be attachment for SVG images.
We check the file extension and appropriately set the response header.
SVG images without a file extension aren't proxied at all. So there's no
problem with that.
* Add test
* Improve test a bit
* Capture content type
* incorporate comments
* Set attachment mode in case of error too
* MM-24397: Reusing the read buffer while reading messages from websockets
The core problem was that conn.ReadMessage allocated a buffer every time it was read.
This created heavy amount of allocations every single time we read a message from the websocket.
To avoid this, we bypass the ReadMessage which was more of a helper method,
and actually call the NextReader which returns a reader object.
We can then reuse a single byte.Buffer instance to read the object unmarshal
into a WebSocketEvent object. This gets rid of the allocation in the read path completly
and allows GC more time to do other tasks.
* Incorporate review comments
* Move reset buffer to top of loop
* Cleanup further
* Fix test
* Final fix
* Fixing system messages about non-visible users
* Adding unit tests to verify the new behavior
* Regenerating app layers
Co-authored-by: mattermod <mattermod@users.noreply.github.com>
We invert the skipSend condition and outdent the remaining block
to make the code a bit more idiomatic.
While here, we also change the dropping message level from info to
warn because that's what it should be.
On user activity, we were clearing the job.pendingNotifications map.
But we had already created a copy of the notifications slice while
iterating the map. Therefore, if we pass the copied slice, it would
still have the old notifications which were originally deleted.
The unit tests would not catch this because it was testing the
job.pendingNotifications map and not actually checking if the email
handler was being called or not. We fix that now.
Co-authored-by: mattermod <mattermod@users.noreply.github.com>
* MM-23264 Add api endpoint for get groups with members in channel
Add store tests
Add tests for api func
Gofmt
Apply changes from code review
* MM-23264 Make store layers
* MM-23264 Check read permission on channel member counts
* Trigger CI
* MM-23800: remove goroutineID and stack printing
Each hub has a goroutineID which is calculated with a known hack.
The FAQ clearly explains why goroutines don't have an id:
https://golang.org/doc/faq#no_goroutine_id.
We only added that because sometimes the hub would be deadlocked and
having the goroutineID would be useful when getting the stack trace.
This is also problematic in stress tests because the hubs would
frequently get overloaded and the logs would unnecessarily have stack traces.
But that was in the past, and we have done extensive testing with
load tests and fuzz testing to smooth any rough edges remaining.
Including adding additional metrics for hub buffer size.
Monitoring the metrics is a better way to approach this problem.
Therefore, we remove these kludges from the code.
* Also remove deadlock checking code
There is no need for that anymore since
we are getting rid of the stack printing anyways.
Let's do a wholesale refactor and clean up the codebase.
* MM-23805: Refactor web_hub
This is a beginning of the refactoring of the websocket code.
To start off with, we unexport some methods and constants which did not
need to be exported. There are more remaining but some are out of scope for this PR.
The main chunk of refactor is to unexport the webconn send channel
which was the main cause of panics. Since we were directly sending
to the connection from various parts of the codebase, it would be possible
that the send channel would be closed and we could still send a message.
This would crash the server.
To fix this, we refactor the code to centralize all sending from the main
hub goroutine. This means we can leverage the connections map to check
if the connection exists or not, and only then send the message.
We also move the cluster calls to cluster.go.
* bring back cluster code inside hub
* Incorporate review comments
* Address review comments
* rename index
* MM-23807: Refactor web_conn
- Unexport some struct fields and constants which are not necessary
to be accessed from outside the package. This will help us moving
the entire websocket handling code to a separate package later.
- Change some empty string checks to check for empty string rather
than doing a len check which is more idiomatic. Both of them compile
to the same code. So it doesn't make a difference performance-wise.
- Remove redundant ToJson calls to get the length.
- Incorporate review comments
- Unexport some more methods
* Fix field name
* Run make app-layers
* Add note on hub check
The current code path for `CreatePostAsUser` tries to update the `LastViewedAt` for bots, which logs a warning message if the bot isn't actually in the channel.
This pull request changes the semantics of bot posting to not update the bot's `LastViewedAt` timestamp, avoiding this log altogether. It matches the semantics of `from_webhook`, but notably makes bots slightly less like users in that they no longer "read" channels when they post. This seems reasonable, but I'm both looking for validation of this semantic change in addition to the code review.
Fixes: https://mattermost.atlassian.net/browse/MM-23926
Co-authored-by: mattermod <mattermod@users.noreply.github.com>
* MM-23800: remove goroutineID and stack printing
Each hub has a goroutineID which is calculated with a known hack.
The FAQ clearly explains why goroutines don't have an id:
https://golang.org/doc/faq#no_goroutine_id.
We only added that because sometimes the hub would be deadlocked and
having the goroutineID would be useful when getting the stack trace.
This is also problematic in stress tests because the hubs would
frequently get overloaded and the logs would unnecessarily have stack traces.
But that was in the past, and we have done extensive testing with
load tests and fuzz testing to smooth any rough edges remaining.
Including adding additional metrics for hub buffer size.
Monitoring the metrics is a better way to approach this problem.
Therefore, we remove these kludges from the code.
* Also remove deadlock checking code
There is no need for that anymore since
we are getting rid of the stack printing anyways.
Let's do a wholesale refactor and clean up the codebase.
Co-authored-by: mattermod <mattermod@users.noreply.github.com>
* Remove unnecessary check for PERMISSION_LIST_TEAM_CHANNELS
In the autocompleteChannelsForTeamForSearch method we're checking for
the PERMISSION_LIST_TEAM_CHANNELS permission in order to avoid filtering
channels in the autocomplete search but this check is not necessary.
Now we're going directly to the database to search for those channels
in this specific method and we're filtering by channel membership and
team so there is no chance that we are going to filter undesired
channels to the user.
[Here](https://github.com/mattermost/mattermost-server/blob/v5.22.0/store/sqlstore/channel_store.go#L2014)
is the query where you can see the filtering we're making
* MM-23017 Check group mentions as part of notification logic
* Add nil groups to existing test cases
* MM-23017 Add tests for insertGroupMention and addGroupMention
* MM-23017 Add tests for getExplicitMentions that have groups
* Add tests for group store GetMemberUsersNotInChannel
* MM-23017 Add tests for AllowGroupMentions
* MM-23017 Fix error message name
* MM-23017 Swap Checks to Name
* MM-23017 Code review fixes
* Rename var and fix allowGroupMentions test
* MM-23017 Use GetMemberUsersInTeam inside of insertGroupMentions
* MM-23017 use group mentions permission
* Actually call GetMemberUsersInTeam
* Remove unnecessary new line
* Uncomment filter allow reference
* MM-23017 Fix group channel notifications
* Update store layer
* MM-23017 Improve test coverage for group channels
* Trigger CI
* Trigger CI
* MM-20934: Fixing int overflow in 32 bits on MaxImageSize check
* Adding comments explaining the casting and the bug fixed there
* Apply suggestions from code review
Co-Authored-By: Juho Nurminen <juhonurm@gmail.com>
* Fixing store layers
Co-authored-by: Juho Nurminen <juhonurm@gmail.com>