Why this change?
On CI, we have been seeing the "handles job concurrency" job timing out
on CI after 45 seconds. Upon closer inspection of `Jobs::Base#perform`
when cluster concurrency has been set, we see that a thread is spun up
to extend the expiring of a redis key by 120 seconds every 60 seconds
while the job is still being executed. The thread looks like this before
the fix:
```
keepalive_thread =
Thread.new do
while parent_thread.alive? && !finished
Discourse.redis.without_namespace.expire(cluster_concurrency_redis_key, 120)
sleep 60
end
end
```
In an ensure block of `Jobs::Base#perform`, the thread is stop by doing
something like this:
```
finished = true
keepalive_thread.wakeup
keepalive_thread.join
```
If the thread is sleeping, `keepalive_thread.wakeup` will stop the
`sleep` method and run the next iteration causing the thread to
complete. However, there is a timing issue at play here. If
`keepalive_thread.wakeup` is called at a time when the thread is not
sleeping, it will have no effect and the thread may end up sleeping for
60 seconds which is longer than our timeout on CI of 45 seconds.
What does this change do?
1. Change `sleep 60` to sleep in intervals of 1 second checking if the
job has been finished each time.
2. Add `use_redis_snapshotting` to `Jobs::Base` spec since Redis is
involved in scheduling and we want to ensure we don't leak Redis
keys.
3. Add `ConcurrentJob.stop!` and `thread.join` to `ensure` block in "handles job concurrency"
test since a failing expectation will cause us to not clean up the
thread we created in the test.
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_to_flag_posts site setting to flag_post_allowed_groups.
Note: In the original setting, "posts" is plural. I have changed this to "post" singular in the new setting to match others.
This change converts the min_trust_to_create_topic site setting to
create_topic_allowed_groups.
See: https://meta.discourse.org/t/283408
- Hides the old setting
- Adds the new site setting
- Add a deprecation warning
- Updates to use the new setting
- Adds a migration to fill in the new setting if the old setting was
changed
- Adds an entry to the site_setting.keywords section
- Updates tests to account for the new change
- After a couple of months, we will remove the min_trust_to_create_topicsetting entirely.
Internal ref: /t/117248
The most common thing that we do with fab! is:
fab!(:thing) { Fabricate(:thing) }
This commit adds a shorthand for this which is just simply:
fab!(:thing)
i.e. If you omit the block, then, by default, you'll get a `Fabricate`d object using the fabricator of the same name.
This commit fixes an issue where when some actions were done
(deleting/recovering post, moving posts) we updated the
topic_users.bookmarked column to the wrong value. This was happening
because the SyncTopicUserBookmarked job was not taking into account
Topic level bookmarks, so if there was a Topic bookmark and no
Post bookmarks for a user in the topic, they would have
topic_users.bookmarked set to false, which meant the bookmark would
no longer show in the /bookmarks list.
To reproduce before the fix:
* Bookmark a topic and don’t bookmark any posts within
* Delete or recover any post in the topic
c.f. https://meta.discourse.org/t/disappearing-bookmarks-and-expected-behavior-of-bookmarks/264670/36
We updated scheduled admin checks to run concurrently in their own jobs. The main reason for this was so that we can implement re-check functionality for especially flaky checks (e.g. group e-mail credentials check.)
This works in the following way:
1. The check declares its retry policy using class methods.
2. A block can be yielded to if there are problems, but before they are committed to Redis.
3. The job uses this block to either a) schedule a retry if there are any remaining or b) do nothing and let the check commit.
This PR does some preparatory refactoring of scheduled admin checks in order for us to be able to do custom retry strategies for some of them.
Instead of running all checks in sequence inside a single, scheduled job, the scheduled job spawns one new job per check.
In order to be concurrency-safe, we need to change the existing Redis data structure from a string (of serialized JSON) to a list of strings (of serialized JSON).
If a codeblock contains **exactly** the same markdown as an image which has been retrieved by the 'pull hotlinked' job, then it will be replaced with the new URL. This commit adds failing (skipped) tests for this issue.
This is a bug that happens only when the current date is less than 90 days from a date on which the time zone transitions into or out of Daylight Savings Time.
In these conditions, bulk invites show the time of day of their expiration as being 1 hour later than the current time.
Whereas it should match the time of day the invite was generated.
This is because the server has not been using the user's timezone in calculating the expiration time of day. This PR fixes issue by considering the user's timezone when doing the date math.
https://meta.discourse.org/t/bulk-invite-logic-to-generate-expire-date-bug/274689
After fbe0e4c we always pass a block into these methods.
So yield inside the export methods works and there is no need
anymore to wrap them into enumerators.
Our code assumed the content_range interval was inclusive, but they are open-ended due to Postgres' [discrete range types](https://www.postgresql.org/docs/current/rangetypes.html#RANGETYPES-DISCRETE), meaning [1,2] will be represented as [1,3).
It also fixes some flaky tests due to test data not being correctly setup and the registry not being resetted after each test.
When we receive the stream parameter, we'll queue a job that periodically publishes partial updates, and after the summarization finishes, a final one with the completed version, plus metadata.
`summary-box` listens to these updates via MessageBus, and updates state accordingly.
A previous change updated `ReviewableQueuedPost`'s `created_by`
to be consistent with other reviewable types. It assigns
the the creator of the post being queued to `target_created_by` and sets
the `created_by` to the creator of the reviewable itself.
This fix updates some of the `created_by` references missed during the
intial fix.
We recently introduced this advice to admins when some translation overrides are outdated or using unknown interpolation keys:
However we missed the case where the original translation key has been renamed or altogether removed. When this happens they are no longer visible in the admin interface, leading to the confusing situation where we say there are outdated translations, but none are shown.
Because we don't explicitly handle this case, some deleted translations were incorrectly marked as having unknown interpolation keys. (This is because I18n.t will return a string like "Translation missing: foo", which obviously has no interpolation keys inside.)
This change adds an additional status, deprecated for TranslationOverride, and the job that checks them will check for this status first, taking precedence over invalid_interpolation_keys. Since the advice only checks for the outdated and invalid_interpolation_keys statuses, this fixes the problem.
This PR adds a feature to help admins stay up-to-date with their translations. We already have protections preventing admins from problems when they update their overrides. This change adds some protection in the other direction (where translations change in core due to an upgrade) by creating a notice for admins when defaults have changed.
Terms:
- In the case where Discourse core changes the default translation, the translation override is considered "outdated".
- In the case above where interpolation keys were changed from the ones the override is using, it is considered "invalid".
- If none of the above applies, the override is considered "up to date".
How does it work?
There are a few pieces that makes this work:
- When an admin creates or updates a translation override, we store the original translation at the time of write. (This is used to detect changes later on.)
- There is a background job that runs once every day and checks for outdated and invalid overrides, and marks them as such.
- When there are any outdated or invalid overrides, a notice is shown in admin dashboard with a link to the text customization page.
Known limitations
The link from the dashboard links to the default locale text customization page. Given there might be invalid overrides in multiple languages, I'm not sure what we could do here. Consideration for future improvement.
What is the problem?
When an admin changes the default_sidebar_categories or default_sidebar_tags site settings and opts to backfill the setting,
we currently enqueue a sidekiq job to run the backfilling operation. When an admin changes those settings multiple times
within a short time frame, multiple sidekiq jobs with different backfilling parameters will be enqueued.
This is problematic if multiple jobs are executed concurrently as it may lead to situations where a job
with “outdated” site setting values is completed after a job with the “latest” site setting values.
What is the fix?
By setting `cluster_concurrency` to `1`, we ensure that only one of such
backfilling job will execute across all the sidekiq processes that are
deployed at any point in time. Since Sidekiq pops off job in the order
in which they are pushed, limiting the cluster concurrency here will
allow us to execute the enqueued `Jobs::BackfillSidebarSiteSettings`
jobs serially.
While we are unable to support OAUTH2 with pop3 (due to upstream dependency ruby/net-pop#16), we are adding the support for mail pollers plugin. Doing so, it would be possible to write a plugin which then uses other ways (microsoft graph sdk for example) to poll emails from a mailbox.
The idea is that a plugin would define a class which inherits from Email::Poller and defines a poll_mailbox static method which returns an array of strings. Then the plugin could call register_mail_poller(<class_name>) to have it registered. All the configuration (oauth2 tokens, email, etc) could be managed by sitesettings defined in the plugin.
This change adds support retroactively updating display names in the new quote format when the user's name is changed. It happens through a background job that is triggered by a callback when a user is saved with a new name.
Anonymization is among the most expensive operations we can perform with
extreme potential to impact the database. To mitigate risk we only allow a
single anonymization across the entire cluster concurrently.
This commit introduces support for `cluster_concurrency 1`. When you set that on a Job it will only allow 1 concurrent execution per cluster.
SearchIndexer is only automatically disabled in `before_all` and `before` blocks which means at the start
of test runs. Enabling the SearchIndexer in one `fab!` block will affect
all other `fab!` blocks which is not ideal as we may be indexing stuff
for search when we don't need to.
On the client-side, message-bus subscriptions and reviewable count UI is based on the 'redesigned_user_menu_enabled' boolean. We need to use the same logic on the server-side to ensure things work correctly when legacy navigation is used alongside the new user menu.
Currently, we’re performing a check when a user is suspended in the
`UserEmail` job and we’re assuming a `post` is always available, which
is not the case. The code indeed breaks when the job is called with the
`account_suspended` type option.
This patch fixes this issue by making the check use the safe navigation
operator, thus making it working when `post` is not provided.
There is no need to validate the user's emails when
promoting/demoting their trust level, this can cause
issues in things like Jobs::Tl3Promotions, we don't
need to fail in that case when all we are doing is changing
trust level.
Currently, when a suspended user belongs to a group PM (private message
with more than two people in it) and a staff member sends a message to
this group PM, then the suspended user will receive an email.
This happens because a suspended user can only receive emails from staff
members. But in this case, this can be seen as a bug as the expected
behavior would be instead to not send any email to the suspended user. A
staff member can participate in active discussions like any other
member and so their messages in this context shouldn’t be treated
differently than the ones from regular users.
This patch addresses this issue by checking if a suspended user receives
a message from a group PM or not. If that’s the case then an email won’t
be sent no matter if the post originated from a staff member or not.
* UX: add type tag and design update
* UX: clarify status copy in reviewQ
* DEV: switch to selectKit
* UX: color approve/reject buttons in RQ
* DEV: regroup actions
* UX: add type tag and design update
* UX: clarify status copy in reviewQ
* Join questions for flagged post with "or" with new I18n function
* Move ReviewableScores component out of context
* Add CSS classes to reviewable-item based on human type
* UX: add table header for scoring
* UX: don't display % score
* UX: prefix modifier class with dash
* UX: reviewQ flag table styling
* UX: consistent use of ignore icon
* DEV: only show context question on pending status
* UX: only show table headers on pending status
* DEV: reviewQ regroup actions for hidden posts
* UX: reviewQ > approve/reject buttons
* UX: reviewQ add fadeout
* UX: reviewQ styling
* DEV: move scores back into component
* UX: reviewQ mobile styling
* UX: score table on mobile
* UX: reviewQ > move meta info outside table
* UX: reviewQ > score layout fixes
* DEV: readd `agree_and_keep` and fix the spec tests.
* Fix the spec tests
* fix the quint test
* DEV: readd deleting replies
* UX: reviewQ copy tweaks
* DEV: readd test for ignore + delete replies
* Remove old
* FIX: Add perform_ignore back in for backwards compat
* DEV: add an action alias `ignore` for `ignore_and_do_nothing`.
---------
Co-authored-by: Martin Brennan <martin@discourse.org>
Co-authored-by: Vinoth Kannan <svkn.87@gmail.com>
- Reduce duplication of terms in post index from unlimited to 6. This will
result in reduced index size and reduced weighting for posts containing
a huge amount of duplicate terms. (Eg: a post containing "sam sam sam sam
sam sam sam sam", will index as "sam sam sam sam sam sam", only including
the word up to 6 times.) This corrects a flaw where title weighting could
be ignored.
- Prioritize exact matches of words in titles. Our search always performs
a prefix match. However we want to give special weight to exact title matches
meaning that a search for "sum" will find topics such as "the sum of us" vs
"summer in spring".
- Pick up fixes to our search algorithm which are missing from old indexes.
Specifically pick up the fix that indexes URLs properly. (`https://happy.com`
was stemmed to `happi` in keywords and then was not searchable)
see also:
https://meta.discourse.org/t/refinements-to-search-being-tested-on-meta/254158
Indexing will take a while and work in batches, in the background.
So it can easily be overwritten in a plugin for example.
### Added more tests to provide better coverage
We previously only had `u.silenced_till IS NULL` but I made it consistent with pretty much every other places where we check for "active" users.
These two new lines do change the query a tiny bit though.
**Before**
- You could not get the badge if you were currently silenced (no matter what period is being checked)
- You could get the badge if you were suspended 😬
**After**
- You can't get the badge if you were silenced during the past year
- You can't get the badge if you were suspended during the past year
### Improved the performance of the query by using `NOT EXISTS` instead of `LEFT JOIN / COUNT() = 0`
There is no difference in behaviour between
```sql
LEFT JOIN user_badges AS ub ON ub.user_id = u.id AND ...
[...]
HAVING COUNT(ub.*) = 0
```
and
```sql
NOT EXISTS (SELECT 1 FROM user_badges AS ub WHERE ub.user_id = u.id AND ...)
```
The only difference is performance-wise. The `NOT EXISTS` is 10-30% faster on very large databases (aka. posts and users in X millions). I checked on 3 of the largest datasets I could find.
```
class Jobs::DummyDelayedJob < Jobs::Base
def execute(args = {})
end
end
RSpec.describe "Jobs.run_immediately!" do
before { Jobs.run_immediately! }
it "explodes" do
current_user = Fabricate(:user)
Jobs.enqueue_in(1.seconds, :dummy_delayed_job)
sign_in(current_user)
end
end
```
The test above will fail with the following error if `ActiveRecord::Base.connection_handler.clear_active_connections!` is called before the configured Capybara server checks out a connection from the connection pool.
```
ActiveRecord::ActiveRecordError:
Cannot expire connection, it is owned by a different thread: #<Thread:0x00007f437391df58@puma srv tp 001 /home/tgxworld/.asdf/installs/ruby/3.1.3/lib/ruby/gems/3.1.0/gems/puma-6.0.2/lib/puma/thread_pool.rb:106 sleep_forever>. Current thread: #<Thread:0x00007f437d6cfc60 run>.
```
We're not exactly sure if this is an ActiveRecord bug or not but we've
invested too much time into investigating this problem. Fundamentally,
we also no longer understand why `ActiveRecord::Base.connection_handler.clear_active_connections!` is being called in an ensure block
within `Jobs::Base#perform` which was added in
ceddb6e0da 10 years ago. This
commit moves the logic for running jobs immediately out of the
`Jobs::Base#perform` method into another `Jobs::Base#perform_immediately` method such that
`ActiveRecord::Base.connection_handler.clear_active_connections!` is not
called. This change will only impact the test environment.