Our 'page_view_crawler' / 'page_view_anon' metrics are based purely on the User Agent sent by clients. This means that 'badly behaved' bots which are imitating real user agents are counted towards 'anon' page views.
This commit introduces a new method of tracking visitors. When an initial HTML request is made, we assume it is a 'non-browser' request (i.e. a bot). Then, once the JS application has booted, we notify the server to count it as a 'browser' request. This reliance on a JavaScript-capable browser matches up more closely to dedicated analytics systems like Google Analytics.
Existing data collection and graphs are unchanged. Data collected via the new technique is available in a new 'experimental' report.
In dev/prod, these are absorbed by unicorn. Most commonly, they occur when a client interrupts a message-bus long-polling request.
Also reverts the EPIPE workaround introduced in 011c9b9973
This was previously disabled because of incompatibility with the ember-cli proxy. This commit fixes that incompatibility, and restores the development behaviour to match production.
There were three issues at play:
1. Our bootstrap-js addon handles the forwarding of most requests in the ember-cli proxy. This is not built to handle streaming responses. Solution: skip our custom request processing for `/message-bus/*` and use ember-cli's default `http-proxy`.
2. The request/response size-limiting middleware (`rawMiddleware`) would apply even to unhandled paths, causing request and response bodies to be buffered. Solution: skip it for any paths which are not handled by our custom addon.
3. Expressjs servers will buffer/compress responses. Solution: add `Cache-Control: no-transform` to message-bus responses. For now I've done this in development only, but it may be useful to add it to message-bus's default headers in future
MessageBus::Diagnostics allows anyone with access to carry out certain
operations that may result in a denial of service. The impact of this is
greater on multisiite clusters.
It makes much more sense for these to be GlobalSettings, since, in
multisite clusters, only the default site's settings would be respected.
Co-authored-by: David Taylor <david@taylorhq.com>
Fixes `Rack::Lint::LintError: a header value must be a String, but the value of 'Retry-After' is a Integer`. (see: 14a236b4f0/lib/rack/lint.rb (L676))
I found it when I got flooded by those warning a while back in a test-related accident 😉 (ember CLI tests were hitting a local rails server at a fast rate)
Previously we would consider a user "present" and "last seen" if the
browser window was visible.
This has many edge cases, you could be considered present and around for
days just by having a window open and no screensaver on.
Instead we now also check that you either clicked, transitioned around app
or scrolled the page in the last minute in combination with window
visibility
This will lead to more reliable notifications via email and reduce load of
message bus for cases where a user walks away from the terminal
The message_bus performs a fair amount of work prior to hijacking requests
this change ensures that if there is a situation where the server is flooded
message_bus will inform client to back off for 30 seconds + random(120 secs)
This back-off is ultra cheap and happens very early in the middleware.
It corrects a situation where a flood to message bus could cause the app
to become unresponsive
MessageBus update is here to ensure message_bus gem properly respects
Retry-After header and status 429.
Under normal state this code should never trigger, to disable raise the
value of DISCOURSE_REJECT_MESSAGE_BUS_QUEUE_SECONDS, default is to tell
message bus to go away if we are queueing for 100ms or longer
Sometimes we would like to create a base image without any DB access, this
assists in creating custom base images with custom plugins that already
includes `public/assets`
Following this change set you can run:
```
SPROCKETS_CONCURRENT=1 DONT_PRECOMPILE_CSS=1 SKIP_DB_AND_REDIS=1 RAILS_ENV=production bin/rake assets:precompile
```
Then it is straight forward to create a base image without needing a DB or
Redis.
This backend is a bit faster and well tested, this is part of a longer
term plan to have a `backend: :memory, threaded: false` type config for
message bus which we can use in test.
The threading in message bus causes all sorts of surprises in test, it will
be nice not to be beholden to them.
Adds `DISCOURSE_MESSAGE_BUS_REDIS_ENABLED` env var, that when set
to true, will allow Discourse to connect to a different redis
instance for MessageBus needs.
When enabled you can configure the same env vars user for redis,
but prefixed by `MESSAGE_BUS`, eg:
`DISCOURSE_MESSAGE_BUS_REDIS_HOST`
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging