Re-organise and re-structure the ``linkcheck`` builder:
- All functions defined within functions are factored out into top-level functions or class methods
- Classes and methods have been re-arranged (Builder, PostTransform, Checker, Worker)
- TLS verification on ``sphinx.util.requests`` has been changed to not pass the ``Config`` object all the way down
- The ``Hyperlink`` object now stores the document path
- ``BuildEnvironment`` and ``Config`` objects are used to extract properties and are not stored as class attributes
TLS operates at a lower layer than HTTP, and so if there is a TLS-related error from a host,
it seems unlikely that retrying with a different higher-layer protocol request
(HTTP GET instead of HTTP HEAD) could succeed.
We should not make additional HTTP requests that we do not believe will succeed.
This closes HTTP responses when no content reads are required, as
when requests are made in streaming mode, ``requests`` doesn't know
whether the caller may intend to later read content from a streamed
HTTP response object and holds the socket open.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
With parallel run of tests, one gets "Address already in use" errors
as all tests attempt to bind to the same port. Fix it with a shared
file-system lock.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Add raw directives' source URL to the list of links to check with linkcheck.
By the way, refactor HyperlinkCollector by adding `add_uri` function.
Add test for linkcheck raw directives source URL
The approach of `rewrite_github_anchor` makes some anchors valid. But
it also makes other kind of anchors invalid. This disables the handler
to make them valid again (while 4.1.x release).
Now linkcheck builder integrates `linkcheck_warn_redirects` into
`linkcheck_allowed_redirects`. As a result, linkcheck builder will
emit a warning when "disallowed" redirection detected via
`linkcheck_allowed_redirects`.
Add a new confval; `linkcheck_warn_redirects` to emit a warning when
the hyperlink is redirected. It's useful to detect unexpected redirects
under the warn-is-error mode.
Instead of using application members to access the builder and trigger a
build, use the main app interface.
It ensures the builder setup is realistic, builder cleanups are executed
and the build-finished events are emitted.
So far, linkcheck scans all of references and images from documents, and
checks them parallel. As a result, some URL would be checked twice (or
more) by race condition.
This collects the URL via post-transforms, and removes duplicated URLs
before checking availability.
refs: #4303
Linkcheck organizes the URLs to checks in a PriorityQueue. The items are
tuples (priority, url, docname, lineno).
Tuples where the lineno is `None` are not comparable with tuples that
have an integer lineno, and PriorityQueue items must be comparable (see
https://bugs.python.org/issue31145).
Fixes an issue when a document contains two links to the same URL, one
with an int line number and the other without line number metadata (such
as an image :target: attribute).
Using 0 instead of None to represent no line number should not lead to
observable changes, the result logger only logs the line number when it
is truthy.
Close#8565