For pages that do not specify canonical URL we will default to `https://SITENAME/PATH`.
This ensures that if a URL is crawled on the CDN the search ranking will transfer to the main site.
Additionally we whitelist the `?page` param
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging