By optionally passing the request object to Page.get_sitemap_urls() it
will now use the cached site root on the request object instead of
retrieving it for each call. This cuts the number of queries required
for a sitemap roughly in half.
After migrating a Wagtail-based site from MySQL to Postgres, we
noticed that malicious requests to the site that included percent-
encoded Unicode NULLs (`%00`) raised a `ValueError` exception that we
hadn't seen when using MySQL: `A string literal cannot contain NUL
(0x00) characters.` This appears to relate to `psycopg2`'s decision to
raise an exception in these situations, as discussed here:
https://github.com/psycopg/psycopg2/issues/420
While newer versions of Django appear to provide some field validation
that addresses these characters, it doesn't look like Wagtail's
redirect middleware is making use of those validators, and so it seemed
reasonable to clean these characters in the context of 'normalizing'
the paths before looking for corresponding redirects -- especially
since a quick investigation on the internet suggests that U+0000 in
URLs can be used as a means of attack, and also since RFC 3986 says:
Note, however, that the "%00" percent-encoding (NUL) may require
special handling and should be rejected if the application is not
expecting to receive raw data within a component.