Cache¶
By default, the cache system is enabled. However, in order to actually generate cache entries, the following management script needs to be executed:
$ python manage.py publish
This will then generate all pages for which a cache entry exists. By default, all CMS-related pages are cached.
Cached files are being written to public_html/cache/
. By default, the path
to the public_html
folder is set up to be relative to the main installation
folder of your application:
../../public_html/
For example, if your application is installed in the folder ~/app/myapp/
then the public_html
folder is configured to be ~/public_html/
.
Therefore the cache is generated in ~/public_html/cache/
.
The path to the cache folder can be customised via the CACHE_ROOT
settings variable.
Web-server Integration¶
The idea is this: If there is a corresponding cached file in the cache folder for any given page, then such page should be served by the web-server directly without invoking WSGI, python, Django and Cubane. This is obviously orders of magnitude faster.
A cached file should only be served if all of the items below hold true:
- The request is a
GET
orHEAD
request. - The request has an empty query string.
- A corresponding cached file exists in the cache folder
If your web-server is apache 2, then the following configuration options may be used to express such logic:
# redirect non-www. to www. RewriteCond %{HTTP_HOST} !^www. RewriteRule ^(.*)$ https://www.%{HTTP_HOST}$1 [R=301,L] # simple GET requests without query strings are cached (if file exists) RewriteCond %{THE_REQUEST} ^(GET|HEAD) RewriteCond %{QUERY_STRING} ^$ RewriteCond %{REQUEST_URI} ^([^.]+)$ RewriteCond %{DOCUMENT_ROOT}/cache/%1index.html -f RewriteRule ^[^.]+$ /cache/%1index.html [QSA,L] # accessing cache directly is forbidden RewriteCond %{REQUEST_URI} ^/cache/.*$ RewriteRule ^/cache/.*$ - [F]
The first block redirects to https://www.example.com/ if the request URL does
not start with www
, e.g. https://example.com (we assume that HTTPS is used).
The second block is the actual URL rewrite based on the conditions we’ve
identified above: If the request is a GET
or HEAD
request with an empty
query string and a corresponding cached file exists, then the URL is rewritten
to point to that very same (cached) file.
The last block prevents any files to be served directly. E.g.
https://www.example.com/cache/index.html
should not be served directly
unless it was constructed by the redirect rule.
Invalidation¶
The cache can be invalidated by simply running the following management script:
$ python manage.py invalidate
Also the cache is invalidated whenever any relevant entity is changed by using the backend system.
When invalidating the cache, all cached files are renamed by prefixing the
files with a dot character (.
). This process ensures that
- All cached files will no longer match, therefore any incoming request will be dispatched via python, Django and Cubane.
- The content of all cached files still exist and can be placed back very quickly.
Detecting Changes¶
When generating the cache, a page may not be generated again if the following conditions are both holding true:
- The cached file has been invalidated but still exists (prefixed with a dot character).
- The content might have changed due to an analysis of last modification timestamps.
If the content did not change then the previously generated cache file is simply renamed back to its original file name.
Otherwise, if the content did change, then the old file is replaced with new content that is generated for the entire page.
Before rendering a page, a template context is derived which contains information about model instances such as the current page, navigation items, footer elements and other entities.
Cubane makes the following assumptions:
- It is safe to materialise all database queries that are provided via the template context prior to rendering the page.
- A template context only contains information that is relevant to rendering the corresponding page.
- All relevant model instances have been derived from
cubane.models.DateTimeBase
or have a timestamp property with the nameupdated_on
indicating the date and time when the last modification of the entity has been made.
Based on such context, cubane will scan through all entities and determines that latest timestamp value it can find of any entity. This may represent a timestamp of one of the following items:
- The current page
- Any related page
- Any item that is presented in any navigation section
- The general timestamp when the last code deployment has happened.
- A general timestamp when an item has been deleted the last time.
Based on this data, Cubane will work out if the largest timestamp that has been derived from a template context is before or after the timestamp of the cached file. If the cached file’s timestamp is later then the cached file is restored without having to render the page again. Otherwise the page is rendered again replacing any cached content.
Clear Cache¶
The entire cache can be removed entirely by running the following management command:
$ python manage.py clearcache
Adding cached entries¶
Additional cached entries might be added by adding the corresponding pages to
the sitemap with the cached
argument set to True
.
By default, custom entries that are added to the sitemap are not cached. If
an entry should be cached then the cached
argument of the
cubane.cms.views.CustomSitemap.add()
or
cubane.cms.views.CustomSitemap.add_url()
method needs to be set to
True
.
For example:
from cubane.cms.views import CMS
class MyCMS(CMS):
def on_custom_sitemap(self, sitemap):
super(MyCMS, self).on_custom_sitemap(sitemap)
sitemap.add_url('/my-custom-url/', cached=True)
Then the cache system will generate a cached version of the corresponding page
with the URL /my-custom-url/
once the publish
management command is
executed.
Backend Integration¶
By default, the CMS system will add a Publish
button. The publish button is
only visible if the cache has been invalidated. Once the button has been
pressed the cache system will generate all cache items and the button will
disappear again.
The button is not visible if the cache system has been disabled via the
CACHE_ENABLED
settings variable.
Sometimes you may not want to have such button at all. For example, you could
simply execute the publish
command as a cron-job periodically, so that the
cache system will always publish automatically after a certain period of time.
This is safe to do since you cannot run the publish
command multiple times
simultaneously. Running the management script will terminate any script what
might be running at the moment and will then continue to execute.
The button can be removed even with the cache system still being activated by
setting the settings variable CACHE_PUBLISH_ENABLED
to False
.