I was developing a small app as a playground and confidence builder, choosing django, heroku and S3 as resources. One of the packages used is easy-thumbnails.
Behaviour
I chose to use the easy-thumbnails app with S3, a possibility granted by the storages framework. At a glance, everything works OK; thumbnails are created and rendered accordingly. However, on a thumbnail-heavy page, I (and my profiling app) noticed significant load times (3s for a page with 7 thumbnails).
Drilling down, it became evident that something was checking the S3 bucket for each thumbnail (a httpplib call occurred for each one).
Cause
TL;DR: If you use the default ImageField
, the thumbnail storage is created by the Thumbnailer framework. Each time the storage object is created, it checks for the bucket's existence. This means a query to S3. Lots of images equals lots of queries to S3, checking for a bucket's existence.
Logic goes something like this:
-
In
thumbnail.py
:get_thumbnailer(source)[alias]
-
In
easy_thumbnails.files.py
executing line 52 results in creating a newThumbnailerFieldFile()
object -
easy_thumbnails.fields.py
has aThumbnailerField()
object created, which usesthumbnail_storage
parameter. This is empty in the above call, which results later in creating the storge object -
In
easy_thumbnails.files.py
, theThumbnailer
class constructor contains:if not thumbnail_storage: thumbnail_storage = get_storage_class( settings.THUMBNAIL_DEFAULT_STORAGE)()
Solution
I have thought of two options:
- Either tweak the
easy_thumbnails.files.py
, theThumbnailer
class constructor or - Try to find a way not to re-create the storage object
I ended up using in my model something like:
source_image = ThumbnailerImageField(
blank=True, null=True,
upload_to=get_file_path,
storage=picture_log_storage,
thumbnail_storage=thumbnail_storage)
Where the thumbnail_storage is initialised only once at the beginning of the .models file:
thumbnail_storage = get_storage_class(settings.THUMBNAIL_DEFAULT_STORAGE)()
This way, the storage object is created only once per Django instance and the bucket existence is also queried only once.
Results
Now, a page query takes about 120ms on average instead of 3s. Wow. Much speedup.
Member discussion: