This past week I updated my Agility Course Master API to Django version 4.1.4 and Python version 3.11. I noticed that I hadn't run some of the database cleanup management commands (i.e. clearsessions ) that I should have been running periodically. So I started investigating how to run them automatically.
First, I started investigating how to run cron-like jobs in Django.
There are a number of django packages that support importing/running Django application code in cron
jobs.
In all my Django projects I used django-extensions which contains a simple Job
model for defining tasks which are then run by its runjobs
management command.
So that was my choice.
You don't strictly need a separate package, you could just invoke python
in your virtualenv with a script that import
s the necessary packages, connects to the db, etc.
But, these packages/apps take care of that boilerplate for you and you can use familiar Django idioms.
But, none of these apps/packages can run simultaneously in the same Docker container that serves the Django webapp/API.
So you need to run some sort of cron
process to control running the scripts.
After much searching, I figured out that I wanted to create and run a new container; based on my Django API image that runs cron
in the foreground.
That way, it has the same configuration and code as my API container and can run any Python code/Django management commands I want.
Here are some articles I found helpful:
Some of my cron
configuration is specific to the Debian slim-buster
image upon which I base my Django API container.
But, it should be similar enough to other Linux containers to get you started.
1. Install cron
in your Dockerfile
if it isn't already part of your Docker image:
RUN apt-get update && apt-get -y upgrade && apt-get install --no-install-recommends -y <your packages here> cron
2. Install django-extensions
and follow it's instructions for creating your job file(s) that will be run by cron
.
Unlike, their examples of crontab files, Debian uses a different format:
# run daily at 7:07 UTC
# has no environment and needs to redirect to stdout so docker can see results
7 7 * * * root source /etc/environment; /app/manage.py runjobs daily >/proc/1/fd/1 2>/proc/1/fd/2
Notes:
I'm running the runjobs daily
management command at 7:07 UTC everyday as root
. Create as many entries/crontab files as you need.
I put the above file named djangocrondaily
in the root of my repo in a crontab
directory.
I found out the hard way that cron
jobs don't inherit any environment. In my containers I don't hard code any values into the settings.py
file, they are all read from the container's environment variables. That's why I had to source /etc/environment
which is populated in the entrypoint.sh
below.
You can test running your script with no environment within your Docker container by running env -i /bin/sh scriptname
or env -i /bin/bash scriptname
as appropriate.
In order to get the output of the cron
jobs in docker logs
you need to redirect their output to stdout
.
3. entrypoint.sh
shared by my Django and cron
containers
#!/bin/sh
# store environment variables in a file so cron can source them
env >> /etc/environment
echo "Waiting for $SQL_DATABASE database..."
while ! nc -z $SQL_HOST $SQL_PORT; do
sleep 1
done
echo "$SQL_DATABASE ready!"
# production/staging environments
if [ "$1" = "gunicorn" ]
then
mkdir -p staticfiles
python manage.py collectstatic --noinput
python manage.py migrate
fi
exec "$@"
Notes:
This writes the environment variables into /etc/environment
at runtime so that file can be source
'd by each crontab entry.
The final exec "$@"
executes the CMD
for each of the docker-compose.yml
services.
4. Install crontab
used by cron
in the final build stage of the Dockerfile
FROM python:3.11-slim-bullseye as base
ENV PYTHONFAULTHANDLER=1 \
PYTHONHASHSEED=random \
PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1
ENV VIRTUAL_ENV=/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
WORKDIR /app
######################
FROM base as builder
ENV VIRTUAL_ENV=/venv
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
ENV PIP_DEFAULT_TIMEOUT=100 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1
RUN apt-get update && apt-get install --no-install-recommends -y git curl netcat build-essential libpq-dev libjpeg-dev zlib1g-dev libffi-dev
RUN curl -sSL https://install.python-poetry.org | python3 -
COPY pyproject.toml poetry.lock ./
# Only non dev dependencies
RUN $HOME/.local/bin/poetry install --no-dev
#####################
FROM base as final
COPY --from=builder /venv /venv
COPY --from=builder /root/.local /root/.local
# install runtime dependencies
RUN apt-get update && apt-get -y upgrade && apt-get install --no-install-recommends -y libffi-dev libpq-dev libjpeg-dev memcached libmemcached-tools netcat cron
# copy project
COPY . /app/
# install crontab to run django_extensions runscript - it is only run in the cron container
COPY crontab/djangocrondaily /etc/cron.d/djangocrondaily
RUN chmod 0644 /etc/cron.d/djangocrondaily
EXPOSE 8000
# run entrypoint.sh
ENTRYPOINT ["/app/entrypoint.sh"]
# entrypoint.sh gets CMD as arguments
CMD ["gunicorn", "api.wsgi:app", "--bind", "0.0.0.0:8000", "--log-level=info", "--access-logfile=-", "--error-logfile=-", "--workers=2", "--threads=4", "--worker-class=gthread", "--worker-tmp-dir=/dev/shm"]
Notes:
CMD
starts gunicorn
for my production and staging environments5. Simplified docker-compose
file that starts my django
and cron
docker services:
version: '3.7'
services:
django:
image: registry.xxx.com/yyy/zzz/api-django
build:
context: .
dockerfile: Dockerfile.prod
container_name: api-django
ports:
- 8000
env_file:
- ./env.prod
depends_on:
- db
cron:
image: registry.xxx.com/yyy/zzz/api-django
build:
context: .
dockerfile: Dockerfile.prod
container_name: api-cron
env_file:
- ./env.prod
command: >
cron -f -L 15
Notes:
dockerfile
, image
, context
, and env_files
settings.cron
service overrides the CMD
/command
to run cron
in the foreground instead of the default gunicorn
process.I hope you find this helpful if you need to run cron
to execute periodic Django tasks.