This past week I updated my Agility Course Master API to Django version 4.1.4 and Python version 3.11. I noticed that I hadn't run some of the database cleanup management commands (i.e. clearsessions ) that I should have been running periodically. So I started investigating how to run them automatically.

Overview

First, I started investigating how to run cron-like jobs in Django. There are a number of django packages that support importing/running Django application code in cron jobs. In all my Django projects I used django-extensions which contains a simple Job model for defining tasks which are then run by its runjobs management command. So that was my choice.

You don't strictly need a separate package, you could just invoke python in your virtualenv with a script that imports the necessary packages, connects to the db, etc. But, these packages/apps take care of that boilerplate for you and you can use familiar Django idioms.

But, none of these apps/packages can run simultaneously in the same Docker container that serves the Django webapp/API. So you need to run some sort of cron process to control running the scripts.

After much searching, I figured out that I wanted to create and run a new container; based on my Django API image that runs cron in the foreground. That way, it has the same configuration and code as my API container and can run any Python code/Django management commands I want.

Here are some articles I found helpful:

Details

Some of my cron configuration is specific to the Debian slim-buster image upon which I base my Django API container. But, it should be similar enough to other Linux containers to get you started.

1. Install cron in your Dockerfile if it isn't already part of your Docker image:

RUN apt-get update && apt-get -y upgrade && apt-get install --no-install-recommends -y <your packages here> cron

2. Install django-extensions and follow it's instructions for creating your job file(s) that will be run by cron. Unlike, their examples of crontab files, Debian uses a different format:

# run daily at 7:07 UTC
# has no environment and needs to redirect to stdout so docker can see results
7 7 * * * root source /etc/environment; /app/manage.py runjobs daily >/proc/1/fd/1 2>/proc/1/fd/2

Notes:

  • I'm running the runjobs daily management command at 7:07 UTC everyday as root. Create as many entries/crontab files as you need.

  • I put the above file named djangocrondaily in the root of my repo in a crontab directory.

  • I found out the hard way that cron jobs don't inherit any environment. In my containers I don't hard code any values into the settings.py file, they are all read from the container's environment variables. That's why I had to source /etc/environment which is populated in the entrypoint.sh below.

  • You can test running your script with no environment within your Docker container by running env -i /bin/sh scriptname or env -i /bin/bash scriptname as appropriate.

  • In order to get the output of the cron jobs in docker logs you need to redirect their output to stdout.

3. entrypoint.sh shared by my Django and cron containers

#!/bin/sh

# store environment variables in a file so cron can source them
env >> /etc/environment

echo "Waiting for $SQL_DATABASE database..."

while ! nc -z $SQL_HOST $SQL_PORT; do
  sleep 1
done

echo "$SQL_DATABASE ready!"

# production/staging environments
if [ "$1" = "gunicorn" ]
then
  mkdir -p staticfiles
  python manage.py collectstatic --noinput
  python manage.py migrate
fi

exec "$@"

Notes:

  • This writes the environment variables into /etc/environment at runtime so that file can be source'd by each crontab entry.

  • The final exec "$@" executes the CMD for each of the docker-compose.yml services.

4. Install crontab used by cron in the final build stage of the Dockerfile

FROM python:3.11-slim-bullseye as base

ENV PYTHONFAULTHANDLER=1 \
  PYTHONHASHSEED=random \
  PYTHONUNBUFFERED=1 \
  PYTHONDONTWRITEBYTECODE=1

ENV VIRTUAL_ENV=/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

WORKDIR /app

######################
FROM base as builder

ENV VIRTUAL_ENV=/venv
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
ENV PIP_DEFAULT_TIMEOUT=100 \
  PIP_DISABLE_PIP_VERSION_CHECK=1 \
  PIP_NO_CACHE_DIR=1

RUN apt-get update && apt-get install --no-install-recommends -y git curl netcat build-essential libpq-dev libjpeg-dev zlib1g-dev libffi-dev

RUN curl -sSL https://install.python-poetry.org | python3 -

COPY pyproject.toml poetry.lock ./

# Only non dev dependencies
RUN $HOME/.local/bin/poetry install --no-dev

#####################
FROM base as final

COPY --from=builder /venv /venv
COPY --from=builder /root/.local /root/.local

# install runtime dependencies
RUN apt-get update && apt-get -y upgrade && apt-get install --no-install-recommends -y libffi-dev libpq-dev libjpeg-dev memcached libmemcached-tools netcat cron

# copy project
COPY . /app/

# install crontab to run django_extensions runscript - it is only run in the cron container
COPY crontab/djangocrondaily /etc/cron.d/djangocrondaily
RUN chmod 0644 /etc/cron.d/djangocrondaily

EXPOSE 8000

# run entrypoint.sh
ENTRYPOINT ["/app/entrypoint.sh"]
# entrypoint.sh gets CMD as arguments
CMD ["gunicorn", "api.wsgi:app", "--bind", "0.0.0.0:8000", "--log-level=info", "--access-logfile=-", "--error-logfile=-", "--workers=2", "--threads=4", "--worker-class=gthread", "--worker-tmp-dir=/dev/shm"]

Notes:

  • the default CMD starts gunicorn for my production and staging environments

5. Simplified docker-compose file that starts my django and cron docker services:

version: '3.7'

services:
  django:
    image: registry.xxx.com/yyy/zzz/api-django
    build:
      context: .
      dockerfile: Dockerfile.prod
    container_name: api-django
    ports:
      - 8000
    env_file:
      - ./env.prod
    depends_on:
      - db

  cron:
    image: registry.xxx.com/yyy/zzz/api-django
    build:
      context: .
      dockerfile: Dockerfile.prod
    container_name: api-cron
    env_file:
      - ./env.prod
    command: >
      cron -f -L 15

Notes:

  • both services use the same dockerfile, image, context, and env_files settings.
  • the cron service overrides the CMD/command to run cron in the foreground instead of the default gunicorn process.

I hope you find this helpful if you need to run cron to execute periodic Django tasks.

Comments