tech.agilitynerd.com

scratching that itch... 

Debug Site for Website Redirects By Referer String

I'm adding an "m" subdomain to agilitycourses.com to provide a better mobile browsing experience. I'm using the referrer string in Django middleware (currently using minidetector) to detect whether the client is mobile and redirect them to the mobile site. Since it is likely that some folks will/won't get appropriately redirected I was looking for an easy way for them to tell me when they were incorrectly redirected. I'd need to know their referer string.

A little googling turned up a nice one purpose website: www.whatismyreferrer.com/

Filed under  //   django   mobile   referrer   web development  

Comments [0]

My Favorite ORM and Python Anti-Patterns

At work I was looking at improving the performance of one of our slower web pages. It can be rewarding to find a little piece of code that can be easily optimized. This time there were several functions that were adding 10+ sec to the page in worst case. It wasn't a problem for most clients, but when clients with who are related to many other clients hit the page they'd experience terrible performance. Here's pseudo code for the combination of anti-patterns that caused the problem:

# Projects have users and users are in different organizations 
# (project can contain multiple organization's users)
activeOrganizationProjectUsers = \
    [x for x in project.users \
        if x.active and x.organization == organization]

if activeOrganizationProjectUsers:
    # do something *NOT* using activeOrganizationProjectUsers

There are two main problems with this code:

  1. It ignores the fact the project, users, and organization are backed by an ORM
  2. The list comprehension is being used to find all matching elements when only a single element is needed.

Ignoring the ORM

The code above wouldn't be too bad if these were just lists of objects in memory. But being objects that are instantiated by an ORM a number of database queries will be issued. In this particular case (w/o eager loading across user to the organization table) the following queries where executed:

  1. Join project to user and get all users for the project's id
  2. For each user load their organization (one by one) if the user is active

So in the case where there were hundreds of users on a project there were hundreds of queries executed and hundreds of User and Organization instances were instantiated. Depending on the size of the objects (and the ORM's behavior) it can take "real time" to fetch and instantiate all these large objects.

This code base has this kind of code sprinkled through out it. At one time during it's development the developers were encouraged to treat ORM backed objects as though they were Plain Old Python Objects (POPOs). The developer wouldn't necessarily see the performance degradation using small data sets either. This is one of the reasons why I like to tail the database log (or use django-debug-toolbar if I''m using Django) to see the queries go by.

Using List Comprehensions When a Single Value is Needed

To make this situation worse, the activeOrganizationProjectUsers list wasn't actually used. This is a combination of a Python anti-pattern and the ORM anti-pattern. What was required was to determine if a single active organization user existed.

I believe the original developer(s) used the list comprehension solution in a combination of ignorance and syntactic sugar. They didn't want to write a new function to do the query and put it in the User class so they used the existing class's API. The syntactic sugar was using the list comprehension to get more values than the one that was needed. If this wasn't a (potentially) expensive ORM backed operation the original code could have been:

activeOrganizationProjectUsers = False
for x in project.users:
    if x.active and x.organization == organization:
        activeOrganizationProjectUsers = True
        break

if activeOrganizationProjectUsers:
    # do something

But this solution could still query all possible user/organization combinations. The other question would be: which set is larger the organization users or the project users? It is likely looping over the organization's users looking for active ones would be more efficient anyway.

Remember the Underlying Representation

When performance matters remembering the objects are ORM backed is important. So in this case a single query was all that was required (SqlObject pseudo syntax):

activeOrganizationProjectUsers = \
    Users.selectBy(project=project,
                           active=True,
                           organization=organization).count() > 0
 

If abstracting out the ORM's methods is important this new function could be added to the appropriate class as a method. In my case making a change to use a query resulted in cutting the page load time by two orders of magnitude.

Filed under  //   anti-pattern   development   orm   python  

Comments [0]

Configuring Runit for Gunicorn and Django Installed in a Virtualenv on Ubuntu

I couldn't find any documentation that covered all the pieces for configuring my latest Django site so I hope this helps someone else out.

I had used mod_wsgi under Apache for my other Django sites. But now I'm using different python versions for the sites (until if/when I update the older sites) and I wasn't getting the correct versions of some python libraries (even though virtualenv apeared to be putting the appropriate python packages at the start of the sys.path). So I decided to configure Apache to ProxyPass to Gunicorn so I could run my Django app in its virtualenv without it getting any other python modules.

Installing Gunicorn

I installed Gunicorn into the virtualenv for my application, which simplifies using gunicorn from the command line. Assuming /home/user/virtualenvs/myapp is the location of the virtualenv:

$ source /home/user/virtualenvs/myapp/bin/activate
$ pip install gunicorn
or
$ easy_install gunicorn

This copies gunicorn_django to the /home/user/virtualenvs/myapp/bin directory. Test gunicorn with your app, assuming your Django app is located at /home/user/source/myapp, as follows:

$ source /home/user/virtualenvs/myapp/bin/activate
(myapp)$ cd /home/user/source/myapp
(myapp)$ gunicorn_django

Gunicorn starts myapp using the settings.py file in the current directory on 127.0.0.1:8000. Ctrl-C to stop the process.

Installing Runit on Ubuntu

There are two runit packages. You want the one that only runs services you add to it:

$ sudo apt-get install runit
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Suggested packages:
  runit-run socklog-run
The following NEW packages will be installed:
  runit
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0B/113kB of archives.
After this operation, 537kB of additional disk space will be used.
Selecting previously deselected package runit.
(Reading database ... 209845 files and directories currently installed.)
Unpacking runit (from .../runit_2.0.0-1ubuntu2_i386.deb) ...
Processing triggers for man-db ...
Setting up runit (2.0.0-1ubuntu2) ...
runsvdir (start) waiting
runsvdir (start) starting
runsvdir (start) pre-start
runsvdir (start) spawned, process 9575
runsvdir (start) post-start, (main) process 9575
runsvdir (start) running, process 9575

You'll want to create a directory for the application and a run script in /etc/service:

$ sudo mkdir /etc/service/myapp
$ sudo vi /etc/service/myapp/run
# enter the run script I'll show below
$ sudo chmod +x /etc/service/myapp/run
# stop runit from trying to run gunicorn until we are ready
$ sudo sv stop myapp
ok: down: myapp: 0s, normally up

The example run script checked into Gunicorn had some syntax errors and wasn't quite what I wanted. Here's my version:

#!/bin/sh

GUNICORN=/home/user/virtualenvs/myapp/bin/gunicorn_django
ROOT=/home/user/source/myapp
PID=/var/run/myapp.pid

if [ -f $PID ] 
    then rm $PID 
fi

cd $ROOT
exec $GUNICORN -c $ROOT/gunicorn.conf.py --pid=$PID 

You can create a configuration file for gunicorn to use or just create an empty file for now:

$ touch /home/user/source/myapp/gunicorn.conf.py

If you have multiple appserver you'll need to run gunicorn on different ports, you can put the configuration in the gunicorn.conf.py file:

bind = "127.0.0.1:8111"

Putting it Together

Now you can test that the run script works when run as root:

$ sudo /etc/service/myapp/run

Gunicorn should start and start the appserver. If it fails you can debug the script via:

$ sudo bash -x /etc/service/myapp/run

Tell runit to start and keep gunicorn running:

$ sudo sv start myapp
ok: run: myapp: (pid 7540) 0s
$ sudo sv status myapp
run: myapp: (pid 7543) 1s

Filed under  //   apache   django   gunicorn   runit   ubuntu   virtualenv  

Comments [1]