tech.agilitynerd.com

scratching that itch... 
Filed under

django

 

Django Shrink The Web django-stw 0.2.0 Released

Shrink The Web has announced a new API for free users using their new preview verification feature. This change required changes to my django-stw package.

The changes (lifted from the CHANGELOG.txt):

Changes to the shrinkthewebimage template tag:

  • The shrinkthewebimage template tag is NOT backward compatible with version 0.0.1. The alt argument is no longer accepted.
  • The shrinkthewebimage template tag is now intended for use by free accounts, it adds the required preview feature. It can also be used by PRO account users wanting the preview functionality.
  • The shrinkthewebimage template tag now accepts PRO key-value arguments in the same manner as the stwimage tag. This functionality is shown in theexample template but may not yet be fully implemented by the STW web service.

Changes to the stwimage template tag:

  • The stwimage can now only be used for PRO features.

Common changes:

  • Template tags now throw exceptions in their constructors instead of in the render function so configuration errors are visible during development.
  • django-stw defines a key 'lang' for the SHRINK_THE_WEB dictionary that can be passed along as a default to the preview tag. Alternately a 'lang' keyword can be supplied in each template tag invocation. django-stw defaults it to 'en'. This functionality is not yet implemented by the STW web service.

The v 0.2.0 package is available on PyPi, as a source download on github, or via git clone.

Filed under  //   django   shrink the web  

Comments [0]

Obtain Short URLs and QR-Codes for Django Apps

Lately I've been interested in improving the interaction of my agilitycourses website for mobile users. One such improvement is to add QR Codes (aka 2D barcodes) representing the page URLs to the printed representations of pages served as PDFs.

I found that developers have reverse engineered the "api" of the goo.gl URL shortening web site. In my brief testing it is very fast. What makes that service extra useful is by adding ".qr" to a shortened URL it returns a PNG image of the QR Code for the shortened URL. That made it perfect for providing both short text and QR Code URL representations for my printed documents.

I threw together a few functions and put them in a module to make it easy to shorten a long URL, obtain the QR Code PNG and store it using Django's Storage functionality:

import os
import urllib
from django.utils import simplejson


def googl_shorten_url(long_url):
    """
    Returns goo.gl shortened url for the provided long_url.
    Code taken from: http://djangosnippets.org/snippets/2220/

    Parameters:

    - `long_url`: the url to supply to goo.gl to be shortened.
    """
    params = urllib.urlencode({'security_token': None, 'url': long_url})
    f = urllib.urlopen('http://goo.gl/api/shorten', params)
    return simplejson.loads(f.read())['short_url']


def googl_qrcode(googl_url):
    """
    Return file containing qr code image file for the given goo.gl url.

    Parameters:

    - `googl_url`: url from which to obtain the qr code.
    """
    return urllib.urlopen(googl_url + '.qr')


def get_url_qr_code_image(long_url, storage, storage_image_file_path=''):
    """
    Return goo.gl shortened url and storage name of qr code corresponding to
    the shortened url for the supplied full url. Contacts goo.gl to shorten
    the supplied long url then downloads and stores the qr code image file
    in the storage instance using the file path and the shortened url name
    as the storage name.

    Parameters:

    - `long_url`: the url to shorten.
    - `storage': a Django storage instance into which to store the qr code
    image.
    - `storage_image_file_path`: file system path to prepend to shortened
    url. This path must exist prior to calling this function.
    """
    try:
        googl_url = googl_shorten_url(long_url)
        qr_file_name = googl_url.split('/')[-1] + '.qr'
        qr_code_name = os.path.join(storage_image_file_path, qr_file_name)
        if not storage.exists(qr_code_name):
            qr_buffer = storage.open(qr_code_name, 'wb')
            qr_buffer.write(googl_qrcode(googl_url).read())
            qr_buffer.close()
    except:
        googl_url = None
        qr_code_name = None
    return googl_url, qr_code_name

Yes, it has a nasty bare try/except. For my uses this is optional functionality so I never want a failure to stop the main functionality of the views that use it. Add exception handling appropriate for your needs.

The main entry point is get_url_qr_code_image(). Here is an example of its use (assuming you save the code in googl.py):

>>> import googl
>>> from django.core.files.storage import default_storage

>>> short_url, qr_code_storage_name = googl.get_url_qr_code_image('http://google.com', default_storage)
>>> short_url
u'http://goo.gl/mR2d'
>>> qr_code_storage_name
u'mR2d.qr'
>>> default_storage.path(qr_code_storage_name)
u'/home/dev/agilitycourses/static/mR2d.qr'
>>> default_storage.url(qr_code_storage_name)
u'mR2d.qr'
>>> 

Hope you find this useful.

Filed under  //   django   goo.gl   python   qr-code  

Comments [0]

Mobile Web Site Redirects in Django

For the mobile version of agilitycourses.com I wanted to follow the approach Google appears to be using on some of its sites:

  • If the user views agilitycourses.com from a desktop browser they should see the standard/desktop version of the site.
  • If the user views agilitycourses.com from a mobile browser they should be redirected to a mobile domain (m.agilitycourses.com).
  • The mobile version of the website includes a link to the standard version.
  • If the mobile user chooses the standard website they should "stick" on that site and not be redirected to the mobile site.

I wanted to run two different websites but share templates and have the templates and css change for the mobile site. That meant that I'd need to set a variable(s) in the request to use to generate the appropriate HTML. So I found the simplest mobile device detector minidetector and initially used that. I later found Chris Drackett's fork has a number of useful enhancements and switched to it.

But minidetector didn't provide the ability to redirect to another site. I found Scott Newman's article on using multiple templates which had a section on performing the redirect and storing the user's selection in the session. So I forked Chris' minidetector and modified it to include the redirect and session storage. At the same time I decided to store all the minidetector variables into the session and add them, via middleware, to the request so the raw request wouldn't have to be parsed each time. My fork is available here with details on the new configuration options.

I'm using two domains so I can track analytics for the mobile and non-mobile sites separately and allow users to bookmark the desired site's pages. I use Google Analytics (via django-google-analytics) and Awstats for analytics.

Since I'm using two separate domain and sharing everything else I'm using a setup similar to the one described by Dustin Davis. I have a settings.py file and a mobile_settings.py that only overrides the features I need:

from settings import *
SITE_ID = 2
CACHE_MIDDLEWARE_KEY_PREFIX = "m.ac-"

I use a different memcached key prefix so the cached pages for the mobile site don't clash with those for the desktop site.

I setup m.agilitycourses on my server using the same Gunicorn setup I used for agilitycourses.com with the only changes being specifying the --bind address/port and the name of the mobile settings file:

#!/bin/sh

GUNICORN=/home/user/virtualenvs/myapp/bin/gunicorn_django
ROOT=/home/user/source/myapp
PID=/var/run/myapp.pid

if [ -f $PID ] 
    then rm $PID 
fi

cd $ROOT
exec $GUNICORN --bind 127.0.0.1:8001 -c $ROOT/gunicorn.conf.py --pid=$PID $ROOT/mobile_settings.py

If my templates/content start to diverge more significantly between the mobile and desktop sites I may set the TEMPLATE_DIRS differently in the mobile_settings file. Or I can move to Dustin's approach and create a new application containing the urls.py and views.py specific to my mobile deployment. I would think diverging further would call for a refactoring of the common functionality to its own application which could be imported into separate code branches for each domain.

Filed under  //   django   gunicorn   minidetector   mobile  

Comments [5]

Debug Site for Website Redirects By Referer String

I'm adding an "m" subdomain to agilitycourses.com to provide a better mobile browsing experience. I'm using the referrer string in Django middleware (currently using minidetector) to detect whether the client is mobile and redirect them to the mobile site. Since it is likely that some folks will/won't get appropriately redirected I was looking for an easy way for them to tell me when they were incorrectly redirected. I'd need to know their referer string.

A little googling turned up a nice one purpose website: www.whatismyreferrer.com/

Filed under  //   django   mobile   referrer   web development  

Comments [0]

Configuring Runit for Gunicorn and Django Installed in a Virtualenv on Ubuntu

I couldn't find any documentation that covered all the pieces for configuring my latest Django site so I hope this helps someone else out.

I had used mod_wsgi under Apache for my other Django sites. But now I'm using different python versions for the sites (until if/when I update the older sites) and I wasn't getting the correct versions of some python libraries (even though virtualenv apeared to be putting the appropriate python packages at the start of the sys.path). So I decided to configure Apache to ProxyPass to Gunicorn so I could run my Django app in its virtualenv without it getting any other python modules.

Installing Gunicorn

I installed Gunicorn into the virtualenv for my application, which simplifies using gunicorn from the command line. Assuming /home/user/virtualenvs/myapp is the location of the virtualenv:

$ source /home/user/virtualenvs/myapp/bin/activate
$ pip install gunicorn
or
$ easy_install gunicorn

This copies gunicorn_django to the /home/user/virtualenvs/myapp/bin directory. Test gunicorn with your app, assuming your Django app is located at /home/user/source/myapp, as follows:

$ source /home/user/virtualenvs/myapp/bin/activate
(myapp)$ cd /home/user/source/myapp
(myapp)$ gunicorn_django

Gunicorn starts myapp using the settings.py file in the current directory on 127.0.0.1:8000. Ctrl-C to stop the process.

Installing Runit on Ubuntu

There are two runit packages. You want the one that only runs services you add to it:

$ sudo apt-get install runit
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Suggested packages:
  runit-run socklog-run
The following NEW packages will be installed:
  runit
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0B/113kB of archives.
After this operation, 537kB of additional disk space will be used.
Selecting previously deselected package runit.
(Reading database ... 209845 files and directories currently installed.)
Unpacking runit (from .../runit_2.0.0-1ubuntu2_i386.deb) ...
Processing triggers for man-db ...
Setting up runit (2.0.0-1ubuntu2) ...
runsvdir (start) waiting
runsvdir (start) starting
runsvdir (start) pre-start
runsvdir (start) spawned, process 9575
runsvdir (start) post-start, (main) process 9575
runsvdir (start) running, process 9575

You'll want to create a directory for the application and a run script in /etc/service:

$ sudo mkdir /etc/service/myapp
$ sudo vi /etc/service/myapp/run
# enter the run script I'll show below
$ sudo chmod +x /etc/service/myapp/run
# stop runit from trying to run gunicorn until we are ready
$ sudo sv stop myapp
ok: down: myapp: 0s, normally up

The example run script checked into Gunicorn had some syntax errors and wasn't quite what I wanted. Here's my version:

#!/bin/sh

GUNICORN=/home/user/virtualenvs/myapp/bin/gunicorn_django
ROOT=/home/user/source/myapp
PID=/var/run/myapp.pid

if [ -f $PID ] 
    then rm $PID 
fi

cd $ROOT
exec $GUNICORN -c $ROOT/gunicorn.conf.py --pid=$PID 

You can create a configuration file for gunicorn to use or just create an empty file for now:

$ touch /home/user/source/myapp/gunicorn.conf.py

If you have multiple appserver you'll need to run gunicorn on different ports, you can put the configuration in the gunicorn.conf.py file:

bind = "127.0.0.1:8111"

Putting it Together

Now you can test that the run script works when run as root:

$ sudo /etc/service/myapp/run

Gunicorn should start and start the appserver. If it fails you can debug the script via:

$ sudo bash -x /etc/service/myapp/run

Tell runit to start and keep gunicorn running:

$ sudo sv start myapp
ok: run: myapp: (pid 7540) 0s
$ sudo sv status myapp
run: myapp: (pid 7543) 1s

Filed under  //   apache   django   gunicorn   runit   ubuntu   virtualenv  

Comments [1]

Confidently Refactoring Django URLs, Views, and Templates

Googility.com is my first Django website and under the covers the oldest code looked like it. I had originally written it with the sole intent of allowing people to enter dog agility businesses and websites into a database that I could use to create a Dog Agility Google Custom Search Engine. The primary mistake I made was making the "project" (in Django speak) effectively equivalent to the primary application. In other words I didn't divide the major features of the site into standalone applications (which would allow them to be more easily reused, extended and tested).

As I continued to work on it I learned more about organizing Django projects. When I added the periodical search to the website I created it as a standalone application. I recently split out my django-shrinktheweb application from the main code base.

The Custom Search Engine (CSE) functionality is a worthwhile application that I'm planning on releasing as its own reusable application. I had already created an application directory called "cse" into which I had placed my models, views, urls, and tests specific to the CSE functionality. But I wanted to make the following changes:

  • Move CSE templates into a cse template subdirectory
  • Name the templates to match the views that use them
  • Name the urls in the urls.py prefixed with the application name ("cse_")
  • Covert all reverse() calls in the views and url template tags to use the named urls

Those are enough changes that I was concerned that I might miss something that would fail either in the view code or in rendering of the templates.

The Django test client makes it easy to test the forward and reverse url matching, calling the view and rendering the template. It is kind of a coarse grained test but the changes I was making were perfect for this tool. Given a urls.py:

urlpatterns = patterns('cse.views',
                    url(r'^site/view/(?P<id>\d+)/$', 'view', name='cse_view'),
)

and a view:

def view(request, id, template='cse/view.html'):
    """Display an end user read only view of the site information"""
    site = get_object_or_404(Annotation, pk=id)
    return render_to_response(template,
                          {'site': site,
                           'labels': get_labels_for(site, cap=None),
                           },
                          context_instance=RequestContext(request))

I then wrote a test class to create the required test instances and tests for each url to verify that the url can be found by name (via reverse()), the url maps to a view, the view invokes the desired template(s), and the {% url %} calls within the template can all be resolved:

from django.test import TestCase
from django.test.client import Client
from django.conf import settings
from django.core.urlresolvers import reverse
from cse.models import Label, Annotation

class ViewsTestCase(TestCase):

    def setUp(self):
        self.client = Client()
        self.ROOT_URLCONF = settings.ROOT_URLCONF
        # can provide a custom urls.py for testing so the tests can be run when
        # the application is incorporated into another project
        # settings.ROOT_URLCONF = 'cse.tests.cse_test_urls'
        # override the template context processors if there are special ones in place
        # that either you want to test or want to avoid
        self.TEMPLATE_CONTEXT_PROCESSORS = settings.TEMPLATE_CONTEXT_PROCESSORS
        settings.TEMPLATE_CONTEXT_PROCESSORS = ()
        # Create some instances on which we can invoke views
        self.label = Label(name='name', description='description')
        self.label.save()
        self.annotation = Annotation(comment='Site Name', original_url='http://example.com/')
        self.annotation.save()
        self.annotation.labels.add(self.label)
        self.annotation.save()

    def tearDown(self):
        # put settings back so the next tests aren't effected
        settings.ROOT_URLCONF = self.ROOT_URLCONF
        settings.TEMPLATE_CONTEXT_PROCESSORS = self.TEMPLATE_CONTEXT_PROCESSORS


    def test_view(self):
        response = self.client.get(reverse('cse_view', kwargs={"id":self.annotation.id}))
        self.assertEquals(200, response.status_code)
        self.assertTemplateUsed(response, 'cse/view.html')

The normal unittest asserts are available in the tests. I'm using one of the special asserts provided by the Django test Client to verify that the template I expected was used. All the templates used (due to template inheritance) are collected by the client and can also be verified.

I used these tests in a TDD-ish manner, I wrote the test for a view, ran the tests and kept resolving errors in the templates as I made the changes in my bullet list. It made a tedious job simple and gave me good confidence that I'd found all the renamed urls, views, and templates.

Filed under  //   django   googility   python   tdd   testing  

Comments [0]

Haystack Search Result Ordering and Pre-Rendering Results

I use Haystack and the Python Whoosh project to provide search over ~3400 articles in my Googility.com database. I had originally implemented the search in the "simplest way that works". I was making some other enhancement to Googility and noticed the search result page had two undesirable  behaviors:

  1. The ordering of results was basically random for all matching articles. For the domain of magazine article search having a bias toward the most recent publications would be more desirable.
  2. Looking at the django-debug-toolbar output each element in the search results was hitting the database twice (once for the Article instance and again for its corresponding Periodical). So a single result page was making as many as 60 database selects.

Haystack provides mechanisms to help with both of these issues.

Imposing an Order on the SearchQuerySet

Haystack models search using an API based on Django's QuerySet. The only thing to remember is it performs its queries over the Haystack SearchIndex subclass(es) you create instead of over the Django ORM. So you define a SearchIndex subclass that contains the data from the application's model overwhich you'd like to search. You can also define additional fields that can be used to modify the results of the query. Here is my magazine Article search index:

from haystack.sites import site
from haystack import indexes
from periodicals.models import Article

class ArticleIndex(indexes.SearchIndex):
    text = indexes.CharField(document=True, use_template=True)
    pub_date = indexes.DateTimeField(model_attr='issue__pub_date')

site.register(Article, ArticleIndex)

The text field contains the "document" over which the search engine (Whoosh) will actually perform the search. I'm using the template feature that allows me to use Django templates to format the data presented to the search engine.

I added the pub_date field to the index to allow the matching search results to be ordered by the pub_date field. The 'issue__pub_date' syntax mirrors the Django QuerySet syntax and means extract the "pub_date" attribute of the Article's "issue" attribute (it joins Article to Publication and get's the Publication's published date).

Then the urls.py is modified to change the SearchQuerySet passed to the default haystacksearch view to order by the ArticleIndex's pub_date attribute:

<snip>
from haystack.views import SearchView
from haystack.query import SearchQuerySet
# query results with most recent publication date first
sqs = SearchQuerySet().order_by('-pub_date')
urlpatterns = patterns('',
                       url(r'^search/',
                           SearchView(
                               load_all=False,
                               searchqueryset=sqs,
                               ),
                           name='haystack_search',
                           ),
<snip>

Pre-Rendering Result HTML

Since I have only a few thousand records I decided to follow the Haystack Best Practices for Not Hitting the Database. This solution trades space in the Whoosh index files by generating the HTML that will be displayed when each article matches along with the data used by Whoosh to match articles to search keywords. The changes were pretty simple. In the ArticleIndex:

from haystack.sites import site
from haystack import indexes
from periodicals.models import Article

class ArticleIndex(indexes.SearchIndex):
    text = indexes.CharField(document=True, use_template=True)
    pub_date = indexes.DateTimeField(model_attr='issue__pub_date')
    # pregenerate the search result HTML for an Article
    # this avoids any database hits when results are processed
    # at the cost of storing all the data in the Haystack index
    result_text = indexes.CharField(indexed=False, use_template=True)

site.register(Article, ArticleIndex)

The use_template keyword requires you to create a Django template file that is used during index creation to build the HTML that will be displayed. The only peculiarity I found was figuring out where the template should live. On my system it was at templates/search/indexes/periodicals/article_result_text.txt. I understand the periodicals/article_result_text part but I haven't looked into where the search/indexes is generated from. I imagine a reverse() to find the url for the view and "indexes" is appended to that...

The final change is the template used to display the search results. In order to not hit the database the object list generated by the haystack SearchView is placed into the context used by the template and only the result_text attribute should be accessed:

{% if page.object_list %}
<div class="search-results-title">Results <b>{{page.start_index}}</b>  - <b>{{page.end_index}}</b> for <b>{{query}}</b></div>
    <div class="search-results-list">
    {% for result in page.object_list %}
      {{result.result_text|safe}}
    {% endfor %}
    <div class="pagination">
      <span class="step-links">
        {% if page.has_previous %}
            previous
        {% endif %}
        <span class="current">
            Page {{ page.number }} of {{ page.paginator.num_pages }}
        </span>
        {% if page.has_next %}
            next
        {% endif %}
      </span>
    </div>
</div>
{% else %}
<h2>No matching articles found.</h2>
{% endif %}

The actual result is placed in the template via {{result.result_text|safe}} the safe filter is required since the HTML doesn't need to be escaped again - it was escaped by Django when it was placed into the SearchIndex.

So now my search results are in reverse chronological order and they render using only 3 database queries and at least 10x faster than before.

Filed under  //   django   haystack   search   whoosh  

Comments [0]

Improving Google Ads and Google Search Descriptions

I was looking at the google search results for my Googility web site and noticed that the descriptions shown underneath the title often contained text from my navigation links instead of content from the body of the page:

Google_description
I did some searching and found the Google Webmaster blog post about description meta tags. Since almost all of the pages on Googility are generated by fewer than a dozen Django templates I edited the templates and inserted meta tags and filled the description in with data from each database entry. This avoids boilerplate information that would be ignored by Google and improves the descriptions shown to Google searchers. Some of my pages have already been reindexed:

Google_description_after

Yahoo and some other search sites use a class robots-nocontent on any page elements it should ignore for it's index, Unfortunately, Google doesn't follow this standard. So I might end up making that edit to the templates also. Looking at my site's log files it appears the Yahoo spider is hitting my site more frequently than Google's and the Yahoo index is more up to date. Looking at my analytics reports though Google refers far more readers to my site than Yahoo...

I also noticed that the ads served on pages containing mostly links appeared to be using words in my navigation or other boilerplate instead of the few lines of valuable content. More searching to the rescue and I found this Google Adsense article on section targeting. Once again editing the dozen or so templates I used were easy to edit to add in these HTML comment tags. Checking back a couple days later showed improvements in the ads being generated for those pages. I keep an eye on my Adsense click rate and see if there is any increase in ad clicks.

So a couple simple edits made noticeable improvements not bad for a couple hours investigation and implementation.

Filed under  //   adsense   django   google   search   web development  

Comments [0]

Initial Release of django-stw

I have been using the free website thumbnail service from Shrink The Web on my dog agility search website Googility since I launched it. It is quick and easy to use and it adds a lot to the look of the pages.

I had created a simple Django template tag for inserting the little snippet of HTML needed by their service.

Recently they asked me to add support for their advanced features to my template tag. I used this opportunity to convert my templatetag to a Django application. This mostly makes it a lot easier to install but it also let me to bundle tests and an example template with the template tag.

I kept the existing shrinkthewebimage template tag and added a new tag called stwimage to enable the new features.

I'm hosting the example page included in the package here so you can see how the template tags work.

I've hosted the project source on github and uploaded the initial release to the CheeseShop for easy installation.

 

Filed under  //   django   github   googility   pypi   python   shrink the web   web development  

Comments [0]

Django Shrink The Web Template Tag Updated

I recently updated my Django template tag for simplifying the use of Shrink The Web images. They recently announced a CDN based distribution of images and they took the opportunity to modify their API.

The updated template tag is on django snippets.

The STW folks have asked be to extend my template tag with support for their PRO features. With luck I'll make that available sometime this weekend.

Filed under  //   django   python   shrink the web   web development  

Comments [0]