(mis)adventures in software development...

25 May 2013

HTML meta mischief in Pelican blog posts

Category Python

Possible uses for arbitrary content metadata in Pelican blog themes.

For some reason recently I got to thinking about HTML meta elements, and thinking about these kinds of things often leads to acts of anal retentive geekiness. Like noticing that few Pelican themes do much with HTML meta elements, and usually nothing with the description tag in particular. And it was the HTML meta description tag I was thinking about most.

I was thinking about whether it’s even worth thinking about, for it seems these days HTML meta elements are considered somewhat unimportant as far as SEO goes. Search engines may or may not use a meta description tag (or part thereof) as summary snippets on search results pages, which may influence click through rates, for those that care about such things.

I’m not sure I do, but it did get me thinking a bit about the way Pelican handles article summaries and other such things.

For some posts I prefer to provide my own summary, while for others I’m happy for Pelican to extract a certain number of words from the start.

In the same way, for some posts, I might be happy for search engines to come up with their own summaries, which are likely to be the first sentence or two. However, there are definitely some posts where not only would I like to have some say in the summary that’s displayed on a search engine result page, but I would also be willing to go to great lengths to make sure it isn’t anything from the first paragraph. Especially if I’ve decided to be deliberately long-winded or scatological in the first paragraph. (Of course, some might suggest the best course of action is not to be long-winded or scatological, but what would be the fun of that?)

So I decided I did want to have more control over these kinds of things, and therefore having the ability to optionally specify a meta description tag for blog posts would be a useful feature to have in my increasingly hacky custom theme.

A possibly viable option for some might be to just to take the article summary and use that as the content of the meta description too. But I wanted more control than that — more often than not I would want the summary to be different to that of the meta description. So I took a different approach.

With Pelican it’s possible to add arbitrary metadata tags to a blog post, and the value of this metadata tag will be available in templates as an attribute of the article object. This is great, because it means I can add all kinds of metadata tags that make sense to me (if no one else) and modify my theme’s templates to render that metadata as appropriate.

For example, let’s say we have the following blog post in reStructuredText format:

Mild Self Loathing

:date: 2010-10-03 10:20
:tags: mindless drivel, incoherent whining
:category: self indulgent rambling
:slug: mild-self-loathing
:author: me
:description: This text will be put in the meta description tag.

Main text of the blog entry.

Note it contains an arbitrary metadata tag called “description”.

In the theme templates, the value of this description tag will be available as article.description.

So in the theme’s base.html file we could add the following:

<meta name="description" content="{% block metadesc %}{% endblock %}" />

This would define an empty meta description by default. (Alternatively, we could have instead specified some default text for the description tag.)

We can then override this block in the theme’s article.html to define a meta description specific to the blog post:

{% block metadesc %}{{ article.description|e }}{% endblock %}

The value of article.description will be the value given in the blog post’s metadata. So in the above example, that blog post will include the following HTML meta element in the generated HTML:

<meta name="description" content="This text will be put in the meta description tag." />

The same thing can be done with Pelican pages, except in this case the object we’re working with in the template will be called “page”. So a similar block in the theme’s page.html file might look something like:

{% block metadesc %}
{% if page.description %}{{ page.description|e }}
{% else%}A page in my blog called {{ page.title }}
{% endif %}
{% endblock %}

The above will use a description metatag if one is specified, otherwise it will create a generic one using the page title.

Of course, there are other more esoteric things that can be done with HTML meta elements. For instance, if you’d prefer tag pages not clutter up potential search engine results, you can suggest that search engines not index them by adding a robots meta tag to the tag archive page. In the theme’s base.html add this line:

{% block meta_other %}{% endblock %}

Then in the theme’s tag.html file, override the meta_other block like this:

{% block meta_other %}<meta name="robots" content="noindex" />
{% endblock %}

Now, the tag archive pages will have a robots noindex directive in the head.

We can of course conditionally include or exclude content depending on the value of metadata. So, for example, we could do something like this in article.html to prevent draft posts from being indexed by search engines:

{% block meta_other %}
{% if article.status == 'draft' %}
<meta name="robots" content="noindex" />
{% else %}{% endif %}
{% endblock %}

The above example used the standard “status” metadata, but of course, given we can add all the arbitrary metadata we like, similar things can be done with our own custom metadata tags, which makes it relatively straightforward to add custom conditional features to Pelican themes.