Barefoot Development

Processing Remote Files with Attachment Fu

I've used Rick Olson's attachment_fu on a few projects now, and it's become one of the first plugins I import into my rails directory. But for my current task I need to not only upload photos but also import external images from a data service and run them through the usual cropping and scaling. I think it is ideal for all images, regardless of source, to be run through the same model where they are subjected to the same operations and validations. So that's what we'll do.

Currently, when you post a multi-part form with a file input box named uploaded_data, attachment_fu will grab that TempFile with a setter of the same name and do its magic. I have followed the same process with a new method. It can go right in the model that has_attachment:

# Takes input of a remote file via an absolute URL,
# reads it and passes it to attachment_fu for processing
def remote_data=(file_url)
return nil if file_url.nil?
open(file_url) do |data|
# extract the filename and extension from the url
temp_filename = URI.split(file_url)[5][/[^\/]+\Z/]
# pass details to attachment_fu
self.filename = temp_filename
self.temp_data = data.read
self.content_type = data.content_type
end
end

This opens the attachment and places the contents in a TempFile just as if it had been uploaded. Keep in mind that while my input data is quite reliable, you may need to add a few checks to ensure you have a valid extension and mime type. You can use the new method thusly:

photo = Photo.new
photo.remote_data = "http://www.somewhere.com/an_image.jpg"
photo.save

Bobby Uhlenbrock, Application Developer, Barefoot

Labels: ,

A Fix for a WordPress Paging Problem

I recently discovered a problem in the next/previous paging links of a custom WordPress 2.2.1 site we created for Fractional Jets Focus. When viewing a list of articles by category, the previous posts link generated a 404 error.

After digging for too long through WordPress code and Google search results, I found that the problem was related to an apparent bug or conflict between the paging feature and custom permalinks.

Our custom permalink setting in Options -> Permalinks is this:

/%category%/%year%/%monthnum%/%postname%/

When you want to view all the posts for a certain category, an example URL is /blog/2007/9/. The problem is that the URL generated by next_posts_link() was misinterpreted by WordPress because of the permalink. The link was: /blog/2007/9/page/2/. Unfortunately, the string "page" in this URL was interpreted as a post name, instead of a token for the page index.

There is probably some voodoo I could have done in the .htaccess file with mod_rewrite to fix the issue, but I feared breaking something else. So, I wrote the following plugin. It's so short that I'm publishing the whole thing inline.

<?php
/*
Plugin Name: Fix Paging in Category Listings
Plugin URI: http://www.thinkbarefoot.com
Description: Fixes a bug where next/previous links are broken in category by year/month listings
Version: 0.5
Author: Doug Smith
Author URI: http://www.thinkbarefoot.com

Copyright 2007 Doug Smith (email: dsmith@thinkbarefoot.com)

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

*/

/**
* Function to fix problem where next/previous buttons are broken on list
* of posts in a category when the custom permalink string is:
* /%category%/%year%/%monthnum%/%postname%/
* The problem is that with a url like this:
*
* /category/2007/10/page/2
*
* the 'page' looks like a post name, not the keyword "page"
*/
function remove_page_from_query_string($query_string)
{
if ($query_string['name'] == 'page' && isset($query_string['page'])) {
unset($query_string['name']);
// 'page' in the query_string looks like '/2', so split it out
list($delim, $page_index) = split('/', $query_string['page']);
$query_string['paged'] = $page_index;
}
return $query_string;
}

add_filter('request', 'remove_page_from_query_string');

?>

This plugin simply checks to see whether the post name is 'page', and if there is a 'page' parameter too. If so, it removes the 'name', and assigns the page index to the magic 'paged' parameter. Paging restored.

Doug Smith, Senior Developer, Barefoot

I'm Glad Rails Loves SQL

It's a good thing that the core team for Ruby on Rails has chosen to embrace SQL, and easily expose it to developers. Other object-relational mapping layers often try to hide SQL as 'evil'. However, there are times when nothing will substitute for being able to talk directly to the database, in its own language.

Yesterday was such a time. I have an application that displays articles divided into hierarchies of categories. Articles are rendered differently on the site depending on their category.

I needed to change a list of articles from one top-level category (let's call it food) so that some articles in one of its sub-categories (call it expert reviews) only appeared one time per sub-category. So, while articles from other sub-categories of food would all appear in the list, only the latest article from any expert reviewer would appear in that same list.

I wanted to handle this in the model using a custom finder so that controllers could just call something like Category.published_article_list(options) and the details would be irrelevant.

Ok, enough background, here's the method I added to the Category model:

def published_article_list(options = {})
if (food?)
# Special query that returns only the latest article for each 'reviewer' category
sql = "mid_category_id != #{REVIEW_CAT} OR articles.id IN ( "
sql += "SELECT SUBSTRING( MAX( CONCAT( published_on, id ) ), 11 ) AS ra_id "
sql += "FROM articles "
sql += "WHERE mid_category_id = #{REVIEW_CAT} AND "
sql += " #{Article.conditions_published} "
sql += "GROUP BY category_id ) ) "
options[:conditions] = sql
end
self.published_root_articles.find(:all, options)
end

That SQL is a little serious, so here's what it does in English. It starts by removing any articles that have a mid-level category of "review". It then adds the articles from that category that we want using a subquery.

The meat of the subquery is in this part of the SELECT clause: SUBSTRING( MAX( CONCAT( published_on, id ) ), 11 ) AS ra_id. I needed to get the ID of the latest article, grouped by category_id (the individual reviewer). So, this code concatenates the published_on date to the article ID, finds the max of all those values, then uses SUBSTRING to chop the concatenated date off, leaving only the id of the latest article. So, for a list of articles with IDs and dates like this:

id date category_id
101 2007-10-01 1
102 2007-09-30 1
103 2007-10-02 2
104 2007-10-01 2

When the concatenated dates and IDs are sorted, they end up like this:

2007-10-02103
2007-10-01104
2007-10-01101
2007-09-30102

Using this method, the subquery accurately adds the IDs of articles 101 and 103 to the IN clause above.

The only downside to this method is that the CONCAT function requires a full table scan, so indexes won't be used in the subquery. Since this page is cached, it wasn't a problem for this application, but if performance becomes an issue other optimization could be done while still using this strategy.

Thanks for loving SQL, Rails!

Doug Smith, Senior Developer, Barefoot