RESTful Web Services in a WordPress Plugin?

UPDATE (2011-04-15):

Since I wrote this post I’ve learned a tremendous amount about WordPress plugin development; so much so that I can’t overstate how much more I know today than when I wrote this post years ago.  So, while the following post might be a novelty to read, I highly recommend that you don’t use this approach.

The only way that a RESTful approach to web services in WordPress would make sense to me is if a team of rockstar WordPress plugin developers were to create a fully fleshed-out extension to WordPress that offered a complete RESTful web service implementation including one that addressed all edge cases and security concerns; only then I would consider not defaulting to the non-RESTful approach WordPress uses for AJAX.  

In summary I recommend not trying to swim upstream today and instead use the approach provided by WordPress. Who knows, maybe in the future there will be a viable method of doing RESTful web services in WordPress.


So I’ve got a project where I need to have a Flash component built in Flex to call to a WordPress blog and get information about it’s latest post. Should be no problem right?  The easy way to do this would be to just write a "rest.php" file and brute-force all the setup that WordPress does but I thought it would be so much more valuable to implement this as a plug-in.

I figured that I’d just quickly learn how to build a WordPress plug-in and create one for exposing RESTful web services; after all with a year of programming Drupal modules WordPress’ plug-in API can’t be that hard, right?  Well turns out it wasn’t that easy and I think I have run into a design limitation with WordPress and I’m beginning to wish I’d just taken the brute-force approach and said to hell with writing a plug-in.

Although I am not 100% certain, and I hope someone can point out that I’m just doing something wrong, it seems sadly like I’m pushing the edges of the WordPress API and exposing where it’s design falls short. By the way, I’m working with WordPress v2.5 because why upgrade mid-project when god knows if WordPress will release another in the remaining days before this project is done and I’ll just have to do again before deployment?

Here’s the details.  I started writing a plug-in called "RESTful Services" with a goal of implementing URLs that behave in the following fashion; {format} could potentially be html, xhtml, json, xml, rss, atom, etc.:

http://example.com/services.{format}
Provide a list of RESTful services in specified {format}, defaults to html
http://example.com/services/{service}.{format}/{data}?{params
Provide a RESTful service in specified {format}, defaults to html, with optional provided data and parameters.

But before I got all those options working I just wanted to service a page from my RESTful Services plugin where Content-Type: text/plain. I found this page that professes to explain how to hook into the URL routing and after a few fits and starts I can came up with the following code for my plugin that would indeed response to my http://example.com/services URL:

wp-content/plugins/restful-services/restful-web-services.php:

1
2
3
4
5
6
7
8
9
10
11
12
13
add_action('init', 'restful_services_flush_rewrite_rules');
function restful_services_flush_rewrite_rules() {    
  global $wp_rewrite;
  $wp_rewrite->flush_rules();
}       
 
add_filter('generate_rewrite_rules', 'restful_services_add_rewrite_rules');
function restful_services_add_rewrite_rules( $wp_rewrite ) {      
  $new_rules = array(        
    'services' => 'wp-content/plugins/restful-services/rest.php',     
  );
  $wp_rewrite->rules = $new_rules + $wp_rewrite->rules;
}

The problem with the above was that it wouldn’t call "wp-content/plugins/restful-services/rest.php"; it would simply continued to call "index.php" and display the home page!  After literally hours and hours of debugging with my trusty PhpEd IDE & debugger I was able to find that the code on lines 737 & 738 of "wp-includes/query.php" told WordPress that my service was the home page! It is almost seems like the "generate_rewrite_rules" was implemented as "a good idea" yet no testing has ever been done on it because for the best I can tell it doesn’t work. (Note I’ve reformatted the code to multiple lines so that it is easier to read and does not extend past the right margin of my blog):

wp-includes/query.php:

if ( !(  $this->is_singular                           
      || $this->is_archive                           
      || $this->is_search                           
      || $this->is_feed                           
      || $this->is_trackback                           
      || $this->is_404                           
      || $this->is_admin                           
      || $this->is_comments_popup ) )   
       $this->is_home = true;

I could possibly hack it to get past this by setting one of those to "true", but none of them are really appropriate; there is nothing there quite like an "is_service" instance variable. Setting something like "this->is_singular" or "this->is_feed" might work but it could manifest incompatibility problems with other plugins or future versions of WordPress. Frankly it was rather disappointing to discover this because it tells me that WordPress has hard-coded all the potential scenarios and doesn’t really have a way around it. Seems to me there should really be a hook here and the type of pages should be allowed to be expanded by plugins rather than be hardcoded as only one of ’singular’, ‘archive’, ’search’, ‘feed’, … and ‘home.’

Anyway, where this manifests itself is "wp-includes/template-loader.php" file which I have included in it’s entirety below.  It is on lines 24 and 25 where the template loader loaded the home page because it’s not possible to specify otherwise:

wp-includes/template-loader.php:

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
/**
 * Loads the correct template based on the visitor's url
 * @package WordPress
 */
if ( defined('WP_USE_THEMES') && constant('WP_USE_THEMES') ) {
  do_action('template_redirect');
  $is_home = is_home() ;
  if ( is_robots() ) {
    do_action('do_robots');
    return;
  } else if ( is_feed() ) {
    do_feed();
    return;
  } else if ( is_trackback() ) {
    include(ABSPATH . 'wp-trackback.php');
    return;
  } else if ( is_404() && $template = get_404_template() ) {
    include($template);
    return;
  } else if ( is_search() && $template = get_search_template() ) {
    include($template);
    return;
  } else if ( is_home() && $template = get_home_template() ) {
    include($template);
    return;
  } else if ( is_attachment() && $template = get_attachment_template() ) {
    remove_filter('the_content', 'prepend_attachment');
    include($template);
    return;
  } else if ( is_single() && $template = get_single_template() ) {
    include($template);
    return;
  } else if ( is_page() && $template = get_page_template() ) {
    include($template);
    return;
  } else if ( is_category() && $template = get_category_template()) {
    include($template);
    return;
  } else if ( is_tag() && $template = get_tag_template()) {
    include($template);
    return;
  } else if ( is_tax() && $template = get_taxonomy_template()) {
    include($template);
    return;
  } else if ( is_author() && $template = get_author_template() ) {
    include($template);
    return;
  } else if ( is_date() && $template = get_date_template() ) {
    include($template);
    return;
  } else if ( is_archive() && $template = get_archive_template() ) {
    include($template);
    return;
  } else if ( is_comments_popup() && $template = get_comments_popup_template() ) {
    include($template);
    return;
  } else if ( is_paged() && $template = get_paged_template() ) {
    include($template);
    return;
  } else if ( file_exists(TEMPLATEPATH . "/index.php") ) {
    include(TEMPLATEPATH . "/index.php");
    return;
  }
} else {
  // Process feeds and trackbacks even if not using themes.
  if ( is_robots() ) {
    do_action('do_robots');
    return;
  } else if ( is_feed() ) {
    do_feed();
    return;
  } else if ( is_trackback() ) {
    include(ABSPATH . 'wp-trackback.php');
    return;
  }
}

Still another problem in this puzzle is the $wp->send_headers() method shown being called here on line 293 of "wp-includes/classes.php":

wp-includes/classes.php:

290
291
292
293
294
295
296
297
298
function main($query_args = '') {   
  $this->init();   
  $this->parse_request($query_args);   
  $this->send_headers(); 
  $this->query_posts();  
  $this->handle_404();   
  $this->register_globals();   
  do_action_ref_array('wp', array(&$this));   
}

The problem with the $wp->send_headers(), also from "wp-includes/classes.php", is that it seems to have the option of either serving an HTML content type on line 183 and 185, or a content type based on a feed (the content types for the feeds are set in their respective "wp-includes/feed-*.php" files) but no custom content types as far as I can determine as there seems to be no way to override calling this function or the logic path contained within:

wp-includes/classes.php:

175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
function send_headers() {
  @header('X-Pingback: '. get_bloginfo('pingback_url'));
  if ( is_user_logged_in() )
    nocache_headers();
  if ( !empty($this->query_vars['error']) && '404' == $this->query_vars['error'] ) {
    status_header( 404 );
    if ( !is_user_logged_in() )
      nocache_headers();
    @header('Content-Type: ' . get_option('html_type') . '; charset=' . get_option('blog_charset'));
  } else if ( empty($this->query_vars['feed']) ) {
    @header('Content-Type: ' . get_option('html_type') . '; charset=' . get_option('blog_charset'));
  } else {
    // We're showing a feed, so WP is indeed the only thing that last changed
    if ( !empty($this->query_vars['withcomments'])
      || ( empty($this->query_vars['withoutcomments'])
        && ( !empty($this->query_vars['p'])
          || !empty($this->query_vars['name'])
          || !empty($this->query_vars['page_id'])
          || !empty($this->query_vars['pagename'])
          || !empty($this->query_vars['attachment'])
          || !empty($this->query_vars['attachment_id'])
        )
      )
    )
      $wp_last_modified = mysql2date('D, d M Y H:i:s', get_lastcommentmodified('GMT'), 0).' GMT';
    else
      $wp_last_modified = mysql2date('D, d M Y H:i:s', get_lastpostmodified('GMT'), 0).' GMT';
    $wp_etag = '"' . md5($wp_last_modified) . '"';
    @header("Last-Modified: $wp_last_modified");
    @header("ETag: $wp_etag");
 
    // Support for Conditional GET
    if (isset($_SERVER['HTTP_IF_NONE_MATCH']))
      $client_etag = stripslashes(stripslashes($_SERVER['HTTP_IF_NONE_MATCH']));
    else $client_etag = false;
 
    $client_last_modified = empty($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? '' : trim($_SERVER['HTTP_IF_MODIFIED_SINCE']);
    // If string is empty, return 0. If not, attempt to parse into a timestamp
    $client_modified_timestamp = $client_last_modified ? strtotime($client_last_modified) : 0;
 
    // Make a timestamp for our most recent modification...
    $wp_modified_timestamp = strtotime($wp_last_modified);
 
    if ( ($client_last_modified && $client_etag) ?
         (($client_modified_timestamp >= $wp_modified_timestamp) && ($client_etag == $wp_etag)) :
         (($client_modified_timestamp >= $wp_modified_timestamp) || ($client_etag == $wp_etag)) ) {
      status_header( 304 );
      exit;
    }
  }
 
  do_action_ref_array('send_headers', array(&$this));
}

Still, I was able to come up with a solution although it is so very hackish.  My solution was to hook the "template_redirect" action on line 7 of "wp-includes/template-loader.php" (see code from that file above.) Though it seems to works thus far, my solution just feels wrong for the following reasons:

  1. It ignores the fact that WordPress continues to think that my web service URL is the home page,
  2. It first lets "$wp->send_headers()" set the content type before it overrides it,
  3. It uses an "exit" rather than a return to keep WordPress from serving up the home page template, and
  4. It doesn’t use the routing mechanism apparent built into WordPress (see "null" on line 30 of "wp-content/plugins/restful-services/restful-web-services.php" below, I assume it should have been the URL of the .php file I plan to execute but WordPress doesn’t see to use what I put there.)

The function "restful_web_services_exec_service()" is what is called to execute the appropriate web service:

wp-content/plugins/restful-services/restful-web-services.php:

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
/*
Plugin Name: RESTful Web Services
Plugin URI: http://mikeschinkel.com/wordpress/restful-web-services/
Description: This plugin enables REST-based API web services for WordPress.
Author: Mike Schinkel
Version: 0.1
Author URI: http://mikeschinkel.com/
*/
 
define('RESTFUL_WEB_SERVICES_DIR', dirname(__FILE__));
define('RESTFUL_WEB_SERVICES_URL_PATTERN','services(/?.*)?');
$abspath = trim(str_replace('\\','/',ABSPATH),'/');
$rest_services_dir = str_replace('\\','/',RESTFUL_WEB_SERVICES_DIR);
$rest_services_path = trim(str_replace($abspath,'',$rest_services_dir),'/');
define('RESTFUL_WEB_SERVICES_PATH', $rest_services_path);
 
// NOTE, See: http://codex.wordpress.org/Custom_Queries#Permalinks_for_Custom_Archives
add_action('init', 'restful_web_services_flush_rewrite_rules');
add_filter('generate_rewrite_rules', 'restful_web_services_add_rewrite_rules');
add_action('template_redirect', 'restful_web_services_exec_service');
 
add_action('init', 'restful_web_services_flush_rewrite_rules');
function restful_web_services_flush_rewrite_rules() {
 global $wp_rewrite;
 $wp_rewrite->flush_rules();
}
 
add_filter('generate_rewrite_rules', 'restful_web_services_add_rewrite_rules');
function restful_web_services_add_rewrite_rules( $wp_rewrite ) {
  $new_rules = array(
    RESTFUL_WEB_SERVICES_URL_PATTERN => null,
  );
  $wp_rewrite->rules = $new_rules + $wp_rewrite->rules;
}
 
function restful_web_services_exec_service() {
  global $wp;
  if ($wp->matched_rule==RESTFUL_WEB_SERVICES_URL_PATTERN) {
    if ($wp->request == 'services') {
      header('Content-Type: text/plain');
      print 'TODO: Generate a list of RESTful Web Services here for this WordPress Blog.';
    } else {
      list($dummy,$service_name) = explode('/',$wp->request);
      if (file_exists($service_php = (RESTFUL_WEB_SERVICES_DIR . '/services/' . $service_name . '.php'))) {
        include_once $service_php;
      } else {
        header('Content-Type: text/plain');
        status_header(404);
        print '404 - Service not found.';
      }
    }
    exit;
  }
}

You’ll note that my function "restful_web_services_exec_service()" is very bare-bones at the moment serving only a plain text message "TODO:" for the path http://example.com/services, and assuming that any path http://example.com/services/{service} will execute a same-named .php file in the services subdirectory, i.e. for http://example.com/services/latest-post it will look for "wp-content/plugins/restful-web-services/services/lastest-post.php" and then delegate all the work to that .php file.

wp-content/plugins/restful-services/services/latest-vidclip.php:

2
3
4
5
6
7
8
9
10
11
12
13
14
15
/*
Filename: latest-post.php
Service Name: Latest Post
*/
global $wp_query;
$post = $wp_query->post;
$file = get_attached_file($post->ID);
if (empty($file)) {
  list($file)= get_enclosed($post->ID);
}
$charset = get_option('blog_charset');
header('Content-Type: text/xml; charset=' . get_option('blog_charset'), true);
$link = get_permalink($post->ID);
$html = &lt; &lt;<post></post><post id="{$post-&gt;ID}">      <video>$file</video>   <link />$link  </post>  POST; print $html;

Here is an example output returned by calling http://example.com/services/latest-post:

1
2
&lt; ?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt; 
<post id="2">      <video>http://videos example.org/video1.flv</video>   <link />http://example.org/sample-post-1/  </post>

While bare-bones, that will meet my needs for the moment since I really only have the need for one service that responds to the HTTP GET verb. However, I can see where I might end up fleshing this out and create lots of helper functions that would streamline the creation of web services for RESTful access to the entirety of the WordPress database. If I do so I’ll be happy to donate this plugin to the community.  In the mean time if you’d like to use this for your own use feel free but caveat emptor. However, if you’d like to use this code to create a plugin to contribute to the community, please contact me for collaboration.  Of course if you’d like to retain me to fully flesh it out for you, I’m always open to that too! :-)

Finally, if anyone who knows the WordPress API better than I do can tell me where I erred in my analysis, and I really hope I did, please let me know so I can architect this thing better. On the other hand, if I was spot-on in my analysis maybe this will help the WordPress team understand what needs to be done so they can empower WordPress to generate any arbitrary content type without having to resort to hacks or bypassing the WordPress index.php startup code. 

Long Time, No Blog

Yes I know, it’s a blogger’s cardinal sin to post about why he hasn’t posted in a while. But live with it.

The irony is I’ve had so much to blog about. The reason I haven’t is because a while back I finally gave up on dasBlog and decided I’d switch to WordPress before I blogged again. dasBlog makes so many things difficult that are either easy or trival on WordPress, such as commenting and monitoring spam. After years of putting up with dasBlog I just finally got fed up and decided I’d wait to switch to Wordpress. Sadly I’ve waited a long time, and it’s possible it may still be a while before I can move everything over.

Of course I could have tried upgrading dasBlog, but it’s so much harder to enhance dasBlog with it’s limited templating system that requires compiled .NET plugins vs. WordPress’ PHP scripting (reminiscient of classic ASP+VBScript, only better) that I was finally able to shed my programmer’s guilt for not learning how to write usable .NET plugins just as I was able to shed my guilt for never becoming proficient in x86 assembler back in the late 80’s.

I’ve got a huge backlog of posts that are anywhere from 10% to 99% complete, many of which will never see the light of day because they just won’t be appropriately timely enough by the time I’m ready to finish and post them. Ah well, story of my life; I can envision far more than I ever have time to complete.

Anyway, the reason for this post is to introduce the next post about a module I’m writing for Drupal. I’ve spend a lot of time recently with Drupal and am getting quite good at it, even if I do say so myself. I would have liked to have posted several Drupal related posts as a recursor but if I waited for that I doubt I’d ever manage to post about the module!

So without further adieu, on to the next post!

P.S. It may actually be a few days before I get that post finalized, but if it is not posted yes I am working diligently on it so just hold your breath… :-)

Learning about Adobe AIR in Atlanta…

I’m at the Fox Theatre in my hometown of Atlanta today checking out the Adobe AIR Bus Tour Summer 07. It’s nice to be at the first event nationwide. I’m attending at the behest of a friend who thinks it going to be the "next big thing." I’m skeptical. I fear yet another proprietary attempt to empower developers to craft unique custom web interfaces to provide desktop functionality as a layer over web technologies, and that’s not a compliment. These types of things, especially when looking at the black box nature of opaque Flash SWF files, do their best to ignore those things that make the web work, i.e. stateless URL-addressed resources. The reality of Adobe AIR remains to be seen… P.S. It would have been nice if Adobe had consulted me to ensure that this event was more convenient for me. I mean, I actually had to leave my home and cross the street to attend. Adobe, Please! ‘-)

On the Hunt for a New Programming Language

When it comes to programming on the modern-day GUI (post-DOS) platform, the vast majority of my coding has been, in order of experience, using T-SQL, VBScript in ASP, and about equal parts classic VB (v3.0 to v6.0) and VB.NET. As you can see from my order of experience, I’m really a database guy, and since the beginning of the web I’ve always viewed the web as somewhat of a database publishing environment (anyone remember the DOS product dbPublisher Pro from Digital Composition Systems?) What’s more the web allows a potentially infinite number of people to use a developer’s database publishing apps without any extra effort to distribute them. Finally, the web provides ability to capture evidence the apps were run, how often, and by how many people. Is it any wonder I have more of inclination to develop for the web as opposed to desktop applications? Back during the period from 1994 to 2006 when I ran VBxtras/Xtras.Net where we where a reseller of ActiveX controls and then later .NET components, I never really thought about the cost of add-on components. Almost anything I wanted to play with I can get an NFR (not-for-resale) copy just by sending an email or picking up the phone. Although I still have many of those relationships from a decade+  in the business, I hesitate to ask for NFRs these days except from my really close friends simply because this business I’m in today has nothing to do with benefiting those people. So numerous facts have me giving up on my prior five year assumption that I would someday learn VB.NET at an advanced level and have me instead actively considering alternatives:

  1. As I just stated, the fact I now have to pay for third party components and tools means I’m paying more attention to cost of acquisition,
  2. My recent favorable impressions of open-source developer tools and components, on par with some of the best tools ever sold by Xtras.Net,
  3. My increasing frustration with the Microsoft developer division’s process and release cycle,
  4. All best web applications seem to target L.A.M.P. such as Mediawiki, WordPress, vBulletin, Subversion, Trac,  Ruby On Rails, Django, etc. and all but one of them are free to use
  5. Completely preconfigured stacks (including O/S) that are becoming available for download as a VMware appliance,
  6. Recognizing that Ubuntu’s has an approach strategic enough to result in Microsoft being profiled in a revised edition of Clayton Christensen’s Innovator’s Dilemma as yet another example of why great companies loose their leadership position,
  7. And lastly my rising disgust for ASP.NET (and I promise I will blog about those specific soon…)

By the way, even though I dislike ASP.NET, I do still really like the .NET Framework and programming model. Oh and a note about the first point; whereas there is good open-source tools available for .NET, the operative word is "tools" not components. When you compare what’s available to freely use for .NET compared to what’s available for any of the "P"s (Perl, Python, and PHP), .NET just can’t compare, at least not in depth or breadth. Of course being commercial products the .NET third party components are more polished and of course have commercial support available. However, unless you are big company that needs to CYA and have a throat to choke, those are often dubious benefits especially when you consider the benefits of open-source (i.e. source code, and the ability to fix something and contribute it back so you’ll know it stays fixed!) Anyway, I could write for hours on the pros and cons for open source vs. commercial developer components and tools but that’s not the subject of this post. The subject is about which language I will focus the majority of my future attentions on learning and using, and I’d love to get your input before I decide. Here are the current contenders:

PHP
All the major web apps I mentioned above seem to be built using PHP and I’m currently running many of those apps, PHP is pretty similar to the ASP that I know so well, it’s web-specific, there is a huge support community, it runs on both Windows and Linux, and every Linux web host known to man seems to offer it preinstalled. However, there seems to be lots more crap PHP code examples littering websites than good PHP code examples making it harder to learn so it might be hard to seperate the wheat from the chafe, it is not easy to configure on Windows Servers (especially at a shared web host), and no one individual framework seems to have gotten the lion’s share of the market attention so picking one would be a crap shoot. Oh, and it uses those infernal semi-colons just like C#.
Ruby on Rails
Ruby and it’s framework Rails have gotten tons of attention and it seems all the cool kids are doing it, especially lots of the Web 2.0 startups, it is very database-centric, has very elegant URL mapping functionality, and it seems you can get web apps built really fast using it. And Ruby.NET is also on the horizon meaning I might be able keep my toe in .NET. However, the community comes across as just a little bit too religious and I’m generally alergic to that, AFAIK it doesn’t run on Windows, or at least not for shared hosting. Plus I’ve had people I respect tell me that Ruby doesn’t have nearly as many users as the "P" languages, that Rails it not nearly as mature as its purported to be, and that Rails makes simple thing simple but complex things extremely difficult. And the number of available web hosts that offer it is quite limited.
Python
Unlike PHP, it seems Python is well suited for both web and desktop apps, which might come in handy from time to time, and a shipping IronPython means that I definitely can keep my toe in .NET. The Django framework seems to be a little more mature and have a little less religion than RoR, and Django also has nice URL mapping functionality, albeit slightly less elegant than RoR. And it seems to run equally well on Linux and Windows. However, Django seems more document publishing-centric and less database-centric, there are very few web hosts that support DJango, and I’ve heard it is a real bitch to get working on a web host.
VB.NET+Castle/MonoRail(+Mono)
But then again, maybe I will stick with VB.NET. The Castle/Monorail project is supposed to be a lot like RoR, and I’d even have the option to use Mono on Linux. However, the third party tools are definitely wanting, most web hosts haven’t a clue what Mono is, and they coded Castle/MonRail in C#, so I’d always be dealing with semi-colons…
ASP+IIS+JScript
I could stick with ASP, which I still like, and learn JScript to replace VBScript, the latter of which just has too many limitations when compared with the other current options. This clearly also runs on Windows and any Windows web host will support it, and I already know Windows backwards and forwards. On the other hand, I’ll need to use ISAPI Rewrite for clean URLs, JScript on ASP it has no future and few code examples on the web, and what third party components and tools (to speak of…)?!?
ASP+IIS+VB.NET
I could also use develop VB.NET objects and call them from ASP; that’s what we last did at Xtras.Net (and I think that is what they are still doing, last I checked…) Of course, calling .NET objects as ActiveX controls just doesn’t feel right, and again there’s that third party component and tools problem…
PowerShell+IIS+???:
Of all the teams working on tools for developers over at Microsoft, the PowerShell team run by Jeffrey Snover is the only one that gets me excited anymore. And in an email from him (or was it a comment on my blog, I don’t remember exactly) he said that PowerShell can do web, and will be able to do it more easily in the future. On the other hand, it’s not here today, and what if webified PowerShell is just another way to do rubbish ASP.NET instead of what it should be, a url-based object-selector-and-invoker like Django or Rudy on Rails.  And what’s the chance it will ever run on Mono…?
Other:
Is there anything else do consider…?

At this point I should probably explain what I’m not considering, and why:

Java on Anything:
Although I was really impressed at a Sun Tech Days recently here in Atlanta , even the Sun people were all over dynamic languages with praise, like Jython and JRuby. And though I was impressed with NetBeans 5.5, all the other "enterprise" baggage like J2EE and Servlets and JSP Custom Tags gives me the feeling I’d be jumping out of the frying pan and into the fire.  Oh, and Java uses those infernal semi-colons too.
C# on Anything:
One word: semi-colons!  Sorry but if I’m going to go .NET, it’s going to be VB.NET (or IronPython). VB.NET is so much more natural to me than C#, and there are things you just can’t do in C# that you can do in VB.NET related to using "implements" on a method in an inherited class (I ran into that limitation of C# compared to VB.NET on a project several years ago where I was managing a pair of interns coding in C# and they hit a wall because of that limitation. I can dig it up if anyone cares, or better yet, can someone who knows the specifics explain it in comments?)
Perl on Apache:
Although my partner on Toolicious Ben Coffey who is a devoted disciple of Perl will cringe to hear this (yet again), I can’t quite get my head around Perl, and they tide, at least today, is away from Perl. Of course Ben claims that will all change with Perl 5.0, but to me that remains to be seen and I’d rather go with a bird in the hand (i.e. one with a lot more active current user base) than a bird in the bush.  But who knows, they say you should learn a new language every year; at any rate if he’s right maybe I’ll try and pick up Perl 5.0 in around 2012. :)

So there you have it: my potential choices and non-choices. Any thoughts I which I should choose?  Any and all input will be appreciated and considered seriously.

The Siren Song of SSI

I needed to get a small content website up and running for a project a friend of mine and I are working on, and we started discussing what to use; i.e. raw HTML, a web framework, a CMS, or something else. I have experience on ASP, IIS, and Windows Server using my own mini ASP-based framework but I’ve got very little experience on our chosen deployment platform and hence am not productive on any of the common platforms in use on Linux. So my friend, thinking I was unfamiliar with SSI suggested that I just use SSI with HTML, to which I replied:

Oh, I’ve done that in the past; I built up a pretty robust set of SSI templates, but it took me a while to get the feel of the language and make it all work. So I don’t want to reinvent the wheel.

To which he replied:

But SSI is beautifully simple. You write a couple lines for your header, say, then throw it in a file. Then you write a page containing whatever content you want, with a call to include that header file at the top of the document. Then…well, that’s it. It takes no time to "learn", requires no programming, and seems perfectly sufficient for what you want.

Sigh.

But as I was thinking about how to reply, I realized that my reply would make an interesting blog post. So here it is; I’m going to build a simple website and use SSI to eliminate all the inevitable duplication. Let’s see how it goes.

First thing is to create a header and a footer (please forgive the lack of DOCTYPE and of obvious things we’d add as I’m trying to make my examples easy to follow. And the omission of DOCTYPE and other specifics won’t affect my main points anyway):

header.inc:

<html><body>

footer.inc:

</body></html>

Next step is to create a template for all our web pages; we’ll start by creating the home page:

index.html:

<!-- include virtual="/header.inc" -->
The web page's HTML content would go here
<!-- include virtual="/footer.inc" -->

So far, so good.  Next let’s add a menu to header.inc that will be on all pages in the website. We’ll need to use CSS styling for the menu, so we’ll add a LINK element allowing us to bring in CSS:

header.inc:

<html>
<head>
<link rel="stylesheet" type="text/css" href="/style.css" mce_href="/style.css">
<body>
<ul id="menu">
<li><a href-"/">Home</a></li><li><a href-"/products/">Products</a></li>
<li><a href-"/downloads/">Downloads</a></li>
<li><a href-"/store/">Purchase</a></li>
<li><a href-"/faq/">FAQ</a></li>
<li><a href-"/about/">About</a></li>
<li><a href-"/contact/">Contact Us</a></li>
<ul>

Great! Now let’s start building out our website. Let’s add three, five, ten, twenty five web pages, and more. These SSI are pretty nice, no?

But wait. Someone mentions to us that none of our web pages have titles. Bummer; titles are really important for usability, and super important for search engine optimization. Oops.

So how are we going to fix this? Hmm, looks like we need to split header.inc into two parts and add a <title> element spanning the two.

header1.inc:

<html>
<head>
<link rel="stylesheet" type="text/css" href="/style.css" mce_href="/style.css">
<title>

header2.inc:

</title>
<body>
<ul id="menu">
<li><a href-"/">Home</a></li>
<li><a href-"/products/">Products</a></li>
<li><a href-"/downloads/">Downloads</a></li>
<li><a href-"/store/">Purchase</a></li>
<li><a href-"/faq/">FAQ</a></li>
<li><a href-"/about/">About</a></li>
<li><a href-"/contact/">Contact Us</a></li>
<ul>

Well that’s done, but now we need to go and fixup all those three, five, ten, or twenty five odd web pages, right? I guess it’s going to look something like this:

<!-- include virtual="/header1.inc" -->
Page title goes here
<!-- include virtual="/header2.inc" -->
The web page's HTML content would go here
<!-- include virtual="/footer.inc" -->

I guess that wasn’t too bad. But wait. It becomes clear some of our pages need to omit the menu. Hmm. I guess we need to split the menu out of header2.inc and into it’s own file.

header2.inc:

</title>
<body>

menu.inc:

<ul id="menu">
<li><a href-"/">Home</a></li>
<li><a href-"/products/">Products</a></li>
<li><a href-"/downloads/">Downloads</a></li>
<li><a href-"/store/">Purchase</a></li>
<li><a href-"/faq/">FAQ</a></li>
<li><a href-"/about/">About</a></li>
<li><a href-"/contact/">Contact Us</a></li>
<ul>

I guess that means NOW we need to revisit those three, five, ten, or twenty five odd web pages AGAIN, right? They should probably all look something like this:

<!-- include virtual="/header1.inc" -->
Page title goes here
<!-- include virtual="/header2.inc" -->
<!-- include virtual="/menu.inc" -->
The web page's HTML content would go here
<!-- include virtual="/footer.inc" -->

Sheesh! What’s with this SSI concept? I thought it was suppose to eliminate the need to change every web page file every time we needed to modify a site’s architecture. Why then do we have to keep making all these sweeping changes?

What’s more, those web pages are really hard to read, what with all the cryptic SSI syntax obscuring the logic in the page.

So I’ve shown two simple examples of where a web site rearchitecture requires refactoring of (almost) all of the web pages in a site when SSI is used naively. Yet I could go on. And on. And on. And on. The problem is that you can’t easily parameterize SSI files (easily) and then capture those parameters in pure HTML. And even if you could, you’d be programming, and you’d have to learn how to do it! We’re going full circle, you know?

Which brings me to a question: "Are Server-Side Includes Bad?" And the answer is: "Of course not, but you do need to know how to use Server-Side Includes properly, and they are really only beneficial when paired with a server-side scripting language[1]." I’ve actually used SSI on every web project I’ve every worked on, save the very first. But I have a rule of thumb when using SSI: I generally only use one SSI per web page file, and I include that SSI at the top of the web page file.  My single include file actually includes my library of scripting functions and is a mini-framework of sorts.

So that you can see a good way to use SSI, I’ll show a quick example. The majority of my web experience has been on programming ASP websites so I’ll use ASP and VBScript syntax. For those not familiar, ASP/VBScript is relatively similar to programming in PHP albeit PHP has moved far beyond the capabilities of ASP since Microsoft dropped ASP and went on to focus its efforts on that that abomination they call ASP.NET[2].

default.asp:

<%
   '--Filname: /default.asp
%>
<!-- include virtual="/sitedef.inc" -->
<%

With page
   .Title= "Page title goes here"
   .Show()
End With

Sub PageContent
%>
The web page's HTML content would go here
<%
End Sub
%>

For completion, I’ll so a tiny subset of a workable sitedef.inc as showing and explaining the entire thing would be way out of scope for this article:

sitedef.inc:

<%
   '--Filname: /sitedef.inc
   Option Explicit
%>
<!-- include virtual="/funclib1.inc" -->
<!-- include virtual="/funclib2.inc" -->
<!-- include virtual="/funclib3.inc" -->
<!-- include virtual="/and-so-on.inc" -->
<%
Dim pageSet page= New PageClass
Class PageClass
   ...
End Class
%>

A quick rundown of sitedef.inc shows the first line being a comment to document the file name for print-outs, etc.. Next is the directive Option Explicit that turns on error reporting for undeclared variables.

Then you can see several times the use of embedded SSI to bring in other files from my VBScript library of functionality. As a note, at first I thought that incluing everything even if it wasn’t needed would cause poor performance but I later realized everything was cached and there really were no performance problems at all. At least this is true on  ASP and IIS; I can’t yet speak for PHP or other languages on Linux and Apache.

Then we have the declaration of the "page" variable which you saw used in default.asp above, the creation of a new instance of the page variable, and the skeleton declaration of the "PageClass" class. Note that VBScript is case insensitive and won’t let you reuse symbols so the "page" variable and a class named just "Page" would have clashed hence the use of the suffix "Class" on "PageClass."

With sitedef.inc we can now create our three, five, ten, twenty five, or more web pages using the template shown for default.asp and (almost) never have to modify them when we refactor the code in our server-side includes. Much more maintainable than SSI and HTML alone. Which brings me back to my friend’s statement, a portion of which I repeat below:

But it takes no time to "learn", requires no programming, and seems perfectly sufficient for what you want.

If you are going to use SSI and you want it to be maintainable, it actually does require you learn server-side programming. Maybe we are only talking about three or five web pages for the project today, but we all know that things change quickly and before you know it, there will be fifty web pages or more.

And who wants to architect a website such that you have to rearchitecture as soon as it grows? Not me. :)

Footnotes

  1. When I say server-side scripting I’m using the term "scripting" liberally to refer to any server-side programming solution including platforms that use Java and C#.
  2. Please don’t misquote me; it’s not the .NET framework, .NET languages, and the common language runtime I dislike; it’s the ASP.NET web framework that I think is misguided.