I’ve recently been spending a lot of time pondering and pontificating on web architecture, and it occurs to me that Microsoft’s Internet Information Server (IIS), now in it’s sixth version, is still pathetically lacking in one key feature that I think it critical for properly architecting websites. And this key feature has been part of/available for Apache for a long time in the form of mod_rewrite.
What is this key feature? The ability to create dynamically defined virtual URLs that contain only directories - i.e. URLs that don’t require a file name with a specific extension. Sure, you can easily support dynamically defined virtual URLs using a custom HTTP Handler with IIS6 and ASP.NET; they will look something like this:
But it won’t support dynamically defined virtual URLs that look like this without some C++ derived ISAPI magic:
However, there is a reasonably inexpensive third party ISAPI filter known as ISAPI_Rewrite that provides this capability for IIS.
In addition, there’s another feature that is needed, and that’s the ability to cleanse HTML output before it’s sent to the browser. Why would you need that? For example, if you are using a content management system like DotNetNuke that has a mind of its own with respect to the URL formats it generates and you want use clean and concise URLs you need to be able to filter and transform the HTML after your CMS generates it but before IIS sends it on to the web surfer’s brower. There is evidently an inexpensive product named Speerio SkinWidgets DNN with a "widget" called PageSwiffer with which you can supposedly filter and transform outbound HTML, though I’ve not tested it.
I‘m sure both of these products are great, but dang it all Microsoft, this functionality really should have been baked into IIS a long time ago. How it is something that should have been this easy to add has been for so long overlooked?
Anywho, here’s hoping these features show up in IIS7.
- The cases where there are good reasons can be eliminated by changing architecture, and will probably result in a better overall design.
P.S. If you can provide website examples with URLs and or developer tools that use them, it will help me and my readers better understand the example.
UPDATE: If you are interested in URL design, check out my new project to promote the importance of URL design:
With all the talk of AJAX these days and with my concerns about poorly implemented AJAX-based sites and what they may mean for the web, I’m once again reminded of an opinion I’ve had for a long time: Well designed URL is one of the most valuable aspects of the web. Put more succinctly:
Well Designed URLs are Beautiful!
The following are my (current) set of rules for how to ensure beautiful URLs:
Well Designed URLs Point to Content that Does Not Change
Theoretically, each URL points to a unique view of specific content, or a specific "state" if you will. And I content that should include URLs that point to dynamically generated web pages.
Of course many URLs point to content that changes that with each view (such as advertisements) and/or that is modified based on the current login state of the person viewing the content. Both of these cases corrupt the purity of the ideal stateless URL, but in my pragmatic opinion they are okay as long as the core content for a given URL is static.
URLs that point to home pages and section home pages often change their content, but to me that is okay too. Web users generally don’t expect portal pages to have static content, but all other URL should point to content that doesn’t change.
Well Designed URLs Don’t Change
This should go without saying, but then again how often have you found exactly what you were wanting on a search engine or in a website’s "links" section only to get a 404 after you click the link? Sometimes this happens because the website owner went out of business, but usually its because of a careless website owner/developer who simply reorganized the website without considering all the dead links they are creating, and all the opportunities for traffic they lost for themselves.
Of course most application servers and most content management systems make it almost impossible to "do this right." For example, what’s with Microsoft’s IIS, now in version 6.0, that you can’t serve virtual URLs unless they have an extension (most commonly .aspx?!?) Sheesh!
Well Designed URLs Can Be Stored
When you see a web page you like, you should be able to bookmark its URL and return to view the same thing later. When you see a web page you want a friend to see, you should be able to cut & paste into an email where they can see in their browser exactly what you saw (this is especially helpful if what the see is a bug in your server-side scripting!) And when someone who is blogging or building out a website finds your related web page, they should be able to include your URL as a link in their web page, and it should work as a link for anyone that views their site. Plus, if a URL can be stored, it can be indexed by a Search Engine.
Well Designed URLs Only Use Parameters with Forms-Driven Queries
Many websites use parameters to data drive their website. In most cases, those URLs are just ugly. If I’m looking for Sweaters at Sears, I should click a link that points to www.sears.com/sweaters/, not www.sears.com/products?type=23.
Instead, URL parameters should only best used on pages that allow users to submit a query based entered into form fields. All other URLs should be composed and readable.
Well Designed URLs are Readable and Hierarchical
URLs can and should be part of a website’s user interface. Well designed URLs that are readable and Hierarchical provide wonderfully rich information to a website user about the structure and content of a website. Websites with non-readable and non-Hierarchical URLs don’t.
Well Designed URLs Mean Something
Besides being readable, a web page’s URL should mean something to the website viewer. Having "/honda/" in a URL helps the website user understand the site; having "/tabid23/" in a URL does not.
Of course who violates this rule the worst? Content management systems. And it seems the most more expensive the CMS, the worse it violates this rule (can you say "Vignette?")
Well Designed URLs are Readable in Print
When you see a web page you’d like to reference in print, you’d want it to be readable, not a collection of what appears to be random letters and numbers (i.e. not like a "GUID.") Imagine a reader trying to type in 38 apparently random letter and numbers; that’s simply a painful thought.
Well Designed Websites Have Atomic Leaf Nodes
How many sites have a URL to display a collection of items but no unique URLs for each specific item? Just as an atom is indivisible, so should be leaf-node web pages, each with its own specific and understandable URL.
Well Designed URLs Are Hackable
A website that has a web page for the relative URL "/cars/toyota/4runner" should also have a web page for "/cars/toyota/" and for "/cars/."
Well Designed URLs Can Be Guessed
Let’s say that a website user is on a really slow link. If you URLs are well designed, chances are they can guess at the URL for the page they want. Of if you, godforbid, have a broken link, maybe they can correct it.
Well Designed URLs Are Only As Long and As Short As Necessary
URLs should be short. Short URLs can more easily be posted into emails and not wrap, and short emails can be printed in advertisements, for example. However, URLs should be long enough to make them readable, not obscure their meaning, and retain the website’s heirarchy.
If you can’t make URLs short enough to be readable, retain meaning, and retain heirarchy, create alternate URLs for print, advertisement, etc.
If this is so bad, why is this done? For the application server developer’s convenience, and not for the user’s convenience; that’s for sure.
Well Designed Search Forms Always Use GET
Search engines result pages provide content too, and their URLs need to point to content that doesn’t change. Well of course they do change over time as they display what is current, but that’s appropriate for a search engine result page. If a search form using POST, its search engine result page URL is only useful when an action of a search form, and worthless in every other case.
For a perfect example of a search page that violates this rule, check out CycleTrader’s Search Page.
Well Designed URLs Have More Friends than Just Me
Of course I’m not the first to make the case for URLs, and I probably won’t be the last. Here are some of the better essays about quality URLs:
There are also a few tools that can help:
Well Designed URLs are an Asset
Some people rail against the URL and say it is an overly technical anachronism to which non-technical people should not be exposed. I completely disagree.
Just like so many other technical things that have become such a part of common culture over the years to have become all but invisible, such as radio station’s frequencies, checking account numbers, ATM passcodes, and highway speed limits, so too will URLs become so straightforward that they’ll soon not even be recognized as technical. Instead, they’ll be viewed as obvious and valuable because they fundamentally are.
So, in a nutshell:
Well Designed URLs are Beautiful!