Those of you who read my blog know that I strongly believe in the importance of URL design. For years it bothered me that we’ve see so many URLs on the web that look like the following example of poor URL design from Jeffrey Veen‘s 2001 book The Art & Science of Web Design:
http://www.site.com/computers.dll?1345,1,,22,567,009a.html
Back in Aug of 2005 I finally got my thoughts together and wrote the post Well Designed Urls are Beautiful. Well, from anecdotal evidence (I don’t track stats on my blog stats very closely) it appears that post has become my blogs my popular post!The popularity of that post combine with the several others facts inspired me to go ahead and launch a website with the following mission:
"Providing best practices for URL design, and to raise awareness of the importance of URL design especially among providers of server software and web application development tools."
The "facts" I referenced above are:
- I continue to feel strongly about URL design yet many are still oblivious to the benefits,
- I still have a lot more to say on the topic, and
- It appears that good URL design is one of the many tenants of Web 2.0 partly because of AJAX, Mashups, and REST-based APIs meaning that it won’t be such an uphill battle!
The name of the website/wiki is WellDesignedUrls.org and for it I have the following goals:
- To create a list of "Principles" as best practices for good URL design,
- To cultivate how-to articles about implementing good URL designs on the various platforms like ASP.NET, LAMP and Ruby on Rails, servers like IIS and Apache, and web development tools like Visual Web Developer and Dreamweaver,
- To cultivate general how-to articles and resources for tools such as mod_rewrite and ISAPI Rewrite and others,
- To cultivate "solutions sets" for mod_rewrite and ISAPI Rewrite and others that can clean up the URLs on well known open-source and commericial web applications,
- To grade web applications, websites, and web development tools by giving them a "report card" on how well or how poorly they follow best URL design practices,
- To document URL structure of major web applications and major websites,
- To recognize people who are "Champions for the URL Design cause" (those who’ve written articles and essays promoting good URL design), and
- To providing resources for further reading about good URL design.
The wiki is clearly new and thus a work in progress, so it will probably be a while before it realizes all these things I mention. However, as I have time and am able to recruite others to help, I think it will become an important advocate for good url design and a great central resource for best practices. And if you’ve read this far, I’m hoping that you’ll consider either contributing when you feel you have something relevent, or at least use start considering the value of URL design in your own web application development and also point people in the wiki’s direction when applicable.Thanks in advance for the help!P.S. I also plan to launch a WellDesignedUrl blog in the near future.
Subscribe to my RSS feed it you want to be notified of when the blog goes live.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnaspp/html/urlrewriting.asp
Great example, Perfect implementation of URL Rewriting. I did not write the article, but extended the code to query a database instead of the web.config file. Then, new pages can be tide to a content table in the database. Check out
http://www.collegecupcakes.com/Steph_253_Fun.aspx
to see it in action No querystrings are visible to the end user. Great for SEO and emailing the links to people. If you need the code to keep the URLs in the database intead of a config file, email me at sultan@collegecupcakes.com.
All the articles on the site are coded with URL rewriting, and dynamic metas. no more "index.aspx?ID=123". Plus, have you every tried to email a link in and then when the reciepient gets the link it is broken, because the mail program inserted line feeds? Having links like
http://www.collegecupcakes.com/Party_With_No_Hangover.aspx
Solves this problem.
The under scores prevent the email program from truncating lines and moving part of the querystring to he next line.
Email me if you have any questions about implementing the url rewriting. I can email you the class that queries the database so you can create urls on the fly.
Steve:
Err, nice web page. :)
Yeah, I’ve been aware of Scott Mitchell’s URL Rewriting article since almost from the time it was first published. The real problem with ASP.NET URL Rewriting, and Scott’s article does not address this, is the fact that you can’t eliminated the .ASPX extension. IMO, eliminating the extension on (X)HTML documents (as opposed to images) is critical.
This is a *huge* oversight on Microsoft’s part, and is typical of Microsoft to provide 80% of the solution but leave the admin/developer with a tremendous effort required to solve the remaining 20%. (I understand you can elminate .ASPX extensions in ASP.NET with IIS6, but that causes you to loose the "default page" functionality which is a non-starter for existing websites plus the rewriter is more complicated.)
I understand it will be possible with IIS7, but the question there is how long will it be before most shared web hosts make that available? That is a much needed change by Microsoft.
As for the underscores, I hadn’t thought of that, but underscores create another problem. As Matt Cutts (GoogleGuy) says: "That’s why I would always choose dashes instead of underscores. To answer a common question, Google doesn’t algorithmically penalize for dashes in the url."
http://www.mattcutts.com/blog/dashes-vs-underscores/
Though in concept I hate to optimize for something like Google and would rather optimize for usability, the reality Google can bring significant traffic and not optimizing for Google can mean that that there are far less users to optimize for.
Here are some other URLs that are worth reading on the subject:
* http://www.markcarey.com/googleguy-says/archives/discuss-underscores-are-not-word-seperators-in-google.html
* http://www.markcarey.com/googleguy-says/archives/googleguy-confirms-that-underscores-are-not-word-separators.html
* http://www.markcarey.com/googleguy-says/
* http://annevankesteren.nl/2004/08/uri-design
Probably the best solution would be to program your redirector to accept both underscores and dashes in your URLs all over your site, and then use an "OnClick" event that would convert dashes to underscores before submitting the request:
<a href="/my-post-with=many-words-in-the-title.aspx" onclick="RequestWithUnderscores()">Click Here</a>
In my example, the onclick event could be annotated to all <A> element in an onload event of the <body> tag. The RequestWithUnderscores() would take the HREF from the just clicked <A> element and do a Replace() before issuing a HTTP request. This would put the underscored version of the URL in the URL field of the browser so that people would see it and cut & paste it when they wanted to email the link (instead of the dashed version.)
In addition to that a further Google optimization would be to issue a 301 redirect on any URLs requested by GoogleBot to the dashed version so that Google doesn’t duplicate the resource in it’s index.
An even better long term solution is to get Google to recognize underscores are word separators. :)
BTW, I’m *not* a Javascript guru, so if I got any of the Javascript wrong, please forgive and leave a comment explaining how to fix my errors.