Anti-phishing tactic helps the “Well Designed Url” cause

Today Joris Evers on CNET posted an article about the security developers for the four main web browsers discussing how to make surfing the Web safer. One of the tactics mentioned was Microsoft plans for IIS7 to show the URL in the address bar on all Internet windows to help users identify fraudulent sites. Whereas the trend has somewhat been for many websites to eliminate the address bar on their seconday windows to make their websites look slicker — see what happens when the bad marketing wonks get involved, and when techies become over-enamored by techniques like AJAX — this move will shine the light more brightly on the lowly URL.

In the past have blogged about Good URL design for websites and the related topics of wanting Mod_rewrite functionality for IIS and the tool ISAPI Rewrite that gives mod_rewrite functionality to IIS so it is clear I’m passionate about virtue of incorporating URL design into the overall design of a website. More specifically, my personal opinion is that URL design is one of the more important aspects of web design. This even though one person in this world disagrees with me, but Mark Kamoski is wrong. :)

What’s cool about IIS7 requiring the URL to be seen at all times besides the obvious anti-phishing benefits is it will hopefully cause more website stakeholders (marketers, developers, etc.) to think more about the design of their website’s URLs.

And that would be a good thing.

P.S. Actually, I’d love to see all Windows applications do what Windows Explorer does and support a URL of sorts (maybe call it an "LRL" as in Local Resource Locator?) Wouldn’t it be great to see apps like Word, Excel, QuickBooks, and even Visual Studio be written as a series of state changes where the URL/LRL could represent in a user readable format each uniquely-representable state (with some obvious caveats)? Just imagine how that would empower the creation of solutions by composing applications… but I digress as that is the topic for a future day’s blog post.

P.P.S. I almost don’t want to say this next thing as it could obviate the need for exposing URLs to guard against phishing, but I’m too intellectually honest not to. I see a huge market opportunity for Verisign, with the support of browser and server vendors, to enhance their SSL certificates to include a "Phishing-Safe" seal of approval. Today website owners only need pay for a certificate if they are collecting sensitive information, but in the future I could see it becoming a defacto requirement for any website with a login to need a "phishing-safe" certificate, raising the bar on lots of hobby forums sites, etc. But I once again digress… Oops, I should have read the whole article before pontificating here; looks like they are discussing just such a concept.

Technologies are best when they are simple

What’s the next big thing? AJAX? Ruby on Rails? PC Virtualization? Open-Source Software? Data Security? Open Office File Formats? Windows Vista? Windows Live? Apple’s iWhatever? Yeah, all those things will get lots of hype, but the next big thing is something we’ve had access to all along:

Simplicity

Are my thoughts revolutionary? Nah, I’ve been reading about it at places like Information Week and the other usual suspects. Even Bill Gates at Microsoft gets it, through Ozzie at least (though execution will be the key.) But unlike all that gets hyped, simplicity as a concept that is for real.

Let’s look at two of the best known examples:

  1. Simple Mail Transfer Protocol.
  2. Really Simple Syndication.

Over the years, the world’s Internet email infrastructure evolved from that simple little mail transfer protocol (spam and all!) And RSS exploded as a method to syndicate blog posts in a very short order instead of one of the many complex content syndication concepts most of us never even heard of.

To most people the Internet came out of nowhere ten (10) years ago yet it evolved for at least twenty (20) years prior. The Internet’s foundation protocol TCP/IP isn’t exactly simple, but once the simple protocols HTTP and HTML were layered on top, Internet use exploded because implementing websites was simple (by comparison.)

But it’s not just simple technologies, its also simple to install and simple to use applications: ASCII text editors (i.e. Notepad), web browsers, email clients (w/apps like Outlook Express), instant messenger clients, wikis, blogging apps, online forum apps, and QuickBooks (simple is relative; accounting is required yet QuickBooks doesn’t really require accounting expertise.)

And to many people this simplicity makes sense. Scott Cook (founder of Intuit) got it. The founders of the original Instant Messenger (ICQ) got it. Pierre Omidyar (founder of eBay) got it. Google gets it. The original author of PHP Ramus Lerdorf gets it. And a lesser known group also gets it; the developers of Basecamp (although 37 Signals could also be the poster child for when a group elevates a concept to an ideology, and like all ideologists, becomes blind and misinterprets the concept. But I digress…)

Okay this is all obvious, and well, it’s simple. So what’s the big deal? People recognize that simple is important but without a simple roadmap, most don’t know how (pun intended.) I don’t know that I can provide that roadmap, but at least I can get you started.

First, just for grins, let’s look at some counter examples:

  • MS-Access – Have you ever tried to develop an app is MS-Access? Yeah right.Access it pretty easy in where it allows you as a user to point and click, but once you hit its brick wall of end user functionality, you’ve got to be an Access guru to do anything more with it.
  • VB.NET – Thank god for the My namespace in VB 2005, albeit five years late, but VB.NET is still too damn difficult to use productively without weeks of learning.Don’t get me wrong, I love the power of VB.NET language, but it has very little transitionality.
  • ASP.NET – I know its blasphemy, but let’s be real: VIEWSTATE, __doPostBack(), Server Controls, @Register, @Import, WebForms, DataGrid, etc. etc. There’s so much complexity there, where does one start? It’s no wonder so many people still use ASP & VBScript.
  • Exchange Server – Oh my god! How complex a beast can you get? Most POP3/SMTP servers use files and directories; Exchange using some bastardization of an Access/Jet database that corrupts whenever the power fluctuates. And have you ever tried implementing server events?
  • SharePoint – I can’t even figure out SharePoint as a user, let alone as a developer. What was Microsoft thinking?
  • Active Directory – Need I say more?!?

I’ve bashed on Microsoft thus far, but let me not give them all the credit:

  • XML, though itself simple, has been complicated with namespaces which I’ve studying for literally years I but still can’t figure out how to use.
  • SOAP – Okay, Microsoft was heavily involved here. But why did they have to make web services so hard?I mean, what was wrong with HTTP POST?
  • J2EE – There’s a reason J2EE developers get paid the really big bucks.
  • Oracle – Have you ever tried to tune an Oracle database application?
  • Content Management Systems – Is there anything out that can pass for simple? I’ve been using DotNetNuke on one of my sites for a while and I can tell you, it isn’t.

This brings me to my key point. Aside from being intuitively obvious, what’s so great about simple?

The Benefits of "simple" are, quite simply:

  • For the User: Productivity
  • For the Platform Provider: Rapid and Widespread Adoption

But you say that all of my counter examples have widespread adoption?

Do not underestimate the institutional will of large organizations to implement tremendously complex technology, because they can.

On the other hand, departmental users, users in small businesses, college students, home users and more can’t deal with complex technology. If it’s too difficult, they don’t or can’t use it. And there are many, many more of them than there are large organizations. What’s more, large organizations are effectively made up of these small groups and individuals. Simple technologies benefit all.

Microsoft, with its Windows monopoly has been able to get away with complexity and consequent low user productivity and low platform adoption with many of its products for a long time. But with the new challenges from Google, SalesForce, et. al. they better get pragmatic religion, and they better get it fast.

And that roadmap to which I referred? To quote Albert Einstein:

As simple as possible, but not simpler

:-)

AJAX: It Shouldn’t Just Be All About the Developer!

After seeing Eric Pascarello’s thread at ASP.NET forums entitled “AJAX - Is it Hype? Is it for you?” I decided to post a thread at ASP.NET forums with a reference to my post from July entitled “AJAX: A Panacea, or a Pending Train Wreck?” UkBtlog responded so I’m blogging my reply below:

UkBtlog states:
I think there needs to be a big distinction between web applications (gmail, google maps, online banking) and web sites (msdn, wired, www.asp.net. AJAX is primarily useful for web applications. AJAX has limited use for web sites.

I’ll agree there is a big distinction in some ways, but that’s part of the point. Most web developers will be drawn to AJAX because it’s the “next cool thing” and you’ll see AJAX on practically every site, and in most cases very badly done. This is no different from when people went crazy with desktop publishing software using tons of fonts and colors in a document! But bad AJAX on the web will have a much more profound negative impact than ugly flyers posted on a light pole.

I’m not arguing that web developers can’t use AJAX well, I’m arguing that AJAX opens a pandora’s box where most web developers will use AJAX and few will do it well.

UkBtlog states:
I don’t believe that AJAX is actually useful for things that search engines are best at searching. To use the example of MapQuest and Google Maps, I have never searched for London Bridge and had MapQuest come up as a result in a search engine. This although it uses query string parameters to be deterministic it isn’t something that is searchable. I can also use the robots.txt to stop search engines and google even tells me how.

Again, we mostly agreed, but UkBtlog again missed my one of my points; web developers will use AJAX badly on web sites that should be searchable.

UkBtlog states:
This is the same as restarting an application and isn’t really a suprise to developers. Sure this may not be the expected user experience, but perhaps the dom standards should give web application developers a way to stop the use of refresh if web apps are to mature.

It’s not a surprise for web developers, but it’s a huge surprise for web users! And a bad one at that!!! One of the reason for the success of usability on the web has been the consistent and constrained UI. Web usability guru Jakob Nielsen quotes in The Top Ten New Mistakes of Web Design (May 30, 1999):

Jakob Nielsen states:

1. Breaking or Slowing Down the Back Button

The Back button is the lifeline of the Web user and the second-most used navigation feature (after following hypertext links). Users happily know that they can try anything on the Web and always be saved by a click or two on Back to return them to familiar territory. Except, of course, for those sites that break Back by committing one of these design sins:

  • opening a new browser window (see mistake #2)
  • using an immediate redirect: every time the user clicks Back, the browser returns to a page that
  • bounces the user forward to the undesired location
  • prevents caching such that the Back navigation requires a fresh trip to the server; all hypertext
  • navigation should be sub-second and this goes double for backtracking

Further, from Why Frames Suck (Most of the Time) (Dec 1996):

Jakob Nielsen states:
13% of users are still using Netscape 2 which had one of the worst usability problems to be seen on the Web so far: the BACK button in the browser simply didn’t work with framed sites. The BACK feature is an absolutely essential safety net that gives users the confidence to navigate freely in the knowledge that they can always get back to firm ground. We have known from some of the earliest studies of user navigation behavior that BACK is the second-most used navigation feature in Web browsers (after the simple “click on a link to follow it” action). Thus, breaking the BACK button is no less than a usability catastrophe.

And I wouldn’t attack the sources for being old; usability is based on the way humans process information and that doesn’t change rapidly just because someone coined a new term and all the bleeding edge web developers have jumped on it!

Also, Jakob hasn’t written about AJAX, yet, probably because he writes about his usability research results, not just his opinions (like you and me :), but I’m sure he will as soon as he has solid data to back him up.

UkBtlog states:
Developers of web application spend a lot of effort and tears trying to remove the problems of back buttons and refresh behaviour. The web browser shouldn’t be trying to hold state, although it happens to cache the previous page, as HTML is supposed to be stateless. So if I want to buy into implementing state into a website such as using cookies or hidden fields or fancy frameworks like ASP.NET I don’t want the browser interferring with my application state. If I am writing a web site for a company that informs the world on what they do, then I am not keeping any state and the back, forward and refresh buttons hold no fear. Your view is a little simplistic for the variance in reasons for developing for the web.

The criteria for expanding web functionality shouldn’t be about how to make it easier for the developer!!! The web should be about making it easier for the user, and about empowering solutions that were previously unavailable. The stateless web addressable via URL with a well defined content set (HTML, GIF, JPG, and now PNG and PDF, DOC, XLS, etc.) is the technology that has empowered so much economic expansion and functionality on the web. Making it easier for the developer while ignoring other goals will just drag us down into the new dark ages; the dark ages of the web.

UkBtlog states:
AJAX maybe abused as it is the new cool feature, but using framesets or domain rewriting can lead to similar problems. I can also use pointers without correctly releasing the memory, but I don’t.

And you’ll note most web developers have finally stopped using framesets (thank god!) I had some of the same issues with them. (I don’t know specifically what UkBtlog meant about domain rewriting.)

UkBtlog states:
Any of the technologies developers are presented with can be used in a way that is not best practice. If you do this with an information website such as Wired.Com or MSDN the information will not be found and your business will suffer. I only see this “misuse” of AJAX existing in proof of concept and misinformed developers.

This isn’t about what one developer such as UkBtlog does. It is about the health of the web. Many cities have ordinances that say you can’t have a grill on the balcony of an apartment or condo for the same reason. I can make sure I don’t burn down the building with *my* grill, but what about my neighbor? If he doesn’t use his grill safely, he’ll affect *me*. Albeit an extreme analogy, the same is true for AJAX and web developers for the web.

UkBtlog states:
Perhaps we should encourage article writers to discuss best practice use of AJAX to help remove your fears.

Now with *that* I complete agree! So why do you think I am writing these blogs? :-)

But not only writers. We should also encourage tool vendors like Microsoft with their Atlas, and anyone making AJAX toolkits to really think through the issues and make sure their tools encourage best practices.

But what are those best practices? Thus far, I’ve only complained about AJAX. When I have time, hopefully soon, I’ll make some recommendations as I see it for minimizing the threat and maximizing the benefit of AJAX.

Well Designed URLs are Beautiful!

With all the talk of AJAX these days and with my concerns about poorly implemented AJAX-based sites and what they may mean for the web, I’m once again reminded of an opinion I’ve had for a long time: Well designed URL is one of the most valuable aspects of the web. Put more succinctly:

Well Designed URLs are Beautiful!

The following are my (current) set of rules for how to ensure beautiful URLs:

Well Designed URLs Point to Content that Does Not Change

Theoretically, each URL points to a unique view of specific content, or a specific "state" if you will. And I content that should include URLs that point to dynamically generated web pages.

Of course many URLs point to content that changes that with each view (such as advertisements) and/or that is modified based on the current login state of the person viewing the content. Both of these cases corrupt the purity of the ideal stateless URL, but in my pragmatic opinion they are okay as long as the core content for a given URL is static.

URLs that point to home pages and section home pages often change their content, but to me that is okay too. Web users generally don’t expect portal pages to have static content, but all other URL should point to content that doesn’t change.

Well Designed URLs Don’t Change

This should go without saying, but then again how often have you found exactly what you were wanting on a search engine or in a website’s "links" section only to get a 404 after you click the link? Sometimes this happens because the website owner went out of business, but usually its because of a careless website owner/developer who simply reorganized the website without considering all the dead links they are creating, and all the opportunities for traffic they lost for themselves.

Of course most application servers and most content management systems make it almost impossible to "do this right." For example, what’s with Microsoft’s IIS, now in version 6.0, that you can’t serve virtual URLs unless they have an extension (most commonly .aspx?!?) Sheesh!

Well Designed URLs Can Be Stored

When you see a web page you like, you should be able to bookmark its URL and return to view the same thing later. When you see a web page you want a friend to see, you should be able to cut & paste into an email where they can see in their browser exactly what you saw (this is especially helpful if what the see is a bug in your server-side scripting!) And when someone who is blogging or building out a website finds your related web page, they should be able to include your URL as a link in their web page, and it should work as a link for anyone that views their site. Plus, if a URL can be stored, it can be indexed by a Search Engine.

Well Designed URLs Only Use Parameters with Forms-Driven Queries

Many websites use parameters to data drive their website. In most cases, those URLs are just ugly. If I’m looking for Sweaters at Sears, I should click a link that points to www.sears.com/sweaters/, not www.sears.com/products?type=23.

Instead, URL parameters should only best used on pages that allow users to submit a query based entered into form fields. All other URLs should be composed and readable.

Well Designed URLs are Readable and Hierarchical

URLs can and should be part of a website’s user interface. Well designed URLs that are readable and Hierarchical provide wonderfully rich information to a website user about the structure and content of a website. Websites with non-readable and non-Hierarchical URLs don’t.

Well Designed URLs Mean Something

Besides being readable, a web page’s URL should mean something to the website viewer. Having "/honda/" in a URL helps the website user understand the site; having "/tabid23/" in a URL does not.

Of course who violates this rule the worst? Content management systems. And it seems the most more expensive the CMS, the worse it violates this rule (can you say "Vignette?")

Well Designed URLs are Readable in Print

When you see a web page you’d like to reference in print, you’d want it to be readable, not a collection of what appears to be random letters and numbers (i.e. not like a "GUID.") Imagine a reader trying to type in 38 apparently random letter and numbers; that’s simply a painful thought.

Well Designed Websites Have Atomic Leaf Nodes

How many sites have a URL to display a collection of items but no unique URLs for each specific item? Just as an atom is indivisible, so should be leaf-node web pages, each with its own specific and understandable URL.

Well Designed URLs Are Hackable

A website that has a web page for the relative URL "/cars/toyota/4runner" should also have a web page for "/cars/toyota/" and for "/cars/."

Well Designed URLs Can Be Guessed

Let’s say that a website user is on a really slow link. If you URLs are well designed, chances are they can guess at the URL for the page they want. Of if you, godforbid, have a broken link, maybe they can correct it.

Well Designed URLs Are Only As Long and As Short As Necessary

URLs should be short. Short URLs can more easily be posted into emails and not wrap, and short emails can be printed in advertisements, for example. However, URLs should be long enough to make them readable, not obscure their meaning, and retain the website’s heirarchy.

If you can’t make URLs short enough to be readable, retain meaning, and retain heirarchy, create alternate URLs for print, advertisement, etc.

Well Designed Links Do Not Trigger JavaScript

How often do I find web pages with links that use JavaScript? Grrrrr!!!! (Can you say "__doPostBack()" Yes, I feared you could.) What’s wrong with JavaScript? Users often use browser’s status bar to view the URL give them a clue to where the link will take them. Not with JavaScript. Plus, many users hold down the shift key to launch a new window. NOT WITH JAVASCRIPT!!! (Can you feel my anger? Well Designed Links do NOT trigger JavaScript.)

If this is so bad, why is this done? For the application server developer’s convenience, and not for the user’s convenience; that’s for sure.

Well Designed Search Forms Always Use GET

Search engines result pages provide content too, and their URLs need to point to content that doesn’t change. Well of course they do change over time as they display what is current, but that’s appropriate for a search engine result page. If a search form using POST, its search engine result page URL is only useful when an action of a search form, and worthless in every other case.

For a perfect example of a search page that violates this rule, check out CycleTrader’s Search Page.

Well Designed URLs Have More Friends than Just Me

Of course I’m not the first to make the case for URLs, and I probably won’t be the last. Here are some of the better essays about quality URLs:

There are also a few tools that can help:

Well Designed URLs are an Asset

Some people rail against the URL and say it is an overly technical anachronism to which non-technical people should not be exposed. I completely disagree.

Just like so many other technical things that have become such a part of common culture over the years to have become all but invisible, such as radio station’s frequencies, checking account numbers, ATM passcodes, and highway speed limits, so too will URLs become so straightforward that they’ll soon not even be recognized as technical. Instead, they’ll be viewed as obvious and valuable because they fundamentally are.

So, in a nutshell:

Well Designed URLs are Beautiful!

 

AJAX: A Panacea, or a Pending Train Wreck?

As a former MapQuest user, I find Google Map’s interactive interface built with AJAX (a combination of Asynchronous JavaScript and XML) really cool and lots of fun to use. And Google’s GMail, also built with AJAX, has a nice snappy UI that’s much better than Hotmail. And the more websites that adopt AJAX techniques the less stodgy and more fun the web will be.

But even so, I still think this Next Big Thing called AJAX could well turn out to be a disaster for the web (disaster with a lower case "d," that is.)

Why? The simple answer is "state", or lack there-of.

In the early days for the web, a Uniform Resource Locator pointed to a page on the web that stayed the same, at least until updated. Later, many URLs pointed to dynamic content, but most returned deterministic content; i.e. they returned the same HTML even if accessed multiple times. However web pages with AJAX content have an infinite potential number of states, many for which no URL will ever exist.

Why is this bad? Let me use examples. Have you ever tried to email a friend a Google Map at the precise zoom level you are viewing? Fortunately Google makes it possible with the "Email" link, but how many other web developers will go to that level of effort?

As another example, try using GMail. Click around a bit. Now click the refresh button. Shazam! The page view you had disappears and you return to the Inbox. Now click the [Back] button. Do you go to the state where you were previously? No! You go back to the prior web page, whatever that was. You cannot go "back" to the prior page in an AJAX site because the browser doesn’t manage the state for you. Instead the AJAX developer has to manage state and provide a "Back" link or you won’t be able to go back. So what’s the likelihood when he’s behind schedule to release a new website he’ll go to that extra effort, or even that he’ll know how?

Okay, these are just nits from someone resistant to change, right? Well, not exactly. The biggest problem with lack of state is that programs, such search engines, can’t process AJAX web pages programmatically; Google, for example. Anything hidden behind AJAX’s interactivity is also hidden from search engine view.

In the July 4th, 2005 issue of Information Week, Google’s own Bret Taylor claims clicking on a blue URL and loading a page is "the old Web User interface." Ironic that the company that spawned interest in AJAX is the one that depends most on the programmatically indexed web pages that AJAX will minimize.