Archives for the year of: 2007

As I mentioned on Twitter, as of this week I’ve got a new job of managing the Jazz.net community site. I’ll continue to lead the development on the framework-y pieces of the Jazz Platform web UI stack.

This is going to be a fun job since we’re trying something relatively new with the Jazz Project that we’re calling “Open Commercial Development“. On traditional commercial software projects, there’s a pretty high wall between the development team and everyone else. With the Jazz Platform and the products we’re building on top of it, we want to create a much higher bandwidth channel between the development team and people who use Jazz-based tools and people who build their own tools on top of the Jazz Platform.

So my job is to make it easier for the development team, users, and extenders to collaborate. This of course means good docs and good tools for collaboration, but more than anything, it means getting to know more folks who are interested in learning more about Jazz technology and influencing its direction.

So if you have questions or concerns about Jazz.net, feel free to drop me a line. My work email address is bhiggins@us.ibm.com and my cell phone number is 919-564-6862. Please don’t be shy 🙂

Update: Changed title; original one sounded too presumptuous (was “what development tools do you need besides Jazz?”)

Aaron Cohen of Rational asked an interesting question in the Jazz.net forum the other day. I’ve reprinted his question and my answer here verbatim, with permission from Aaron and the Jazz Project.

Question (Cohen):

Right now Jazz.net uses Wikis and newsgroups to supplement Jazz. If a team is deploying Jazz. What type of Wiki or other collaborative software should we use with Jazz?

Answer (Higgins):

Aaron, it really depends on the needs of your team. With Jazz, our goal is not to reinvent every known software and/or collaboration tool as a Jazz component. Basically we look for opportunities where deep integration with other Jazz components could produce a much more productive experience.

Just implementing a Wiki on top of the Jazz repository is not interesting; the world doesn’t need another Wiki implementation. But it is interesting to observe how we tend to use our own Wiki and then explore how we could provide new functionality in Jazz to replace Wikis for these needs. For example, every Jazz component lead used to create an iteration plan page in the Wiki. Observing this, the Agile Planning team provided an “Overview” page for each iteration plan, where a team can write semi-structured markup as well as linking forward to important work items. Now most component leads don’t feel the need to create a separate plan overview in the Wiki.

In the case of newsgroups, we needed a lightweight way to make announcements to a subset of the project, and newsgroups have been doing this for years so we used newsgroups. This continues to work really well, so… I don’t know why we’d ever change it.

But getting back to your question, I’d rephrase it as “What non-Jazz tools do I find my team using to support our software development?” And if you notice yourself using some non-Jazz tools for large chunks of your development, it’s worth asking “Could a Jazz-based component provide a richer, more productive, more integrated experience?” If the answer to the second question is “Yes”, then we encourage you to consider either building your own Jazz component to fulfill the need or to raise enhancement requests against us.

Thanks for your question and I hope this helps.

I’ll be in Beijing, China next week giving a couple of talks on the Jazz Platform and Rational Team Concert:

Rational Team Concert: Agile Team Development with Jazz

Inside the Jazz Technology Platform

  • When? Friday, 31 Aug 2007 (10AM – Noon, local time)
  • Where? IBM China Research Lab, Beijing
  • What? A technology-focused presentation on the Jazz Platform architecture and how to extend the Jazz Platform with custom components (charts courtesy of Scott Rich and Kai Maetzel)

I have a three year-old son named Alexander who’s one hundred percent boy. The other night I was getting him ready for his bath. As soon as I took off his last piece of clothing, he ran away, waving his arms madly and yelling “Look everybody, I’m a naked boy!”

He ran to where his mom was sitting, put his hands on his hips, stuck his little tummy out as far as he could, and shouted “Look Mama, I have fat belly!”

I laughed but quickly realized the joke was on me when he looked my way and with all seriousness finished his thought “…like Daddy”.

There’s a theory called ‘The Uncanny Valley’ regarding humans’ emotional response to human-like robots. From The Wikipedia entry:

The Uncanny Valley is a hypothesis about robotics concerning the emotional response of humans to robots and other non-human entities. It was introduced by Japanese roboticist Masahiro Mori in 1970 […]

Mori’s hypothesis states that as a robot is made more humanlike in its appearance and motion, the emotional response from a human being to the robot will become increasingly positive and empathic, until a point is reached beyond which the response quickly becomes strongly repulsive. However, as the appearance and motion continue to become less distinguishable from a human being’s, the emotional response becomes positive once more and approaches human-human empathy levels.

This area of repulsive response aroused by a robot with appearance and motion between a “barely-human” and “fully human” entity is called the Uncanny Valley. The name captures the idea that a robot which is “almost human” will seem overly “strange” to a human being and thus will fail to evoke the requisite empathetic response required for productive human-robot interaction.

While most of us don’t interact with human-like robots frequently enough to accept or reject this theory, many of us have seen a movie like The Polar Express or Final Fantasy: The Spirit Within, which use realistic – as opposed to cartoonish – computer-generated human characters. Although the filmmakers take great care to make the characters’ expressions and movements replicate those of real human actors, many viewers find these almost-but-not-quite-human characters to be unsettling or even creepy.

The problem is that our minds have a model of how humans should behave and the pseudo-humans, whether robotic or computer-generated images, don’t quite fit this model, producing a sense of unease – in other words, we know that something’s not right – even if we can’t precisely articulate what’s wrong.

Why don’t we feel a similar sense of unease when we watch a cartoon like The Simpsons, where the characters are even further away from our concept of humanness? Because in the cartoon environment, we accept that the characters are not really human at all – they’re cartoon characters and are self-consistent within their animated environment. Conversely, it would be jarring if a real human entered the frame and interacted with the Simpsons, because eighteen years of Simspons cartoons and eighty years of cartoons in general have conditioned us not to expect this [Footnote 1].

There’s a lesson here for software designers, and one that I’ve talked about recently – we must ensure that we design our applications to remain consistent with the environment in which our software runs. In more concrete terms: a Windows application should look and feel like a Windows application, a Mac application should look and feel like a Mac application, and a web application should look and feel like a web application.

Obvious, you say? I’d agree that software designers and developers generally observe this rule except in the midst of a technological paradigm shift. During periods of rapid innovation and exploration, it’s tempting and more acceptable to violate the expectations of a particular environment. I know this is a sweeping and abstract claim, so let me back it up with a few examples.

Does anyone remember Active Desktop? When Bill Gates realized that the web was a big deal, he directed all of Microsoft to web-enable all Microsoft software products. Active Desktop was a feature that made the Windows desktop look like a web page and allowed users to initiate the default action on a file or folder via a hyperlink-like single-click rather than the traditional double-click. One of the problems with Active Desktop was that it broke all of users expectations about interacting with files and folders. Changing from the double-click to single-click model subtley changed other interactions, like drag and drop, select, and rename. The only reason I remember this feature is because so many non-technical friends at Penn State asked me to help them turn it off.

Another game-changing technology of the 1990s was the Java platform. Java’s attraction was that the language’s syntax looked and felt a lot like C and C++ (which many programmers knew) but it was (in theory) ‘write once, run anywhere’ – in other words, multiplatform. Although Java took hold on the server-side, it never took off on the desktop as many predicted it would. Why didn’t it take off on the desktop? My own experience with using Java GUI apps of the late 1990s was that they were slow and they looked and behaved weirdly vs. standard Windows (or Mac or Linux) applications. That’s because they weren’t true Windows/Mac/Linux apps. They were Java Swing apps which emulated Windows/Mac/Linux apps. Despite the herculean efforts of the Swing designers and implementers, they couldn’t escape the Uncanny Valley of emulated user interfaces.

Eclipse and SWT took a different approach to Java-based desktop apps [Footnote 2]. Rather than emulating native desktop widgets, SWT favor direct delegation to native desktop widgets [Footnote 3], resulting in applications that look like Windows/Mac/Linux applications rather than Java Swing applications. The downside of this design decision is that SWT widget developers must manually port a new widget to each supported desktop environment. This development-time and maintenance pain point only serves to emphasize how important the Eclipse/SWT designers judged native look and feel to be.

Just like Windows/Mac/Linux apps have a native look and feel, so too do browser-based applications. The native widgets of the web are the standard HTML elements – hyperlinks, tables, buttons, text inputs, select boxes, and colored spans and divs. We’ve had the tools to create richer web applications ever since pre-standards DOMs and Javascript 1.0, but it’s only been the combination of DOM (semi-)standardization, XHR de-facto standardization, emerging libraries, and exemplary next-gen apps like Google Suggest and Gmail that have led to a non-trivial segment of the software community to attempt richer web UIs which I believe we’re now lumping under the banner of ‘Ajax’ (or is it ‘RIA’?). Like the web and Java before it, the availability of Ajax technology is causing some developers to diverge from the native look and feel of the web in favor of a user interface style I call “desktop app in a web browser”. For an example of this style of Ajax app, take a few minutes and view this Flash demo of the Zimbra collaboration suite.

To me, Zimbra doesn’t in any way resemble my mental model of a web application; it resembles Microsoft Outlook [Footnote 4]. On the other hand Gmail, which is also an Ajax-based email application, almost exactly matches my mental model of how a web application should look and feel (screenshots). Do I prefer the Gmail look and feel over the Zimbra look and feel? Yes. Why? Because over the past twelve years, my mind has developed a very specific model of how a web application should look and feel, and because Gmail aligns to this model, I can immediately use it and it feels natural to me. Gmail uses Ajax to accelerate common operations (e.g. email address auto-complete) and to enable data transfer sans jarring page refresh (e.g. refresh Inbox contents) but its core look and feel remains very similar to that of a traditional web page. In my view, this is not a shortcoming; it’s a smart design decision.

So I’d recommend that if you’re considering or actively building Ajax/RIA applications, you should consider the Uncanny Valley of user interface design and recognize that when you build a “desktop in the web browser”-style application, you’re violating users’ unwritten expectations of how a web application should look and behave. This choice may have significant negative impact on learnability, pleasantness of use, and adoption. The fact that you can create web applications that resemble desktop applications does not imply that you should; it only means that you have one more option and subsequent set of trade-offs to consider when making design decisions.

[Footnote 1] Who Framed Roger Rabbit is a notable exception.

[Footnote 2] I work for the IBM group (Eclipse/Jazz) that created SWT, so I may be biased.

[Footnote 3] Though SWT favors delegation to native platform widgets, it sometimes uses emulated widgets if the particular platform doesn’t provide an acceptable native widget. This helps it get around the ‘least-common denominator’ problem of AWT.

[Footnote 4] I’m being a bit unfair to Zimbra here because there’s a scenario where its Outlook-like L&F really shines. If I were a CIO looking to migrate off of Exchange/Outlook to a cheaper multiplatform alternative, Zimbra would be very attractive because since Zimbra is functionally consistent with Outlook, I’d expect that Outlook users could transition to Zimbra fairly quickly.

Note to readers: For a while now, I’ve been looking for guidance on designing useful messages and message-based systems, but without much luck. To help others and also because I learn by writing, I’m going to use my blog to document some of the messaging lessons I’ve learned over the past couple of years. I hope this blog entry and future ones like it don’t seem overly-pedantic; my only goal is to help clarify my own thoughts and perhaps help others looking for similar information on a topic with which I’ve personally struggled.

In this blog entry, I talk about the fundamentals of caching resource representations in HTTP-based distributed systems using the language of basic concepts while avoiding HTTP terminology which might sidetrack novice readers. This entry does assume some knowledge of HTTP (e.g. requests, responses, URIs), so if you find these concepts sidetracking you, I’d suggest you read the first couple of chapters of a book like HTTP: The Definitive Guide to familiarize yourself.

If you’re already familiar with HTTP caching (e.g. most likely anyone reading this via Planet Intertwingly), you may wish to skip this entry altogether, unless you’re curious about my take on the topic or are interested in looking for mistakes or misrepresentations. If you do find a problem, please add a comment and I’ll attempt to correct and/or clarify.

Intro

One of the benefits of developing distributed applications using the REST architectural style with the HTTP protocol is their first-class support for caching documents (or ‘entities-bodies’ in HTTP terminology). If you’re simply serving files using a world-class web server like Apache HTTP Server, you get some degree of caching for free. But in dynamic web applications, you’re often generating dynamic documents (e.g. an XML document containing data from a row in a relational database) rather than simply serving files, where the resource and the representation are equivalent.

Unless you’re using an application framework that automatically generates caching information for HTTP responses based on the framework’s meta-data model, you’ll likely have to roll your own caching logic. This presents both a challenge and an opportunity. The challenge is that you must learn about the various HTTP caching options so that you can intelligently apply them to your particular data model; the opportunity is that you can often take advantage of your data model’s semantics to perform smarter caching logic than out-of-the-box file system caching.

In this entry I describe the basic rationale for caching and then discuss the basic caching options possible with the HTTP protocol. Note that I describe these caching options at a very high level, without getting into many implementation details, and at this level the ‘HTTP caching options’ are more like general caching patterns, but nevertheless I describe them in the context and using the language of HTTP, since it’s both a ubiquitously deployed protocol and also the protocol with which I’m most familiar.

Why Cache?

Caching may be one of the most boring topics in software, but if you’re working with distributed systems (like the web), smart cache design is absolutely vital to both system scalability and responsiveness, among other things. In brief, a cache is simply a local copy of data that resides elsewhere. A computing component (whether hardware or software) uses a data cache to avoid performing an expensive operation like fetching data over a network or executing a computationally-expensive algorithm. The trade-off is that your copy of the data may become out of sync with the original data source, or stale, in caching terminology. Whether or not staleness matters depends on the nature of the data and the needs of your application.

For example if your web site displays the average daily temperature for Philadelphia over the past hundred years, you probably display a simple stored data element (e.g. “59 degrees F”) rather than performing this very expensive computation in realtime. Because it would take a long period of unusual weather to noticably affect the result, it doesn’t really matter if your cached copy doesn’t consider very recent temperatures. At the other extreme, an automated teller machine (ATM) definitely should not use a cached copy of your checking account balance when determining whether you have enough money to make a withdrawl since this might allow a malicious customer to make simultaneous withdrawls of his entire balance from multiple ATMs.

Generally speaking, the cacheability of a particular piece of data varies along two axes:

  • the volatility of the data
  • the potential negative impact of using stale data

HTTP Caching Options

Caching is a first-class concern of the REST architectural style and the HTTP protocol. Indeed, one of the main goals of HTTP/1.1 was to enhance the basic caching capabilities provided by HTTP/1.0 (see chapter 7 of Krishnamurthy and Rexford’s Web Protocols and Practice for an excellent discussion on the design goals of HTTP 1.1). At the risk of oversimplifying, for a given RESTful HTTP URI, you have three basic caching options:

  1. don’t use caching
  2. use validation-based caching
  3. use expiration-based caching

These options demonstrate the trade-offs between the need to avoid stale data and the performance benefits of using cached data. The no caching option means that a client will always fetch the most recent data available from an origin server. This is useful in cases where the data is extremely volatile and using stale data may have dire consequences. For example, anytime you view a list of current auctions on eBay (e.g. for 19th Century Unused US Stamps), you’ll notice many anti-caching directives in the HTTP response included to ensure that you always see the most recent state of the various auctions. The downside of no caching is that every request is guaranteed to incur some cost in terms of client-perceived latency, server resources (e.g. CPU, memory), and network bandwidth.

Validation-based caching allows an HTTP response to include a logical ‘state identifier’ (such as an HTTP ETag or Last-Modified timestamp) which a client can then resend on subsequent requests for the same URI, potentially resulting in a short ‘not modified’ message from the server. Validation-based caching provides a useful trade-off between the need for fresh data and the goal to reduce consumption of network bandwidth and, to a lesser extent, server resources and client-perceived latency.

For example, imagine a web page that changes frequently but not on a regular schedule. This web page could use validation-based caching so that each time a client attempts to view the page, the request goes all the way back to the origin server but may result in either a full response (if the client either has an old version of the page or no cached version of the page) or a terse ‘not modified’ response (if the client has the most recent version of the page). All other things being equal, in the ‘not modified’ case the response will be smaller (since the server sent no document), the server will do less work (since it doesn’t have to stream the page bytes from disk or memory), and the client may observe a faster load time since the message is smaller and the user agent (e.g. the browser) may even have a cached rendering of the page. These are certainly superior non-functional characteristics to the ‘no caching’ case and we don’t have to worry about seeing stale data (assuming the client does the right thing). However, the server still did some work to determine that the client had the most recent resource, the client still experienced some latency waiting for the ‘not modified’ message, and we still used some network bandwidth to send the request and received the (albeit short) response.

Expiration-based caching allows an origin server associate an expiration timestamp on a particular document so that clients can simply assume that their cached copy is safe to use if it has not passed its expiration date. In other words, an origin server asserts that the document is ‘good’ or ‘good enough’ for a certain period of tme. This sort of caching has fantastic performance characteristics but requires the designer to ensure either that:

  • the data won’t become stale before the expiration period ends, or
  • the impact of a client using stale data is negligible

An example of a resource that is well-suited for expiration-based caching is an image of a book cover on Amazon.com (e.g. the image of the cover of Steve Krug’s Don’t Make Me Think). While it’s possible that the book cover could change, it’s extremely unlikely and since image files are relatively large, it would be wise for Amazon to set an expiration date so that clients load the image from their cache without even asking Amazon whether or not they have the most recent version. If somehow the cover of the book does change between when you cache your copy and when your cache copy expires, it’s not a big deal unless you base your purchasing decisions on book cover aesthetics.

Another performance benefit of expiration-based caching is that even in the case where a client doesn’t have a valid cached copy of a document, it’s possible that a network intermediary (e.g. a proxy server) does. In this case a client requests a particular URI and before the request reaches the origin server, an intermediary determines that it has a still-valid cached copy of the document and returns its copy immediately rather than forwarding the request to the next intermediary or the origin server. It should be clear from these examples that expiration-based caching results in significantly less user-perceived latency and consumes significantly less network bandwith and server resources. The trick is that you have to guarantee either no staleness or feel confident that the risks involved in a client processing stale data are justified by the performance benefits. Note that its generally not possible to take advantage of intermediary caching over an HTTPS connection.

Summary

In this entry I’ve explained the basic rationale for why we cache things in distributed systems and given an overview of the three basic caching options in REST/HTTP-based systems. This information represents a bare-bone set of fundamental caching concepts, but you must understand these concepts thoroughly before being able to make informed caching design choices vis-à-vis your data model.

In future entries, I’ll build upon these foundational concepts to discuss caching design strategies for various scenarios.

After some prodding from Pat Mueller and James Governor, I signed up for a Twitter account about a week ago. It’s surprisingly fun.

If for some reason you wish to follow my daily activities, you can do so via the following two links:

Don’t know what Twitter is? Ask Wikipedia.