On the same day that Liam was born, I received news that one of my two papers published at the ICSE 2000 conference has been given the International Conference on Software Engineering’s Most Influential Paper Award for its impact on software engineering research over the past decade. The paper, A case study of open source software development: the Apache server, is co-authored by Audris Mockus, myself, and James Herbsleb. The MIP is an important award within the academic world; my thanks to the award committee and congrats to Audris and Jim. I wish I could have been there in South Africa for the presentation. This year’s award is shared with a paper by Corbett et al. on Bandera.

Interestingly, my other paper in ICSE 2000 was the first conference paper about REST, co-authored with my adviser, Dick Taylor. That must have caused some debate within the awards committee. As I understand it, the MIP award is based on academic citations of the original paper and any follow-up publication in a journal. Since I encouraged people to read and cite my dissertation directly, rather than the ICSE paper’s summary or its corresponding journal version, I am not surprised that the REST paper is considered less influential. However, it does make we wonder what would have happened if I had never published my dissertation on the Web. Would that paper have been cited more, or would nobody know about REST? shrug. I like the way it turned out.

The next two International Conferences on Software Engineering will be held in Hawaii (ICSE 2011), with Dick as the general chair, and Zürich (ICSE 2012). That is some fine scheduling on the part of the conference organizers! Fortunately, I have a pretty good excuse to attend both.

Roy in Day t-shirt holding Liam in hospitalAfter years of planning and hoping and preparing and learning and worrying and just getting on with life, I became a Daddy in March. It came as a bit of a shock, in spite of the eight months of watching the ultrasounds and taking classes and helping Cheryl as the little pod grew. We had just moved to a bigger place, still had dozens of boxes left to unpack before the weekend’s baby shower, and I had only been asleep for a few hours when Cheryl woke me up with the news: Hospital, now!

Three weeks early. Twenty-two days early, to be exact. All the books say that the range of 38-42 weeks is “normal”, so he was only eight days ahead of the curve and (thank goodness) beyond the stage of preemie health concerns. 2600 grams (5.732 lbs.) of joy, and a healthy Mommy as well. Woohoo! Of course, that also meant we were tossed out of the hospital about 40 hours after birth, thanks to our wonderful US healthcare system.

Liam on his side with fist toward cameraThe staff and facilities at Hoag Hospital were excellent, but the whole experience was marred by the rush out of the hospital and then a corresponding rush back to the hospital three days later after a test for jaundice turned up in the critical range. We really weren’t prepared for that one; I am still peeved that the test wasn’t automatically scheduled for day 4 (instead of waiting for our pediatrician to see him on day 5). However, a night in the ICU tanning bed, with extra feeding to help evacuate the bilirubin, was enough to get him back to a safe zone and he was good to go home again.

Twenty-two days early doesn’t sound like much, but it is huge. Most of our friends went long for their first baby, so I had this schedule in the back of my mind of all the things that I was going to finish by April so that I could take a long, relaxing break into parenthood. Bzzt! The Anaheim IETF meeting was being held the following week, just twelve miles from my house, and my fellow HTTP standard editors had planned a whole week of editing httpbis at or near my place. Bzzt! We had delayed buying a bunch of baby things until after the shower. Bzzt! We had all these classes on what to expect in terms of sensing the arrival and onset of labor. Bzzt!

Liam sleeping on his dad's shoulderNone of those plans truly mattered, in the grand scheme of things, but it taught me a quick lesson about my limitations as a working Daddy. At least some of my planning worked out, such as saving my vacation time so that I could spend the better part of six weeks at home. He is almost at two months now and still has to eat every three hours. I usually take the night shift and catch up with email while he sleeps on my shoulder. This weekend I discovered that I can actually type this way, with Liam sliding down a bit to warm his legs on my laptop, though I have to watch out when his little feet brush over the multitouch trackpad.

I’ll be catching up on the backblog soon. Now, if I can just get him to sleep long enough to edit a specification …

BTW, Liam is his nickname.

One of the chores that I do for the Apache HTTP server project, every three months or so, is to slog through the IANA media type registry to see what new media types have been registered and add them to the mime.types configuration file. This is one of the few things I do that is almost all pain for little or no gain. It takes hours to do it right because IANA has gone out of their way to make the registry impossible to process automatically via simple scripts. I don’t even get the pleasure of “changing the world” in some meaningful way, since Apache doesn’t update mime.types automatically when installed to an existing configuration.

BTW, if you are responsible for an existing Apache installation, please copy the current mime.types configuration file and install it manually — your users will thank you later not gripe as much about unsupported media types.

IANA is a quaint off-shoot of the Internet Engineering Taskforce that, much like the IETF, is still stuck in the 1980s. One would think that, given a task like “maintain a registry of all media types” so that Internet software can communicate, would lead to something that is comprehensible by software. Instead, what IANA has provided is a collection of FTP directories containing a subset of private registry templates, each in the original (random) submitted format, and nine separate inconsistently-formated index.html files that actually contain the registered types.

The first thought that any Web developer has when they look at the registry is that it should be laid out as a resource space by type. That is, each directory under “media-types” would be a major type (e.g., application, text, etc.) and then each file within those directories would correspond to exactly one subtype (e.g., html, plain, csv, etc.). Such a design would be easy to process automatically and fits with the organization’s desire to serve everything via both FTP and HTTP. Sadly, that is not the case. Most of the private registrations have some sort of like-named file within the expected directory to contain its registration template, but the names do not always correspond exactly to the subtype and the contents are whatever random text was submitted (rather than some consistent format that could be extracted). What’s worse, however, is that the standardized types do not have any corresponding file; instead, the type’s entry in the index may have some sort of link to the RFC or external specification that defines that type.

grumble

The second thought of any Web developer would be “oh, I’ll just have to process the index files to extract the media type fields.” Good luck. The HTML is not well-formed (even by HTML standards). It uses arbitrarily-created tables to contain the actual registry information. There is no consistency across the files in terms of the number of table columns, nor any column headers to indicate what they mean. There is no mark-up to distinguish the registry cells from other whitespace-arranging layout cells. And the registered types are occasionally wrapped in inconsistently-targeted anchors for links to the aforementioned template files.

grumble GRUMBLE

Okay, so the really stubborn Web developers think that maybe a browser can grok this tag soup and generate the table in some reasonably consistent fashion, which can then be screen-scraped to get the relevant information. Nope. It doesn’t even render the same on different browsers. In any case, the index files don’t contain the relevant information: the most important information (aside from the type name) is the unique filename extension(s) that are supposed to be used for files of that type. For that information, we have to follow the link to the registry template file, or RFC containing one or more template files, and look for the optional form field for indicating extensions. Most of the time, the field is empty or just plain wrong (i.e., almost all XML-based formats suggest that the filename extension is .xml, in spite of the fact that the only reason to supply an extension is so that all files of that extension can be mapped to that specific type).

sigh

And, perhaps the most annoying thing of all: the index files are obviously being generated from some other data source that is not part of the public registry.

Normally, what I am left with is a semi-manual procedure. I keep a mirror of the registry files on my laptop and, each time I need to do an update, I pull down a new mirror and run a diff between the old and new index files. I then manually look-up the registry template for file extensions or, if that fails, do a web search for what the deployed software already does. I then do a larger Web search for documentation that various companies have published about their unregistered file types, since I’ve given up on the idea that companies like Adobe, Microsoft, and Sun will ever register their own types before deploying some half-baked experimental names that we are stuck with forever due to backwards-compatibility concerns.

Unfortunately, yesterday I messed up that normal procedure. I forgot that I had started to do the update a month ago by pulling down a new mirror, but hadn’t made the changes yet. So I blew away my last-update-point before doing the diff.

groan

After reliving all of the above steps, I ended up with a new semi-manual procedure:

wget -m ftp://ftp.iana.org/assignments/media-types/
cd ftp.iana.org/assignments/media-types
foreach typ (`find * -type d -print`)
   links -dump $typ/index.html | \
      perl -p -e "s|^\s+|$typ/|;" >> mtypes.txt
end
# manually edit mtypes.txt to remove the garbage lines
foreach typ (`cut -d ' ' mtypes.txt`)
   grep -q -i -F "$typ" mime.types || echo $typ
end

That gave me a list of new registered types that were not already present in mime.types. I still had to go through the list manually, add each type to its location within mime.types, and search for its corresponding file extension within the registry templates. As usual, most of the types either had no file extension (typical for types that are only expected to be used within message envelopes) or non-unique extensions that can’t be added to the configuration file because they would override some other (more common) type.

Please, IANA folks, fix your registries so that they can be read by automated processes. Do not tell me that I have to write an RFC to specify how you store the registry files. The existing mess was not determined by an RFC, so you are free to fix it without a new RFC. If you have software generating the current registry, then I will be more than happy to fix it for you if you provide me with the source code. At the very least, include a text/csv export of whatever database you are using to construct the awful index files within the current registry.

Why am I bothering with all this? Because media types are the only means we have for an HTTP sender to express the intent for processing a given message payload. While some people have claimed that recipients should sniff the data format for type information, the fact is that all data formats correspond to multiple media types. Sniffing a media type is therefore inherently impossible: at best, it can indicate when a data format does not match the indicated media type; at worst, it breaks correct configurations and creates security holes. In any case, sniffing cannot determine the sender’s intent.

The intent can only be expressed by sending the right Content-Type for a given resource. The resource owner needs to configure their resource correctly. Even though Apache provides at least five different ways to set the media type, most authors still rely on the installed file extension mappings for representations that are not dynamically-generated. Hence, most will rely on whatever mime.types file has been installed by their webmaster, even if it hasn’t been updated in ten years.

How old is your mime.types file?

Tim Bray’s article on RESTful Casuistry revisits an odd meme in the REST debates that I’ve been meaning to discredit for a while.

Some people think that REST suggests not to use POST for updates.  Search my dissertation and you won’t find any mention of CRUD or POST. The only mention of PUT is in regard to HTTP’s lack of write-back caching.  The main reason for my lack of specificity is because the methods defined by HTTP are part of the Web’s architecture definition, not the REST architectural style. Specific method definitions (aside from the retrieval:resource duality of GET) simply don’t matter to the REST architectural style, so it is difficult to have a style discussion about them. The only thing REST requires of methods is that they be uniformly defined for all resources (i.e., so that intermediaries don’t have to know the resource type in order to understand the meaning of the request). As long as the method is being used according to its own definition, REST doesn’t have much to say about it.

For example, it isn’t RESTful to use GET to perform unsafe operations because that would violate the definition of the GET method in HTTP, which would in turn mislead intermediaries and spiders.  It isn’t RESTful to use POST for information retrieval when that information corresponds to a potential resource, because that usage prevents safe reusability and the network-effect of having a URI. But why shouldn’t you use POST to perform an update? Hypertext can tell the client which method to use when the action being taken is unsafe. PUT is necessary when there is no hypertext telling the client what to do, but lacking hypertext isn’t particularly RESTful.

POST only becomes an issue when it is used in a situation for which some other method is ideally suited: e.g., retrieval of information that should be a representation of some resource (GET), complete replacement of a representation (PUT), or any of the other standardized methods that tell intermediaries something more valuable than “this may change something.” The other methods are more valuable to intermediaries because they say something about how failures can be automatically handled and how intermediate caches can optimize their behavior. POST does not have those characteristics, but that doesn’t mean we can live without it. POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”

I think the anti-POST meme got started because of all the arguments against tunneling other protocols via HTTP’s POST (e.g., SOAP, RSS, IPP, etc.). Somewhere along the line people started equating the REST arguments of “don’t violate HTTP’s method definitions” and “always use GET for retrieval because that forces the resource to have a URI” with the paper tiger of “POST is bad.”

Please, let’s move on. We don’t need to use PUT for every state change in HTTP. REST has never said that we should.

What matters is that every important resource have a URI, therein allowing representations of that resource to be obtained using GET.  If the deployment state is an important resource, then I would expect it to have states for undeployed, deployment requested, deployed, and undeployment requested. The advantage of those states is that other clients looking at the resource at the same time would be properly informed, which is just good design for UI feedback. However, I doubt that Tim’s application would consider that an important resource on its own, since the deployment state in isolation (separate from the thing being deployed) is not a very interesting or reusable resource.

Personally, I would just use POST for that button. The API can compensate for the use of POST by responding with the statement that the client should refresh its representation of the larger resource state. In other words, I would return a 303 response that redirected back to the VM status, so that the client would know that the state has changed.

As you may have noted, my last post seems to have hit a nerve in various communities, particularly with those who are convinced that REST means HTTP (because, well, that’s what they think it means) and that any attempt by me to describe REST with precision is just another elitist philosophical effort that won’t apply to those practical web developers who are just trying to get their javascript to work on more than one browser.

Apparently, I use words with too many syllables when comparing design trade-offs for network-based applications. I use too many general concepts, like hypertext, to describe REST instead of sticking to a concrete example, like HTML. I am supposed to tell them what they need to do, not how to think of the problem space. A few people even complained that my dissertation is too hard to read. Imagine that!

My dissertation is written to a certain audience: experts in the fields of software engineering and network protocol design. These are folks with long industry careers or graduate degrees, usually Ph.D.s who have spent decades learning about their field, identifying an untrodden path to pursue advanced research, and eventually becoming so familiar with that path that they are (hopefully) able to learn something that nobody else in the world has revealed before. In the process, they become specialists, because it is only through specialization that a human being can become sufficiently knowledgeable to find what has yet to be known in a field as large as computing. It is only by becoming specialists that we can understand each other when we explain what we have learned, and thereby grow the field of knowledge over time.

James Burke described the problem of specialization in the final episode of his first series on BBC, Connections. If you are having a hard time following my work, then I strongly suggest you go find a copy of the old episodes somewhere and watch them, bearing in mind that it was first broadcast in 1978, when the folks who brought you the Web were at their most impressionable early age. Mr. Burke would appreciate that connection, I think. What he said deserves a bit of transcripting on my part:

The other, general thing to be said about how change comes about through innovation, and especially about the rate in which that change occurs, is that: the easier you can communicate, the faster change happens.

I mean, if you look back at the past, in that light, you’ll see that there was a great surge in invention in the European Middle Ages, as soon as they had reestablished safe communication between their cities, after the so-called dark ages. There was another one, in the sixteenth century, when these [books] gave scientists and engineers the opportunity to share their knowledge with each other, thanks to a German goldsmith called Johannes Gutenberg who’d invented printing back in the 1450s. And then, when that developed out there, telecommunications, oh a hundred-odd years ago, then things really started to move.

It was with that second surge, in the sixteenth century, that we moved into the era of specialization: people writing about technical subjects in a way that only other scientists would understand. And, as their knowledge grew, so did their need for specialist words to describe that knowledge. If there is a gulf today, between the man-in-the-street and the scientists and the technologists who change his world every day, that’s where it comes from.

It was inevitable. Everyday language was inadequate. I mean, you’re a doctor. How do you operate on somebody when the best description of his condition you have is “a funny feeling in the stomach?” The medical profession talks mumbo jumbo because it needs to be exact. Or would you rather be dead?

And that’s only a very obvious example. Trouble is, when I’m being cured of something, I don’t care if I don’t understand. But what happens when I do care? When, say, the people we vote for are making decisions that effect our lives deeply, `cause that is, after all, when we get our say, isn’t it? When we vote? But say the issue relates to a bit of science and technology we don’t understand? Like, how safe is a reactor somebody wants to build? Or, should we make supersonic airplanes? Then, in the absence of knowledge, what is there to appeal to except our emotions? And then the issue becomes “national prestige,” or “good for jobs,” or “defense of our way of life,” or something. And suddenly you’re not voting for the real issue at all.

[James Burke, “Yesterday, Tomorrow and You” (19:02-21:30), 1978]

Still timely after all these years, isn’t it?

As scientists go, I am a generalist: the topics that I care about range from international politics to physics, with most applications of computing somewhere in between. However, when I send out a message to API designers, I expect the audience to be reasonably competent in the field. I have to talk to them as a specialist because I want them to understand, as specialists themselves, exactly what I am trying to convey and not some second-order derivatives. Most of the terms that I use should already be familiar to them (and thus it is a waste of everyone’s time for me to define them). When there is a concern about a particular term, like hypertext, it can be resolved by pointing out the relevant definitions that I use as an expert in the field.

I don’t try to tell them exactly what to do because, quite frankly, I don’t have anywhere near enough knowledge of their specific context to make such a decision. What I can do is tell them what isn’t REST or that doesn’t fit my definitions, because that is something about which I am guaranteed to know more than anyone else on this planet. That’s what happens when you complete a dissertation on a topic.

So, when you find it hard to understand what I have written, please don’t think of it as talking above your head or just too philosophical to be worth your time. I am writing this way because I think the subject deserves a particular form of precision. Instead, take the time to look up the terms. Think of it as an opportunity to learn something new, not because I said so, but because it will do you some personal good to better understand the depth of our field. Not just the details of what I wrote, but the background knowledge implied by all the strange terms that I used to write it.

Others will try to decipher what I have written in ways that are more direct or applicable to some practical concern of today. I probably won’t, because I am too busy grappling with the next topic, preparing for a conference, writing another standard, traveling to some distant place, or just doing the little things that let me feel I have I earned my paycheck. I am in a permanent state of not enough time. Fortunately, there are more than enough people who are specialist enough to understand what I have written (even when they disagree with it) and care enough about the subject to explain it to others in more concrete terms, provide consulting if you really need it, or just hang out and metablog. That’s a good thing, because it helps refine my knowledge of the field as well.

We are communicating really, really fast these days. Don’t pretend that you can keep up with this field while waiting for others to explain it to you.

I am getting frustrated by the number of people calling any HTTP-based interface a REST API. Today’s example is the SocialSite REST API. That is RPC. It screams RPC. There is so much coupling on display that it should be given an X rating.

What needs to be done to make the REST architectural style clear on the notion that hypertext is a constraint? In other words, if the engine of application state (and hence the API) is not being driven by hypertext, then it cannot be RESTful and cannot be a REST API. Period. Is there some broken manual somewhere that needs to be fixed?

API designers, please note the following rules before calling your creation a REST API:

  • A REST API should not be dependent on any single communication protocol, though its successful mapping to a given protocol may be dependent on the availability of metadata, choice of methods, etc. In general, any protocol element that uses a URI for identification must allow any URI scheme to be used for the sake of that identification. [Failure here implies that identification is not separated from interaction.]
  • A REST API should not contain any changes to the communication protocols aside from filling-out or fixing the details of underspecified bits of standard protocols, such as HTTP’s PATCH method or Link header field. Workarounds for broken implementations (such as those browsers stupid enough to believe that HTML defines HTTP’s method set) should be defined separately, or at least in appendices, with an expectation that the workaround will eventually be obsolete. [Failure here implies that the resource interfaces are object-specific, not generic.]
  • A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]
  • A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC’s functional coupling].
  • A REST API should never have “typed” resources that are significant to the client. Specification authors may use resource types for describing server implementation behind the interface, but those types must be irrelevant and invisible to the client. The only types that are significant to a client are the current representation’s media type and standardized relation names. [ditto]
  • A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user’s manipulation of those representations. The transitions may be determined (or limited by) the client’s knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

There are probably other rules that I am forgetting, but the above are the rules related to the hypertext constraint that are most often violated within so-called REST APIs. Please try to adhere to them or choose some other buzzword for your API.

About three weeks ago, a new “standard” for Content Management Interoperability Services (CMIS) was announced by EMC, IBM, and Microsoft with the usual fanfare of being the best thing since sliced bread and compliant with the latest buzzwords. One of those buzzwords, REST (as in Representational State Transfer), happens to be defined by my dissertation. I am getting tired of big companies making idiotic claims about REST and their so-called RESTful architectures. The only similarity between CMIS and REST is that they both have four-letter acronyms.

Note to our technology industry rags: A standard is an approved measure against which multiple independent organizations have agreed (by choice or by force) to have their products tested for compliance. A standards-effort is what we call a proposal when it is being actively worked on by some standards development organization. CMIS, on the other hand, is just a vendor proposal that is being submitted to OASIS. It only becomes a standards-effort once the OASIS members agree to host it, which shouldn’t be a problem given the pay-to-play nature of OASIS, and might become a standard if the final specification is approved.

CMIS is a thin veneer on RDBMS-based data repositories that provides a data model for document-like objects within filesystem-like folders, basic file versioning, and access via SQL queries and local object references. It is exactly the kind of document model one would expect within a legacy document management system that is backed by a large relational database and authored via Microsoft Office applications. No surprise, given the sponsors, and there are plenty of good reasons why folks would want to support such data models. For the interface, CMIS includes both a Web Services SOAP/WSDL protocol binding, tightly coupled to the data model, and a REST protocol binding, which also happens to be tightly coupled to the data model.

REST is an architectural style, not a protocol, and thus announcing it as a protocol binding is absurdly ignorant behavior for a group of technology companies. The RESTish protocol binding actually being proposed by CMIS is AtomPub, or at least it would be if not for the huge number of unnecessary protocol extensions that tunnel the Web Services interface through fake-Atom and fake-HTTP. The examples assume a single-script gateway that accepts methods in query strings with CMIS-* header fields to bind search scope, just like SOAP envelopes and bodies are used to tunnel object-specific protocols over HTTP. Are there any REST constraints that this binding doesn’t violate?

The SOAP protocol binding, in contrast, is more direct: half the number of pages and defined with WSDL and XSD. It is obvious that the SOAP binding was designed first and the AtomPub binding added for marketing reasons. I don’t think much of SOAP bindings, in general, but at least this one is consistent with the limited data model, the design of other SOAP-based services, and the goal of providing a control-oriented API for document management.

CMIS is a classic example of what happens when a control-oriented interface is slapped onto an HTTP-based protocol instead of redesigning the interface to be data-oriented. All of the lowest-common-denominator constraints of CMIS’ data model, which are necessary for the SOAP interface because its operations are object-specific, are completely unnecessary for an HTTP interface that is properly designed to be data-oriented. An HTTP interface doesn’t need to be limited to Atom feed formats for traversing folder hierarchies; hypertext is a lot more powerful than temporally-ordered query results. An HTTP interface doesn’t need to forbid the versioning of folders; hypertext can tell the client what operations are allowed on each folder. An HTTP interface doesn’t need a special query media type that (insanely) consists of a raw SQL statement embedded in XML sugar coating; any HTTP resource can be a stored procedure and any hypertext response can contain a list of results. An HTTP interface doesn’t need to traverse folders or query databases in order to access an object summary that points to a content stream that might then be downloaded; hypertext allows each object to be identified by a URI and manipulated independent of the discovery process.

CMIS is a Web Services interface for document management. It should be renamed WS-DMS and tossed on the same pile of other specs from that genre. WebDAV is a far more capable interface that has already been standardized to provide document-level write-access and versioning over HTTP. WebDAV isn’t very RESTful either, because it relies on folder operations instead of hypertext, but at least the WebDAV interface doesn’t interfere with the read-only side of HTTP, WebDAV is already supported by authoring tools with filesystem semantics, and Microsoft has already deployed hundreds of proprietary extensions to WebDAV within its Exchange and SharePoint server products. For that matter, CMIS would be a lot more interesting if it were designed as an extension to CIFS instead of HTTP or SOAP.

My bet is that the document repository vendors will continue to focus on making their own native HTTP interfaces more efficient, since that is how customers will evaluate their performance when integrated within heterogeneous architectures. At best, CMIS will become another method for data migration away from Web Services and legacy repositories, which may be justification enough for implementation. However, they should stop calling their AtomPub derivative a REST binding. Even if they manage to redesign the HTTP interface during the standards process, it will still only be an HTTP binding and only one part of an overall application architecture that could be called RESTful.

I need to address that teaser I included in my last post on paper tigers and hidden dragons:

People need to understand that general-purpose PubSub is not a solution to scalability problems — it simply moves the problem somewhere else, and usually to a place that is inversely supported by the economics.

I am talking here about the economics of sustainable systems, which I consider to be part of System Dynamics and Systems Engineering. Software Architecture and System Dynamics are deeply intertwined, just as Software Engineering and Systems Engineering are deeply intertwined, but they are not the same thing.

Fortunately, this discussion of economics has nothing to do with the financial economy or the proposed bailout on Wall Street. At least, I don’t think it does, though event-based integration and publish-subscribe middleware are most popular within financial institutions (and rightly so, since it matches their natural data stream of buy/sell events and news notifications that drive financial investing/speculation).

Economists have a well-known (but often misunderstood) notion of the economies of scale, which can be summarized as the effect of increased production on the average cost of production (see also Wikipedia and Investopedia). For most types of business, the average cost of producing some item is expected to go down as the business expands. This is a natural consequence of both the spreading of fixed costs over many more customers and the ability of larger distributors to demand more competitive pricing from their upstream suppliers (e.g., Walmart).

If we look at such a business in terms of its system dynamics, we should see “load” on the system increase at a much lower rate than the increase in customers. In other words, the additional customers should not, on average, create more cost than they generate in revenue once the business passes the initial fixed infrastructure cost break-even point. If they do create more cost, then the business is not sustainable at larger scales. For many businesses, that’s just fine — most of the personal services economy is based on models with a small window between “enough customers to survive” and “too many customers to handle without degrading service.”

The same is true of software architectures for network-based applications. As the number of consumers increase, the per-consumer cost of the overall system must decrease in order to sustain the system. Relative economies of scale were used by Nikunj Mehta in his dissertation to compare architectural choices: “A system is considered to scale economically if it responds to increased processing requirements with a sub-linear growth in the resources used for processing.” I like that as a quantifiable definition for the architectural property of scalability. However, Nikunj is focused on comparing the relative scalability of particular implementations within an architecture-driven modular framework, not the type of economics that I alluded to in my post.

It is important to think of system dynamics in terms of economics and not just system throughput, since the impact of growth in non-sustainable systems is often unrelated to the individual data transfers or protocols in use. In fact, some architectures that are more efficient from the standpoint of per-event network usage are also more susceptible to inverse economies of scale. PubSub is one such example.

PubSub (short for publish/subscribe) is a derivative of one of the most analyzed architectural styles in software: event-based integration (EBI). From a purely theoretical perspective, event-based integration is a natural fit for systems that monitor real-time activities, particularly when the number of event sources outnumber the recipients of event notifications (e.g, graphical user interfaces and process control systems). However, EBI does not scale well when the number of recipients greatly outnumbers the sources. Researchers have been studying Internet-scale event notification systems for over two decades. Several derivative styles have been proposed to help reduce the issue of scale, including PubSub, EventBus, and flood distribution. There are probably others that I have left out, but they usually boil down into the use of event brokers (intermediaries) and/or a shared data stream (multicast and flood).

PubSub improves EBI scalability by filtering events, either by publisher-driven topics or subscriber-chosen content, and only delivering notifications to the matching subscribers of those topics/content. In theory, if we are careful in choosing the topics such that they partition the events into relatively equal, non-overlapping subsets, and if the consumers of those events are only interested in subscribing to a small subset of those topics, then the corresponding reduction in notification traffic is substantial. Group chat, as in Jabber rooms, is one such example. For most applications, however, such a partitioning does not exist.

EventBus takes a slightly different tack on scalability by forcing traffic onto shared distribution streams, usually IP multicast or flood-distribution channels, and having all subscribers pick their own events off that stream. Unfortunately, multicast does not scale socially (a tragedy of the commons) and rarely succeeds across organizational domains, whereas flooding only succeeds when the signal/noise ratio is high.

Of course, there are many examples of successful EBI systems. USENET news, for example, combined both hierarchical PubSub and flood-distribution to great effect. Adam Rifkin and Rohit Khare did an extensive survey of Internet-scale event notification systems that still stands the test of time. And, as I said at the beginning, EBI is the most popular style of system architecture within the financial industry, where scalability is offset by relatively high subscription costs.

So, what’s the point of this rambling post? Ah, yes, now I remember: the inverse economics of general-purpose PubSub.

People are greedy. They tend to be event-gluttons, wishing to receive far more information than they actually intend to read, and rarely remember to unsubscribe from event streams. When they stop using a service, they tend to remove the software that reports the notifications locally, not the software or subscriptions that cause it to be delivered. If it were not for automatic bounce handling, Apache mailing lists would have melted down years ago. The only way to constrain such users is a tax on either subscriptions or deliveries.

PubSub systems have an imbalance of effort to subscribe versus effort to deliver. In other words, a single user request results in a potentially large number of server obligations. As such, a benevolent user can produce a disproportionate load on the publisher or broker that is distributing notifications. On the Internet, we don’t have the luxury of designing just for benevolent users, and thus in HTTP systems we call such requests a denial-of-service exploit. Any request architecture that allows a malevolent user to create load as a multiple of the effort required to make the request will result in denial of service attacks, since they succeed without overwhelming the attacker’s own connectivity.

The only solution to the inverse economics of PubSub that I know of is to match the costs to the revenue stream. In other words, charge consumers for the subscriptions they choose to receive, at a rate corresponding to the cost of delivering the subscribed stream at the subscribed rate and over the subscribed medium. That is exactly why there is no standard mechanism for notifications in HTTP: there is no scalable solution for subscribing to events without also standardizing the creation of user accounts and event filtering mechanisms, neither of which tend to be standardized because user management is an application of its own and event filtering is always domain-specific.

What does this mean for a system like Jabber? It means that each jabber server (the broker) needs to keep watch on the types of systems being supported via its infrastructure and adjust their subscription model accordingly. Applications like chat will work great, as will most applications wherein the event sources far outnumber the notification recipients. Applications with inverse economics need to pay their own way.

What does this mean for a system like Twitter? Well, that’s an interesting case because it incorporates both subscription-based following of events and RESTful interfaces for event logs. When a user only follows a small set of twitters, or the set of twitters being followed are mostly quiet, then they might be more economically supported by an EBI system instead of polling their twitter home stream. However, that won’t be true for people who want to follow many twitters or who are only interested in looking at twits in batch (basically, anyone who polls at a lower frequency than their incoming twit rate). In particular, malevolent followers (those robots that follow everyone, for whatever reason) are much more dangerous to an EBI system.

My guess is that Twitter will eventually move to an ad-supported revenue stream on their Web interfaces, since ad revenue increases at the same rate as RESTful applications need to scale (i.e., RESTful systems have a natural economy of scale that makes them sustainable even when their popularity isn’t anticipated). Likewise, Twitter’s event-based interfaces will move toward a more sustainable subscription model that charges for excessive following and premium services like real-time event delivery via XMPP or SMS. That kind of adjustment is necessary to rebalance the system dynamics.

One of the sessions that I attended at OSCON was “Beyond REST? Building Data Services with XMPP PubSub” by Evan Henshaw-Plath and Kellan Elliott-McCrea. I think you can guess why that made me curious, but it was interesting to see how much that curiosity was shared by the rest of the conference: the room filled up long before the scheduled start. They certainly gave a very entertaining talk and one that spilled over into the blogosphere in posts by Stephen O’Grady, Joshua Schachter, and Debasish Ghosh.

Unfortunately, the technical argument was a gigantic paper tiger, which is a shame given that there are plenty of situations in which event-based architectures are a better solution than REST-based architectures. I made a brief comment about notification design and how they seemed to be ignoring a good twenty years of research on Internet-scale event notification systems. People need to understand that general-purpose PubSub is not a solution to scalability problems — it simply moves the problem somewhere else, and usually to a place that is inversely supported by the economics. I’ll have to explain that in a later post, since this one is focused on the technical.

Here’s the tiger:

On July 21st, 2008, friendfeed crawled flickr 2.9 million times to get the latest photos of 45,754 users, of which 6,721 of that 45,754 potentially uploaded a photo.

Conclusion: Polling sucks.

If you’d like to learn more about their XMPP solution, the slides are available from the OSCON website. I do think there is a lot to be learned from using different interaction styles and true stream-oriented protocols (the kind that don’t care about lost packets), but this FriendFeed example is ridiculous. It took me less than 30 seconds to design a better solution using nothing more than HTTP, and that while sitting in the middle of a conference session. This is exactly what Dare means by: “If a service doesn’t scale it is more likely due to bad design than to technology choice.

They are comparing the efficiency of blind polling using HTTP crawls to a coordinated PubSub services setup with XMPP.  Spidering an entire site is obviously not going to be efficient if you are only interested in what has changed. One advantage it has is that you don’t need to cooperate with the information provider. However, for the specific example of FriendFeed polling Flickr, cooperation is easy (both companies gain immensely by cooperating) and essential (2.9 million requests per day will get you blocked from any site that doesn’t want cooperation).

The solution, which I mentioned briefly at the talk, is to provide a resource that reflects all of the changes on Flickr during a given time period. Anyone (not just FriendFeed) can perform a GET on that resource to find out what has changed. In fact, one such example is the SUP (Simple Update Protocol) just introduced by FriendFeed. However, I had something more efficient in mind.

Web architects must understand that resources are just consistent mappings from an identifier to some set of views on server-side state. If one view doesn’t suit your needs, then feel free to create a different resource that provides a better view (for any definition of “better”). These views need not have anything to do with how the information is stored on the server, or even what kind of state it ultimately reflects. It just needs to be understandable (and actionable) by the recipient.

In this case, we want to represent the last-updated state of all Flickr users in a way that minimizes the lag between event and notification (let’s just assume that one minute is “fast enough” to receive a change notification). The simplest way of doing that is to log state changes by userid in a sequence of journal-style resources named according to the server’s UTC clock minutes.  For example,

http://example.com/_2008/09/03/0000
http://example.com/_2008/09/03/0001
http://example.com/_2008/09/03/0002
...
http://example.com/_2008/09/03/2359

This URI pattern instantly drops the poll count from 2.9 million to 1440 (the number of minutes in a day) plus whatever pages are retrieved after we notice a user has changed their state. Alternatively, we could define a single append-only resource per day and use partial GET requests to retrieve only the bits since the last poll, but that tends to be harder on the server. Representations for the above resources can be generated by non-critical processes, cached, and even served from a separate distribution channel (like SUP).

What, then, should we include in the representation? Well, a simple list of relative URIs is good enough if the pattern is public, but that would be unwise for a site that features limited publication (obscured identifiers so that only people who have been given the URI can find the updated pictures). Likewise, the list might become unwieldy during event storms, when many users happen to publish at once. Of course, like any good CS problem, we can solve that with another layer of indirection.

Instead of a list of changed user ids or URIs, we can represent the state as a sparse bit array corresponding to all of Flickr’s users. I don’t know exactly how many users there are at Flickr, but let’s be generous and estimate it at one million. One million bits seems like a lot, but it is only 122kB in an uncompressed array. Considering that this array will only contain 1s when an update has occurred within the last minute, my guess is that it would average under 1kB per representation.

I can just imagine people reading “sparse bit array” and thinking that I must be talking about some optimal data structure that only chief scientists would think to use on the Web. I’m not. Any black-and-white GIF or PNG image is just a sparse bit array, and they have the nice side-effect of being easy to visualize. We can define our representation of 1 million Flickr users to be a 1000×1000 pixel black-and-white image and use existing tools for its generation (again, something that is easily done outside the critical path by separate programs observing the logs of changes within Flickr). I am quite certain that a site like Flickr can deliver 1kB images all day without impacting their scalability.

Finally, we need a way to map from the bits, each indicating that a user has changed something, to the much smaller set of users that FriendFeed knows about and wishes to receive notifications. If we can assume that the mapping is reasonably persistent (a month should be long enough), then we can define another resource to act as a mapping service. Such as,

http://example.com/_2008/09/03/users?{userid}

which takes as input a userid (someone that a friend already knows and wants to monitor for changes) and returns the coordinate within the sparse array (the pixel within the 1000×1000 image) that corresponds to that user. FriendFeed can store that accumulated set of “interesting users” as another image file, using it like an “AND mask” filter to find the interesting changes on Flickr.

Note that this is all just a quick thought experiment based on the general idea. In order to build such a thing right, I’d have to know the internals of Flickr and what kinds of information FriendFeed is looking to receive, and there are many potential variations on the representations that might better suit those needs (for example, periods could be overlapped using gray-scale instead of B&W). The implementation has many other potential uses as well, since the sequence of images provide an active visualization of Flickr health.

I should also note that the above is not yet fully RESTful, at least how I use the term. All I have done is described the service interfaces, which is no more than any RPC. In order to make it RESTful, I would need to add hypertext to introduce and define the service, describe how to perform the mapping using forms and/or link templates, and provide code to combine the visualizations in useful ways. I could even go further and define these relationships as a standard, much like Atom has standardized a normal set of HTTP relationships with expected semantics, but I have bigger fish to fry right now.

The point is that you don’t need to change technologies (or even interaction styles) to solve a problem of information transfer efficiency. Sometimes you just need to rethink the problem. There are many systems for which a different architecture is far more efficient, just as XMPP is far more efficient than HTTP for something like group chat. Large-scale collaborative monitoring is not one of them. An XMPP solution is far more applicable to peer-to-peer monitoring, where there is no central service that is interested in the entire state of a big site like Flickr, but even then we have to keep in mind that the economics of the crowd will dictate scalability, not the protocol used for information transfer.

Hmm, over five months since my last (published) post. I don’t know how folks like Tim and Sam keep writing every day. I’ll have to ask when I see them at ApacheCon US in New Orleans.

I fell off the blogon while trying to prepare my two presentations for ApacheCon Europe and have yet to catch up with my email since then. Seriously, I spend way too much time reading and deleting email, so if you’ve sent me something and haven’t received a response then please understand that it isn’t because I don’t like you (unless, of course, you sent it to several of my email addresses at once, in which case it was probably blocked as spam).

Whenever I did find time to blog, I found myself wanting to update the software first, fiddle with the comment settings, or try out various blogspam avoidance tricks. My personal blog runs on WordPress, both because it’s cool cutting-edge open source software and because my ISP won’t let me run anything written in Java (let alone Day’s commercial products). It also has the nice side-effect of forcing me to pay attention to everyday web content management issues, instead of just the more abstract (implementation independent) issues of Web protocols. In other words, it keeps me in the loop in terms of what people are looking for in public-facing web content management. And, if I want to see what it looks like through our own software, I can just wander over to Planet Day.

One area in which open source blogging software is way ahead of the curve is support for collaborative antispam solutions. I’ve been much happier with my blog since I installed the TypePad Antispam plugin. The social filtering of blog spam is even more effective than it has been for Gmail, especially given the setting to automatically discard comments that look like spam on old posts. I wonder how long it would take Michael Marth to write an Apache Sling plugin for comment filtering using the TypePad service?

Next Page »