Sat 22 Mar 2008
On software architecture
Posted by Roy T. Fielding under software architecture, web architecture
[3] Comments
I ran across a spout yesterday about the uniform interface in REST. Actually, it is more of an attack on resource-oriented architecture (ROA) with the usual sideswipes at REST. Like most criticisms of my work, it got me thinking… not just about what was being criticized (in this case, the lack of REST constraint enforcement in HTTP), but how to fix the underlying problem that a lot of folks simply don’t understand the differences between software architecture and implementation, let alone between architectural styles and software architecture.
A software architecture is an abstraction of the run-time elements of a software system during some phase of its operation. A system may be composed of many levels of abstraction and many phases of operation, each with its own software architecture.
Let’s start with a simple (yet surprisingly complex) example. My blog is a network-based application — a specific grouping of functionality and behavior that allows me to accomplish a desired task using multiple computers that communicate via a network. That’s what application means in our industry: applying computing to accomplish a given task.
The implementation of my blog consists of, at the time of this writing, an installation of Apache HTTP Server 2.0.61 that is executing PHP 5.2.3 in order to run the scripts from WordPress 2.3.3 which use sockets to interact with another server running MySQL 5.0 in order to store, manipulate, and retrieve database entries that form the content of my blog when passed through various template scripts and delivered to any number of specific versions of Web-based clients, usually via HTTP, many of which apply stylesheets prior to rendering said content in a form that is (hopefully) readable by you. Phew! And that’s just the Web interface. WordPress has at least four other interfaces that are not Web-based, and each has its own set of network or server-side clients with their own specific versions, and the sum of all of these individual components make up the implementation of what I call Untangled.
Note that some of this blog’s implementation (the clients used by other readers) is not under my control. The vast majority of it, in fact, regardless of whether we count in lines of code or software installs. If you don’t think the clients should be considered part of the implementation, then think again: all this effort would be wasted if the words can’t be read.
Within my blog implementation there are many software architectures. A huge number of architectures, in fact, at various levels of abstraction and component granularity. I could probably spend months trying to describe them all and would still miss a few valid abstractions. If we limit ourselves to just the network-based architectures (the ones where component interaction is limited to message exchange), then we might just have a chance to discuss them in a week. However, just one example should be enough to get the idea, and the Atom publishing mechanism within WordPress is ideally suited for our purpose.
Atom is a great example of how architectures are often nested within other architectures. Using typed links (hypertext), a couple XML media types, and a subset of HTTP, Atom defines a range of expected behaviors and interactions for the purpose of authoring blog entries and syndicating feeds. The Atom implementation within my blog consists of the various Web browsers and Atom clients out on the Internet (which are thankfully pretty consistent at the moment) and a couple scripts, utility functions, and links within the theme headers of my WordPress installation. Compare that to the Atom architecture within my blog, which consists of just the externally observable behavioral abstraction: clients that consume or produce the atom formats, send AtomPub request messages, and receive AtomPub responses; a server that identifies and provides specific resources that accept HTTP requests, stores entries, and responds in accordance with the Atom protocols.
Note that when we talk about the implementation of my blog versus an architecture of my blog, we are still talking about the same software — the only difference between the two is the amount of extraneous detail being ignored. Of course, another advantage of the architecture view is that we can talk about the interactions independent of the specific implementations, and thus find common ground in which to standardize the interactions in the form of an application-level protocol. It is also easier to perceive systemic effects at the higher architectural levels (architectural properties, such as evolvability, that encompass many implementations over time).
So, where do software architectural styles fit within this scheme?
Representational State Transfer (REST) is a software architectural style, not a software architecture. REST is just one of many software architectural styles. Specifically, REST is a named set of constraints on component interaction that, when obeyed, cause the resulting architecture to have certain properties (preferably, desirable properties). Like software patterns, an architectural style packages a set of constraints under a convenient name and tells us about the properties that are induced by those constraints when we follow the style. There are some subtle differences between architectural patterns and architectural styles, mainly due to the different audiences/communities, but they are equivalent from the point of view of an architect of network-based software.
We can talk about architectural styles as an abstraction in general. We can also talk about the architectural styles found within specific architectures, such as the Web architecture as a whole, or within the Atom publishing protocols as standardized, or even the architecture observed by abstracting a specific implementation of WordPress. We can compare different architectures that perform the same function, along with the different styles found within those architectures, in terms of their architectural properties. Furthermore, we can talk about how a given implementation matches (or fails to match) an intended architecture that is supposed be an example of a given style.
Discussion of software architecture therefore falls into the same tracks that we often hear when real-world architects talk about the architecture of buildings. They might have discussions about the Doric style, compare examples of that style found within various architectures, or just admire the Parthenon. Likewise, different styles that perform the same function can be compared, as can slight differences in the specific implementations that are used as examples of a given style.
Architecture is therefore an abstraction of implementation, and styles are the named patterns by which we can understand architectures and architectural design. Simple, right?
Then why is it that SOA advocates insist on comparing REST to specific implementations and then complain about how vague the style is compared to the implementation? ROA is not REST. ROA is supposed to be a kind of design method for RESTful services, apparently, but most folks who use the term are talking about REST without the hypertext constraint. In other words, not RESTful at all. REST without the hypertext constraint is like pipe-and-filter without the pipes: completely useless because it no longer induces any interesting properties. The RESTful Web Services book doesn’t help the situation by renaming the hypertext engine as connectedness. That does nothing but obscure its role as the driving force in RESTful applications.
Linda is another example of oddly construed comparisons with REST, though in this case it is generally abused by REST advocates. Mark Baker was the first to notice the similarity between Linda’s uniform interface and the uniform interface constraints within REST. However, Linda is not a style — it is an architecture for coordination via a shared tuplespace (an example of the blackboard architectural style). It makes sense to compare REST to the blackboard style (they do have a lot in common, as styles go). Likewise, there are some limited comparisons that can be made between Linda and the Web architecture, but one must keep in mind that they serve completely different functions (Linda being a coordination language and the Web being a distributed hypermedia system). But to make any comparison at all between REST (a style) and Linda (an architecture designed to support a different style in order to accomplish an entirely different function) is absurd; just as absurd as trying to compare the Doric style to my condo’s garage door opener. Unlike Mark, some REST advocates have a tendency to lose track of when they are talking about REST versus when they are talking about Web architecture versus when they are talking about specific implementations that attempt to match the Web architecture.
In summary:
- Web implementation consists of the current universe of information identified by URIs and all of the specific versions of software currently operating within that information space (like Safari, Firefox, Apache httpd, WordPress, …).
- Web architecture consists of the protocols and data formats that define the syntax and semantics of interactions between Web components: the standards for URI, HTTP, HTML, XML, and many others. All of these standards are designed to optimize RESTful interaction, with varying degrees of success, but not to require such interaction because RESTful interaction is not the only way they are used.
- REST is an architectural style that, when followed, allows components to carry out their functions in a way that maximizes the most important architectural properties of a multi-organizational, network-based information system. In particular, it maximizes the growth of identified information within that system, which increases the utility of the system as a whole.
Web implementations are not equivalent to Web architecture and Web architecture is not equivalent to the REST style. REST constraints do not constrain Web architecture — they constrain RESTful architectures (including those found within the Web architecture) that voluntarily wish to be so constrained. HTTP/1.1 was designed to enable and improve RESTful architectures, just as REST was designed to reflect and explain all of the best things about Web architecture. That does not mean that HTTP/1.1 is constrained to a single style; it means those other styles are not part of the design (i.e., we don’t care if future changes to HTTP will cause them to break). Only some of the architectures found on the Web are RESTful, but that doesn’t change the fact that RESTful architectures do work better on the Web than any other known styles. They work better because REST induces the architectural properties that the Web needs most — reusability, anarchic scalability, evolvability, and synergistic growth — and thus the Web architecture has been updated over time to promote RESTful styles over all others, by design.
Connecting…
Roy Fielding: I won’t take credit for that idea, but I stand behind it. Perhaps I talk to different people than Roy does, but many of the people I do talk to don’t, um, connect when they hear the phrase Yet, when I poi…
Care to elaborate a bit for us mortal undergrads? I do understand that the “hypertext restriction†is another name for “hypertext as the engine of application stateâ€, but how exactly ROA or “RESTful Web Services†fail at it?
See my comment on Sam’s blog about why the engine is needed to prevent coupling. ROA focuses too much on resources and not enough (if any) on the engine of hypertext.