Mon 29 Sep 2008
No REST in CMIS
Posted by Roy T. Fielding under standards, web architecture
[10] Comments
About three weeks ago, a new “standard” for Content Management Interoperability Services (CMIS) was announced by EMC, IBM, and Microsoft with the usual fanfare of being the best thing since sliced bread and compliant with the latest buzzwords. One of those buzzwords, REST (as in Representational State Transfer), happens to be defined by my dissertation. I am getting tired of big companies making idiotic claims about REST and their so-called RESTful architectures. The only similarity between CMIS and REST is that they both have four-letter acronyms.
Note to our technology industry rags: A standard is an approved measure against which multiple independent organizations have agreed (by choice or by force) to have their products tested for compliance. A standards-effort is what we call a proposal when it is being actively worked on by some standards development organization. CMIS, on the other hand, is just a vendor proposal that is being submitted to OASIS. It only becomes a standards-effort once the OASIS members agree to host it, which shouldn’t be a problem given the pay-to-play nature of OASIS, and might become a standard if the final specification is approved.
CMIS is a thin veneer on RDBMS-based data repositories that provides a data model for document-like objects within filesystem-like folders, basic file versioning, and access via SQL queries and local object references. It is exactly the kind of document model one would expect within a legacy document management system that is backed by a large relational database and authored via Microsoft Office applications. No surprise, given the sponsors, and there are plenty of good reasons why folks would want to support such data models. For the interface, CMIS includes both a Web Services SOAP/WSDL protocol binding, tightly coupled to the data model, and a REST protocol binding, which also happens to be tightly coupled to the data model.
REST is an architectural style, not a protocol, and thus announcing it as a protocol binding is absurdly ignorant behavior for a group of technology companies. The RESTish protocol binding actually being proposed by CMIS is AtomPub, or at least it would be if not for the huge number of unnecessary protocol extensions that tunnel the Web Services interface through fake-Atom and fake-HTTP. The examples assume a single-script gateway that accepts methods in query strings with CMIS-* header fields to bind search scope, just like SOAP envelopes and bodies are used to tunnel object-specific protocols over HTTP. Are there any REST constraints that this binding doesn’t violate?
The SOAP protocol binding, in contrast, is more direct: half the number of pages and defined with WSDL and XSD. It is obvious that the SOAP binding was designed first and the AtomPub binding added for marketing reasons. I don’t think much of SOAP bindings, in general, but at least this one is consistent with the limited data model, the design of other SOAP-based services, and the goal of providing a control-oriented API for document management.
CMIS is a classic example of what happens when a control-oriented interface is slapped onto an HTTP-based protocol instead of redesigning the interface to be data-oriented. All of the lowest-common-denominator constraints of CMIS’ data model, which are necessary for the SOAP interface because its operations are object-specific, are completely unnecessary for an HTTP interface that is properly designed to be data-oriented. An HTTP interface doesn’t need to be limited to Atom feed formats for traversing folder hierarchies; hypertext is a lot more powerful than temporally-ordered query results. An HTTP interface doesn’t need to forbid the versioning of folders; hypertext can tell the client what operations are allowed on each folder. An HTTP interface doesn’t need a special query media type that (insanely) consists of a raw SQL statement embedded in XML sugar coating; any HTTP resource can be a stored procedure and any hypertext response can contain a list of results. An HTTP interface doesn’t need to traverse folders or query databases in order to access an object summary that points to a content stream that might then be downloaded; hypertext allows each object to be identified by a URI and manipulated independent of the discovery process.
CMIS is a Web Services interface for document management. It should be renamed WS-DMS and tossed on the same pile of other specs from that genre. WebDAV is a far more capable interface that has already been standardized to provide document-level write-access and versioning over HTTP. WebDAV isn’t very RESTful either, because it relies on folder operations instead of hypertext, but at least the WebDAV interface doesn’t interfere with the read-only side of HTTP, WebDAV is already supported by authoring tools with filesystem semantics, and Microsoft has already deployed hundreds of proprietary extensions to WebDAV within its Exchange and SharePoint server products. For that matter, CMIS would be a lot more interesting if it were designed as an extension to CIFS instead of HTTP or SOAP.
My bet is that the document repository vendors will continue to focus on making their own native HTTP interfaces more efficient, since that is how customers will evaluate their performance when integrated within heterogeneous architectures. At best, CMIS will become another method for data migration away from Web Services and legacy repositories, which may be justification enough for implementation. However, they should stop calling their AtomPub derivative a REST binding. Even if they manage to redesign the HTTP interface during the standards process, it will still only be an HTTP binding and only one part of an overall application architecture that could be called RESTful.
Roy,
I always enjoy your musings. I am glad you decided to post about CMIS. It is something I am very excited about and something David Neuscheller has been talking about years now, the need for a protocol-based binding for repositories.
You raise a few points:
1. Not a standard
2. Methods in query strings
3. Header fields for search scope
4. Single-Script gateway (tunnelling)
5. Exposing query
6. Big vendors not listening
CMIS is a draft specification as you rightly point out. As it goes through the OASIS process, it may become a standard.
The next three items you raise where the RESTish AtomPub binding does not meet the principles in your dissertation. I believe you have not read the specification closely enough, or we need to rewrite the specification to make these points more clear.
The HTTP headers or Query strings fall into two categories:
A. Provide guidance on the representation much like the Accept header. Examples are includeAllowableActions, filter, etc. These all predominately GET-related
B. Provide guidance on the behavior of common HTTP verbs (PUT, POST) when default behavior is not sufficient. Examples of these are major=false on checkin. These are predominately because the group wanted to stay with the common HTTP verbs rather than leverage new or newer HTTP verbs.
Neither of these violate REST principles, though I am happy to discuss how to make it a better REST-ful proposal,
On the single-script gateway issue, the REST-ish AtomPub binding starts with an APP service doc and exposes collections. The collections can then be retrieved to view the contents in the repository as atom feeds. At that point documents or folders are exposed as Atom documents. They contain links to other resources such as allversions for the version history collection as feed, relationships for the relationships on an object, allowableactions in a well defined format, etc.
We also extended AtomPub collections to have meaning besides membership for unfiled and checkedout collections. When a document is added to those collections (becomes a member), it is either removed from all other collections or checkedout respectively.
So, we exposed a lot of resources and leveraged hypertext to allow a client to unravel/traverse the repository. All resources have a well defined type. The REST-ish AtomPub binding leverages XSD extensively to define the document formats.
On the query item (#5), how do you propose to expose search criteria? The current model is posting a well-defined doc format to a collection (query collection) and getting a collection (hypermedia document back with links) returned using the Content-Location header. That seems REST-ful to me.
On the vendors not listening, I welcome you to participate at OASIS and contribute in discussions on this topic. I, personally, am very interested in discussing this with you. I think this is the first large effort by vendors big and small to provide a large set of functionality in two modes: Service-based and Resource-based.
Is CMIS RESTful? Or merely HYPEful?…
Not long ago I blogged about the newly announced Content Management Interoperability Services specification, which is a joint effort of EMC, IBM, Open Text, Oracle, SAP,…
Hi Al,
Regarding (1), it is deliberately misleading to portray CMIS as a standard when it has only been proposed for submission to OASIS, as is clearly being done on the EMC product page, EMC whitepaper, and IBM product page. Even the joint press release starts out with the submission status and then devolves into a bunch of quotes about it being a standard.
Compare that to the more accurate portrayals of CMIS by Alfresco and Microsoft. IBM and EMC need to reign in their marketing folks.
Regarding (2) and (3), the REST constraints on identification of resources using identifiers and separation of action (method) from identification (URI) are not being followed when custom header fields are being used to tweak the scope of actions. Even if they were limited to content negotiation (selection of equivalent representations in different formats), these fields would still need to be listed in HTTP’s Vary header to be compliant with the cache requirements. Instead, CMIS should use different resources (with different URIs) to represent the different views of repository state.
Regarding (4), the single-script gateway is more of a problem with the examples chosen in the specification than a limitation of the AtomPub binding. The examples should be described in hypertext and the focus of the API should be on how the extended link relations identify the available repository resources. It is not clear from the specification that any of those links are required, let alone expected within the response. That is the hypertext constraint. Almost all of the examples end with the first line of content.
Regarding (5), exposing search criteria is typically done using simple forms or URI templates. Defining a new media type might make sense for something like XForms or XQuery, but there isn’t anything particularly RESTful about sending a complete representation of the query in a standardized media type versus appending terms to a URI’s query component. The RESTful aspect is being instructed on what to do by the current representation, whether that be in the form of a link relationship that indicates the queryable resource or a form that instructs the client on how to compose and post a query to a provided URI. The only example I could find in CMIS is an RPC call on the main service URI using a POST of the cmisrequest content type. There is no indication anywhere that this example is being driven by hypertext, and thus the query isn’t RESTful because it relies on the client having hard-coded knowledge that the CMIS repository entry point will accept a POST of this format to indicate a query.
All of those points are rather small compared to my overall complaint that it isn’t appropriate to define a “REST” binding to a specific data model’s limitations. The whole point of REST is to avoid coupling between the client applications and whatever implementation might be behind the abstract interface provided by the server. REST accomplishes that by eliminating the need to think in terms of resource types or specialized interfaces. Instead, the representations tell the application how and what it can do next. Any resource can potentially be viewed as a document or as a folder, depending on how one might want to look at the information. The trick is to define how such resources are related to one another in an implementation-independent manner that can be provided as the interface to any back-end, not just a back-end that corresponds to one data model.
AtomPub is certainly one RESTful way to interface with content if that content drives the application. The relationships could be represented by extensions in the form of links in the Atom format, and perhaps even an XForms or WebForms integration for the formation of queries. Such extensions should be vetted by the same organization that developed AtomPub, the Internet Engineering Taskforce (IETF), not OASIS. However, unless you expect blogging clients and syndication feeds to be the primary application of CMIS, it would make a lot more sense to define the representations in a microformat of HTML, JSON, YAML, or whatever else best fits the data.
….Roy
So… are you going to similarly blast Flickr for not being ReST-ful? Check out the documentation on their “ReST” API:
http://www.flickr.com/services/api/request.rest.html
All requests take this form:
http://api.flickr.com/services/rest/?method=flickr.test.echo&name=value
A HTTP GET is wisely idempotent, so you have to use POST to add/edit/delete… but (shudder) they put the method in URI. Heathens! Heretics! Flickr should be burned at the stake for daring to call that ReST!!! ;-)
Kidding aside, the CMIS ‘ReST’ interface does have a similar ‘bolt-on’ feel… but I doubt most people will care if its ‘impure.’ The goal is easier interoperability, which means ugly sacrifices. I’m just glad they made the wise choice to not require SOAP bindings and tons of WS-* layers.
ReST-ful or ReST-inspired: who cares? As long as its simpler…
Interesting point about CIFS… I was initially thinking CMIS would benefit from ActiveMQ wraper, or maybe a Session Initiation Protocol wrapper…
— bex
Roy,
You raise good points and they should be discussed at the TC. This is why we chose to come forward with the proposal in this space rather than continuing a private effort. We want broader review on the specification so it has the best possible chance of solving customer’s needs.
On (2) and (3), we have been discussing if the HTTP headers are used for changing the response (property filter, etc), returning a Content-Location header of an URI that points to that representation. This came out with discussions with Julian Reschke earlier.
On the other points, the mimetypes and atom link relationships will be registered as the specification gets closer to being a standard.
The table starting on page 15 identifies the atom link relationship that are required depending on the resource.
I walked through the binding again with Sam Ruby yesterday with how a client can start with an APP Service document and access the information in the repository. He was satisfied. I would be happy to walk you through the same exercise. Especially since we are in the same location.
Also, Alfresco has their prototype public so you can get more experience with what we are proposing.
I think you will be positively surprised.
bex –
Why do people think it is funny to portray REST as a religion? Architectural styles are just a means of communicating design decisions and comparing alternatives. There is no faith required. Do you think it would be funny if you asked a realtor to show you Victorian houses and they took you to a bunch of Ranch houses instead? Abusing design classifications makes it harder on everyone to communicate.
The Flickr API has been criticized by many people already. Flickr obviously don’t have a clue what REST means since they just use it as an alias for HTTP. Perhaps that is because the Wikipedia entry is also confused. I don’t know. What I do know is that I don’t have time to instruct every company on how to design software. Outside of Day and Apache, I only get involved in criticizing a design when it becomes a standards-effort, since that is when the design transitions from being self-imposed to being imposed on others.
[…] and Sam Ruby comment on CMIS. As someone directly involved in CMIS, I wanted to acknowledge both Roy’s remarks and Sam’s remarks, which follow onto […]
Hmmm… I’m starting to think that you feel the same way about ReST perversion the same way I think about SOAP perversion…
I like SOAP. I hate most of the WS-* stack. I hate schemas. I hate tight XML data binding. I hate people who ignore the fact that the ‘S’ in ‘SOAP’ originally stood for ‘Simple.’
You are 100% correct about Wikipedia. I think its explanation of ReST is more of a problem than a solution. The implication that ReST is just CRUD over HTTP makes me cringe…
Bex,
Thanks for your comment. Even though the example files may use a particular url style, CMIS urls are opaque and can point to any different host/machine. The binding starts with the APP service document and from there the client can unravel/traverse the links. There are no hardcoded links or url patterns in CMIS as all urls are opaque. This was done to give the implementors the widest possible freedom. All resources are discoverable off of the APP service document. All actions that create new items, etc, return a Content-Location header of the new URI for that resource.
Currently CMIS does not have the flickr pattern:
http://api.flickr.com/services/rest/?method=flickr.test.echo&name=value
So, I think the TC needs to do a better job on explaining the rest-ful binding.
I’d like to re-echo the comment, that we welcome participation and that I hope to see you and Roy in the TC
CMIS: A Contrarian View…
CMS Watch has this interesting post about the new CMIS specification. Roy T. Fielding (one of the *real…