Search Results for “post”.


Tim Bray’s article on RESTful Casuistry revisits an odd meme in the REST debates that I’ve been meaning to discredit for a while.

Some people think that REST suggests not to use POST for updates.  Search my dissertation and you won’t find any mention of CRUD or POST. The only mention of PUT is in regard to HTTP’s lack of write-back caching.  The main reason for my lack of specificity is because the methods defined by HTTP are part of the Web’s architecture definition, not the REST architectural style. Specific method definitions (aside from the retrieval:resource duality of GET) simply don’t matter to the REST architectural style, so it is difficult to have a style discussion about them. The only thing REST requires of methods is that they be uniformly defined for all resources (i.e., so that intermediaries don’t have to know the resource type in order to understand the meaning of the request). As long as the method is being used according to its own definition, REST doesn’t have much to say about it.

For example, it isn’t RESTful to use GET to perform unsafe operations because that would violate the definition of the GET method in HTTP, which would in turn mislead intermediaries and spiders.  It isn’t RESTful to use POST for information retrieval when that information corresponds to a potential resource, because that usage prevents safe reusability and the network-effect of having a URI. But why shouldn’t you use POST to perform an update? Hypertext can tell the client which method to use when the action being taken is unsafe. PUT is necessary when there is no hypertext telling the client what to do, but lacking hypertext isn’t particularly RESTful.

POST only becomes an issue when it is used in a situation for which some other method is ideally suited: e.g., retrieval of information that should be a representation of some resource (GET), complete replacement of a representation (PUT), or any of the other standardized methods that tell intermediaries something more valuable than “this may change something.” The other methods are more valuable to intermediaries because they say something about how failures can be automatically handled and how intermediate caches can optimize their behavior. POST does not have those characteristics, but that doesn’t mean we can live without it. POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”

I think the anti-POST meme got started because of all the arguments against tunneling other protocols via HTTP’s POST (e.g., SOAP, RSS, IPP, etc.). Somewhere along the line people started equating the REST arguments of “don’t violate HTTP’s method definitions” and “always use GET for retrieval because that forces the resource to have a URI” with the paper tiger of “POST is bad.”

Please, let’s move on. We don’t need to use PUT for every state change in HTTP. REST has never said that we should.

What matters is that every important resource have a URI, therein allowing representations of that resource to be obtained using GET.  If the deployment state is an important resource, then I would expect it to have states for undeployed, deployment requested, deployed, and undeployment requested. The advantage of those states is that other clients looking at the resource at the same time would be properly informed, which is just good design for UI feedback. However, I doubt that Tim’s application would consider that an important resource on its own, since the deployment state in isolation (separate from the thing being deployed) is not a very interesting or reusable resource.

Personally, I would just use POST for that button. The API can compensate for the use of POST by responding with the statement that the client should refresh its representation of the larger resource state. In other words, I would return a 303 response that redirected back to the VM status, so that the client would know that the state has changed.

Who we are

This is the personal website of Roy T. Fielding, which consists mostly of Roy’s blog Untangled and a few archived talks. The contents are determined by Roy, not by his employer, and might reflect information or opinions that have gone out of date. Too bad. Deal with it.

What personal data we collect and why we collect it

Comments

When visitors leave comments on the site we collect the data shown in the comments form, and also the visitor’s IP address and browser user agent string to help spam detection.

An anonymized string created from your email address (also called a hash) may be provided to the Gravatar service to see if you are using it. The Gravatar service privacy policy is available here: https://automattic.com/privacy/. After approval of your comment, your profile picture is visible to the public in the context of your comment.

Media

If you upload images to the website, you should avoid uploading images with embedded location data (EXIF GPS) included. Visitors to the website can download and extract any location data from images on the website.

Contact forms

We don’t have any (see below).

Cookies

If you leave a comment on our site you may opt-in to saving your name, email address and website in cookies. These are for your convenience so that you do not have to fill in your details again when you leave another comment. These cookies will last for one year.

If you have an account and you log in to this site, we will set a temporary cookie to determine if your browser accepts cookies. This cookie contains no personal data and is discarded when you close your browser.

When you log in, we will also set up several cookies to save your login information and your screen display choices. Login cookies last for two days, and screen options cookies last for a year. If you select “Remember Me”, your login will persist for two weeks. If you log out of your account, the login cookies will be removed.

If you edit or publish an article, an additional cookie will be saved in your browser. This cookie includes no personal data and simply indicates the post ID of the article you just edited. It expires after 1 day.

Embedded content from other websites

Articles on this site may include embedded content (e.g. videos, images, articles, etc.). Embedded content from other websites behaves in the exact same way as if the visitor has visited the other website.

These websites may collect data about you, use cookies, embed additional third-party tracking, and monitor your interaction with that embedded content, including tracking your interaction with the embedded content if you have an account and are logged in to that website.

Analytics

We don’t currently use analytics for these pages, but they might be collected on embedded content (see above).

Who we share your data with

The hosting service upon which this site operates (Dreamhost) has direct access to any data saved within it, though does so as a service provider under contract. We are not aware of any others having access to the data aside from that published on the web pages.

How long we retain your data

If you leave a comment, the comment and its metadata are retained indefinitely. This is so we can recognize and approve any follow-up comments automatically instead of holding them in a moderation queue.

For users that register on our website (if any), we also store the personal information they provide in their user profile. All users can see, edit, or delete their personal information at any time (except they cannot change their username). Website administrators can also see and edit that information.

What rights you have over your data

If you have an account on this site, or have left comments, you can request to receive an exported file of the personal data we hold about you, including any data you have provided to us. You can also request that we erase any personal data we hold about you. This does not include any data we are obliged to keep for administrative, legal, or security purposes.

Where we send your data

Visitor comments may be checked through an automated spam detection service.

Contact information

Aside from blog comments, you may send additional requests or complaints to Roy, directly, to one of the addresses listed on this site’s root page. If you send to multiple addresses, all will be deleted automatically as spam. Please be aware that Roy receives thousands of emails a day, almost all of which are spam, thanks to the infinite archives of old RFCs.

As you may have noted, my last post seems to have hit a nerve in various communities, particularly with those who are convinced that REST means HTTP (because, well, that’s what they think it means) and that any attempt by me to describe REST with precision is just another elitist philosophical effort that won’t apply to those practical web developers who are just trying to get their javascript to work on more than one browser.

Apparently, I use words with too many syllables when comparing design trade-offs for network-based applications. I use too many general concepts, like hypertext, to describe REST instead of sticking to a concrete example, like HTML. I am supposed to tell them what they need to do, not how to think of the problem space. A few people even complained that my dissertation is too hard to read. Imagine that!

My dissertation is written to a certain audience: experts in the fields of software engineering and network protocol design. These are folks with long industry careers or graduate degrees, usually Ph.D.s who have spent decades learning about their field, identifying an untrodden path to pursue advanced research, and eventually becoming so familiar with that path that they are (hopefully) able to learn something that nobody else in the world has revealed before. In the process, they become specialists, because it is only through specialization that a human being can become sufficiently knowledgeable to find what has yet to be known in a field as large as computing. It is only by becoming specialists that we can understand each other when we explain what we have learned, and thereby grow the field of knowledge over time.

James Burke described the problem of specialization in the final episode of his first series on BBC, Connections. If you are having a hard time following my work, then I strongly suggest you go find a copy of the old episodes somewhere and watch them, bearing in mind that it was first broadcast in 1978, when the folks who brought you the Web were at their most impressionable early age. Mr. Burke would appreciate that connection, I think. What he said deserves a bit of transcripting on my part:

The other, general thing to be said about how change comes about through innovation, and especially about the rate in which that change occurs, is that: the easier you can communicate, the faster change happens.

I mean, if you look back at the past, in that light, you’ll see that there was a great surge in invention in the European Middle Ages, as soon as they had reestablished safe communication between their cities, after the so-called dark ages. There was another one, in the sixteenth century, when these [books] gave scientists and engineers the opportunity to share their knowledge with each other, thanks to a German goldsmith called Johannes Gutenberg who’d invented printing back in the 1450s. And then, when that developed out there, telecommunications, oh a hundred-odd years ago, then things really started to move.

It was with that second surge, in the sixteenth century, that we moved into the era of specialization: people writing about technical subjects in a way that only other scientists would understand. And, as their knowledge grew, so did their need for specialist words to describe that knowledge. If there is a gulf today, between the man-in-the-street and the scientists and the technologists who change his world every day, that’s where it comes from.

It was inevitable. Everyday language was inadequate. I mean, you’re a doctor. How do you operate on somebody when the best description of his condition you have is “a funny feeling in the stomach?” The medical profession talks mumbo jumbo because it needs to be exact. Or would you rather be dead?

And that’s only a very obvious example. Trouble is, when I’m being cured of something, I don’t care if I don’t understand. But what happens when I do care? When, say, the people we vote for are making decisions that effect our lives deeply, `cause that is, after all, when we get our say, isn’t it? When we vote? But say the issue relates to a bit of science and technology we don’t understand? Like, how safe is a reactor somebody wants to build? Or, should we make supersonic airplanes? Then, in the absence of knowledge, what is there to appeal to except our emotions? And then the issue becomes “national prestige,” or “good for jobs,” or “defense of our way of life,” or something. And suddenly you’re not voting for the real issue at all.

[James Burke, “Yesterday, Tomorrow and You” (19:02-21:30), 1978]

Still timely after all these years, isn’t it?

As scientists go, I am a generalist: the topics that I care about range from international politics to physics, with most applications of computing somewhere in between. However, when I send out a message to API designers, I expect the audience to be reasonably competent in the field. I have to talk to them as a specialist because I want them to understand, as specialists themselves, exactly what I am trying to convey and not some second-order derivatives. Most of the terms that I use should already be familiar to them (and thus it is a waste of everyone’s time for me to define them). When there is a concern about a particular term, like hypertext, it can be resolved by pointing out the relevant definitions that I use as an expert in the field.

I don’t try to tell them exactly what to do because, quite frankly, I don’t have anywhere near enough knowledge of their specific context to make such a decision. What I can do is tell them what isn’t REST or that doesn’t fit my definitions, because that is something about which I am guaranteed to know more than anyone else on this planet. That’s what happens when you complete a dissertation on a topic.

So, when you find it hard to understand what I have written, please don’t think of it as talking above your head or just too philosophical to be worth your time. I am writing this way because I think the subject deserves a particular form of precision. Instead, take the time to look up the terms. Think of it as an opportunity to learn something new, not because I said so, but because it will do you some personal good to better understand the depth of our field. Not just the details of what I wrote, but the background knowledge implied by all the strange terms that I used to write it.

Others will try to decipher what I have written in ways that are more direct or applicable to some practical concern of today. I probably won’t, because I am too busy grappling with the next topic, preparing for a conference, writing another standard, traveling to some distant place, or just doing the little things that let me feel I have I earned my paycheck. I am in a permanent state of not enough time. Fortunately, there are more than enough people who are specialist enough to understand what I have written (even when they disagree with it) and care enough about the subject to explain it to others in more concrete terms, provide consulting if you really need it, or just hang out and metablog. That’s a good thing, because it helps refine my knowledge of the field as well.

We are communicating really, really fast these days. Don’t pretend that you can keep up with this field while waiting for others to explain it to you.

I need to address that teaser I included in my last post on paper tigers and hidden dragons:

People need to understand that general-purpose PubSub is not a solution to scalability problems — it simply moves the problem somewhere else, and usually to a place that is inversely supported by the economics.

I am talking here about the economics of sustainable systems, which I consider to be part of System Dynamics and Systems Engineering. Software Architecture and System Dynamics are deeply intertwined, just as Software Engineering and Systems Engineering are deeply intertwined, but they are not the same thing.

Fortunately, this discussion of economics has nothing to do with the financial economy or the proposed bailout on Wall Street. At least, I don’t think it does, though event-based integration and publish-subscribe middleware are most popular within financial institutions (and rightly so, since it matches their natural data stream of buy/sell events and news notifications that drive financial investing/speculation).

Economists have a well-known (but often misunderstood) notion of the economies of scale, which can be summarized as the effect of increased production on the average cost of production (see also Wikipedia and Investopedia). For most types of business, the average cost of producing some item is expected to go down as the business expands. This is a natural consequence of both the spreading of fixed costs over many more customers and the ability of larger distributors to demand more competitive pricing from their upstream suppliers (e.g., Walmart).

If we look at such a business in terms of its system dynamics, we should see “load” on the system increase at a much lower rate than the increase in customers. In other words, the additional customers should not, on average, create more cost than they generate in revenue once the business passes the initial fixed infrastructure cost break-even point. If they do create more cost, then the business is not sustainable at larger scales. For many businesses, that’s just fine — most of the personal services economy is based on models with a small window between “enough customers to survive” and “too many customers to handle without degrading service.”

The same is true of software architectures for network-based applications. As the number of consumers increase, the per-consumer cost of the overall system must decrease in order to sustain the system. Relative economies of scale were used by Nikunj Mehta in his dissertation to compare architectural choices: “A system is considered to scale economically if it responds to increased processing requirements with a sub-linear growth in the resources used for processing.” I like that as a quantifiable definition for the architectural property of scalability. However, Nikunj is focused on comparing the relative scalability of particular implementations within an architecture-driven modular framework, not the type of economics that I alluded to in my post.

It is important to think of system dynamics in terms of economics and not just system throughput, since the impact of growth in non-sustainable systems is often unrelated to the individual data transfers or protocols in use. In fact, some architectures that are more efficient from the standpoint of per-event network usage are also more susceptible to inverse economies of scale. PubSub is one such example.

PubSub (short for publish/subscribe) is a derivative of one of the most analyzed architectural styles in software: event-based integration (EBI). From a purely theoretical perspective, event-based integration is a natural fit for systems that monitor real-time activities, particularly when the number of event sources outnumber the recipients of event notifications (e.g, graphical user interfaces and process control systems). However, EBI does not scale well when the number of recipients greatly outnumbers the sources. Researchers have been studying Internet-scale event notification systems for over two decades. Several derivative styles have been proposed to help reduce the issue of scale, including PubSub, EventBus, and flood distribution. There are probably others that I have left out, but they usually boil down into the use of event brokers (intermediaries) and/or a shared data stream (multicast and flood).

PubSub improves EBI scalability by filtering events, either by publisher-driven topics or subscriber-chosen content, and only delivering notifications to the matching subscribers of those topics/content. In theory, if we are careful in choosing the topics such that they partition the events into relatively equal, non-overlapping subsets, and if the consumers of those events are only interested in subscribing to a small subset of those topics, then the corresponding reduction in notification traffic is substantial. Group chat, as in Jabber rooms, is one such example. For most applications, however, such a partitioning does not exist.

EventBus takes a slightly different tack on scalability by forcing traffic onto shared distribution streams, usually IP multicast or flood-distribution channels, and having all subscribers pick their own events off that stream. Unfortunately, multicast does not scale socially (a tragedy of the commons) and rarely succeeds across organizational domains, whereas flooding only succeeds when the signal/noise ratio is high.

Of course, there are many examples of successful EBI systems. USENET news, for example, combined both hierarchical PubSub and flood-distribution to great effect. Adam Rifkin and Rohit Khare did an extensive survey of Internet-scale event notification systems that still stands the test of time. And, as I said at the beginning, EBI is the most popular style of system architecture within the financial industry, where scalability is offset by relatively high subscription costs.

So, what’s the point of this rambling post? Ah, yes, now I remember: the inverse economics of general-purpose PubSub.

People are greedy. They tend to be event-gluttons, wishing to receive far more information than they actually intend to read, and rarely remember to unsubscribe from event streams. When they stop using a service, they tend to remove the software that reports the notifications locally, not the software or subscriptions that cause it to be delivered. If it were not for automatic bounce handling, Apache mailing lists would have melted down years ago. The only way to constrain such users is a tax on either subscriptions or deliveries.

PubSub systems have an imbalance of effort to subscribe versus effort to deliver. In other words, a single user request results in a potentially large number of server obligations. As such, a benevolent user can produce a disproportionate load on the publisher or broker that is distributing notifications. On the Internet, we don’t have the luxury of designing just for benevolent users, and thus in HTTP systems we call such requests a denial-of-service exploit. Any request architecture that allows a malevolent user to create load as a multiple of the effort required to make the request will result in denial of service attacks, since they succeed without overwhelming the attacker’s own connectivity.

The only solution to the inverse economics of PubSub that I know of is to match the costs to the revenue stream. In other words, charge consumers for the subscriptions they choose to receive, at a rate corresponding to the cost of delivering the subscribed stream at the subscribed rate and over the subscribed medium. That is exactly why there is no standard mechanism for notifications in HTTP: there is no scalable solution for subscribing to events without also standardizing the creation of user accounts and event filtering mechanisms, neither of which tend to be standardized because user management is an application of its own and event filtering is always domain-specific.

What does this mean for a system like Jabber? It means that each jabber server (the broker) needs to keep watch on the types of systems being supported via its infrastructure and adjust their subscription model accordingly. Applications like chat will work great, as will most applications wherein the event sources far outnumber the notification recipients. Applications with inverse economics need to pay their own way.

What does this mean for a system like Twitter? Well, that’s an interesting case because it incorporates both subscription-based following of events and RESTful interfaces for event logs. When a user only follows a small set of twitters, or the set of twitters being followed are mostly quiet, then they might be more economically supported by an EBI system instead of polling their twitter home stream. However, that won’t be true for people who want to follow many twitters or who are only interested in looking at twits in batch (basically, anyone who polls at a lower frequency than their incoming twit rate). In particular, malevolent followers (those robots that follow everyone, for whatever reason) are much more dangerous to an EBI system.

My guess is that Twitter will eventually move to an ad-supported revenue stream on their Web interfaces, since ad revenue increases at the same rate as RESTful applications need to scale (i.e., RESTful systems have a natural economy of scale that makes them sustainable even when their popularity isn’t anticipated). Likewise, Twitter’s event-based interfaces will move toward a more sustainable subscription model that charges for excessive following and premium services like real-time event delivery via XMPP or SMS. That kind of adjustment is necessary to rebalance the system dynamics.

One of the sessions that I attended at OSCON was “Beyond REST? Building Data Services with XMPP PubSub” by Evan Henshaw-Plath and Kellan Elliott-McCrea. I think you can guess why that made me curious, but it was interesting to see how much that curiosity was shared by the rest of the conference: the room filled up long before the scheduled start. They certainly gave a very entertaining talk and one that spilled over into the blogosphere in posts by Stephen O’Grady, Joshua Schachter, and Debasish Ghosh.

Unfortunately, the technical argument was a gigantic paper tiger, which is a shame given that there are plenty of situations in which event-based architectures are a better solution than REST-based architectures. I made a brief comment about notification design and how they seemed to be ignoring a good twenty years of research on Internet-scale event notification systems. People need to understand that general-purpose PubSub is not a solution to scalability problems — it simply moves the problem somewhere else, and usually to a place that is inversely supported by the economics. I’ll have to explain that in a later post, since this one is focused on the technical.

Here’s the tiger:

On July 21st, 2008, friendfeed crawled flickr 2.9 million times to get the latest photos of 45,754 users, of which 6,721 of that 45,754 potentially uploaded a photo.

Conclusion: Polling sucks.

If you’d like to learn more about their XMPP solution, the slides are available from the OSCON website. I do think there is a lot to be learned from using different interaction styles and true stream-oriented protocols (the kind that don’t care about lost packets), but this FriendFeed example is ridiculous. It took me less than 30 seconds to design a better solution using nothing more than HTTP, and that while sitting in the middle of a conference session. This is exactly what Dare means by: “If a service doesn’t scale it is more likely due to bad design than to technology choice.

They are comparing the efficiency of blind polling using HTTP crawls to a coordinated PubSub services setup with XMPP.  Spidering an entire site is obviously not going to be efficient if you are only interested in what has changed. One advantage it has is that you don’t need to cooperate with the information provider. However, for the specific example of FriendFeed polling Flickr, cooperation is easy (both companies gain immensely by cooperating) and essential (2.9 million requests per day will get you blocked from any site that doesn’t want cooperation).

The solution, which I mentioned briefly at the talk, is to provide a resource that reflects all of the changes on Flickr during a given time period. Anyone (not just FriendFeed) can perform a GET on that resource to find out what has changed. In fact, one such example is the SUP (Simple Update Protocol) just introduced by FriendFeed. However, I had something more efficient in mind.

Web architects must understand that resources are just consistent mappings from an identifier to some set of views on server-side state. If one view doesn’t suit your needs, then feel free to create a different resource that provides a better view (for any definition of “better”). These views need not have anything to do with how the information is stored on the server, or even what kind of state it ultimately reflects. It just needs to be understandable (and actionable) by the recipient.

In this case, we want to represent the last-updated state of all Flickr users in a way that minimizes the lag between event and notification (let’s just assume that one minute is “fast enough” to receive a change notification). The simplest way of doing that is to log state changes by userid in a sequence of journal-style resources named according to the server’s UTC clock minutes.  For example,

http://example.com/_2008/09/03/0000
http://example.com/_2008/09/03/0001
http://example.com/_2008/09/03/0002
...
http://example.com/_2008/09/03/2359

This URI pattern instantly drops the poll count from 2.9 million to 1440 (the number of minutes in a day) plus whatever pages are retrieved after we notice a user has changed their state. Alternatively, we could define a single append-only resource per day and use partial GET requests to retrieve only the bits since the last poll, but that tends to be harder on the server. Representations for the above resources can be generated by non-critical processes, cached, and even served from a separate distribution channel (like SUP).

What, then, should we include in the representation? Well, a simple list of relative URIs is good enough if the pattern is public, but that would be unwise for a site that features limited publication (obscured identifiers so that only people who have been given the URI can find the updated pictures). Likewise, the list might become unwieldy during event storms, when many users happen to publish at once. Of course, like any good CS problem, we can solve that with another layer of indirection.

Instead of a list of changed user ids or URIs, we can represent the state as a sparse bit array corresponding to all of Flickr’s users. I don’t know exactly how many users there are at Flickr, but let’s be generous and estimate it at one million. One million bits seems like a lot, but it is only 122kB in an uncompressed array. Considering that this array will only contain 1s when an update has occurred within the last minute, my guess is that it would average under 1kB per representation.

I can just imagine people reading “sparse bit array” and thinking that I must be talking about some optimal data structure that only chief scientists would think to use on the Web. I’m not. Any black-and-white GIF or PNG image is just a sparse bit array, and they have the nice side-effect of being easy to visualize. We can define our representation of 1 million Flickr users to be a 1000×1000 pixel black-and-white image and use existing tools for its generation (again, something that is easily done outside the critical path by separate programs observing the logs of changes within Flickr). I am quite certain that a site like Flickr can deliver 1kB images all day without impacting their scalability.

Finally, we need a way to map from the bits, each indicating that a user has changed something, to the much smaller set of users that FriendFeed knows about and wishes to receive notifications. If we can assume that the mapping is reasonably persistent (a month should be long enough), then we can define another resource to act as a mapping service. Such as,

http://example.com/_2008/09/03/users?{userid}

which takes as input a userid (someone that a friend already knows and wants to monitor for changes) and returns the coordinate within the sparse array (the pixel within the 1000×1000 image) that corresponds to that user. FriendFeed can store that accumulated set of “interesting users” as another image file, using it like an “AND mask” filter to find the interesting changes on Flickr.

Note that this is all just a quick thought experiment based on the general idea. In order to build such a thing right, I’d have to know the internals of Flickr and what kinds of information FriendFeed is looking to receive, and there are many potential variations on the representations that might better suit those needs (for example, periods could be overlapped using gray-scale instead of B&W). The implementation has many other potential uses as well, since the sequence of images provide an active visualization of Flickr health.

I should also note that the above is not yet fully RESTful, at least how I use the term. All I have done is described the service interfaces, which is no more than any RPC. In order to make it RESTful, I would need to add hypertext to introduce and define the service, describe how to perform the mapping using forms and/or link templates, and provide code to combine the visualizations in useful ways. I could even go further and define these relationships as a standard, much like Atom has standardized a normal set of HTTP relationships with expected semantics, but I have bigger fish to fry right now.

The point is that you don’t need to change technologies (or even interaction styles) to solve a problem of information transfer efficiency. Sometimes you just need to rethink the problem. There are many systems for which a different architecture is far more efficient, just as XMPP is far more efficient than HTTP for something like group chat. Large-scale collaborative monitoring is not one of them. An XMPP solution is far more applicable to peer-to-peer monitoring, where there is no central service that is interested in the entire state of a big site like Flickr, but even then we have to keep in mind that the economics of the crowd will dictate scalability, not the protocol used for information transfer.

Hmm, over five months since my last (published) post. I don’t know how folks like Tim and Sam keep writing every day. I’ll have to ask when I see them at ApacheCon US in New Orleans.

I fell off the blogon while trying to prepare my two presentations for ApacheCon Europe and have yet to catch up with my email since then. Seriously, I spend way too much time reading and deleting email, so if you’ve sent me something and haven’t received a response then please understand that it isn’t because I don’t like you (unless, of course, you sent it to several of my email addresses at once, in which case it was probably blocked as spam).

Whenever I did find time to blog, I found myself wanting to update the software first, fiddle with the comment settings, or try out various blogspam avoidance tricks. My personal blog runs on WordPress, both because it’s cool cutting-edge open source software and because my ISP won’t let me run anything written in Java (let alone Day’s commercial products). It also has the nice side-effect of forcing me to pay attention to everyday web content management issues, instead of just the more abstract (implementation independent) issues of Web protocols. In other words, it keeps me in the loop in terms of what people are looking for in public-facing web content management. And, if I want to see what it looks like through our own software, I can just wander over to Planet Day.

One area in which open source blogging software is way ahead of the curve is support for collaborative antispam solutions. I’ve been much happier with my blog since I installed the TypePad Antispam plugin. The social filtering of blog spam is even more effective than it has been for Gmail, especially given the setting to automatically discard comments that look like spam on old posts. I wonder how long it would take Michael Marth to write an Apache Sling plugin for comment filtering using the TypePad service?

Noah Mendelsohn posted a pre-draft about some thoughts on resources, information resources and representations, in which he tries to make sense of the never-ending debate on httpRange-14 httpRedirections-57.

I’ve been vaguely following this discussion on the W3C TAG mailing list for some time and, against my better judgement, I feel a need to comment on the suggestions. Actually, that’s not quite right, since I really don’t have any polite comments to make on the suggestions, and I wouldn’t want to offend my friends on the TAG.

Instead, I’ll just ask a few questions about this one sentence:

A key requirement of the Semantic Web is that URIs be used
to identify resources unambiguously.

Why?

  • What makes that a key requirement of the Semantic Web?
  • What makes the TAG think it is feasible to satisfy that requirement?
  • How will the TAG know when the requirement is (or isn’t) satisfied?
  • If someone can show that this requirement is inherently unsatisfiable, doesn’t that imply the Semantic Web will never happen?

Or, is it not a requirement, and all these proposals are just an excuse to avoid designing a Semantic Web that would actually work within the same problem space of the current Web?

I don’t want the Web to constrain what people do: the Web
is not there to constrain society. It’s there to model society
in its completeness, in its entirety. [Tim Berners-Lee, 1994]

Have we forgotten why?

On the Web, millions of people mint URIs, and millions more use them in references. Millions of human beings, conversing over time, with an occasional URI thrown in to refer to a subject under discussion.

When was the last time you had an unambiguous discussion?

I have been posting things on the Internet for so long, in so many forms, that I never felt a need to join into the latest round we call blogging, particularly since I already spend most of my working days simply trying to get through the email. Lately, however, it seems that I spend more time answering people’s questions about REST than I should – many of them questions that I have answered a dozen times or more, but are simply lost in the thousands of email messages archived in too many places around the net for me to even think about tracking them down and forwarding. So, I write more email, and more archives pile up. I need to start organizing my own correspondence.

Why untangled? Because I’ve always liked this quote from Babylon 5:

“You’re a problem solver. You’re one of these people who would pick up a rope that’s gotten all tangled up and spend an entire day untangling it. Because it’s a challenge, because it defies your sense of order in the universe, and because you can.”

which is about as close a description of me as I could find, at least in the abstract. I am not quite as bad as Monk, but I love a good puzzle, and I can be exceedingly stubborn about finishing the things that I start.

Untangled is the personal blog of Roy T. Fielding, a Senior Principal Scientist at Adobe. This blog is on Roy’s personal site and does not represent the thoughts, intentions, plans, or strategies of Adobe; it reflects only Roy’s own personal opinion and the opinions of those who have posted comments.

Dr. Fielding is best known for his work in developing and defining the modern World Wide Web infrastructure. He is the primary architect of the Hypertext Transfer Protocol (HTTP/1.1), coauthor of the Internet standards for HTTP and Uniform Resource Identifiers (URI), and a founder of several open source software projects (including the Apache HTTP Server Project that produces the software running most Web servers).

Dr. Fielding received his Ph.D. degree in Information and Computer Science from the University of California, Irvine. His research interests include the World Wide Web, Web-based content management, software architecture for network-based applications, application-layer network protocols, collaborative software development methods, and global software engineering environments. His dissertation, Architectural Styles and the Design of Network-based Software Architectures, defines the REST architectural style as a model for the design principles behind the modern Web architecture.

Dr. Fielding has been honored with the 1999 ACM Software System Award for his work on the Apache HTTP server project, by ACM/SIGSOFT and IEEE TSEC as a coauthor of the 2010 ICSE Most Influential Paper, by MIT Technology Review as a member of the first TR100, by the O’Reilly Open Source 2000 with the Appaloosa Award for Vision, and was among the first elected members of the W3C Technical Architecture Group.

Dr. Fielding continues to serve as a member of The Apache Software Foundation and an external adviser for the University of California’s Institute for Software Research.

License

The content on this blog is licensed as if it were a public forum, with Roy as moderator. Authors of articles or comments retain their copyright and license Roy to publish that content with attribution. Roy has sole discretion as to what content gets published on this site. Inappropriate comments will be deleted. Roy may make editorial changes to posted comments for the sake of fixing embedded mark-up, updating links, or correcting obvious typos. Contributors may delete their own comments, or ask Roy to do so if needed.

Unless otherwise noted, code snippets within the blog are published under the same open source license as the product being discussed. If no open source license is apparent from the context, then permission is hereby given to redistribute the code under either the Apache License 2.0 or the new BSD license.

Roy’s articles are licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.