You work for me, Computer.

By Brandon Bloom

Fipp: Fast Idiomatic Pretty-Printer for Clojure

I decided to create a new pretty printer in the spirit of “Data All The Things!” And it’s fast too!

Fipp, the Fast Idiomatic Pretty-Printer for Clojure, is a pretty printer with linear runtime and bounded space requirements. Unlike clojure.pprint’s side-effectual API, Fipp is configured by pretty print documents that are similar to Hiccup HTML templates. It’s based on some fancy pants research pretty printers from the Haskell world.

You can see my implementation and lots more notes at https://github.com/brandonbloom/fipp

Factjor: Concatenative Clojure

I’ve released another small Clojure project: Factjor

Head on over to GitHub for a tiny intro in the README. Lots more explanation and context will follow in a future post, after I release the project that motivated Factjor.

The Node.js REPL Is Broken

Not only is the Node.js REPL broken, but so are the stock REPLs for Python and Ruby too! In fact, for many of you, every single REPL you’ve ever used has been fundamentally flawed. Moreover, even if you used a fancy replacement REPL, such as UltraREPL, IPython, or Pry, you’d only be mitigating the problem: JavaScript, Python, and Ruby are all fundamentally flawed languages from a REPL-lover’s perspective.

Now, I realize that this is a bold and, thus far, unsubstantiated claim. Python and Ruby are revered for interactive, dynamic development and Firebug’s JavaScript console changed the game for web apps. So with so many happy REPL users, what are these fundamental flaws? Since it is the worst offender, let me pick on JavaScript for a moment…

Node’s stock REPL provides a fragile and awkward notion of an evaluation context. A fresh REPL is configured with an evaluation context as if by both with(context) and .call(context) to enable unqualified access to context variables, as well as introspection of the this context. This, however, is a convenient illusion. If you evaluate var x = 1; this, you’ll note that x is added to this. If you were to use var inside a with block, you wouldn’t get that same behavior. The top-level inside the REPL is somewhat magical.

CommonJS Modules provides the blueprint for Node’s module system. The key idea is that Plain Old JavaScript Objects can be used to describe modules and their exports. Simple, but perhaps too simple. Consider if you were working on some function f in module m. You boot up your trusty REPL, evaluate var f = require('m').f; and try out some corner cases. Ugh oh, f has a bug, so you modify src/m.js and then you…. promptly restart your REPL! The require function doesn’t offer any sort of {reload: true} option, and even if you did, you’d still be left with a reference f pointing to an old version of the function. That might not be a big problem if you have just one function, but what if you have several objects and functions in a larger system? The too-simple plain old JavaScript objects approach results in the copying of references and assigning those copies to local names. It is extremely difficult to reason about changes. Even if you knew to cheat by reaching inside of require.cache with a delete, you’d still be better off restarting your REPL to preserve your sanity. That’s a real shame if you had a bunch of interesting test cases lying around in memory, or if there is an expensive initialization computation you need to wait through each time.

Now that I’m done picking on JavaScript, I’ll point out that the situations in both Python and Ruby aren’t much better. Python doesn’t have a notion of a current module, although it does provide the builtin locals() function, as well as lets you call code.interact(locals={...}) to start a nested sub-REPL with something akin to Node’s magical top-level. Ruby fairs marginally better by providing a reasonably well behaved implicit self, which can be shadowed in a nested sub-REPL via an argument to the confusingly named irb method. Python’s modules have a similar issue to plain old JavaScript objects with from m import f and Ruby’s metaprogramming facilities add more than enough complexity to make REPL-coding hair-raising.

It doesn’t have to be this way. There are languages that are much better suited to iterative REPL development. If you’ve been following my blog, then this is where you’d expect me to praise Clojure. I’ll save that for a future post in which I’ll discuss what is really necessary for a successful REPL. And while Clojure’s REPL is far better than any other I’ve ever used, it has its own shortcomings to discuss as well.

Comments

Florian said…

Note that you’re lamenting for the most part that python, ruby and JS can’t do “hot-reload”.

This has nothing to do with an interactive prompt.

Hot reloading never really does work whatever you do (it doesn’t “just work” in smalltalk and any lisp dialect either). The first reason is that even if you can magically substitute all references to everything with the right version (which smalltalk and most lisp dialects can) it still leaves in-memory data corrupt (you haven’t written an in-memory migration script have you?).

The situation is somewhat worse for any language that allows references to everything (like python, ruby and JS) and is not side-effect free. In those environments hot-reloading becomes all but impossible, and although that’s a bit sad, the very way those languages work also makes them pleasant to use unless you want a hot reload.

Brandon Bloom said…

Being able to “hot-reload” a file is nice, but the failing is in the manner by which names are resolved. Aliasing imports into a local variable lacks a level of indirection; this precludes hot-swapping.

This has nothing to do with an interactive prompt.

These are design decisions that impact the usability of interactive prompts.

you haven’t written an in-memory migration script have you?

Sometimes I do migrate my in memory data structures… I enjoy having that option.

devdazed said…

I agree that the Node REPL is by far the worst offender, I think the largest point should have been that the REPL decides when an error make it crash. For instance, if you run:

throw new Error('foo');

All is well and it prints a stacktrace, however if you run:

process.nextTick(function(){ throw new Error('foo') });

It will cause the entire REPL to crash, and thus you need to start from square one. It’s this type of inconsistent behavior that makes the Node REPL infuriating.

IMO, the REPL should never crash and if there is anything a user can do to make it crash (aside from sending a kill -9 command) then it is fundamentally flawed.

Another thing is that you can cause a JS OOM error by simply creating a tight look that prints to the console.

Additionally, standard sigterm signals (such as ctrl+c) won’t break such a tight loop, so your only choice is kill the entire application.

Clojure’s Multimethod Dispatch as a Library

In keeping with the theme, this post motivates dispatch-map

Clojure’s multimethods embody a powerful polymorphic dispatch mechanism. With multimethods, you can define a function which is polymorphic with respect to a given hierarchy and arbitrary dispatch value. The hierarchy is actually a directed acyclic graph, but you can adjust method priority to address the diamond problem.

Unfortunately, Clojure’s dispatch mechanism is baked into the defmulti and defmethod forms. If you want multimethod-style dispatch, you need a Var to hold your multimethod. Similar to the circumstances which prompted the creation of backtick, I discovered that there was an opportunity to decomplect a part of the core library to extract a useful subcomponent. In this case, the datastructure which describes dispatch rules was complected with the identity which points to those dispatch rules. As a result, it’s impossible to treat a multimethod as an immutible value.

So I created dispatch-map. A dispatch map is just like a regular Clojure map, but it comes coupled with a dispatch function and hierarchy. Looking up keys in a dispatch map has the same dispatch behavior and caching functionality of multimethods. However, unlike a multimethod, a dispatch-map is a true value. The standard map operators assoc, dissoc, etc, return a new dispatch-map, leaving the original unmodified. This enables you to manipulate dispatch maps like any other Clojure data structure, without the need for a named, mutable Var. Clojure’s multimethods and all related functions could trivially be reimplemented in terms of a dispatch-map and an atom (edit: and now are!).

One use for a dispatch-map as a value is to store them within some other data structure, without breaking the value guarantees of the enclosing data structure. In my case, I’ve got a dispatch-map of data types to their GUI representations. In a hierarchical definition of a GUI, each node can have a :templates key which is a (dispatch-map class). If you want to render a (Person.) inside a list box, you can rely on automatic dispatch by type to find the appropriate template: Simply walk up the GUI hierarchy and look in the :templates map. Thanks to the powerful dispatch functionality, this will also work if you have an (Employee.) too!

Comments

tgoossens said…

Doesn’t one restrict himself then. What to do when you have a new type that you need to get working with the existing code? You cannot add any new methods. Its a neat idea and I’m sure you can use it for a lot of stuff. But I think – if my previous statement is correct – that the advantage of having it as a value doesn’t compensate for losing the ability to extend (which is basically what you want to do to solve the expression problem right?)

Brandon Bloom said…

You cannot add any new methods.

If that were true, you couldn’t add any new key value pairs to an immutable map…

the advantage of having it as a value doesn't compensate for losing the ability to extend

You’re not losing your ability to extend, you’re simply shifting it’s responsibility elsewhere.

Just put your dispatch-map in an atom, and you’ve basically got a multimethod. More accurately, a multimethod is a dispatch-map plus a Var. The goal of dispatch-map is to decomplect dispatch from state.

Consider, for example, if I had two separate multimethods (call them f and g) and wanted to add a method to both atomically. With a multimethod, you would have to (locking xx (defmethod f …) (defmethod g …)) when you could instead (swap! yy #(–> (assoc-in [:x a] …) (assoc-in [:y a] …)))

tgoossens said…

Ok cool. I think that now I get what you were trying to achieve. Thanks for the explanation!

Paul Stadig said…

Conceptually, an atom and a dispatch-map could replace multimethods. It might be possible with multimethods, since they are so slow anyway. However, there could be performance issues.

I did some similar experiments trying to extract type based dispatch from protocol functions. I started with an atom and a dispatch table, and it turned out to be pretty slow.

Brandon Bloom said…

“Slow” is a relative term. Surely multimethods are slower than protocol dispatch. And protocol dispatch is surely slower than direct dispatch.

However, each yields an additional level of indirection for an additional abstraction need. Direct dispatch offers no abstraction. Protocol dispatch offers abstraction by type. Multimethods offer arbitrary dispatch with respect to a directed acyclic graph. In theory, you could also create a predicate-dispatch-map, which would allow dispatch with respect to an arbitrary decision tree.

With each level of abstraction, you gain a little more flexibility and dynamism at the cost of a little more performance. However, if you can statically determine a dispatch value, you can eliminate those performance costs at compile time. That’s likely not a worthwhile exercise, though, since dynamism is often the whole point of that level of abstraction.

In the case of both multimethods and my dispatch-maps, dispatch values are memoized. Subsequent calls given the same dispatch key do not require traversing the hierarchy to find a “best” match. The result is that the amortized cost of a lookup is the cost of the dispatch-fn plus the cost of one tree lookup. That’s quite reasonable in my mind. December 14, 2012 at 1:26 PM

Dataflow Solver (Qplan for Clojure)

This post describes the motivation for my next Clojure library: qplan

The overwhelming majority of graphical applications are built using ad-hoc event callbacks. Some portion of graphical applications, however, use a databinding model of varying expressivity. Often, these databinding models are simply shorthand for ad-hoc dependency graphs composed of event callbacks. Inevitably, the databinding system will be insufficiently expressive, necessitating manual callback wiring. Even with the most expressive data binding systems available in application frameworks these days, there is a shockingly fragile rat’s nest of control flow and execution order.

Quite a bit of activity has occurred in the research community surrounding more formal, more composable, and less fragile event handling. The buzzword de jour is “Functional Reactive Programming” (FRP) which, although quite varied by author, generally involves the concepts of continuous “signals”, discrete “events”, and some evaluation strategy. FRP was born of animation systems and is a fertile research area. More generally, FRP is a type of dataflow programming. The “reactive” moniker also applies to the area of push-sequences, such as .NET’s IObservable and associated Reactive Extensions (Rx).

As fun as it has been to study FRP, there has been little success building real applications using its techniques. In my opinion, this is because FRP encourages a continuous model of your application, but most graphical applications are much more discrete in nature. Multimedia systems, like the animation system that motivated the first FRP paper, and games have quite a few continuous use cases. Point-and-click software, however, generally has discrete controls, with discrete values, and discrete commands. Consider, for example, the undo-able discrete mutations in the most broadly deployed dataflow application on the planet: spreadsheets. Although capable of representing discreet events, FRP systems project the mental framework of water flowing through pipes. This mental framework clashes badly with the more familiar event callback model. Furthermore, even for truly continuous use cases, discretization is essential to the nature of computation; every game fundamentally distills down to an update and render loop.

I’m still contemplating whether or not it is possible to build a databinding system that is an order of magnitude more expressive and composable than the existing state of the art. However, I’m certain that the underlying architecture will be a discrete system which embraces a formal notion of time through an explicit notion of identity and value. Consider again the spreadsheet, where cells have an identity composed of their row and column labels. As you change the spreadsheet, the document updates the values in all cells synchronously in response to a discrete command. This approach can be extended to an arbitrary graph of identities connected by dataflow constraints.

Given a sprawling graph of identities, you need some evaluation system for assigning values to each identity. Unlike a spreadsheet, many applications demand sophisticated and potentially cyclical bindings. In the face of multi-input, multi-output, and multi-directional dataflow, there are many potential evaluation strategies yielding many different graph linearizations. You need a constraint solver suited to the task. Luckily, Texas A&M has studied this problem in collaboration with Adobe. Their approach is referred to as Property Models and addresses many of the issues with relying on solvers for user-facing constraint systems. The Property Models approach is much broader than simply databinding, it also considers predictability and effectively synthesizing commands, such as those recorded in Photoshop’s History pane.

At the core of the Property Model system, is a variation of Zanden’s Incremental Algorithm for Satisfying Hierarchies of Multiway Dataflow Constraints as described by Freeman, et al. I’ve produced a Clojure implementation of this algorithm called qplan.

My intention is to use qplan as part of a larger approach to building graphical user interfaces. I expect a that a complete system requires a variety of algorithms and solvers tuned for particular use cases. For example, Adobe’s Adam and Eve, has both a Quickplan solver for property models and a linear constraint solver for layout. An even more complete system may have an FRP evaluation strategy for animations.

Templating: Clojure’s Backtick

This post describes the motivation for my new Clojure library: Backtick

There is no shortage of template languages. In fact, there are probably far too many of them. Template languages fall into two major categories that I’ll call procedural and structural. Procedural templating languages operate by emitting code as a side-effect. Structural templating languages operate on trees. For example, Ruby’s ERB is procedural, where as HAML is structural.

The primary advantage of a text-based procedural template language is reach. ERB can be used to generate any type of text file, while HAML can only really be used to emit HTML. Reach is not without disadvantages, however. For one, it’s quite easy to write incorrect ERB templates which emit invalid syntax, such as missing a close tag. Additionally, code with side effects can be difficult to refactor because execution order is critical. You can trivially generate SQL statements with an ERB template, but you’ll quickly accumulate an incomprehensible mud ball, so you’re better off working with a system that treats queries as data and lets you manipulate them structurally. Unfortunately, you’ll need to implement a unique structural tool for each target data structure. If you’re generating a lot of HTML, then using HAML is a great choice. But if you’re only generating one HTML file among a sea of dozens of other arbitrary config file formats, then you really want ERB.

Generalizing, structural templating systems are really just plain old pure functions. You take some data, let’s say describing a blog post, and then you return some other data describing the rendered representation of that same post. This is a powerful realization because it enables you to utilize your language’s full set of utility functions for refactoring your templates. For an HTML example, see Noir’s partials and their use of Hiccup.

When you get into the wonderful world of metaprogramming, templating systems encounter a series of new problems. First, there is multi-level escape character hell. Anyone who has ever tried to write a shell script, which substitutes arguments into an Awk script, which generates a config file, knows what I’m talking about. Second, there is the issue of names. Unlike HTML or many simple config files, code generally have one or more notions of context, such as scoping or namespaces. You need to worry problems like variable capture. Yikes!

Early lisps discovered a mechanism for minimizing the pain of escape character hell: quoting. Quoting is similar to escaping, in much the same way that structural templates relate to procedural ones. Escaping lets one combine any two languages, provided the language on top provides a uniform escaping policy. However, with each new layer of languages, there is a new layer of escaping rules and, if those escape sequences conflict, you’ll have to double-, or even triple-escape common metacharacters like apostrophes, backslashes, and dollar signs. There’s no protection against incorrectly escaped strings and refactoring is thwarted by the need to increment and decrement escaping levels. In contrast, quoting takes advantage of Lisp’s homoiconicity to simplify template indirection into four composable, primitive operators: quote, syntax-quote, unquote, and unquote-splicing.

If you’re already a Clojure syntax-quote pro, you may want to skip down to the next section

Due to their high frequency of use, both Common Lisp and Clojure reserve some of their limited syntax for these primitives. Common Lisp uses the apostrophe (‘), backtick (`), comma (,), and comma-at (,@) character sequences as shorthand for quote, syntax-quote, unquote, and unquote-splicing respectively. Because Clojure treats commas as whitespace, it substitutes tilde (~) and tilde-at (~@) sequences for the unquoting operators. I’ll use Clojure’s notation for my examples.

The quote operator is used to protect symbols from evaluation. If I have a symbol x with value 5, then the expression (quote x) or ‘x does not evaluate to 5, it evaluates to the symbol x. Quoting is distributed recursively throughout a form. ’(f x) will protect both f and x from evaluation. It’s this distributive property that allows quoting and unquoting to compose across multiple levels of templating.

You might guess that (unquote 'x) or ~'x would return 5, but that operation is performed by the eval function: (eval 'x) does return 5. Trying to unquote here will generate an error. This distinction is because eval is generally implemented as a function, not a macro or special form. It does not alter the interpretation of its inputs, so it’s behavior doesn’t play nice with the distributive property of quoting.

Unquoting comes into play once you introduce syntax-quoting. Syntax-quotes are demarcated by a backtick, which can be thought of as switching Lisp into template mode. Like regular quotes, syntax-quotes are distributive. Unlike regular quote, which leaves its input form alone, syntax-quote transforms its input in two important ways: First it enables unquoting. You can think of unquoting like template variable substitution. Second, it resolves symbols. We’ll come back to this second point later.

Let’s say that you want to generate the code (f 1 (g 2 3)) where the numbers are provided by variables x and y holding the number 1 and the vector [2 3] respectively. You can accomplish this with the expression (syntax-quote (f (unquote x) (g (unquote-splicing y)))). Well, sort of. Clojure does not provide direct access to syntax-quote as a symbol. You need to use the shorthand form: `(f ~x (g ~@y)). Luckily, that’s nicer to look at anyway.

Unquote substitutes in a single value, like a dollar-sign in a shell script. Unquote-splicing is a little bit more interesting. When a syntax quote is transforming its inputs, it looks for unquotes and evaluates them. The evaluated result is substituted into the parent expression. Regular unquotes are substituted directly, but splicing unquotes cause the parent expression to be rewritten as a concatenation. The expression `(a b ~@x c d) is transformed into an expression resembling (concat '(a b) x '(c d)). The splicing operation is a more powerful alternative to a traditional templating language’s looping constructs. It can also emulate a traditional templating language’s if statement. The expression `(f ~@(when b [x])) will return code that passes the x argument to f only when b is true.

As discussed previously, quoting and unquoting are distributive. This is what allows you to escape from escape character hell. The expression `(f (g x) ~(h y `z)) distributes the quoting behavior down to f, g, and x, but the unquoting operator decrements the quoting level, such that neither h nor y are quoted. The z is explicitly quoted, re-incrementing the quoting level. You can’t do that with escape sequences, because, like a procedural template language, there is no explicit notion of the inherent tree structure. It’s worth noting that most lisps don’t have an explicit quoting level, instead syntax-quote processes its inputs recursively; the quote level is maintained implicitly by the execution stack.

That’s not all you need to know about syntax-quote; there’s also symbol resolution. Now that you’ve escaped escape hell and you’re writing structural templates, you’re able to do bigger and better metaprogramming. If you’re writing Common Lisp, you’ll quickly run into the aforementioned variable capture problem: You’ll generate code that includes name collisions. Clojure’s syntax-quote differs from Common Lisp’s by providing two mechanisms to combat this problem. First is namespace resolution. Clojure will expand `inc into clojure.core/inc and `foobar into user/foobar. This protects you from free symbol capture. Secondly, Clojure provides automatic gensyms. Symbols ending with a # character inside a syntax-quote are replaced by an automatically generated name. So the expression `(x# x# y#) will expand into something like (x__1__auto__ x__1__auto__ y__2__auto__), which is virtually impossible to cause macro argument capture. As linked several times, Paul Graham’s On Lisp is the best source for understanding these issues more deeply.

[Experienced Clojurians can start reading from here]

OK, so now that you have an extremely powerful template language baked into Clojure’s syntax-quoting mechanism, you still have the problem of reach: What good is a template language if you can only generate Clojure code with it? What if you need to generate some HTML? Or what about that Awk script?

Unfortunately, the template language can only really yield Clojure forms. Luckily, Clojure forms are relatively rich with a variety of primitives and collection types, so much of the Clojure community uses them for configuration files and DSLs. If you’re generating input to these Clojure systems, you’re in good shape. You don’t have quite the same reach as “all tools that operate on a stream of characters”, but “any library in the Clojure ecosystem” is a pretty damn good start. If you’re going to create a system to structurally generate SQL queries, you’ll need to build some kind of query representation that can be compiled to a SQL string. In a non-homoiconic language, you’d eventually need to come up with a new structural template engine too. In Clojure, your query library will get that template engine for free!

However, there is still a problem. When you first encountered code duplication, you reached for a simple procedural text template. Now you have the escape sequence problem. Lisp mitigates the escape sequence problem with quoting. But once you have larger scale templates, you start running into the variable capture problem. So Clojure solves the variable capture problem with an enhanced syntax-quote. Well just as each prior solution yields a new problem, so too does Clojure’s enhanced syntax-quote.

The problem is that Clojure tightly couples the syntax-quote symbol resolution to Clojure’s namespace system. This isn’t usually a problem if you’re always using syntax-quotes to generate code that will ultimately be executed in your local Clojure environment. However, consider the case of generating code to be executed in a remote Clojure environment. Or what about generating Clojure forms that will be compiled to SQL procedures to be executed on your database server? In those two cases, it’s less likely that resolving symbols against the locally loaded namespaces is a useful behavior. Suddenly, Clojure’s powerful template language is unavailable to you! Either you have to piece together your own template system, just as a non-homoiconic language, or you need to play tricks by manually or procedurally defining vars. Ouch.

Enter Backtick.

Backtick is an absurdly simple library. It provides Clojure’s syntax-quote reader macro as a normal macro. You can create new template quoting macros by providing a custom symbol resolver. Check it out!

Comments

drcode said…

Thanks- I need this.

Conrad Barski

Brandon Bloom said…

Cool! Let me know how it works out for you.

Unknown said…

I really needed this! Thank you for making this available.

ClojureScript Projects

I stumbled across Clojure a few times, but I’ve fallen in love with it sometime over the past year. I’ve always loved the idea of Lisp, but I never loved any particular Lisp until Clojure.

The name “CLoJure” comes from C#, Lisp, and Java. The best parts of each of community clearly shine through. Bolstered by the Java community, Clojure combines all the expressiveness of a Lisp in a package that makes everything seem as easy as C#. That last point deserves elaboration: C# has managed to empower the average code monkey with lambdas, closures, generics, expression trees, and futures. Similarly, Clojure empowers the average senior developer with immutability, macros, flexible dispatch, software transactional memory, interactive development, and a whole lot more! Plus, it’s free and open source! Rich Hickey has an incredible talent for distilling good ideas down into understandable bite sized chunks, making agreeable tradeoffs, and then deftly guiding the community.

Lately, I’ve been spending a lot of my spare cycles working on ClojureScript. Hacking on the ClojureScript compiler and runtime has been more fun than I can ever recall in my programming career. It’s an incredible toolkit that I intend to utilize heavily in my next startup, so I feel like my hours and hours of fiddling with cljs.core is extremely worthwhile. I’ve already contributed 20+ patches!

As fun as hacking on ClojureScript is, it’s probably a better idea to hack on something that uses ClojureScript, rather than ClojureScript itself. Paul Graham described Hacker News as an application to sharpen Arc on. Paul is right in saying that you need the top down pressure of a real application to motivate the design and implementation of the language from the bottom up. Although where his goal is brevity, Rich’s goal is simplicity. While Paul has seemingly effortlessly scaled Hacker News to a 1M+ daily pageview community, Rich has seemingly effortlessly reinvented the database.

ClojureScript hasn’t found its killer application yet. There is a lot of interest around the project, but only a handful of active contributors and production users. People like Chris Granger are pushing down on the top with ambitious applications like Light Table, but we need others. Chris is starting to make re-inventing the IDE look effortless, but his application prefers to be installed outside the browser, generally on Macs, so it isn’t clear to me if his vision remains squarely in the HTML/CSS/JavaScript-targeting, client/server sweet spot.

It has been over seven years since Gmail and Google Maps has made us re-think what is possible in the browser. I sense that ClojureScript holds the key to the next generation of re-thinking what’s possible in client-side applications. As I ponder my next startup, I’m going to be on the lookout for just such an application. In the meantime, I’ll be open sourcing some of the technology I’m building as I explore that path.

Learn to Read the Source, Luke

Apparently, I’m Jeff Atwood’s guest blogger today: Learn to Read the Source, Luke

Since I wrote the original Hacker News comment, I figured I might as well reproduce it in full here.

I started working with Microsoft platforms professionally at age 15 or so. I worked for Microsoft as a software developer doing integration work on Visual Studio. More than ten years after I first wrote a line of Visual Basic, I wish I could never link against a closed library ever again.

Using software is different than building software. When you’re using most software for its primary function, it’s a well worn path. Others have encountered the problems and enough people have spoken up to prompt the core contributors to correct the issue. But when you’re building software, you’re doing something new. And there are so many ways to do it, you’ll encounter unused bits, rusty corners, and unfinished experimental code paths. You’ll encounter edge cases that have been known to be broken, but were worked around.

Sometimes, the documentation isn’t complete. Sometimes, it’s wrong. The source code never lies. For an experienced developer, reading the source can often be faster… especially if you’re already familiar with the package’s architecture. I’m in a medium-sized co-working space with several startups. A lot of the other CTOs and engineers come to our team for guidance and advice on occasion. When people report a problem with their stack, the first question I ask them is: “Well, did you read the source code?”

I encourage developers to git clone anything and everything they depend on. Initially, they are all afraid. “That project is too big, I’ll never find it!” or “I’m not smart enough to understand it” or “That code is so ugly! I can’t stand to look at it”. But you don’t have to search the whole thing, you just need to follow the trail. And if you can’t understand the platform below you, how can you understand your own software? And most of the time, what inexperienced developers consider beautiful is superficial, and what they consider ugly, is battle-hardened production-ready code from master hackers. Now, a year or two later, I’ve had a couple of developers come up to me and thank me for forcing them to sink or swim in other people’s code bases. They are better at their craft and they wonder how they ever got anything done without the source code in the past.

When you run a business, if your software has a bug, your customers don’t care if it is your fault or Linus’ or some random Rails developer’s. They care that your software is bugged. Everyone’s software becomes my software because all of their bugs are my bugs. When something goes wrong, you need to seek out what is broken, and you need to fix it. You fix it at the right spot in the stack to minimize risks, maintenance costs, and turnaround time. Sometimes, a quick workaround is best. Other times, you’ll need to recompile your compiler. Often, you can ask someone else to fix it upstream, but just as often, you’ll need to fix it yourself.

Closed-software shops have two choices: beg for generosity, or work around it. Open source shops with weaker developers tend to act the same as closed-software shops. Older shops tend to slowly build the muscles required to maintain their own forks and patches and whatnot. True hackers have come to terms with a simple fact: If it runs on my machine, it’s my software. I’m responsible for it. I must understand it. Building from source is the rule and not an exception. I must control my environment and I must control my dependencies.

After reproducing my comment, Jeff wrote:

Nobody reads other people’s code for fun. Hell, I don’t even like reading my own code. The idea that you’d settle down in a deep leather chair with your smoking jacket and a snifter of brandy for a fine evening of reading through someone else’s code is absurd.

I disagree.

REST: One Thousand Inconsequential Decisions

Hardly a week goes by without yet another Hacker News front page post by a blogger making a bold declaration about some important thing you absolutely must do if you want your API to be “RESTful”. With Rails’ recent PATCH announcement, REST is all over the front page again.

In my experience designing and consuming APIs of varying levels of RESTfulness, I’ve concluded that most of the decisions involved in designing a RESTful API are completely inconsequential. PUT vs POST vs PATCH? HATEOAS vs manual URL construction? JSON vs custom Mimetype? It’s all the same to an API consuming developer. It might matter to some kind of generic crawler, but I’m not building an API for crawlers, so I don’t care.

What follows is a series of rants about various parts of the grand REST debate. This is by no means comprehensive. My goal with this post isn’t to discourage you from writing a REST API. I simply want you to spend your time on the API decisions that matter, not the cosmetic, inconsequential ones.

HTTP Verbs & Status Codes

The Rails PATCH announcement states “both PUT and PATCH are routed to update”. So what’s the difference? Apparently PATCH is for partial updates, where as PUT is for complete replacement of resources. Here’s the funny bit: Both of these are implemented via POST with a _method parameter for support of older browsers. The only two HTTP Verbs that actually matter are GET, for idempotent, cachable requests; and POST, for side-effects. Even DELETE is insufficiently defined for any real application. What if you want to support both Archive and Delete like Gmail?

Which response codes should you use? Those which have specific meaning to HTTP clients. Specifically: those treated differently by browsers. Should you use 201 on create instead of 200? Who cares? Should you use 301 or 302 for redirects? In an API, it doesn’t matter. Pick one. Document it. For your actual web page, consult an SEO guide.

What about errors? If you don’t have permission to see a resource, is that a 403? Or do you hide the existence of the resource and return a 404 (like GitHub)? It doesn’t matter. Return a 4xx for expected caller errors and a 5xx for unexpected errors. Pick one. Document it.

The smart thing to do is to always return your resources within one level of nesting. That is an item key in JSON or an item element in XML. Do this so that you can add an error key or other metadata without breaking changes. For example, you may want a timings key for perf debugging. For lists, use an items key because pretty soon, you’re going to want a paging key. Also, for errors, make sure you include a non-localized (and hence, Google-able) error string; a status code is insufficient.

Hating On HATEOAS

HATEOAS, an absurd mouthful of an acronym, is a complete waste of time outside of read-only semi-structured datasets. If you’re FreeBase, then go to town with HATEOAS conventions that are useful to you. However, for your typical webapp API, you simply cannot build a useful API client application without deep knowledge of the problem domain and the available API calls. You’re going to have API documentation and you’re going to have to read it.

By all means, feel free to use HTTP 201 and a Location header to return the location of a newly created resource. Or you could just return the resource as JSON with an href or id key. That’s yet another inconsequential decision. Might as well go with the path of least resistance: return interesting data in the body of the response (ie. in the JSON) where it can be accessed without additional effort to parse the headers. As long as your response code is in the 200 range, no one is going to look at it further.

I’ve seen people advocate Link headers such as rel=“newcomment”. It’s weird to me because you still need to know which rel to lookup. The same thing applies to returning a bunch of extra URLs as metadata in your JSON. You still need to know which key to look up for the post-new-comment URL. You might as well just appended “/comments” and avoid the indirection. Link and rel are useful to generic tools that want to treat a wide range of resources the same. There aren’t enough common aspects between the domain models of a typical API to justify the spare brain cycles. Again, if you’re Wikipedia, your use case is different. You have generic data that you want to traverse generically. And again, that’s a mostly read-only use case.

Versioning

Some REST fans advocate the use of custom MIME types for API versioning. These folks believe that version numbers in URLs are ugly. The usual objection is that resources should be equitable by URL and that /v1/users/5 and /v2/users/5 are different resources. The problem here is that this resource is actually identified by just the number. You don’t need the whole URL and practical constraints are going to make it impossible to equate resources by URL. Consider the constraint of browsers’ Same Origin Access Policy. You need api.example.com to run your public API on it’s own IP address. And you need example.com/api/ to make calls from your web frontend without jumping through crazy IFrame interop hacks.

There are two types of API changes: 1) Backward Compatible, and 2) Breaking Changes. In the case of backwards compatible changes, you don’t have to, nor should you, change version numbers. Simply add the additional method or additional return data. Nothing to worry about. In the case of breaking changes, you want to break clients as loudly as possible. If you simply change the MIME type, most consumers of the API are totally going to ignore it. If you deploy your new API at api2.example.com, then there is no risk of breaking older clients and they have to make a conscious decision to upgrade.

In general, you’re probably going to want to protect yourself with API keys. API keys provide the best way to version your API: record the desired API version with the generated key. Just stick an api_version column on your api_keys table. Expose that column as a drop down box in your API key management UI. This solution let’s you leave the “ugly” version number out of the URL, but still enforce breakage on breaking changes. The best part is that you can totally ignore API keys and versioning when you start: Simply default unregistered API consumers to use v1 when you do implement v2.

Last word on custom MIME types: One of the great promises of REST is the ability to treat generic things, generically. If your MIME type isn’t a standard JSON MIME type, then how can you expect your browser to render it nicely with syntax highlighting and code folding? Your data is structured as JSON or XML; if you want to version the schema, use the facilities provided by XSDs or JSON Schemas, etc. But you’re better off with documentation written by humans, for humans, than you are with formal schema.

REST: The Good Parts

There are a lot of good reasons why REST has gained popularity. For one thing, it isn’t SOAP.

The good ideas in REST are primarily predictable URLs, contractual idempotence, the preference of nouns over verbs, and CURL as a client library. In practice, you can get contractual idempotence via GET vs POST. Predictable URLs are no different than predictable function names in all typical library code.

All this other stuff, the finer points of URIs and HTTP headers, are completely inconsequential to a developer programming against a well documented API.

Imported Comments

Xavier Noria

Note that the motivation for PATCH in Rails is to follow HTTP semantics, it is unrelated to REST.

Lars Gierth

It’s all the same to an API consuming developer. It might matter to some kind of generic crawler, but I’m not building an API for crawlers, so I don’t care.

That’s how you’re consuming APIs now! Imagine building a client that doesn’t break if the API changes its semantics.

The idea of versioning APIs is bad though, as it’s contradicting the idea of building sustainable web services.

Brandon Bloom

@Xavier: What benefit do I get from following the rules of HTTP semantics on this point? Semantics are only worth having if they make a distinction that is useful in practice. HTTP experimented with various verbs and it turns out that two are universally useful: GET and POST. The rest are ancillary.

@Lars: Your two points conflict. You should only make breaking changes if you’re changing your models meaningful. When you make a breaking change, you use versioning strategies like changing the base URL. When your models change meaningfully, your clients will need to change to present the new view of the problem domain. If you’ve versioned correctly, old apps will work on the old API version and new ones will work on the new API version. It’s impossible for an old client to magically support new concepts.

Plugging a Hole in Microsoft’s Hiring Pipe: IE Frame

You’re probably familiar with Google Chrome Frame, Google’s clever workaround to combat Internet Explorer dependency in enterprises. You’re probably also aware that Microsoft is not a fan.

Microsoft desperately needs to implement an equivalent “Internet Explorer 10 Frame” mass deployed as a high-priority Windows Update. Failure to do so all but guarantees that Microsoft will become completely incapable of hiring a sufficient supply of talented graduates.

The key issue is that two of the major vectors for aspiring engineers are no longer dominated by Microsoft technology.

Javascript is the New Basic

Microsoft QBasic was my first experience with programming, but I got started a few years earlier than most of my peers. The students I graduated college with largely had their first foray into text editing via HTML. Browsers are installed on every computer on the market and web development skills are vital for nearly all professional developers (even those not doing full-time web development). This puts Microsoft in a really bad spot: every budding developer’s first impression of Microsoft is “IE SUCKS!!!!11!!one!”.

Microsoft’s Stranglehold on Gaming is Loosening

When I was employed by Microsoft, I took an informal survey of my recently hired peers. I worked in the gaming division, so my view was skewed: they almost all got into programming via PC gaming. If you loved games, you ran Windows. If you wanted to make games, you learned Microsoft Visual C++. However, times are changing. Apple’s iPhone is the biggest gaming platform on the planet. Your family computer is probably a Mac by now and your Xbox doesn’t come with a compiler. Why not learn Objective-C? Furthermore, canvas games are popping up everywhere and it won’t be long before WebGL is common place. Soon enough, a significant subset of aspiring game developers only have to boot Windows to test something in Internet Explorer. sigh.