The endless quest for the right tools for large scale software
Colin Steele writes about the challenges that he and his team are facing at Hotelicopter:
We’ve officially run headlong into one of Ruby on Rails’ deficiencies: programming in the large. We’re not interested in computer science-y solutions, only pragmatic ones.
I’m curious what is excluded by the phrase “computer science-y solutions”? I would normally interpret that to mean “we are looking for well tested solutions with wide deployment” but elsewhere he writes:
We’re currently investigating a spectrum of new technologies in the NoSQL realm, including Tokyo Tyrant, MongoDB, Amazon’s SDB, CouchDB, Voldemort, and more. Tis’ a dizzying mix, and things are popping in the space.
So clearly they are looking at some cutting edge technologies. Colin links to an essay about Programming in the large which includes this:
Maintenance and locality are strong arguments in favor of immutability. The less aspects of an object can be changed, the less you have to worry about the execution history. If an object has a two-phase initialization sequence (e.g. this is so in C++ if you need a virtual function during initialization), you have to make sure that the objects are properly initialized; code that gets handed over such an object will have to check that it’s initialized (if only in an assert()). This all vanishes if the language makes sure that no object remains uninitialized, and doesn’t force a two-phase initialization on programmers like C++. (In C++, the “wrong” design decision was that objects mutate from the base type to the subtype when the various constructors are run. It’s this kind of far-reaching consequences (IOW non-locality) that makes language design an art.) If you take immutability to its extremes, nothing can ever be changed. If you wish to change the world, you write a function that returns a list of changes and let the run-time system inspect that list and execute the proper actions. If you have an interactive program, you emit a list that has a function pointer at its end; the function gets fed the next input and is expected to generation another action list.
In the last few years, many programmers have pointed to mutability as one of main problems they face when they build larger systems. Chas Emerick does a good job of highlighting the problem in his post “All my methods take 316 arguments, and I like it that way”
316 arguments to a method (which I don’t think is actually possible in the jvm, but bear with me)? “That’s absurd!”, you’d say. The problem, of course, is that the 3-arg doSomething actually has far more arguments than its signature implies:
The behaviour of every function in a mutable, imperative environment is dependent upon the state of all of the other (variables|attributes|bindings|whatever) in your program at the time the function is invoked.
So, if you have 313 other variables in your program, that 3-arg doSomething is functionally (ha!) operating over 316 arguments.
Would you ever intentionally write a method signature that takes 316 arguments? Would you use any library that contained such a function signature? No? Then why are you using tools that force such craziness upon you?
Chas says “The languages are ready” and he links to some of the major functional languages: Erlang, Clojure, F#, and Fantom. In comments, his readers add in their favorite functional languages: OCaml, Haskell, etc.
One of his readers challenges the idea that functional languages are safer than imperative languages by offering this:
Regarding functions changing state, what about things like this in clojure, Isn’t it like global variables in imperative lang?
(def state (ref #{}))
(defn function that updates state)
(defn another function that updates state)
Chas responds:
You bet. Clojure is not a purely functional programming language, so you can have as much shared state as you want – but the language is going to make you work for those bits of shared state, so you have to “pay” for them. Conversely, imperative languages like Java et al. make you work to achieve immutability, and provide nothing in the way of enabling persistent data structures, etc.
The point is that defaults matter, a lot.
I like the word “default” in this context. In his 2001 book, Effective Java, Joshua Bloch wrote “Favor immutable objects over mutable.” When writing a big system in Java, you work to make your system immutable. In a language like Clojure, the default is just the opposite – you work to make parts of your system mutable.
I have very little experience with functional programming. I am just learning Clojure now (Lisp redone for the JVM). I can not say what benefits its brings. I’m looking forward to learning more in this area. It’ll be interesting to see where functional programming comes to be regarded as a “best practice”. Certainly, it will be interesting to see if CTO’s start using these languages at startups, or whether they will be regarded as “computer science-y solutions”.
I’ve somewhat more experience with the web app frameworks that have emerged since 2004. I’m interested in what Colin wrote here:
I suspect as we muddle along we’ll develop a component-level (service level if you prefer) version of the Law of Demeter, which will drive us to make the right decisions for decoupling. I’m not too worried about that. However, we definitely have issues with reuse. Currently the Ruby on Rails state of the art solution for reuse is the gem. Which, let’s face it, is a pathetic solution.
Some of the frameworks seem to encourage bad habits. I’ve already written of Symfony’s weaknesses in Symfony versus The Law Of Demeter: does Symfony promote bad habits?.
When Ruby on Rails first emerged it was targeting web apps, not web services. Rails has a lot of imitators: Groovy/Grails, PHP/Symfony, etc. These all help create web sites, but not necessarily web services. I suspect a new generation of frameworks will be needed to make this kind of work easier:
The place we’re aiming for is a highly decoupled (and scalable), cohesive set of services, joined through REST APIs and/or fully reused common business models.
In their book Restful Web Services the authors Leonard Richardson and Sam Ruby talk about “the human web” and the “programmable web”. This is from page 2:
The Web you use is full of data: book information, opinions, prices, arrival times, messages, photographs, and miscellaneous junk. It’s full of services: search engines, online stores, weblogs, wikis, calculators, and games. Rather than installing all this data and all these programs on your own computer, you install one program – a web browser – and access the data and services through it.
The programmable web is just the same. The main difference is that instead of arranging its data in attractive HTML pages with banner ads and cute pastel logos, the programmable web usually serves stark, brutal XML documents. The programmable web is not necessarily for human consumption. Its data is intended as input to a software program that does something amazing.
Originally, frameworks like Rails were created to help speed the production of sites for the human web. They have evolved since then, Rails in particular. In fact, Richardson and Ruby use Rails for many of the examples they offer in the book, about how to correctly build a RESTful web service. And yet, the scaffolding systems in these frameworks still tend to automate the production of CRUD web pages, rather than PGPD services. (I do not know the state-of-the-art with Rails, so someone can tell me if I’m wrong about its scaffolding.)
Richardson and Ruby suggest that every module (resource) in a RESTful web service should expose just 4 actions:
POST
GET
PUT
DELETE
These are the HTTP verbs, and they roughly correspond to the standard CRUD actions, except that POST is used for both Create and Update, and PUT is used for uploading files:
Create/Update
Read
Upload
Delete
I suspect we need a new generation of frameworks, or at least new scaffolding systems for the existing frameworks, that automate the setup of PGPD services. That seems like the next obvious step forward.