Archive for December, 2009

What a RESTful architecture for NewsGator should look like

Thursday, December 31st, 2009

Dare Obasanjo offers an interesting look at how NewsGator’s RSS service could be re-designed so as to be more RESTful.

The endless quest for the right tools for large scale software

Thursday, December 31st, 2009

Colin Steele writes about the challenges that he and his team are facing at Hotelicopter:

We’ve officially run headlong into one of Ruby on Rails’ deficiencies: programming in the large. We’re not interested in computer science-y solutions, only pragmatic ones.

I’m curious what is excluded by the phrase “computer science-y solutions”? I would normally interpret that to mean “we are looking for well tested solutions with wide deployment” but elsewhere he writes:

We’re currently investigating a spectrum of new technologies in the NoSQL realm, including Tokyo Tyrant, MongoDB, Amazon’s SDB, CouchDB, Voldemort, and more. Tis’ a dizzying mix, and things are popping in the space.

So clearly they are looking at some cutting edge technologies. Colin links to an essay about Programming in the large which includes this:

Maintenance and locality are strong arguments in favor of immutability. The less aspects of an object can be changed, the less you have to worry about the execution history. If an object has a two-phase initialization sequence (e.g. this is so in C++ if you need a virtual function during initialization), you have to make sure that the objects are properly initialized; code that gets handed over such an object will have to check that it’s initialized (if only in an assert()). This all vanishes if the language makes sure that no object remains uninitialized, and doesn’t force a two-phase initialization on programmers like C++. (In C++, the “wrong” design decision was that objects mutate from the base type to the subtype when the various constructors are run. It’s this kind of far-reaching consequences (IOW non-locality) that makes language design an art.) If you take immutability to its extremes, nothing can ever be changed. If you wish to change the world, you write a function that returns a list of changes and let the run-time system inspect that list and execute the proper actions. If you have an interactive program, you emit a list that has a function pointer at its end; the function gets fed the next input and is expected to generation another action list.

In the last few years, many programmers have pointed to mutability as one of main problems they face when they build larger systems. Chas Emerick does a good job of highlighting the problem in his post “All my methods take 316 arguments, and I like it that way

316 arguments to a method (which I don’t think is actually possible in the jvm, but bear with me)? “That’s absurd!”, you’d say. The problem, of course, is that the 3-arg doSomething actually has far more arguments than its signature implies:

The behaviour of every function in a mutable, imperative environment is dependent upon the state of all of the other (variables|attributes|bindings|whatever) in your program at the time the function is invoked.

So, if you have 313 other variables in your program, that 3-arg doSomething is functionally (ha!) operating over 316 arguments.

Would you ever intentionally write a method signature that takes 316 arguments? Would you use any library that contained such a function signature? No? Then why are you using tools that force such craziness upon you?

Chas says “The languages are ready” and he links to some of the major functional languages: Erlang, Clojure, F#, and Fantom. In comments, his readers add in their favorite functional languages: OCaml, Haskell, etc.

One of his readers challenges the idea that functional languages are safer than imperative languages by offering this:

Regarding functions changing state, what about things like this in clojure, Isn’t it like global variables in imperative lang?

(def state (ref #{}))

(defn function that updates state)

(defn another function that updates state)

Chas responds:

You bet. Clojure is not a purely functional programming language, so you can have as much shared state as you want – but the language is going to make you work for those bits of shared state, so you have to “pay” for them. Conversely, imperative languages like Java et al. make you work to achieve immutability, and provide nothing in the way of enabling persistent data structures, etc.

The point is that defaults matter, a lot.

I like the word “default” in this context. In his 2001 book, Effective Java, Joshua Bloch wrote “Favor immutable objects over mutable.” When writing a big system in Java, you work to make your system immutable. In a language like Clojure, the default is just the opposite – you work to make parts of your system mutable.

I have very little experience with functional programming. I am just learning Clojure now (Lisp redone for the JVM). I can not say what benefits its brings. I’m looking forward to learning more in this area. It’ll be interesting to see where functional programming comes to be regarded as a “best practice”. Certainly, it will be interesting to see if CTO’s start using these languages at startups, or whether they will be regarded as “computer science-y solutions”.

I’ve somewhat more experience with the web app frameworks that have emerged since 2004. I’m interested in what Colin wrote here:

I suspect as we muddle along we’ll develop a component-level (service level if you prefer) version of the Law of Demeter, which will drive us to make the right decisions for decoupling. I’m not too worried about that. However, we definitely have issues with reuse. Currently the Ruby on Rails state of the art solution for reuse is the gem. Which, let’s face it, is a pathetic solution.

Some of the frameworks seem to encourage bad habits. I’ve already written of Symfony’s weaknesses in Symfony versus The Law Of Demeter: does Symfony promote bad habits?.

When Ruby on Rails first emerged it was targeting web apps, not web services. Rails has a lot of imitators: Groovy/Grails, PHP/Symfony, etc. These all help create web sites, but not necessarily web services. I suspect a new generation of frameworks will be needed to make this kind of work easier:

The place we’re aiming for is a highly decoupled (and scalable), cohesive set of services, joined through REST APIs and/or fully reused common business models.

In their book Restful Web Services the authors Leonard Richardson and Sam Ruby talk about “the human web” and the “programmable web”. This is from page 2:

The Web you use is full of data: book information, opinions, prices, arrival times, messages, photographs, and miscellaneous junk. It’s full of services: search engines, online stores, weblogs, wikis, calculators, and games. Rather than installing all this data and all these programs on your own computer, you install one program – a web browser – and access the data and services through it.

The programmable web is just the same. The main difference is that instead of arranging its data in attractive HTML pages with banner ads and cute pastel logos, the programmable web usually serves stark, brutal XML documents. The programmable web is not necessarily for human consumption. Its data is intended as input to a software program that does something amazing.

Originally, frameworks like Rails were created to help speed the production of sites for the human web. They have evolved since then, Rails in particular. In fact, Richardson and Ruby use Rails for many of the examples they offer in the book, about how to correctly build a RESTful web service. And yet, the scaffolding systems in these frameworks still tend to automate the production of CRUD web pages, rather than PGPD services. (I do not know the state-of-the-art with Rails, so someone can tell me if I’m wrong about its scaffolding.)

Richardson and Ruby suggest that every module (resource) in a RESTful web service should expose just 4 actions:

POST
GET
PUT
DELETE

These are the HTTP verbs, and they roughly correspond to the standard CRUD actions, except that POST is used for both Create and Update, and PUT is used for uploading files:

Create/Update
Read
Upload
Delete

I suspect we need a new generation of frameworks, or at least new scaffolding systems for the existing frameworks, that automate the setup of PGPD services. That seems like the next obvious step forward.

The decline of east coast tech

Thursday, December 31st, 2009

Adam Healey writes Charlottesville Needs More Nerds:

For some reason, there are just not a lot of startups being created here. On this mashup by fourio, web 2.0 start-ups are mapped globally. There are none, until now, in Charlottesville. Why is that? Simple. Charlottesville needs more nerds. UVA’s graduate engineering school is ranked 37th nationally. Ouch. There’s the problem right there.

Colin Steele echos the concern:

As CTO, I seem to be getting the recurring question, “Can you (hotelicopter) find the tech-savvy talent you need in Charlottesville?” It’s a valid question. Long gone are the days of Kesmai, EA, Mr. Goodbucks, and the beloved Value America. These days, we have influx of spooks, a smattering of biotech companies, and in the IT/Internet world… a whole buncha nothin’.

To me, the issue seems related to the decline of New York during this last decade. I mean, hell, if even New York was in decline, then what chance did Charlottesville have? The simple fact was that a lot of the tech industry was consolidating into Silicon Valley (or moving off shore). But if it is true that New York is set for a rebound, then perhaps other east coast locales will also see their fortunes improve.

Working 90 hour weeks, and playing video games

Thursday, December 31st, 2009

When you are in the office 90 hours a week, frequent work breaks are important.

Working 90 hours a work week requires frequent, and highly effective, work breaks. In the center of Macintosh work area in Bandley 3 we had a ping pong table, a nice stereo system, and a Defender video game machine. We found that competitive play gave us a jolt of adrenaline, and a refreshed mind-set when we resumed work. We also learned a lot about our coworkers and how they excel during competition. While playing Defender one day I got some great insight into how Burrell accelerates his own learning process.

…One day Burrell started doing something radical. Andy came by my cube and said “You’ve got to come see what Burrell’s doing with Defender.” “How can you innovate with a video game?” I wondered. I’d seen Burrell and Andy innovate on all kinds of things, but I couldn’t image how he could somehow step outside the box of a video game – the machine controlled the flow and dictated the goals. How could you gain some control in that environment?

Some people argue that 90 hour work weeks are inefficient because eventually people get tired. That would depend on what “efficient” means. It is possible that there is a law of diminishing returns that sets in once people have worked more than 20 or 30 hours in a week. But so what? If you are only 25% as efficient, after 80 hours, as you were during your first hour, then you are still getting work done. In such a scenario you are only achieving 15 minutes of work for every hour that you work, but you are still moving forward. And there are a lot of projects where that is what is needed – every possible minute that can be put in on the project. No matter how far down you go on the declining tail of diminishing returns, you are still getting some additional output for every hour worked.

But I would also argue with those who say there is a law of diminishing returns. I’d say, just as often, there is a law of increasing returns that applies to situations where tough problems must be solved. How often do you think Einstein thought about relativity during 1904/1905? Maybe 100 hours a week? 120? Every waking minute? I think the important phrase is “time on task”. It is crucial for big breakthroughs. There is a huge difference between playing videos at work, knowing you are going back to work, and playing videos at home, knowing you’ll be chilling out for the rest of the evening. Context is everything. In one situation your brain works on the problem, in the background, in the other situation your brain drifts away from the problem. Going home means shifting gears and thinking about other things. Staying at work means staying basically focused on your task, even if you take frequent recreational breaks.

A web app written in Lisp

Thursday, December 31st, 2009

A web app written in Lisp. Mostly interesting for the sheer novelty.

Why is nature able to program highly dependent systems?

Thursday, December 31st, 2009

All of the literature of computer science is devoted to the issues of arranging the state of the system in such a way that it can not be accidentally changed, or changed by 2 processes that need the state of the system to move in opposite directions. Programming has many catch-phrases to express these ideas:

information hiding

decoupling

small pieces, loosely joined

Apparently humans have trouble maintaining software that is written in a highly coupled way. And yet, our bodies appear to be highly coupled systems – failure of any one major part can lead to death for the whole. There are many global variables, such as body temperature, which effect the context in which all other variables operate (for instance, enzyme efficiency depends on body temperature). This leaves me curious – apparently nature has figured out how to build highly-coupled systems, systems which then last for 70 or 80 years (better than most computer systems can hope for). How is this done? Meta-programming? Processes that write macros that give rise to processes which can write macros? I suspect a close study of the ways cells program their activities will eventually lead to new strategies of programming software.

Friendship is a checksum

Thursday, December 31st, 2009

Colin Steele, on the difficulties of communicating what you mean:

Meaning is heavy stuff. And, as it turns out, difficult to transmit using language. Why? Because language (words) sits one level of indirection away from the models themselves. Follow me, here.

We have the thing itself. We’ll stick with “apple”. Out there, somewhere, is the apple we’re talking about. Then, one level removed from that, are the sensory experiences associated with that apple. (Level of indirection: 1). Aggregated in your head is that sensory input, memories, associations, etc — the mental model of that apple. (Indirection now 2.) Then there is the word I slap onto that model – “apple”. Note that the word represents the model, but isn’t the model itself. (Indirection: 3.) Phew. We’re three hops from the damn piece of fruit, and just getting rolling…

Now I try to talk about the apple. I use the word apple, and a bunch of other words that are intended to connect the dots – make the same associations for your version of “apple” as I have attached to mine. Again, add a level of indirection. (4.) You aggregate into your “apple” model. (5.) You slap the word “apple” on that model. (6.)

Yep. That’s lossy. Your copy of the “apple” model that I’ve tried to convey to you sits about 6 hops away from the thing itself. Maybe more, I’m not sure. And there are no checksums.

That nicely sums up the way that meaning gets lost as it moves from one person to another. Colin then suggests that storytelling is the most effective tool to protect against meaning-loss, and the most compressed version of a story is a metaphor:

Then, invent a defining, central story. A system metaphor captures the essence of the system in the most effective communication tool we have – the story.

This suggests to me the importance of having relationships that last a while. Friendship (or at least prolonged exposure to someone) acts as a kind of checksum. What is a checksum after all, but a pattern whose presence we’ve learned to expect? With old friends, we look for that pattern, we check what they are saying right now against the pattern we’ve seen in the past. If you try to tell me about an apple, and if we’ve known each other awhile, then I’ve all our previous conversations to refer back to. I can recall the conversations we’ve had about pears, grapes and oranges. Possibly I might remember that you prefer tart fruit to sweet, so I know your apple is probably more of a macoun and less of a mutsu – our previous conversations help me narrow down what kind of apple you mean.

This has an interesting implication for work, especially web development. It suggests areas where hiring freelance workers is perhaps inappropriate. To the extent that hiring freelance workers suggests loose, impermanent relationships, then freelancers are a bad choice for startups. Since long relationships make it slightly easier to communicate meaning, and since startups must wrestle constantly with defining their aims and goals, then startups are probably the kind of environment that should prefer long term workers instead of freelancers.

One other thing that this brings up, Colin is partly talking about the illusion of agreement.

illusion-small

Whenever this issue comes up, I’m reminded of 37 Signals philosophy of getting real. In terms of web work, getting real means that you should work with images, rather than functional specs:

Functional specifications documents lead to an illusion of agreement. A bunch of people agreeing on paragraphs of text is not real agreement. Everyone is reading the same thing, but they’re often thinking something different. This inevitably comes out in the future when it’s too late. “Wait, that’s not what I had in mind…” “Huh? That’s not how we described it.” “Yes it was and we all agreed on it — you even signed off on it.” You know the drill.

Functional specifications document are “yes documents.” They’re political. They’re all about getting to “yes” and we think the goal up front should be getting to “no.” Functional specs lead to scope creep from the very start. There’s very little cost in saying “yeah, ok, let’s add that” to a Word document.

As much as possible, you want to make your product real, as fast as possible, so that people in your company can argue over the actual thing, rather than their different conceptions of what that thing should be. If you have the time to build a simple prototype, you should do so. It is true, of course, that in a literal sense “there are no checksums” when 2 people are trying to communicate complex ideas to one another about what direction a software project should take. But there are ways to create checks, to ground the conversation as close to reality as possible, and images of how the software should look on screen is a good one. And simple prototypes of the interface is another good one. 2 people can go very far astray when they argue over words in a document, but when they are looking at an image of the interface, the image helps limit the number of places they can misunderstand each other.

High functioning schizophrenia

Tuesday, December 29th, 2009

Interesting story about a woman who has built a fantastic career, despite having schizophrenia:

The first frank episode of psychosis happened when I was around 16, and I suddenly started walking home from school in the middle of the day. I began to feel the houses were getting weird; they were sending me messages: “You are special. You are especially bad. Now walk. Cries and whispers.” There were also some warning signs in college but I didn’t really “officially” break down until graduate school at Oxford.

….Subjectively, the best comparison I can make is to a waking nightmare. You have all the terror and confusion and the bizarre images and thoughts that you have in a nightmare. And then with the nightmare you sit bolt upright in bed in utter terror. Only with a nightmare you then wake up, while with psychosis you can’t just open your eyes and make it all go away.

When I was 16, I had maybe a dozen incidents where I woke up with what felt like a bad nightmare that would not stop, despite the fact that I was now awake. Most times I was able to fix the problem by going back to sleep and waking up maybe 15 minutes later. What I felt was similar to what she describes. It was a really awful sensation, the worst I’ve ever known. Reading her words, I have to wonder if I wasn’t skating along the edge of something serious.

No longer dreaming of a perfect future

Tuesday, December 29th, 2009

Nowadays, people do not dream of a better tomorrow, they simply hope won’t be much worse than the present:

The difference, says Blom, is that the beginning of this century has not yielded any hope for the future. Blom utters a depressing sentence: “We don’t want a future, we want a present that doesn’t end.” It isn’t as if this present were so attractive, he says — it’s just that people are worried that things could get even worse.

In a few days, the first decade of the 21st century and of the third millennium will come to an end. It was a decade that began, not with a smooth transition into a new era but with a bang. It was a decade filled with crisis years: the 9/11 crisis, the climate crisis, the financial crisis and the crisis of democracy. Taken together, they represent a general crisis for the West. Things could hardly have gone any worse over the course of decade.

…The internationally most successful film of the decade was “Lord of the Rings: The Return of the King.” Harry Potter was the most successful literary character. Both are children’s stories that are also enjoyed by adults. We are withdrawing into an infantile world, in which attractive heroes conquer evil. The modern fairy tale is our response to a harsh world.

In the reality of the first decade, evil did not come from monsters but from our neighbors, who had no bad intentions. Our neighbors’ stock market investments stoked the financial crisis, their SUVs contributed to the climate crisis, their abstention from voting to the crisis of democracy. And now their viruses are transmitting swine flu. With the exception of terrorists, the villains of the decade were innocents.

The Lost Decade

Tuesday, December 29th, 2009

It is astonishing that, despite all the bright hopes of the Internet, the last 10 years have been the worst in history for the economy of the United States.

Paul Krugman writes:

It was a decade with basically zero job creation. O.K., the headline employment number for December 2009 will be slightly higher than that for December 1999, but only slightly. And private-sector employment has actually declined — the first decade on record in which that happened.

It was a decade with zero economic gains for the typical family. Actually, even at the height of the alleged “Bush boom,” in 2007, median household income adjusted for inflation was lower than it had been in 1999. And you know what happened next.

It was a decade of zero gains for homeowners, even if they bought early: right now housing prices, adjusted for inflation, are roughly back to where they were at the beginning of the decade. And for those who bought in the decade’s middle years — when all the serious people ridiculed warnings that housing prices made no sense, that we were in the middle of a gigantic bubble — well, I feel your pain. Almost a quarter of all mortgages in America, and 45 percent of mortgages in Florida, are underwater, with owners owing more than their houses are worth.

Last and least for most Americans — but a big deal for retirement accounts, not to mention the talking heads on financial TV — it was a decade of zero gains for stocks, even without taking inflation into account. Remember the excitement when the Dow first topped 10,000, and best-selling books like “Dow 36,000” predicted that the good times would just keep rolling? Well, that was back in 1999. Last week the market closed at 10,520.

I could almost write:

I sit in one of the dives
On Fifty-second Street
Uncertain and afraid
As the clever hopes expire
Of a low dishonest decade:
Waves of anger and fear
Circulate over the bright
And darkened lands of the earth,
Obsessing our private lives

The burden is on the victors: many great technologies have been lost

Monday, December 28th, 2009

Innovation is a ragged, ugly process. Some advances entail consolidation around second-best technologies. It has often been the case for 2 technology firms to fight it out for some market segment. The company in 1st place will be the one with the best marketing. The company in 2nd place will be the one with the best technology (think Microsoft versus Apple circa 1990).

In the context of experiments in Lisp, done a long time ago, this post makes a good point along those lines:

Anyway, I *know* what it is to look at functionality and duplicate it elsewhere. It CAN be done. I am not saying it can’t. What I’m saying is that it has not been done, and it’s a crying shame. Few people even know there ever WAS a lisp machine, and those who do are mostly not rich enough personally to invest the time to duplicate what was there. Many people spent a big chunk of their lives investing in this dream and it didn’t pan out quite as we wish. Ok. Sometimes other events win out–not always even for the right reasons. Or at least for the reasons you wish. But don’t add insult to injury to say that the losers in battles such as these had nothing to offer.

Common Lisp beat out Interlisp, and maybe for good reasons but it doesn’t mean Interlisp had nothing to offer–some very good ideas got lost in the shuffle and I don’t pretend that Common Lisp just obviously had a better way. Java is going to beat out Smalltalk perhaps, but that doesn’t mean Java is better than Smalltalk. We owe it to the losers in these little skirmishes to make sure that, if nothing else, the good ideas are not lost along with the framework. And we do not accomplish that by defining that there was nothing lost. That’s both callous to those who worked hard on these other things and short-sighted to the future, which might one day care about the things that got lost.

…You can say the burden is on us old-timers to tell you what’s missing or we shouldn’t be whining. But I don’t see it that way. I see the burden is on the victors, who have the resources and who claim their way is better, to show us that they won for good reason. We did our part for the cause. We may or may not continue to try to do things to assure the ideas aren’t lost.

I spend a lot of my time trying to make sure old ideas make it onto the books and don’t get lost. But I’m just one person. It takes more than one person. And the task does not begin by dismissing the need to do the job.

Morphine and the economic stimulus

Monday, December 28th, 2009

When my dad was dying of cancer I sat by his bed for most of each day. I ran errands for him, like getting him coffee when he wanted a coffee. One thing I learned was that it was important for him to stay on pain killers at all times (not actually morphine, I’m just using that in the title since it is easy to recognize). When he was on pain killers, he could momentarily pretend that he wasn’t really that sick. When the pain killers wore off, he was in agonizing pain. So I became diligent about reminding the nurses to be prompt about delivering the next dose of pain killers. The important thing was that my dad should get the next dose of pain killers before the last dose wore off.

This memory influences the way I think about the economy. Everyone I know, including myself, is currently walking around saying, “Hey, this recession isn’t really that bad. I thought it was going to be a lot worse.” But it is noteworthy that the government’s economic stimulus wears off in 6 months.

Positive mentions of WP Questions

Saturday, December 26th, 2009

I do not know who Michael Soriano but I am very pleased to see him mention WP Questions in the comments on Frugal Themes.

Darren Hoyt and I do not have much of a budget for marketing, so we are relying on positive word-of-mouth to spread awareness about our new site.

What if we could go back and start over with the computer revolution?

Saturday, December 26th, 2009

What kind of architectures would computers have if we could go back and start over?

The foundations of the computing systems we use are built of ossified crud, and this is a genuine crime against the human mind. How much effort (of highly ingenious people, at that) is wasted, simply because one cannot press a Halt switch and display/modify the source code of everything currently running (or otherwise present) on a machine? How many creative people – ones who might otherwise bring the future to life – are employed as what amounts to human compilers? Neither programmers nor users are able to purchase a modern computer which behaves sanely – at any price. We have allowed what could have once become the most unbridled creative endeavor known to man short of pure mathematics to become a largely janitorial trade; what could have been the greatest amplification of human intellect in all of history – comparable only to the advent of written language – is now confined to imitating and trivially improving on the major technological breakthroughs of the 19th century – the telegraph, telephone, phonograph, and typewriter.

Brokenness and dysfunction of a magnitude largely unknown for centuries in more traditional engineering trades has become the norm in computer programming. Dijkstra believed that this state of affairs is the result of allowing people who fall short of top-notch in conventional mathematical ability into the profession. I disagree entirely. Electronics was once a field which demanded mathematical competence on the level of a world-class experimental physicist. Fortunately, a handful of brilliant minds gave us some very effective abstractions for simplifying electrical work, enabling those who had not devoted their lives to the study of physics to conceive ground-breaking electronic inventions. Nothing of the kind has happened in computing. Most of what passes for widely-applicable abstractions in the field serves only to hamstring language expressiveness and thus to straightjacket cube farm laborers into galley-slave fungibility, rather than to empower the mind by compartmentalizing detail. (OOP is the most obvious example of such treachery.) As for invention, almost everyone has forgotten what genuine creativity in software development even looks like. Witness, for instance, the widespread belief that Linux exemplifies anything original.

Customer service

Saturday, December 26th, 2009

Avedon Carol on customers who sass cashiers:

We are now at the point where merely expressing annoyance at anyone in authority makes you a terrorist. And when I say “authority”, they don’t have to have much authority. As with the case of the shopper who said something sassy to a cashier and got tased by – who was it, cops, or just security? – the issue isn’t whether you’re involved in any kind of crime or violence, it’s whether you know your place. You are just acting as a private citizen, and you have to recognize that as such, those who are acting as servants of those in power have greater authority than you. As a mere American citizen, you are not the equal of anyone who is acting for Big Property. And, while it’s true that the cops might beat you up whether you sass them or not, they pretty much will subject you to violence if you suggest, in any way, that they may possibly be overstepping, making a mountain out of a molehill, or simply incorrect in their assumptions. This gets amped up well beyond the Kafkaesque when there is any possibility of employing use of the utterly vague and horrifically overbroad term “terrorism”. In these cases, you don’t even have to express that you are annoyed or upset at them to earn a conviction – you merely have to show some sign that you are concerned with anything other than showing them their due deference.

Creating a hit is largely a matter of random chance

Friday, December 25th, 2009

I’ve spent the last 10 years working as a computer programmer, but now that Darren Hoyt and I are trying to launch WP Questions I find myself reading a lot more about marketing.

I very much like this bit in Fast Company, where Duncan Watts argues that which songs emerge as hits is a largely random process:

Watts wanted to find out whether the success of a hot trend was reproducible. For example, we know that Madonna became a breakout star in 1983. But if you rewound the world back to 1982, would Madonna break out again? To find out, Watts built a world populated with real live music fans picking real music, then hit rewind, over and over again. Working with two colleagues, Watts designed an online music-downloading service. They filled it with 48 songs by new, unknown, and unsigned bands. Then they recruited roughly 14,000 people to log in. Some were asked to rank the songs based on their own personal preference, without regard to what other people thought. They were picking songs purely on each song’s merit. But the other participants were put into eight groups that had “social influence”: Each could see how other members of the group were ranking the songs.

Watts predicted that word of mouth would take over. And sure enough, that’s what happened. In the merit group, the songs were ranked mostly equitably, with a small handful of songs drifting slightly lower or higher in popularity. But in the social worlds, as participants reacted to one another’s opinions, huge waves took shape. A small, elite bunch of songs became enormously popular, rising above the pack, while another cluster fell into relative obscurity.

But here’s the thing: In each of the eight social worlds, the top songs–and the bottom ones–were completely different. For example, the song “Lockdown,” by 52metro, was the No. 1 song in one world, yet finished 40 out of 48 in another. Nor did there seem to be any compelling correlation between merit and success. In fact, Watts explains, only about half of a song’s success seemed to be due to merit. “In general, the ‘best’ songs never do very badly, and the ‘worst’ songs never do extremely well, but almost any other result is possible,” he says. Why? Because the first band to snag a few thumbs-ups in the social world tended overwhelmingly to get many more. Yet who received those crucial first votes seemed to be mostly a matter of luck.

Word of mouth and social contagion made big hits bigger. But they also made success more unpredictable. (And it’s worth noting, no one in the social worlds had any more influence than anyone else.) So yes, Watts figures, if you rewound the world to 1982, Madonna would likely remain a total unknown–and someone else would have slipped into her steel-tipped corset. “You cannot predict in advance whether a band gets this huge cascade of popularity, because the social network is liable to throw up almost any result,” he marvels.

Predictably, the music industry received the analysis–”Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market,” published in Science in 2006–with a cocked eyebrow. When Watts presented his findings to executives at a major record label last spring, the younger among them were reasonably receptive. They’re accustomed to the unpredictability of hit-making online, so they can grasp the terrifying randomness of success.

But the older execs?

Watts laughs. “They were all like, ‘I think it’s bullshit. I’m still going to go with my gut,’” he recalls. “And I’m like, Okay, good luck to you. You’re going to need it.”

He is going over ground that Clay Shirky examined in 2003, in such essays as “The FCC, Weblogs, and Inequality“:

Yesterday, the FCC adjusted the restrictions on media ownership, allowing newspapers to own TV stations, and raising the ownership limitations on broadcast TV networks by 10%, to 45% from 35%. It’s not clear whether the effects of the ruling will be catastrophic or relatively unimportant, and there are smart people on both sides of that question. It is also unclear what effect the internet had on the FCC’s ruling, or what role it will play now.

What is clear, however, is a lesson from the weblog world: inequality is a natural component of media. For people arguing about an ideal media landscape, the tradeoffs are clear: Diverse. Free. Equal. Pick two.

He talked about the issue at even greater length in Power Laws, Weblogs, and Inequality:

Freedom of Choice Makes Stars Inevitable #

To see how freedom of choice could create such unequal distributions, consider a hypothetical population of a thousand people, each picking their 10 favorite blogs. One way to model such a system is simply to assume that each person has an equal chance of liking each blog. This distribution would be basically flat – most blogs will have the same number of people listing it as a favorite. A few blogs will be more popular than average and a few less, of course, but that will be statistical noise. The bulk of the blogs will be of average popularity, and the highs and lows will not be too far different from this average. In this model, neither the quality of the writing nor other people’s choices have any effect; there are no shared tastes, no preferred genres, no effects from marketing or recommendations from friends.

But people’s choices do affect one another. If we assume that any blog chosen by one user is more likely, by even a fractional amount, to be chosen by another user, the system changes dramatically. Alice, the first user, chooses her blogs unaffected by anyone else, but Bob has a slightly higher chance of liking Alice’s blogs than the others. When Bob is done, any blog that both he and Alice like has a higher chance of being picked by Carmen, and so on, with a small number of blogs becoming increasingly likely to be chosen in the future because they were chosen in the past.

Think of this positive feedback as a preference premium. The system assumes that later users come into an environment shaped by earlier users; the thousand-and-first user will not be selecting blogs at random, but will rather be affected, even if unconsciously, by the preference premiums built up in the system previously.

Note that this model is absolutely mute as to why one blog might be preferred over another. Perhaps some writing is simply better than average (a preference for quality), perhaps people want the recommendations of others (a preference for marketing), perhaps there is value in reading the same blogs as your friends (a preference for “solidarity goods”, things best enjoyed by a group). It could be all three, or some other effect entirely, and it could be different for different readers and different writers. What matters is that any tendency towards agreement in diverse and free systems, however small and for whatever reason, can create power law distributions.

Because it arises naturally, changing this distribution would mean forcing hundreds of thousands of bloggers to link to certain blogs and to de-link others, which would require both global oversight and the application of force. Reversing the star system would mean destroying the village in order to save it.

Laws such as Sarbanes-Oxley, with punitive measures for business officers, will probably stifle innovation

Thursday, December 24th, 2009

[What follows is a comment I posted over at Hacker News.]

My impression is that there was a stretch when some combination of the public mood and the government’s emphasis conspired to encourage small startups. The 1980s and 1990s were clearly good in this respect. The mood of the last decade has been increasingly punitive. Sarbanes-Oxley is the most clear example of this. What once would have been treated as a civil matter is now treated as a criminal matter. Entrepreneurs are now faced with jail time instead of lawsuits. This can only have a chilling effect on innovation. I think it is urgent that everyone who cares about entrepreneurial culture in America to make the argument that innovation in business depends in part on tolerance, and that, in practical terms, this means most matters of conflict should be treated as civil rather than criminal cases.

A comparison might be made to the evolution of bankruptcy law. Before the mid 1800s, most Western countries treated bankruptcy as a criminal matter, rather than a civil one. The liberalization of bankruptcy law was one of the factors that allowed our modern economies to gain the dynamic nature they now enjoy. The public’s mood changed during the 1800s as it became more obvious that many times entrepreneurs failed with their first venture. They needed a second chance, when they were often more successful. John Bayer, who created what became Bayer aspirin, is an outstanding example of this – at first he tried to build a liquor business, but it failed. His father-in-law was suffering arthritis, and therefore drinking large amounts of willow bark tea – the only known source acetylsalicylic acid. John Bayer then put the willow bark tea through the distillery equipment he’d bought for his liquor business – and thus asprin was created. The point is, he needed a second chance to become successful. Many entrepreneurs are in this category.

Since this is Hacker News, I would guess that most of us know someone who has tried to do a startup, and failed on their first attempt. Many of us also know entrepreneurs who tried again, and met with greater success on successive tries. Tolerance of failure is the first pre-requisite of a dynamic economy.

More so, if you have any friends who have attempted to launch a startup, ask yourself under what circumstances you think your friends should go to jail.

I posted a similar comment some months ago, and I mentioned how many lives might be saved by the next wave of medically-focused startups. Someone responded:

“When you cross the line into experimenting with medical treatments, you’re not gambling with other people’s money, you’re gambling with lives. You can’t just equate it to any other kind of start up, it has to be held to a higher standard.”

I want to repeat, many, many industries can lead to people’s deaths. There is nothing unique about medical innovation. If you build a new kind of jet engine, which gets through testing but which then is responsible for a spectacular crash, then your product has killed a few hundred people. And yet, unless there was fraud in the documentation of the tests, there have not been criminal cases in the past. Right from its creation, decades ago, the FAA has taken a strong line against criminal – the feeling has always been that criminal prosecutions would stifle the free flow of information, and the only way to save lives over the long-term is through the free flow of information.

Many other fields can cause people to die – industrial automation, the transport and disposal of toxic chemicals, the construction of buildings (which could then fail and kill people). All industries are in need of innovation all of the time, yet innovation brings with it risk, including the risk of death. How much innovation will we get if we make these matters criminal?

I should emphasize, just in case people forget, that fraud has always been criminal. It has been criminal for centuries. So the move to criminalize more aspects of business is not a move to make fraud criminal. If you think that the Sarbanes-Oxley Act made fraud criminal, then you are mistaken. Fraud has always been criminal.

Sarbanes-Oxley is representative of the new trend. The overall goal was to encourage greater accuracy in the reporting of a company’s financial health. This goal could have been reached through a variety of methods, including both the carrot (rewards) and the stick (punishments). Rewards could have included tax breaks for meeting some additional level of compliance. Punishments could have included fines levied against companies that failed to meet a higher level of compliance. These approaches would not have raised the risk of jail time for CEO’s. Instead, Sarbanes-Oxley decided to go with the heaviest kind of punishment of all – to treat infractions as criminal offenses, potentially meriting jail time.

This punitive attitude is going to have a chilling effect on the amount of innovation we can expect in any field.

Best viewed in Netscape 3.0

Wednesday, December 23rd, 2009

Wow. It’s been awhile since I’ve seen a “Best viewed in Netscape 3.0” warning.

Best viewed in Netscape 3.0

Best viewed in Netscape 3.0

Get 50 people to say “hi” for $50, using Amazon’s Mechanical Turk

Tuesday, December 22nd, 2009

50 people say “hi”, for $1 each. Project launched via Amazon’s Mechanical Turk.

How heavily commented should your code be?

Tuesday, December 22nd, 2009

Veteran programmers hate the code that beginners write, and beginners hate the code that veterans write. I can relate to this, personally.

Hopefully the scene I’ve painted so far helps you understand why sometimes you look at code and you just hate it immediately. If you’re a n00b, you’ll look at experienced code and say it’s impenetrable, undisciplined crap written by someone who never learned the essentials of modern software engineering. If you’re a veteran, you’ll look at n00b code and say it’s over-commented, ornamental fluff that an intern could have written in a single night of heavy drinking.

The sticking point is compression-tolerance. As you write code through your career, especially if it’s code spanning very different languages and problem domains, your tolerance for code compression increases. It’s no different from the progression from reading children’s books with giant text to increasingly complex novels with smaller text and bigger words. (This progression eventually leads to Finnegan’s Wake, if you’re curious.)

The question is, what do you do when the two groups (vets and n00bs) need to share code?

I’ve heard (and even made) the argument that you should write for the lowest common denominator of programmers. If you write code that newer programmers can’t understand, then you’re hurting everyone’s productivity and chances for success, or so the argument goes.

However, I can now finally also see things from the veteran point of view. A programmer with a high tolerance for compression is actually hindered by a screenful of storytelling. Why? Because in order to understand a code base you need to be able to pack as much of it as possible into your head. If it’s a complicated algorithm, a veteran programmer wants to see the whole thing on the screen, which means reducing the number of blank lines and inline comments – especially comments that simply reiterate what the code is doing. This is exactly the opposite of what a n00b programmer wants. n00bs want to focus on one statement or expression at a time, moving all the code around it out of view so they can concentrate, fer cryin’ out loud.

My own take, right now, is on projects where a lot of programmers have to co-operate, there should be a fair amount of commenting, but it should be about goals, not the mechanics of the code. That is, it should be about why, not how. A simple trick is start your comment by writing “The problem I’m trying to solve here is…”. That leads to comments like:

“The problem I’m trying to solve here is the fact that this email code is called from multiple modules, therefore it needs to be centrally located.”

“The problem I’m trying to solve here is the fact that the user may not be logged at this point, so we can not assume we have access to the user object.”

Issues of “how” can usually be figured out by looking at the code. Issues of “why” are sometimes not obvious.

The essay goes to argue that static typing is just another kind of useless comment and, even more so, strict database schema’s are a form of useless comment too:

I’ve been in surprisingly many situations at different companies where I had a fringe team that was being held up by data modelers who were overly-concerned about data integrity when the real business need was flexibility, which is sort of the opposite of strong data modeling. When you need flexible storage, name/value pairs can get you a long, long, LONG way. (I have a whole blog planned on this topic, in fact. It’s one of my favorite vapor-blogs at the moment.)

…I think that by far the biggest reason that C++ and Java are the predominant industry languages today, as opposed to dynamic languages like Perl/Python/Ruby or academic languages like Modula-3/SML/Haskell, is that C++ and Java cater to both secure and insecure programmers.

…And Haskell, OCaml and their ilk are part of a 45-year-old static-typing movement within academia to try to force people to model everything. Programmers hate that. These languages will never, ever enjoy any substantial commercial success, for the exact same reason the Semantic Web is a failure. You can’t force people to provide metadata for everything they do. They’ll hate you.

…Java has been overrun by metadata-addicted n00bs. You can’t go to a bookstore or visit a forum or (at some companies) even go to the bathroom without hearing from them. You can’t actually model everything; it’s formally impossible and pragmatically a dead-end. But they try. And they tell their peers (just like our metadata-addicted logical data modelers) that you have to model everything or you’re a Bad Citizen.

In the same way that an individual can suffer a mental illness, groups of people can be overcome with mass hysterias that cause them to act as if they were insane (think about Germany in 1933: the whole country lost its mind). The Java community has suffered from this repeatedly. I do not know why, but the type of people who crave order instead of cleverness are drawn to Java. The extreme cases are a lot like that woman who was so obsessed with the idea of bacteria on her hands that she had to keep washing, over and over again. The extreme Java types are like that way about code: they are so obsessed with its messy reality that they feel the need to keep washing and washing it, hoping enough layers of strictness will finally create something clean. For instance, there was the craving for annotations:

Some of those several thousand words were devoted to JUnit 4, which has comically (almost tragically) locked on, n00b-style, to the idea that Java 5 annotations, being another form of metadata, are the answer to mankind’s centuries of struggle. They’ve moved all their code out of the method bodies and into the annotations sections. It’s truly the most absurd overuse of metadata I’ve ever seen. But there isn’t space to cover it here; I encourage you to go goggle at it.

There are die-hard Java folks out there who are practically gasping to inject the opinion, right here, that “rapid development” is a byproduct of static typing, via IDEs that can traverse the model.

Why, then, was Struts considered by its own developers to be a failure of rapid development? The answer, my dear die-hard Java fans, is that a sufficiently large model can outweigh its own benefits. Even an IDE can’t make things go faster when you have ten thousand classes in your system. Development slows because you’re being buried in metadata! Sure, the IDE can help you navigate around it, but once you’ve created an ocean, even the best boats in the world take a long time to move around it.

Men falling behind women in college education

Tuesday, December 22nd, 2009

Another look at the degree to which women are pulling ahead of men in regards to college education:

College admissions directors curious about the experience of touching a third rail can review what happened when the president of the University of Alberta suggested that Canadian males, including white males, needed a helping hand.

She got fried … by her own students.

Last month, President Indira Samarasekera pointed to the preponderance of women in higher education in Canada (three female undergraduates for every two males) and suggested that perhaps males could need some extra attention. “We’ll wake up in 20 years and we will not have the benefit of enough male talent,” said Samarasekera, a metallurgical engineer originally from Sri Lanka. “I’m going to be an advocate for young white men, because I can be,” she added, pointing to her Nixon-to-China status as a minority woman advocating for men.

A fair number of her students were not happy. Within 24 hours the campus was awash with posters poking fun at the notion of women taking over higher education. “Women are attacking campus,” read one. “Only white men can save our university! Stop the femimenace.”

Humorous, perhaps, but here’s why this is not funny to college officials in the United States: currently, the University of Alberta grants no admissions preferences to men – unlike scores, perhaps even hundreds, of colleges in the United States that for years have been turning down women for less qualified men.The preferences many colleges give to men are far less formal and less debated than those that help minority applicants, or women applying to some programs. But many, many admissions offices routinely look at male applicants’ test scores and grades with lower expectations than they have when viewing those of female applicants.

A new post up at Symfony Nerds

Tuesday, December 22nd, 2009

I’ve a new post up at Symfony Nerds. This looks at the evil of utility classes full of static methods, a vice that I’m guilty of on every project. I examine possible ways to refactor the code to get a healthier overall architecture.

The surprising thing about New York City in 2009 is how safe it is

Monday, December 21st, 2009

I grew up in New Jersey so I have some memories of New York City during the 1980s. I lived in New York City (or near it) in 1995 and 1996. And now, in 2009, I’m living there again. The change is dramatic, almost surreal. The shocking thing about New York City in 2009 is how incredibly safe it is.

The modern crime wave began in 1964. The crime rate rose till 1993. It has declined dramatically since then. Every part of America has seen a decrease in crime, though no large city has seen as dramatic a decline as New York. In 1993 the murder rate in New York City peaked at 2,500 people. Last year it was around 500. The 80% fall in the murder rate suggests a major demographic transition.

The FBI just released a report today saying that 2009 will be another year of declining crime. The murder rate, for the nation, has fallen further. Theft is down. Theft of automobiles is down in a big way. But check out this headline: New York City Violent Crime Down 8%, Outpacing U.S., FBI Says

New York City “remains the safest big city in America,” Mayor Michael Bloomberg said today, citing FBI statistics for the first half of 2009 showing an 8 percent decrease in violent crime that outpaced a national decline.

The percentage change in murder, rape, robbery and assault was almost twice the 4.4 percent drop nationwide in the same period, according to the Federal Bureau of Investigation report. Property crime in the city fell 6 percent from a year earlier.

During the stretch from 1965 to 1993 New York City lead the nation upward into carnage. Now it leads the nation back down to sanity.

I was recently living at the corner of 14th Street and Avenue A. In 1995, this was a sketchy neighborhood. Even New York veterans warned me to be careful at night, take a cab home from any bar, don’t walk. Nowadays I walk around and I see mom’s with young children, teenagers walking alone, talking with friends, laughing, people sauntering along the streets, there is no sense of danger anywhere. At 2 AM, most nights, the streets are busy with people going from one bar to another. The restaurant scene is thriving. The graffiti is mostly gone. The streets are cleaned up.

I recall visiting Williamsburg in 1995. I had a friend living out there. I recall walking down desolate streets where every building was a bombed out ruin, the windows covered with plywood. I recall a tough looking crew watching me as I crossed into their territory, walking down the sidewalk. I remember being scared and trying not to show it.

That was then. I went to a party in Green Point (in Brooklyn) last month. I left the party at 3 AM. I walked back to Williamsburg. It has become the hipster capital of the nation. Even at 4 AM, the streets were buzzing with young people in fashionable outfits. I stopped and got pizza. The whole area has become suprisingly affluent. I’ve the impression a lot of young people, if they have some money, come here right after they graduate from college.

I’ve a few friends who lament that the “real” New York is dead. It’s been replaced by this suburban mall that just happens to look like Manhattan. I get their point. To some extent, I regret some of the things that have been lost – New York no longer feels as avant garde. A certain amount of danger is needed to energize a truly daring art scene. Still, much has been gained. For almost all of the southern half of Manhattan, you could walk alone, late at night, and never have any cause to fear. For someone with memories of what New York City used to be like, the current epidemic of safety is hard to understand, almost hard to believe.

There is some kind of grand reversal going on. The suburbs are getting (relatively) more dangerous, certain cities are less so. Shows like Breaking Bad highlight the change. When my parents were young, in the 1950s and 1960s, they traveled the country and they briefly considered settling down and raising a family in Arizona. It was a Sunshine State, with a booming economy. The sun was always shining and a new kind of society was being built along the new highways, an open affluent society where every family had its own house and its own plot of land, out in the suburbs. That was how it looked in the early 1960s. Nowadays, Arizona has been devastated by the collapse of the housing boom. It’s economy is in tatters. It is a major conduit for drugs. The border area is dangerous. Drug cartels bribe custom officials so as to get more illegal goods into America. Our popular culture reflects some of the changes. Whereas in the 60s crime (on TV) was something that only happened in the big cities, nowadays shows like Breaking Bad highlight the amount of crime happening in the suburbs.

The hard part is figuring why this happened, or how it happened. My mom spent the 1970s getting her Masters Degree in urban policy. She struggled to understand the crime wave that was then unfolding. I don’t think anyone of that generation of theorists was able to come up with a solid answer that explained the crime wave. And do not think any current theorists have a solid idea about what caused the outbreak of peace.

The New York City tech revival

Monday, December 21st, 2009

Chris Dixon notes a revival of the startup scene in New York City:

But the question that has puzzled me is: why did New York City lag behind the West Coast this decade so much more than last decade? Especially since the internet in the 2000’s has been more than ever about consumers, media, and advertising – traditional New York City strengths?

I think the only explanation is that the finance bubble of 2003-2008 was a giant talent suck on the East Coast. The people I knew graduating out of top engineering or business programs on the East Cast were all trying to work at hedge funds or big banks or else felt like fish out of water and moved west. Money was flowing so freely in the finance world that there was no way the risk/reward trade off of startups could compete. Eventually it just became downright idiosyncratic to be a startup person on the East Coast. The Larry and Sergey of the East Coast were probably inventing high frequency trading algorithms at Goldman Sachs.

But this is why New York City now seems poised for a technology startup boom. The finance bubble has burst and the industry will hopefully return to its historical norm, about half its bubble size. The traditional advertising and media businesses are in disarray. The people who work in them will no doubt find new applications for their talents.

There is also a nice ecosystem developing in New York City. Union Square Ventures is one of the best VC’s in the country, with early stage investments in companies like Twitter and Etsy (that were followed on by top West Coast VCs at significant markups). Bessemer is an old firm that has a managed to stay relevant with investments in Yelp, Skype, and LinkedIn among others. There is also a new wave of scrappy Boston firms spending a lot of time in New York City – specifically Spark, General Catalyst, Flybridge, and Bain Ventures. First Round Capital out of Philadelphia is extremely active in early stage investing in New York. There are a bunch of veteran entrepreneurs actively investing in and mentoring seed stage startups. Google has a big office here and many people seem to be leaving to go start companies.

The New York City start-up scene is warming up

Monday, December 21st, 2009

A fascinating look at some of the startups based in New York City.

One thing I’ve noticed over the past year is that NYC’s version of Silicon Valley will be Soho, which has been primarily associated with the fashion industry. The combination of the falling price of leases stemming from the 2008 financial collapse, and the dropping rent (all the bankers moved out of Manhattan); there have been dozens of creative startups opening up office in Soho. I’ve listed the ones I know in the list below.

1. 20×200 sells art for everyone at ridiculously affordable prices (Soho).

2. Aviary makes creation accessible to artists of all genres.

3. Behance organizes the creative world to make their ideas happen (Soho).

4. Betaworks is an internet media company.

5. Blip.tv is the next generation television network (Soho).

6. By/Association is a private service for new introductions to remarkable people (Soho).

7. Bug Labs is a modular, open source system for building devices.

8. Boxee is the best way to enjoy entertainment from the Internet and computer on your TV.

9. Carbonmade helps you build and manage an online portfolio website (Soho).

10. ChallengePost is a marketplace for challenges.

11. Clickable is an online solution that makes creating and managing online advertising simple and effective.

12. College Humor is the best humor site on the internet.

13. Designer Pages is a free social application for finding products in architecture and interior design.

14. Drop.io allows simple real-time sharing, collaboration, and presentation.

15. Etsy is the world’s most vibrant handmade marketplace.

16. Foursquare gives you and your friends new ways of exploring the city (Soho).

17. gdgt is the new consumer electronics site by the guys behind Engadget and Gizmodo.

18. Harvest allows simple online time tracking, timesheet, and reporting (Soho).

19. Hello Health helps doctors communicate, document, and transact with their patients in person and online.

20. Hot Potato allows you to find events, join the crowd, and share the experience.

21. Hunch helps you make decisions and gets smarter the more you use it.

22. Kickstarter is a funding platform for artists, designers, filmmakers, musicians, journalists, investors, and explorers.

23. Livestream is the most powerful live broadcast platform on the internet.

24. Meetup helps groups of people with shared interests plan meetings and form offline clubs in local communities around the world.

25. OMGPOP is the #1 place to play free multiplayer games with your friends.

26. Parachutes aims to reinvent how people teach and learn.

27. Quirky is a social product development company.

28. SeamlessWeb is the fastest, easiest, and smartest way to order food delivery online.

29. Squarespace is a fully hosted, completely managed environment for creating and maintaining a website, blog or portfolio (Soho).

30. Tumblr is the easiest way to blog.

31. Vimeo is a respectful community of creative people who are passionate about sharing the videos they make.

PUBLISHING/EMAIL COMPANIES
New York City has always been the epicenter of the publishing and advertising industries. And that hasn’t changed with this list of innovative companies changing the publishing and email businesses.

1. Daily Candy is a handpicked selection of all that’s fun, fashionable, food related, and culturally stimulating in the city you’re fixated on.

2. Flavorpill is a daily guide to quality cultural events in New York City, Los Angeles, San Francisco, Chicago, Miami and London.

3. Gawker is an online media company (Soho).

4. Gilt Groupe offers luxury designers and fashion brands at prices up to 70% off retail.

5. Huffington Post offers syndicated columnists, blogs and new stories with moderated comments.

6. One King’s Lane offers exclusive sales on designer home accessories.

7. Tasting Table is a free daily email about the best of eating and drinking culture.

8. TBD is a free email newsletter that delivers one world-changing idea and one collective action to improve our future.

9. Thrillist’s daily emails sift through the crap to find the newest and best the Nation is hiding (Soho).

10. Urbandaddy brings you the single thing you need to know every day about your city.

11. Very Short List is a collection of distinct, free, daily e-mails that each recommend one must-see gem a day.

A denormalisation dictionary

Sunday, December 20th, 2009

A fascinating bit of database optimization:

Having run two crowdsourcing projects I can tell you this: the single most important piece of code you will write is the code that gives someone something new to review. Both of our projects had big “start reviewing” buttons. Both were broken in different ways.

The first time round, the mistakes were around scalability. I used a SQL “ORDER BY RAND()” statement to return the next page to review. I knew this was an inefficient operation, but I assumed that it wouldn’t matter since the button would only be clicked occasionally.

Something like 90% of our database load turned out to be caused by that one SQL statement, and it only got worse as we loaded more pages in to the system. This caused multiple site slow downs and crashes until we threw together a cron job that pushed 1,000 unreviewed page IDs in to memcached and made the button pick one of those at random.

This solved the performance problem, but meant that our user activity wasn’t nearly as well targeted. For optimum efficiency you really want everyone to be looking at a different page—and a random distribution is almost certainly the easiest way to achieve that.

The second time round I turned to my new favourite in-memory data structure server, redis, and its SRANDMEMBER command (a feature I requested a while ago with this exact kind of project in mind). The system maintains a redis set of all IDs that needed to be reviewed for an assignment to be complete, and a separate set of IDs of all pages had been reviewed. It then uses redis set intersection (the SDIFFSTORE command) to create a set of unreviewed pages for the current assignment and then SRANDMEMBER to pick one of those pages.

This is where the bug crept in. Redis was just being used as an optimisation—the single point of truth for whether a page had been reviewed or not stayed as MySQL. I wrote a couple of Django management commands to repopulate the denormalised Redis sets should we need to manually modify the database. Unfortunately I missed some—the sets that tracked what pages were available in each document. The assignment generation code used an intersection of these sets to create the overall set of documents for that assignment. When we deleted some pages that had accidentally been imported twice I failed to update those sets.

This meant the “next page” button would occasionally turn up a page that didn’t exist. I had some very poorly considered fallback logic for that—if the random page didn’t exist, the system would return the first page in that assignment instead. Unfortunately, this meant that when the assignment was down to the last four non-existent pages every single user was directed to the same page—which subsequently attracted well over a thousand individual reviews.

Next time, I’m going to try and make the “next” button completely bullet proof! I’m also going to maintain a “denormalisation dictionary” documenting every denormalisation in the system in detail—such a thing would have saved me several hours of confused debugging.

Performance tests: Java versus C

Saturday, December 19th, 2009

This looks to be a truly exhaustive discussion of the issue of performance speed of software written in Java, versus software written in C. I am surprised that Java is as competitive as it is. This is very surprising to me:

It could also be notice that JRuby is faster than native Ruby.

I think this says a lot about how much time and energy and money has been spent optimizing the JVM:

Places where C/C++ proponents claim C beats Java, but it doesn’t appear (to me) to do so:

* Most ‘plain’ modest sized program. This will be programs requiring no more than the “usual” compiler optimizations and are not so tightly constrained by machine size or startup time. Examples might be things from simple compute-bound loops (string hash, compress) up to IDEs & editors (and most visual tools); DB cache/drivers, etc.

[SS: For the example where Java beats C/C++ a number of times visit www.caucho.org and their OSS PHP engine Quercus. You can check the numbers yourself using my http://code.google.com/p/wikimark/ For the example of super-fast memory DB in Java visit http://www.h2database.com it beats MySQL both in performance and footprint :) Of course it's specifically tuned for in-memory use. And Java is just an example of so called Managed Runtime (Microsoft term). If we are talking about Java performance we are mostly considering JITed code performance. But JIT can be effectively applied to C/C++ as well, see http://llvm.org Apple Mac OpenGL implementation is based on LLVM and all OpenCL implementations too. So anyone running their games on Mac or planing to use OpenCL will use the JIT. ]

The traveler: where did I get that bruise?

Saturday, December 19th, 2009

“Good Lord! Where’d I get THAT?”

I love this line from Sarah Lacy, which matches my own experience when traveling off the beaten path:

I will take four days out to roundly smack-down anyone who says Nashville is a better city, and now that the dates have changed from Dec. 31-Jan. 3 to Dec. 27-30, it even overlaps nicely with my 34th birthday.

Thirty-effing-four. I’m telling you I look every bit of it after this year of travel. I also have an interesting collection of scars, bites, bruises and scrapes from life in all these far flung places. A lot of mornings of yelling “Good Lord! Where’d I get THAT?” in various hotel showers. But I wouldn’t trade it. Seeing the world and meeting thousands of its entrepreneurs over the last year have permanently changed my worldview and changed me as a person. It’s a good kind of pain. I mean—assuming I do actually get this book done by next August. Not totally sure I won’t yet wind up kidnapped, in jail or in an insane asylum.

She puts it very well. Not that I’ve traveled like she is doing now, but I recall various adventures hitch-hiking, jumping into various cars and trucks, and then jumping out, and later being surprised at all the odd bruises I picked up.

Colossus, designed and built in rural Argentina, is a machine for harvesting olives

Saturday, December 19th, 2009

Sarah Lacy writes about an interesting startup that is building farm machinery in rural Argentina:

The idea was born back in the late 1990s. Argentina offered a tax benefit to encourage the planting of some 70,000 hectares of olive trees in poor areas of the country. Argentina had less than 20,000 hectares before the change. The catch was these groves had to be high density, a minimum of 300 trees per hectare. The incentives have worked well enough that Argentina’s Ministry of Economy and Production estimates that the country could be a top ten producer of the world’s olive oil supply within the next decade.

Olive groves take about three years to mature and Bonadeo—a self-proclaimed “soybean man” and long-time farmer—noticed a problem before a lot of other people: Who was going to harvest all these olives? Harvesting olives is expensive and time-consuming and has to be done in a 70-day window. There just wasn’t the labor in Argentina, especially given the high-density plots. It would take 800 people to harvest 1,200 hectares. “That’s more like a military operation than agriculture,” Mourelle says.

So began years of trial and error building the Colossus, a huge machine that, crassly put, looks like it’s having its way with an olive tree. The machine straddles a row of trees and rubber tentacles gently swat off the olives at rapid speed. The arms can move in and out to hug the canopy of the tree—all controlled by a joystick in the air-conditioned, comfortable cab. The company is doing roughly $4 million a year in revenues and sells the machines in six countries. The Colossus increases productivity ten-fold and cuts harvesting costs by a third once the cost of the machine is paid back.

It was a humble beginning. Bonadeo barely had a working prototype and no customers. There’s no such thing as venture capital in Argentine farm country. Without money, he couldn’t build more machines. Bonadeo used to befriend olive farm managers to find out when the owners would be in town. He and his team would crowd into a van and tow the Colossus over for cold calls. Sometimes he was laughed at, sometimes the owner wouldn’t be there after all. “There’s no way you guys can build this business from here,” potential buyers said, even when they saw the machine working. It was disheartening.

The only reason the first Colossus was sold was luck. Two farms were close to signing, but not quite ready to commit to the pricey $500,000 sticker price. So the smaller one called up the larger one and offered to split it with him and share the machine. Simply out of Argentine machismo the owner of the larger farm decided he wasn’t going halfsies on any farm equipment, called Bonadeo into his office and said he had five minutes to make a sale.

“What’d you say?” I asked.

“Hamana…hamana…hamana…” he joked.

It didn’t matter what he said, the man bought one anyway. Soon after that an Australian company placed and order for three machines. Three! “Not bad, fat boy,” Bonadeo said to himself. MaqTec was in business.

It will be fascinating to see if this company survives. It has everything going against it – distance from centers of innovation, lack of capital, limited domestic markets, a nation with a historically broken political system, a culture that till recently valued conservative social traditions over innovation, etc. 100 years ago Argentina was the 12th wealthiest nation on Earth, per capita. Its decline was due to several factors, though the biggest of all was probably the lack of education of its citizens, which fed a series of social pathologies that lead to a broken political system and then dictatorship. I would love to see Argentina revive. I don’t doubt that innovators like Bonadeo will play a crucial role in any revival that happens. But, wow, talk about an uphill fight.

On a different topic, I’m glad to see TechCrunch escape from the suffocating provincialism of most American business news. It is appropriate that a weblog devoted to cutting edge subjects should show leadership in recognizing the amount of innovation going on in the developing countries.

I am concerned about that provincialism. It is an ominous sign of where America’s thinking is at. With the economy in a coma, and competent economists suggesting the coma will last another 5 years, it is clear a lot of the most important innovation of the next decade will be happening elsewhere. So why is Sarah Lacy such a relatively rare figure? Why aren’t there a 100 Sarah Lacys, or a 1,000 Sarah Lacys? Why aren’t more writers going overseas to tell the American public what is going on over there?

Odd interface decisions in the Thesis framework of DIY Themes

Saturday, December 19th, 2009

I was just looking at the video on DIY Themes that explains their Thesis framework. I was surprised by the part of the video where he describes how to move the nav bar from above the title to below it. Moving the nav bar involves adding 2 lines of code to their custom_functions.php file, as you can see in this screen shot:

diythemes_thesis_code_for_placement

Doesn’t it seem bizarre that you need to type these 2 lines of code?

remove_action(’thesis_hook_before_header’, ‘thesis_nav_menu’);
add_action(’thesis_hook_after_header’, ‘thesis_nav_menu’);

I’m curious what theory of user interaction justifies these 2 lines? When you want a particular element to be exposed to control by some end-user, there are 2 ways to go that make sense:

1.) Expose the element in the simplest way, allowing for the most straightforward manipulation.

2.) Control the element from your code, so it can be manipulated from a GUI.

Since we are talking about a block of HTML, the simplest way to expose it would be to leave it as plain HTML in the template. A user could then open the file and cut-n-paste the block of HTML to another part of their template. The user could use a text editor, or perhaps GUI design software such as Dreamweaver. Or you (the programmer creating the code/theme/template) can control the element from your code, and create a GUI that allows the user to drag the element around to wherever they want it. But why would you ever do what DIY Themes does here – control the element from code, and ask your users to edit the code?

In the 1970s, personal computers were usually driven from commands typed into the command line. In 1984, Apple made a big splash when it came out with the Macintosh computer, the first personal computer that had a modern GUI. Since that time its been recognized that GUIs can do a lot to make software easier for end-users to use.

I know a lot of designers who are comfortable with HTML and CSS but not necessarily with PHP code – the risk of some maddening, hard to track down parse error is too high.

I’ve the impression that over the last 2 years frameworks such as Thesis have become popular in the WordPress community. And the kind of code tricks that you see above have become common. This is counter-intuitive. The WordPress community is one that I would have thought would have disliked this kind of design style. WordPress originally became popular with designers because it offered a fairly simple way of editing templates.

I’ve been working with PHP for 10 years. I’m comfortable with the code. But I hate template systems where design elements are controlled from the PHP code. I prefer simple, literal templates full of plain HTML, with only a few PHP commands embedded in the templates. I hate controlling the placement of anything from PHP code. I want designers to be able to open my templates and re-design everything, without ever having to edit any PHP code.

I’m surprised that DIY Themes is taking the approach that it is taking.

Much of the effort in the Thesis framework seems aimed at at wrapping a GUI around some of the technical details of running WordPress. I wonder why they didn’t wrap a GUI around these 2 lines?