This week offered a fun juxtaposition of ideas:
It reminded me of one of my struggles during the research that produced my debugger, Theseus. I wanted to see how people used debuggers while programming, so I invited six people to the lab, gave them programming challenges, and took notes.
While it was possible to draw general conclusions about some aspects of how people code, I almost didn't want to. Every single one of those people had such different approaches to using debuggers that all I really wanted to do was study more programmers.
If the distribution were bimodal (e.g. there are people who use debuggers and people who don't), then even with N=6 people, I would have likely seen two people who were alike. But although my memory of the interviews is fuzzy, I distinctly remember being hesitant to put any of the people I watched into the same group.
So when it comes time to evaluate a software language, library, or tool, I have to resign myself to several facts:
Very few people (if anybody) are going to have the same experience with it as I do. My personal opinion is worthless—although I like to think that my professional opinion is less so. ;)
Very few people are going to have the same experience with it as each other. If a single dimension such as "fondness of type expressiveness" were enough to summarize a programmer, then we could locate the average person on that spectrum and judge how far the library is from that average to determine its worthiness. But neither programmers nor software is like that. You couldn't say all people who like elaborate type systems, or all languages with elaborate type systems, are basically the same.
Some dedicated researchers decided on 14 dimensions for classifying languages and whatnot in a way that would give an idea of their usability. Even if you think 14 seems small, the number explodes when you consider that it's contextual based on the current task. (Fun challenge: evaluate your favorite languages using these heuristics. It turns out they all suck at something!)
Take averages with a grain of salt. When research finds that people spend X% of their programming time writing code, Y% of their time debugging, etc., be slow to generalize. Because within that population there are people who solve every programming challenge in their head and wonder why we research debuggers at all.
So how can a team decide between using Django and Meteor for a project? Between Go and Rust? Promises and callbacks?
I dunno. Poll all the stakeholders?