o beautiful code
building well-crafted systems


The motivation for this post is the disconnect I’m seeing between the marketplace and my own experience with hiring developers. It seems an adage these days that hiring great developers is extremely difficult. My experience is that there is enough talent out there and that employers, large and small, are particularly mediocre at hiring. The market is certainly competitive, but I’ve discovered that a well-crafted process, an attention to some key details at each stage, and a healthy respect for the psychology of relationship-building quickly catapults you above the rest.

I can say this with confidence, because my experience with hiring has been tested in cash-poor-equity-rich, cash-rich-equity-soso, and cash-soso-equity-soso situations. The Googles of the world can always outspend you, but you can win by being more scrappy and clever. In this post, I’ll share the details of my process. My diabolic interest in doing so it that I have a soft spot for small companies, where a single great developer can have outsized influence on the success of the business.

Your Network

The single best way to hire a great developer is to leverage your network. It’s common knowledge. Unfortunately (for us), great developers typically find themselves in great situations. When I’m in hiring mode, I regularly find that the people I want to pull-in are happily employed. I’ll send an email into my network asking for referrals and no leads will come back. More often, folks will remind me that they too are looking for awesome developers. At some point, we need a strategy to explore outside of our network. This post is about the art and science of tapping into a pool you don’t know (yet).

The Job Post

When you post for a position, you are starting a conversation. Measured in that frame, most companies are terrible conversationalists. But it’s important to get it right because first impressions count.

I needed to ramp-up my Dev team on RESTful systems. Existing content on this topic was either too terse or too verbose.

So I created this deck as a necessary and sufficient tutorial on REST. The goal was for my Devs to walk away with enough of an understanding to be (and want to be) dangerous.

Here it is…REST in 18 slides (ok, 21 slides if you include Cover, References, and Thank You):

This is more text-heavy than I prefer, but I needed an excuse to try out SlideShare so I opted for a presentation format.

Best viewed in full-screen.

Here is the link to download the slides as a PDF: Download PDF

RserveCLI2, a .net client for Rserve

RserveCLI is a .net/cli client for Rserve, created by Oliver M. Haynold. Oliver has done a great job with this project.

I forked this project to add features, fix bugs, and do some restructuring. I thought it was a significant enough depature to create a new project.

To that end, I’m hosting RserveCLI2 on github.

Contributions appreciated!

Nixed Amazon Linux

I’m using EC2, I just can’t use Amazon Linux. Let me explain…

Solve for Dev/Prod Parity

My company is committed to the principal of dev/prod parity to minimize deployment failures. That means our development environment is as close as possible to our production environment. Unfortunately, Amazon Linux is not super accessible. It’s not published such that it can be installed on desktops, laptops, or local virtualization environments like VirtualBox. It seems there are ways to hack it, but it isn’t an out-of-the-box sort of thing.

So, there are two options for selecting an OS to run in the cloud:

  1. Use Amazon Linux. Develop, test and deploy in EC2 (always-on instance).
  2. Use a more accessible distro (e.g. Ubuntu which has local and cloud images). Develop and test locally and deploy in EC2.

#1 is costly where micro or small instances can’t do the job. Also, I’m uncomfortable being perpetually on-the-clock (until the day cloud computing becomes a dirt-cheap commodity).

So that leaves #2!

Rabbit Hole

I’m a compulsive note-taker (using OneNote, which I think tops all other products in the category). I like to condense information, in plain english, and codify insights so I can refer to them later when I need a refresher.

Here’s my notes from Rich Hickey’s 2011 talk “Simple Made Easy” (augmented with bits from his 2012 RailsConf talk). This is extraordinary! It’s near the top of my list of software engineering resources (across all formats - talks, books, etc.). If you haven’t seen the talk, go watch it and I promise you will be so much wiser for it.

It was a small bit of effort to find the deck, so I’m hosting it for your convenience. My notes are a synthesis of the talk, the deck, and some of my own material where I felt it aided in understanding.

The Simple

  • A simple piece of software means that there is no interleaving. It doesn’t combine things. It has focus. Something is simple when it addresses: one role, one task, one concept, one dimension
  • Simple doesn’t mean there is only one of them. Its not about an interface with one operation or a class with one instance. It’s not about cardinality.
  • By contrast, a complex system is braided or folded together. There are interleaving roles, tasks, concepts, or dimensions.
  • Identifying a system as simple or complex is thus objective. If there are twists, its complex. If there is no interleaving, it’s simple.

A Warning About warning()

Avoid R’s warning feature. This is particularly important if you use R in production; when you regularly run R scripts as part of your business process. This is also important if you author R packages. Don’t issue warnings in your own code and treat warnings in 3rd party code as errors that are explicitly suppressed. I’ll discuss a strategy to implement this in a little bit, but first lets discuss the warning feature and the justification for this advice.

The Warning

A warning cautions users without halting the execution of a function. It basically says “although I can and will give you an answer, there might be a problem with your inputs. Thus, the computation could be flawed.” For example, the correlation function issues a warning when an input vector has a standard deviation of 0. Rather than raising an error and halting execution, the function issues a warning and returns NA.

> cor( c( 1 , 1 ), c( 2 , 3 ) )
[1] NA
Warning message:
In cor(c(1, 1), c(2, 3)) : the standard deviation is zero

We’re moving from internally hosted SVN to Git/GitHub for the typically reasons (resounding hallelujah). Mercurial was a contender. We surveyed the available hosts and choose Git because of GitHub. Its an old adage at play here - betting on a superior toolset is just as important as betting on a technology. The benefits of Git and Mercurial largely overlap since they are distributed. Had a GitHub-like site existed for Mercurial then likely we would have sided with Mercurial for its simplicity. It’s possible to use Mercurial to push/pull from a Git repository through the Hg-Git plugin, but this scenario will always be at risk as the respective technologies evolve. Also, intentionally taking-on an additional dependency this early in the game seems awkward and unwise when it can be avoided.

The host is not the only tool in the chain, but for our purposes it’s the most impactful variable. Both Mercurial and Git have decent shell extensions with TortoiseHg and TortoiseGit. These are enhanced by replacing the diff tool with Beyond Compare. So the non-command-line-use-case is off the table. I consider the command-line tools to be a part of the core technology, but regardless hg.exe and git.exe are straight-forward for the usual tasks.


I Read The F***ing Manual And It Sucks

from xkcd

R Source on GitHub

  • I added R source code v0.49 to v2.15.0 to a GitHub repository: r-source
  • Each release is tagged by version number.
  • This is an easy and accessible way to browse R source and diff with prior version. I couldn’t find a suitable alternative.
  • A tarball via Apache directory listing via CRAN mirror via the ____ (insert sarcastic adjective) R-project website is not a suitable alternative.
  • I will keep the repository updated as new versions of R are released.

R, I Love You

It is easier to critique than it is to create. I write this post with much gratitude for R, the R community and particularly R-Core who are paid $0 to bring us R. I’d like to offer an idea and I’m wondering if people are interested in rallying around it.

Julia, I’m in committed relationship

You might have caught the post titled “Julia, I Love You”. It’s the top article on Rbloggers. Perhaps you had the same reaction I did. I read the material, repeated “wow” a few times, and slipped into a contemplative space. Am I betting on an outdated technology? I slapped myself (figuratively) and snapped back to reality. I vaguely remember when Revolution Analytics released side-by-side performance figures there was pushback about an apples-to-oranges comparison. Some tests had to be reworked (or am I just make that up?). People have added comments to the Julia post with performance fixes to the R code used to benchmark against Julia. And in the end, languages come and go but R has withstood the test of time.

I use R | Julia because. . .

Why do people use R? In my (informal, anecdotal, not rigorous, no medals of honor conferred) survey, the reasons people use R are:

How R Searches and Finds Stuff

Rabbit Hole

How to push oneself down the rabbit hole of environments, namespaces, exports, imports, frames, enclosures, parents, and function evaluation?


There are a few reasons to bother reading this post:

  1. Rabbit hole avoidance
    You have avoided the above mentioned topics thus far, but now it’s time to dive in. Unfortunately you speak English, unlike the R help manuals which speak “Hairy C” (imagine a somewhat hairy native C coder from the 80s who’s really smart but grunts a lot…not the best communicator).

  2. R is acting a fool
    Your function used to work, now it spits an error. Absolutely nothing about this particular function has changed. You vaguely remember installing a new package, but what does that matter? Unfortunately my friend, it does matter.

  3. R is finding the wrong thing
    You attached the matlab package and call sum() on a numeric matrix. The result is a vector of column sums, not a length 1 numeric. This messes up everything. What were you thinking trying to make R act like Matlab? Matlab is for losers (and rich people).

  4. You want R to find something else
    You like a package’s plotting function. If you could intercept one call within the function and use your own calculation, it would be perfect. This seems like black magic to you, but something is strange about maintaining a full copy of the function just to apply your little tweak. Welcome to the dark arts.

  5. Package authoring
    You have authored a package. How does your kid plays with the other kids in the playground?