What does Team Data do at Basecamp?

Basecamp’s “team data” recently doubled in size with Justin White joining us full-time as a programmer. We’ve been in the data business at Basecamp for over six years, but the occasion of hiring a second team member caused me to reflect on why Team Data exists, what it does, and how we work.

A simple objective: make Basecamp better

We’re basically interested in three things on Team Data:

  1. Make Basecamp-the-product better to help our customers achieve their goals.
  2. Make Basecamp-the-company a great place to work.
  3. Make Basecamp-the-business successful.

These are really the same fundamental objectives every team at Basecamp is working towards, and each team has their own specific angle on this. The support team focuses on how their interactions with customers can achieve these goals, the design and development teams focus on how the functionality of Basecamp itself can achieve these goals, etc.

On the data team, we primarily attempt to use quantitative information to achieve these goals. Our approach isn’t necessarily the best approach for every problem, but it’s the angle we take on things. If we can’t address a specific question or problem with some sort of number, we probably aren’t the best team to answer it, and we’ll gladly defer to the perspective others can bring to bear.

What we do

Pretty much everything we do on Team Data falls into one of two categories:

  1. We answer questions about Basecamp using data and make recommendations. The questions we tackle span a wide range: from specific questions about how a feature is used, to understanding how a change we made impacted signups, to open questions about how we can improve some aspect of business performance.
  2. We build infrastructure and tools to a) support our ability to answer the questions above, and b) help others at Basecamp accomplish their work more effectively.

We occasionally do things that don’t fall into either of those categories, but the core of what we do falls into either analysis or infrastructure.

A sampling of our work over the past few months includes:

  • Analyzing the performance of a new marketing site and account setup process.
  • Improving the internal dashboard app that powers many of our internal tools by removing thousands of lines of dead code and upgrading to a modern version of Rails.
  • Helped design, implement, and analyze a dozen A/B tests.
  • Migrating our data infrastructure from on-premise hardware to cloud-based services.
  • Analyzed the sequencing of notifications sent by Basecamp and recommended ways to adjust timing.

Things we believe about ourselves

Every team at every company has a set of beliefs about how they work, whether they are aware of them, acknowledge them, or codify them. Here on team data, there are a few tenets that we try to embody that we’ve taken the time to write down:

  1. We are scientists. Wherever possible, we apply the scientific method to solving problems, whether through analysis or engineering.
  2. We are objective. There’s no agenda on team data other than seeking the truth; we report the facts whether we like them or not.
  3. We try for simple. We don’t use a machine learning model when a heuristic will do, and we don’t write complicated programs when a simple `awk` one liner will work.
  4. We are rigorous. When a problem demands a nuanced understanding or when data needs to be high quality, we stick to those requirements. We’d rather over-explain a complicated situation than over-simplify it.
  5. We are technology and tool agnostic. Ruby, Go, Scala, R, Python — whatever the best tool for the job is. When possible, we use open-source or third-party tools, but we’ll build what’s needed that isn’t otherwise available.
  6. We collaborate, engaging in peer review of analysis and code.

We don’t hit all of these points on every day, but they’re the aspiration we’re working towards.

How we work

Unlike the core product teams at Basecamp, we don’t explicitly work in six week cycles, and we tend to each have multiple projects under way at any given time. Many of our projects are a couple days or weeks, and some stretch over six months or a year. We might do some instrumentation today and then back burner that for 30 days while we wait for data to collect, or a thorny problem might wait until we figure out how to solve it.

Generally, Justin spends about 80% of his time working on infrastructure and the remainder on analysis, and I spend about 80% of my time on analysis and the remainder on infrastructure. This is mostly about specialization — Justin is a far better programmer than I am, and I have more experience and background with analytics than he has. We don’t hit this split exactly, but it’s our general goal.

We get lots of specific requests from others at Basecamp: questions they’d like answered, tools that would help them do their work, etc., and we also have a long list of bigger projects that we’d like to achieve. We explicitly reserve 20% of our time to devote to responding directly to requests, and we both try to set aside Fridays to do just that.

Anyone can add a request to a todolist in our primary Basecamp project, and we’ll triage it, figure out who is best equipped to fulfill it, and try to answer it. Some requests get fulfilled in 20 minutes; we have other requests that have been around for months. That’s ok — we embrace the constraint of not having unlimited time, and we admit that we can’t answer every question that comes up.

Outside of requests, we collaborate with and lean on lots of other teams at Basecamp. We build some of the tooling that the operations team uses for monitoring and operating our applications, and they provide the baseline infrastructure we build our data systems on. We collaborate with developers and designers to figure out how what data or analysis is helpful as they design and evaluate new features. We work closely with people working on improving basecamp.com and the onboarding experience through A/B testing, providing advice on experimental design, analysis, etc.

One of the most visible things our team does is put out a chart-of-the-day; some piece of what we’re working on, shared daily with the whole company.

Like the rest of Basecamp, we don’t do daily stand-ups or formal status meetings. Justin and I hop on a Google Hangout once a week to review results, help each other get unstuck on problems, and — since Justin is still relatively new to team data — walk through one piece of how our data infrastructure works and discuss areas for improvement each week. Other than that, all of our collaboration happens via Basecamp itself, through pings, messages, comments, etc.

Sound like fun?

Here’s the shameless plug: If you read the above and it sounds like your cup of tea, and you’re a student or aspiring data analyst, I hope you’ll consider joining us this summer as an intern. You’ll work mostly on the analysis side of things: you’ll take requests off our main request list and projects from our backlog, structure the question into something that can be answered quantitatively, figure out what data you need to answer that question, figure out how to get the data, perform analysis, write up results, and make recommendations.