Respecting privacy at Basecamp

I spend a lot of time as a data scientist thinking about how to use data responsibly, particularly when it comes to privacy. There’s tremendous value to be found by analyzing data, but the only way the data science field will continue to have data to analyze is if we are responsible in how we use it.

As a company, Basecamp strives to have the respect for user privacy that we’d like in every service we personally use.

I could talk about the things that we do relating to privacy:

We have a plain English privacy policy, and expect to be GDPR compliant by the deadline.
We use encryption for all communications between Basecamp and your browser, and we encrypt our backend services as much as is practical.
When you cancel, we delete your account and all your data.
We purge log data and database backups after 30 days.

But I think our privacy philosophy is better defined by the things that we don’t do.

We don’t access customer accounts unless they ask.

The only time we’ll ever put ourselves into a position to see a customer’s account is if they grant explicit permission to do so as part of a support ticket. We log and audit all such access.

We don’t look at customer identities.

Many companies, especially startups, review every signup manually and reach out to interesting looking customers. I get lots of these emails, and every one leaves me unsettled.

Tons of companies will also use the fact that you signed up as permission to identify you as a customer for marketing purposes. Over the years, I’ve had to ask no fewer than a dozen companies to remove Basecamp from their marketing material.

I find both of these practices to be distasteful. There’s no reason I, or anyone else here, needs to know the names of people who are signing up for Basecamp. It’s unnecessary.

We don’t share customer data.

There are a few aspects to this, but our basic premise is that it’s your data, and not ours, so we shouldn’t be sharing it.

We get lots of people writing us from big companies asking “does anyone else at Acme use Basecamp?” or people asking “can you tell me any companies in our industry that use Basecamp?”. Just like we don’t look at identities ourselves, we also don’t disclose them to people who ask.

We’ll only provide customer data to law enforcement agencies in response to court orders. Unless specifically prohibited from doing so, we’ll always inform the customer of the request.

It should go without saying, but we don’t sell customer lists or any other data to anyone.

We don’t look at identifiable usage data.

To make Basecamp better, we do analyze usage patterns, and we have instrumentation to enable us to do that. This inherently requires us to in some form look at what people are doing when they’re using Basecamp.

Where we draw the line is that we never look at identifiable usage data. Any data that we use for analysis is stripped of all customer provided content (titles, message or comment bodies, file names, etc.), leaving only metadata, and it’s blinded to remove identifiable information like user IDs, IP addresses, etc. We try to do these things in such a way that it’s impossible for anyone analyzing data to even accidentally have access to anything identifiable.

This choice to never look at any identifiable data (or even be able to) does place minor constraints on the analyses we can perform, but so what? There’s plenty of value left in what we can do. My job might be a little bit harder, but I’m happy to spend the extra effort to be respectful of customers’ privacy.

We don’t send customer data to third party services.

As much as possible, we avoid the user of third party services that require any customer data to pass through them. There are many cases of such tools capturing too much, and we can’t control what happens with data once it reaches them.

There are a few cases where we do use third party services, which I’m happy to disclose:

We use Amazon Web Services and Google Cloud Platform to host some parts of our applications. In those cases, we use available encryption options to prevent the platform provider from having access to the underlying customer data.
We use third party analytics tools (currently Google Analytics and Clicky) on public facing websites only. They capture IP addresses, etc., but are not put in any place where they could capture user provided content.
We use a third party helpdesk tool for answering support cases (HelpScout). This mean that HelpScout has any data that gets sent in a support ticket.
We use third party tools for sending some emails (MailChimp and Customer.io), which have access to customer email addresses and metadata required to know when to send an email. We don’t send any customer provided data to either service.
We use third party CDNs (Akamai and Cloudfront) for serving static assets. Those services have access to IP addresses, etc.

We don’t want you to feel creeped out.

At the end of the day, this is the bottom line. We don’t want to do anything that feels creepy or that we wouldn’t want done with our data.

We know that you’re putting your trust in us when you use Basecamp, and we want to do everything we can to honor and live up to that trust.