Skip to the end for the TLDR
Motivation
So you want to do some machine learning on the side? No matter who you are, you don’t need permission or approval to do it. If this sounds like something you’d do, consider this series your treasure map. I’ve pursued dozens of side projects over the years and they almost always suffer a quiet death as people lose interest or friction slows progress. My goal is to distill what I’ve learned over the last ten years so you and your friends can get hands on with the fun stuff quickly. It’s like If you wanted to make cupcakes but had to constantly remodel the kitchen. This is my guide to setting up your machine learning kitchen right the first time.
This is the first of five articles about how to do machine learning with friends. Why machine learning with friends? Because it’s more motivating and fun to work on interesting problems with people you like. Before you ask anyone to do anything, first identify the area where you want to work. You could be fired up about climate, personal health, or fantasy sports. Whatever it is, if you think it’s awesome and there might be a machine learning angle to it, start there. Don’t convince yourself you’re into computer vision or natural language processing because you see it in the blogs and feel left out. The outcomes should be exciting and relevant to you, not the strategic initiatives of the big tech companies. The final article in this series has ideas for fun projects given different capabilities and experience levels.
Once you know what’s got you pumped up to get down and dirty, channel your inner Blackbeard and go recruit a merry band of pirates who share your enthusiasm. The group energy will keep everyone motivated when you get stuck on a thorny problem or life gets busy. A little diversity in skillsets is also a great way to get exposed to new ways of thinking or doing things. Frankly, this is both fun and good for your career. After assembling these marvelous misfits, it’s time to set up for a fruitful journey that hopefully doesn’t reduce you to scurvy-ridden cannibals lost at sea (been there, 1 star review, do not recommend).
Creating an organization
With your group aligned and rallied, first agree on a leader. By no means should this person be the sole decision-maker on anything, but someone has to put down a credit card and manage the administrative stuff. Of course, you can share costs and responsibilities but put someone behind the wheel and make sure everyone feels good about it. With a leader in place, it’s time to create a formal organization. You might be tempted to cobble together free services or just work through Github, but I don’t recommend it. Doing this properly only costs a few bucks a month and it’s worth it in the long run.
You’ll establish your organization by signing up for a collaboration suite and using it to manage team membership. You can do this without a custom domain but for less than $10/year, you might as well give your team a name and lock in a domain for it. Mustachedmachinelearners.com is available as of the time of this writing. If you don’t already have a service you like, I use Namecheap Then each user will have an email address like steve@mustachedmachinelearners.com they can use to log in to everything else.
My Pick: Google Workspace
I primarily use Google Workspace for my organizations. It’s not the cheapest option but the starter plan is only $6/user/month, and it will grow with you if you have bigger ambitions. It’s also very widely used so chances are most people won’t need any time to get used to it. Google also has very good documentation for setting up your workspace with a custom domain. When I did this for the first time five years ago, it took over an hour with some troubleshooting. I set up a new workspace recently and it took less than 10 minutes. Google isn’t perfect but the combination of low cost and general ease of use makes it pretty compelling.
The starter plan will cover your email, calendar, video meetings (including screen sharing), messenger, docs, spreadsheets, presentations, storage, etc. Additionally, users can use Google to sign in to many other platforms without creating a new account.
Other Options
If you don’t want to use Google, there are other options. I haven’t used anything else extensively but want to be clear that it’s not the only option. Most of the competitors are more expensive and designed for full throated businesses. However, Zoho Workplace seems to be the low cost competitor to Google Workspace. The standard plan is $3/user/month (billed annually) and it appears to hold up against Google on features with a few minor differences. Again, I haven’t used this myself but it’s worth exploring.
Communications
Once your organization is set up, you need to pick a primary communication channel and stick to it. What you choose isn’t all that important but making a choice is. Spreading communications across channels like email, texts, and a private messenger will fracture your team and stop momentum cold. You can still text your friends about whatever else and it’s also okay to send a few memes through the group channel, but keep the project talk centralized. This makes it easy to find important threads and stay focused on current projects, especially when you do other things with your teammates like get drunk and argue about who was the best power ranger.
My Pick: Slack
Slack (free)
I really like Slack for several reasons. A lot of people use it already and the free version is more than adequate for a small team. There’s no limit on users but they do only keep your most recent 10k messages on the free tier. Aside from that, it’s a great chat tool. You can create channels for specific projects or issues, connect to third-party apps like Google Drive, and users can belong to multiple workspaces. Personally, I belong to several workspaces and it helps me switch between them easily as opposed to suffering through never ending emails in a single inbox. Slack also has nice desktop and mobile apps that make it easy to receive (or silence) notifications across devices.
Other Options
Discord (free)
Alternatively, Discord is very similar to Slack and probably your better option if you’ve used it before for gaming or school. I primarily use Slack for business but Discord feels more fun. If you decide to upgrade to the paid version, they are similarly priced but there’s a key difference: Slack gets upgraded for the entire workspace and is paid by the admin while Discord allows individual users to upgrade and pay for themselves. If you haven’t used either, this comparison post is helpful.
If you don’t want to sign up for another service that will tempt you to upgrade and pay, I recommend using the default messenger that comes with Google Workspace or Zoho Workplace. The messenger for these platforms is just one of many products they support though so the experience tends to fall short of Slack or Discord who focus entirely on messaging. If you’d rather not bother though, the native messenger will do fine. Again, the important thing is to pick one and stick with it.
Tracking and Documentation
If your group is more than 2-3 people, I highly recommend putting some structure around task tracking and documentation. Imagine you’re working on fantasy football predictions. As a group, you need a few web scrapers, some preprocessing code, a model training script, and backtesting routines. Everything partially depends on something else but not entirely. If one person works on each thing, it can get pretty hairy keeping track of who is doing what, when it’s expected to be done, and what done looks like. Even with a small group, the complexity can be discouraging but with the right setup, you can work through it smoothly.
The goal here is visibility and simplicity. For tasks, you want to be able to create, assign, and track them along with any dependencies or blockers. This keeps everyone in the loop by default rather than constantly asking each other what’s going on. Think about it like a dynamic group to-do list. Once you complete a few projects and have a codebase established, you’ll quickly want to be able to document things and easily search for what you need. This is where well-designed software is super useful.
My Picks: Jira and Confluence
Atlassian offers a suite of software development and collaboration tools which include Jira and Confluence. Good news is both are free for up to 10 users ($7.50/user/month after 10). Jira is software for issue and project tracking. In my experience, it’s pretty commonly used so some people will have experience with it. If it’s new to everyone, you might start with Trello. Atlassian also owns Trello and it’s free for up to 10 users too ($5.50/user/month after 10). It’s very similar to Jira but a bit simpler to get started with. Both tools are designed around the Agile development framework. If this sounds like a foreign language to you but you want to know more, give the article below a read.
What is agile? Kanban vs Scrum
Confluence is a collaborative workspace for sharing knowledge. That sounds like fluffy marketing and it kind of is, but if you use Confluence right, it’s a fantastic tool. Ultimately, it’s just a common space to create pages and keep them organized. It uses markdown but you can also include plug-ins like draw.io to include editable architecture or flow diagrams on a page. It doesn’t support live collaborative editing like google docs, but I don’t mind that given everything else I can do with it that I can’t do in google docs. It’s free for up to 10 users too ($5.50/user/month after 10). My best advice with Confluence is to follow a few best practices and avoid the documentation swamp that will be created if people just dump random things everywhere carelessly. I could write an entire series on this but I’ll spare you and just offer a link instead.
Other Options
There are plenty of other options available for tracking and documentation but I love the model Atlassian created. They know the value of their suite goes up exponentially when you use it effectively with more than 10 users so they price it accordingly. They also connect everything seamlessly so you can do things like link to Jira tickets from confluence pages, tag people so they receive automatic updates, and manage user permissions across all products. With that said, if this is too many new things to learn or you just can’t stand Atlassian products for some reason, my recommendation for an alternative option would be to use shared docs and spreadsheets in your organization suite (Google or Zoho). You’ll have to add a custom layer of organization to prevent the swamp from forming but it can be done. You may eventually see the value of tools like Jira and Confluence or you’ll stick to the minimalist thing and not be bothered.
TLDR
Too long; didn’t read? I got you. Here’s what to set up and how much it will cost per user. By the way, I’m assuming most of you won’t have more than 10 people in your crew.
Buy a domain from something like namecheap.com for $10 or less per year.
Sign up for the Google Workspace starter plan for $6/user/month to manage the organization and provide a catch-all for collaboration if you don’t opt for other tools.
Set up a free workspace in Slack and add users through their Google accounts you just created. The basic version is free for unlimited users. The pro version for $9/user/month is worth it for video calls with screen sharing if you do a lot of remote pair programming.
Set up an account with Atlassian and add your team through their Google accounts then activate Jira and Confluence for task tracking and documentation. Both are free if you have 10 or fewer users then Jira is $7.50/user/month and Confluence is $5.50/user/month after that.
If you have 10 or fewer users, you’re only out $6/month for each of them with this setup. If you opt for Slack Pro and your team grows beyond 10, you’ll pay about $28/month for each but then you’re probably on your way to a legit startup.
Next week, we’ll get into standardizing operating systems and setting up flexible computing. Stay salty until then my machine learning friends.