5 “Must-Haves” for Your Startup’s Repo Structure

Sarah Smith
6 min readMar 1, 2020

--

Photo by NESA by Makers on Unsplash

This is my highly opinionated take on what new startup projects should have in their repository structure. Getting this right from that first day of hacking; or catching it when you’re down the track and have a chance to do some catchup — can save you big time when trouble comes knocking.

Each one of these must-haves comes from hard won experience. You’ll have different takes on them, so I don’t mandate anything here.

Obviously it also changes for the languages or tech stack; web vs mobile vs other types of area you’re working in. Please read this along with resources like these ideas from Stack Overflow (where its pointed out that what is productive for you and your team is best); and this suggested structure from the Go-Lang folks.

For team dynamics I am a huge believer in always bake processes and good culture into your tooling so that you don’t have require lazy programmers to follow it: it just works. Good repository structure and artifacts can help.

Number five: scripts

Get things done by putting them into scripts as often as possible. Even if the script doesn’t always run out of the box, and maybe you need to cut-n-paste into a terminal or cherry-pick the lines; putting those productivity boosting tools into source control in here is a great way to boost team productivity. Get your team to share their knowledge by doing this as well.

Avoid littering your top-level project directory with build scripts by putting them in a heirarchy below the scripts directory. Typically scripts here require they are run by first changing into this directory — another good way so that artifacts from running the scripts are kept out of the source tree. If those artifacts overwrite or get mixed in with source you can have horrible problems down the track.

Scripts that get run by your CI system should go in here too. Avoid as far as possible storing scripts on your CI server’s configuration. Note that files for Travis or Circle or other build systems and CI systems that use a repo-side driver file are typically in their own custom folder — named something other than scripts. This number five point covers those two.

Number four: examples

Also known as demos or tutorials this directory is a must-have not just for external projects but also private repositories where they can be vital onboarding for new team members.

Runnable, worked examples — especially if you have your CI system check that they at least build — are a fantastic way to make your code modules user friendly and faster for teams.

If your team goes to a conference, or presents at a meetup or client premises whatever they produced for that should go in here. Don’t lose that work they did for that one-off session, and keep it live & building against HEAD so you can use it again.

Number three: docs

In the Python world your rst files for Sphinx are in here, also the templates and whatever else you need to build your doc tree.

Or it can be just a bunch of README.md files that source in graphics.

Whatever they are they should be plain text files (that is not binary files like MS Word or PDF) and if they’re buildable they should be built as part of your CI system and CD process.

Do your team do PRs? Pretty much every one they submit should involve a change touching files in here, otherwise what front-facing value did they add?

Number two: README.md

This must have file should allow newbies to the project to know how to build the project, including pulling down all dependencies and setting up all tools required on their local machine.

That may mean linking to docker files or what ever is needed to get a base image up, for web projects; or installing tool chains & IDE’s if its a binary project.

This is again a touch-stone for PR’s that are changing anything about the projects build process.

README.md is a number one quality flag for me for the health of a repo. If there’s signs its out of date, or doesn’t have all that folks new to a project need to build, its a canary in the coalmine that lack of ownership & pride in work is creeping in.

Watch for engineers linking out of the top-level README.md file to their newly added functional areas — if they aren’t then its a sign of “works on my machine” thinking.

Number one: 3rd-party

Along with this directory are the very important files LICENSE.txt and potentially license.header (which new source code files should include).

If your startup gets acquired or investors want to do due diligence, one thing that will happen is an army of suits with scary briefcases will walk in & go through your company’s assets with a fine-tooth-comb. And if you’re a software startup the number one asset is your software.

You say your company is valued at $X and most of that value is the software you wrote right? Its awesome code!

But what if that code is actually owned by someone else? What if the scary briefcase folks go back with a report that your code is actually inextricably intertwined with third-party code in a way that means you can’t answer basic questions about which stuff you & your team wrote?

This is why you need to structure your repository to have a third-party directory at the top level and put in there a sub-directory for your code that you are using from other projects.

All code that is outside of the third-party directory must be original code written by you and your team.

Corollary: If anyone on your team submits a PR that puts code not written by them into a directory outside of third-party in my book that is grounds for performance review. Or its an screw up by you as CTO if you didn’t communicate to them how serious this is.

Each project under third-party should include the actual text of the licence used, and a README with details of how to get the code, where it is from and so on.

Sometimes authors of open source projects can change their license so you go back to the project and find its not any more available under the permissive license you relied on. If you have a copy of that committed into your source control you have proof you have a valid license to use it.

Artifactory, Maven and other Dependency tools

You should avoid large files in your repository at all costs. They’ll grind your CI to crawl, and create dependency hell. Look for a solution like Artifactory or Git large files instead. So you’re doing this right?

And if you’re working in the web-world with Javascript then you’ll be using Node and NPM so doesn’t this get you out of having to worry about third-party?

Well — to a degree. Same as with Carthage for Swift and Objective-C and Maven for Java. You’ll have a file that documents your dependencies and you can then avoid having to write them up.

I would strongly council to use a third-party directory though — for the licensing reasons above. Snippets of code that get taken from open source projects should definitely be documented here if they’re more than 1–2 lines.

Same as even if your repository is private the chance it will somehow get shared or taken out of premises is always there. Having the license inside the repository is key.

Good luck with your Startup!

I hope these tips help.

There’s lots more things I put in my repositories that I didn’t have room to list here.

Let me know in the comments if you have any of your own “must have” items. Thanks for reading.

--

--

Sarah Smith
Sarah Smith

Written by Sarah Smith

Sarah Smith is a writer & app developer .

No responses yet