> Moving to a monorepo didn't change much, and what minor changes it made have been positive.
I'm not sure that this statement in the summary jives with this statement from the next section:
> In the previous, separate repository world, this would've been four separate pull requests in four separate repositories, and with comments linking them together for posterity.
>
> Now, it is a single one. Easy to review, easy to merge, easy to revert.
IMO, this is a huge quality of life improvement and prevents a lot of mistakes from not having the right revision synced down across different repos. This alone is a HUGE improvement where a dev doesn't accidentally end up with one repo in this branch and forgot to pull this other repo at the same branch and get weird issues due to this basic hassle.
When I've encountered this, we've had to use another repo to keep scripts that managed this. But this was also sometimes problematic because each developer's setup had to be identical on their local file system (for the script to work) or we had to each create a config file pointing to where each repo lived.
This also impacts tracking down bugs and regression analysis; this is much easier to manage in a mono-repo setup because you can get everything at the same revision instead of managing synchronization of multiple repos to figure out where something broke.
Every monorepo I've ever met (n=3) has some kind of radioactive DMZ that everybody is afraid to touch because it's not clear who owns it but it is clear from its quality that you don't want to be the last person who touched it because then maybe somebody will think that you own it. It's usually called "core" or somesuch.
Separate repos for each team means that when two teams own components that need to interact, they have to expose a "public" interface to the other team--which is the kind of disciplined engineering work that we should be striving for. The monorepo-alternative is that you solve it in the DMZ where it feels less like engineering and more like some kind of multiparty political endeavor where PR reviewers of dubious stakeholder status are using the exercise to further agendas which are unrelated to the feature except that it somehow proves them right about whatever architectural point is recently contentious.
Plus, it's always harder to remove something from the DMZ than to add it, so it's always growing and there's this sort of gravitational attractor which, eventually starts warping time such that PR's take longer to merge the closer they are to it.
Better to just do the "hard" work of maintaining versioned interfaces with documented compatibility (backed by tests). You can always decide to collapse your codebase into a black hole later--but once you start on that path you may never escape.
Repository boundaries are affected far more by the social structure of your organisation than anything technical.
Do you want hard boundaries between teams — clear responsibilities with formal ceremony across boundaries, but at the expense of living with inflexibility?
Do you want fluidity in engineering, without fixed silos and a flat org structure that encourages anyone to take on anything that’s important to the business right now, but with the much bigger overhead of needing strong people leaders capable of herding the chaos?
I’m sure there are dozens of other examples of org structures and how they are reflected in code layout, repo layout, shared directories, dropboxes, chat channels, and email groups etc.
I love monorepos but I'm not sure if Git is the right tool beyond certain scale. Where I work doing a simple `git status` takes seconds due to the size of the repo. There has been various attempts to solve Git performance but so far this is nothing close to what I experienced at Google.
The Git team should really invest in tooling for very large repos. Our repo is around 10M files and 100M lines of code and no amount of hacks on top of Git (cache, sparse checkout etc etc) is not really solving the core problem.
Meta and Google have really solved this problem internally but there is no real open source solution that works for everyone out there.
Without indicating my personal feelings on monorepo vs polyrepo, or expressing any thoughts about the experience shared here, I would like to point out that open-source projects have different and sometimes conflicting needs compared to proprietary closed-source projects. The best solution for one is sometimes the extreme opposite for the other.
In particular many build pipelines involving private sources or artifacts become drastically more complicated than their those of publicly available counterparts.
CharlieDigital ·9 hours ago
When I've encountered this, we've had to use another repo to keep scripts that managed this. But this was also sometimes problematic because each developer's setup had to be identical on their local file system (for the script to work) or we had to each create a config file pointing to where each repo lived.
This also impacts tracking down bugs and regression analysis; this is much easier to manage in a mono-repo setup because you can get everything at the same revision instead of managing synchronization of multiple repos to figure out where something broke.
Show replies
__MatrixMan__ ·2 hours ago
Separate repos for each team means that when two teams own components that need to interact, they have to expose a "public" interface to the other team--which is the kind of disciplined engineering work that we should be striving for. The monorepo-alternative is that you solve it in the DMZ where it feels less like engineering and more like some kind of multiparty political endeavor where PR reviewers of dubious stakeholder status are using the exercise to further agendas which are unrelated to the feature except that it somehow proves them right about whatever architectural point is recently contentious.
Plus, it's always harder to remove something from the DMZ than to add it, so it's always growing and there's this sort of gravitational attractor which, eventually starts warping time such that PR's take longer to merge the closer they are to it.
Better to just do the "hard" work of maintaining versioned interfaces with documented compatibility (backed by tests). You can always decide to collapse your codebase into a black hole later--but once you start on that path you may never escape.
Show replies
gorgoiler ·1 hours ago
Do you want hard boundaries between teams — clear responsibilities with formal ceremony across boundaries, but at the expense of living with inflexibility?
Do you want fluidity in engineering, without fixed silos and a flat org structure that encourages anyone to take on anything that’s important to the business right now, but with the much bigger overhead of needing strong people leaders capable of herding the chaos?
I’m sure there are dozens of other examples of org structures and how they are reflected in code layout, repo layout, shared directories, dropboxes, chat channels, and email groups etc.
msoad ·4 hours ago
The Git team should really invest in tooling for very large repos. Our repo is around 10M files and 100M lines of code and no amount of hacks on top of Git (cache, sparse checkout etc etc) is not really solving the core problem.
Meta and Google have really solved this problem internally but there is no real open source solution that works for everyone out there.
Show replies
xyzzy_plugh ·9 hours ago
In particular many build pipelines involving private sources or artifacts become drastically more complicated than their those of publicly available counterparts.
Show replies