I wish the mods hadn't changed the title; I chose the original title to focus more on what's cool about it.
Anyway, Harvey is distributed in the same way that Plan 9 is/was distributed: the services are meant to be run across a network. On a properly set up network, you'll have an authentication server which does only authentication, because that's the keys to the kingdom. Similarly, you'll have a file server which ONLY serves files, and speaks to the auth server to manage access. Then you'll have a CPU server which lets remote users connect and do stuff; it mounts its root filesystem from the file server and authenticates users with the auth server. You can also have terminal machines, which typically netboot, get their root from the file server, authenticate with the auth server, and basically act as a personal workstation for whoever sits down; when you're done working, just save your work and reboot the terminal.
Of course it doesn't have to be run like that and many people don't, because they don't want to run 4+ systems. You can combine auth, FS, and CPU services onto one machine and just connect from a Windows, Linux, or Mac machine using 'drawterm' (think of it as an ssh equivalent).
Honest question, not trying to be dismissive: This architecture sounds old to me, as in things were built like that in the 80s or earlier but evolved past. Is that so? If so, what makes those decisions newly relevant?
The architecture does indeed come from the late 80s/early 90s, but I think it's more relevant today than ever. Separation of services is, in my opinion, essential to security. By putting the authentication service off in its own machine, you restrict the attacks that can be made on it; the auth server only talks over a highly restricted protocol. On a standalone Unix system, users log in to the same machine that stores the passwords. They're only a privilege escalation exploit away from getting the hashes of everyone's password, and these days privilege escalations are a dime a dozen.
When this scheme was designed, it was frankly a little bit nutty. The CPU, auth, and file servers would be VAX or Sun systems, costing tens of thousands of dollars, and the terminals would be either slightly cheaper Suns or IBM PC-compatibles costing thousands of dollars themselves. Today, you could probably cobble together a passable network for a few hundred dollars, assuming you use cheap Atom boards for everything except the CPU server (which is meant to be a beefy compute box, but let's be honest nothing in Plan 9 uses a lot of cycles). This makes the architecture more sensible than ever.
It seems a bit older than that. Much of the design goals for Harvey were in the early MULTICS systems of the 70's. In some ways I view Plan9 as Bell Labs attempt to bridge the great components of MULTICS and UNIX together.
I don't mean to be argumentative. MULTICS was in use by Honeywell Aerospace into the early 90's simulating physics models.
There was a centralized computer which was programmed via punch cards which you could dial into to submit your specific simulation and later return to for results.
Good point. Multics was used just as much, or little, as Plan 9 was. We might wish our predecessors had made the jump to distributed, securable operating systems, but there was never a time at which it made sense to do so.
You're absolutely correct, and like I said a lot of people run their Plan 9 systems with all the services on one box, which kills a lot of the security advantages.
However, if you compare setting up a Plan 9 auth server to setting up a Kerberos server... well, basically anything to do with Kerberos makes me long for death. The Plan 9 auth system is one of the best things they made and I highly recommend checking out the paper: https://css.csail.mit.edu/6.858/2013/readings/plan9auth.pdf
If you emulate a dozen machines on one physical machine, then a single exploit can traverse them all. If you pack a dozen "single board computers" in a case and give each a single function, then entire classes of attack are ruled out.
> On a standalone Unix system, users log in to the same machine that stores the passwords. They're only a privilege escalation exploit away from getting the hashes of everyone's password
But what machines are, in practice, multiuser today in that way? My work computer has only my account, and sometimes a temporary one if a technician needs to log in to troubleshoot something. For home use I don't think it makes sense. And for a prod network, again users aren't logging into the machines directly.
Here is an excerpt from the about page, I don't know enough to argue for or against the architecture but this part caught my eye.
"those who believed in Harvey were thought to be insane, but eventually found not to be."
So they recognize that many will think it is an insane idea but are playing a long game to prove that to be incorrect.
================================
About the Name
So, why “Harvey”? Mostly because we liked it, but there are at least 3 other good reasons based on the origins of Plan 9 operating system:
Harvey is a movie, Plan 9 is a movie too.
Harvey is a rabbit, Glenda is a bunny.
Most importantly: in the movie, those who believed in Harvey were thought to be insane, but eventually found not to be.
>This architecture sounds old to me, as in things were built like that in the 80s or earlier but evolved past.
Actually most modern OSes in use are even older in their concepts.
Plan 9's concepts as described above have slowly creeped into Linux but not fully. So the architecture above was only experimental in the 80s/90s and is still nowhere available in the mainstream today.
So your concern is like saying "we've had macros and closures since the 70s" in a world that still uses Java/C#/etc.
The idea does indeed come from the 80's, but things didn't evolve past them. The architecture of our current OSes was settled at the 60's and early 70's, and we never moved from it.
It is not clear why such things were never adopted. There were many licensing problems on the way, and a Microsoft monopoly pushing things into even older architectures than the competition. There were also failed modern projects, but there is little evidence to decide if it is a hard problem, greed people forced everybody into a worse path, or if it is something people do not want.
One nice thing about Plan9 is there is no concept of "root".
Plan9 "fixes" many of the flaws/deficiencies of the original UNIX, such as that one.
Whats cool about this Plan9 project compared to the original Plan9 is one does not need to use the Plan9 compiler to compile it. One can use gcc, clang or icc.
They key word in my question being "things evolved past that". Age doesn't matter, but if things are done in a different way now it's for reasons, not merely for novelty.
That would make it an improvement over whatever else you are using which is most likely running on an architecture designed in the 60's and first implemented in the 70's.
> you'll have an authentication server which does only authentication
> you'll have a file server which ONLY serves files
These sound like disadvantages. Decentralization would be better. I'd like to share the storage of all my servers and not have one auth server as a single point of failure. I imagine you can setup up a redundant auth server at the cost of more hardware but why not decentralize? This seems a lot like the old way of doing things.
And indeed one of them may be local to the computer serving local drives, and another could be the network resources. Heck it could even allow file servers with different security settings (e.g. the USB drives might be mounted in a hostile file server space).
I mean, it sounds a bit like 'software defined computing' if you'll excuse the terrible metaphor, a bit like SDN which abstracted the physical networking layer.
Does Harvey abstract the hardware layer? So it could theoretically scale to a huge amount of machines that look like one giant powerful one?
Wouldn't the speed of operation be limited to the speed of the network though?
Anyway, sorry if the questions sound silly, I don't know much about this stuff.
Biggest mistake IMO was not clarifying what a distributed OS is supposed to be. Especially nowadays with all the cloud hype I could think of at least three different meanings right off the bat, then clicked all the links on the landing page and erratically browsed the wiki and didn't find anything, not even an external (wikipedia) link which got me quite annoyed at that point.
Plan9 was the 'next version' of Unix made by the people that originally made Unix. It was a small (tiny!) network packet routing kernel (routing 9p, a layer above IP) that is meant to be fully distributed and networked.