Hi

This email is rather long and contains mostly technical background and 
opinions about the stated cons of the buildmaster setup. Feel free to 
skip it if you're not interested in the workings of my current setup or 
why I find it sensible.

On 19.06.2016 06:13, Alexander von Gluck IV wrote:
> I worked on getting the haikuporter build-master installed in a public VM tonight and
> ran in to quite a few concerns about it.

Seriously? I mean that is exactly what I've set up and has been running 
for the last couple of months. I understood the goal was to put it on 
Haiku infrastructure and make it official, not build it up in private again.

> The number one issue I saw was the requirement
> that each Haiku build-slaves be accessible via IP + SSH

Seriously? It uses SSH. The only dependency here is a TCP connection. 
TCP can be tunnelled and forwarded at will with whatever tool fits. I 
personally use reverse SSH tunnels to hook up my builders. But you can 
use stunnel or just plain netcat as the connection will be encrypted 
anyway. I find SSL generally to be more of a hassle to set up than SSH, 
so I just use that.

 > Given these machines generally run behind user NAT's, and is 
"single-shot" I think the
 > haikuporter build-master might be *too* simplistic.

The SSH connections are closed when the buildrun is through, yes. That 
doesn't mean that the SSH servers are suddenly going to disappear and 
need to be set up again for the next buildrun. You make it sound like 
there's some kind of manual work involved here, which there isn't.

Regarding reverse SSH tunnelling:
 > I can't find any documentation, arguments, or code on this... it 
seems this should be
 > the default behavior for the outlines reasons. (Since we need one 
complete haikuporter
 > build-master per architecture, how would this even work?)

I'm really getting a strange feeling here. You make it sound like I'm 
hiding some dark secret. But this really is plain basic networking. You 
need to make a TCP port available where an SSH server can be reached. 
You can do that through a static IP and a port forward through your NAT 
on your home router. You can also make that port available by forwarding 
it through a reverse SSH tunnel or similar setup. This is not something 
I just invented, this has been used since decades.

The general approach was to not reinvent stuff. HaikuPorter needs a way 
to remotely execute commands on the builder. A remote shell like SSH 
fits that bill, so I used that (through paramiko) to leverage the fact 
that we ship an SSH server with every Haiku nightly ready to use.

Of course you can build remote execution using your own protocol and 
wrap that within SSL to secure it and implement yet another way to do 
the same thing. I just didn't find it useful to build such a protocol 
directly into HaikuPorter as that IMHO would just be bloat that would 
need to be maintained.

 > If anyone wants to try and document it let me know and i'll give you 
access to a
 > buildmaster *and* a remote buildslave.

To make this very obvious, I'm going to insert the full text of 
everything I'm discussing here at the end of this email.

For the reverse SSH tunnel I'm using a script [1] with a corresponding 
configuration [2]. This configuration forwards port 22 of my builder 
(where the local SSH server listens) to port 8124 on the server where 
the buildmaster is configured to connect to (see the full builder 
configuration in [3]). It does that with a weak but fast cipher because 
it is going to only provide a tunnel for another, more strongly 
encrypted SSH connection. The builder configuration was created with the 
createbuilder.sh script, which I've committed to the HaikuPorter 
repository under the buildmaster directory and which automates the 
configuration. The only thing I had to do to set this builder up was to 
name the HaikuPorter and HaikuPorts directory, host, port and user for 
the SSH connection (using localhost:8124 for the forwarded port) and 
finally add the automatically generated SSH public key to the 
authorized_keys of the desired user on my builder.

The ssh_tunnel.sh is symlinked in my ~/config/settings/boot/launch so 
that it start automatically on boot. For obvious reasons I am going to 
leave out the private key used by that configuration. The tunnel user on 
the server is set up with git-shell to prevent normal shell access. The 
authorized_keys file [4] further limits the possible actions to pretty 
much just port forwarding, which is all that is needed here.

Why didn't I document that as the official way to set up a builder? 
Because I find this to be an implementation detail that is entirely up 
to the operator of the server and builder. How the connection is 
established does not matter to HaikuPorter and the choice of 
infrastructure to make it happen is a matter of various factors 
including security and trust concerns, available tools and personal 
preference. The described setup can be used for pretty much any port 
forwarding need and is not in any way specific to HaikuPorter (and 
wasn't written for this use case either, it stems from the setup I've 
implemented at work to do most of our remote support).

Another person would maybe prefer not to create a user on the server at 
all and configure stunnel to do SSL tunnels instead. This would work 
just as well.

> I think we're all assuming haikuporter build-master is a lot more magic than it actually
> is. Some great work has been put into it, but I want to make sure there is consensus
> that haikuporter build-master is the way to go.

Why are you assuming that everyone's assuming magic here? It's pretty 
far away from magic or being a black box. It's a single source file 
within HaikuPorter with ~800 lines of code [5]. How much magic can there be?

This is of course not by accident. Indeed the whole point was for it to 
do as little as possible by leveraging existing tools (like the 
dependency logic and recipe handling inside HaikuPorter itself, but also 
regarding protocols for remote command execution (SSH) and file 
transfers (SFTP) as well as serving out status (apache httpd in my setup)).

I generally find that if something seems like magic one just doesn't yet 
fully understand how it works.

> https://github.com/haikuports/haikuporter
> haikuporter build-master mode  (mmlr)
>    Pro
>     - Python which has good community knowledge
>     - Fully leverages haikuporter internal logic for dependencies
>     - Builds repos
>    Cons
>     - SSH's out to slaves and requires user to open ssh port per slave. (and static ip)

See above.

>     - Requires haikuporter + haikuports on master and each slave (does haikuports have to be in sync?)

Yes, obviously HaikuPorter is needed because in this setup both the 
buildmaster logic and the builder are implemented in it. The builder 
uses HaikuPorter in all of these setups, so I don't see why it's listed 
as a con for this setup.

HaikuPorts is obviously also needed (in all of the setups as well). 
HaikuPorts needs to be in sync on the buildmaster and the builder. The 
buildmaster ensures that automatically.

>     - Difficult slave configuration + lots of directory settings per slave

The createbuilder.sh script asks you a couple of questions that you have 
to answer. All questions that can have a sensible default have one. I 
don't exactly see how this is classified as difficult. It even automates 
creation of all the necessary SSH keys and queries the remote host key 
for you. The only directories that you have to specify is where on the 
builder HaikuPorter and the HaikuPorts tree can be found. All the other 
directories have defaults that you can just accept and it will work fine.

>     - Doesn't know about architectures of buildslaves (one entire environment for each arch)

I don't understand? All setups will need builders for the different 
architectures. Conveniently the fully host independent chroot in this 
setup will allow you to run builds for different versions/branches on 
the same builder (as long as that one is reasonably compatible) as no 
system packages are used at all. So overall builder count should be 
reduced compared to the other setups.

If you mean there's one *entire environment for each arch*, then yes 
that is true. It consists of a HaikuPorts checkout and a builder 
configuration.

>     - *Basic* html report of each single-shot run.

I give up on this point. I've explained my reasoning for a JSON output 
numerous times. Maybe just think of it as serving out the "database" 
that the other approaches also have?

>     - Single shot for one package (or a bunch? --do-bootstrap seems broken here) and deps

You can do a buildrun for a single package, many packages, the packages 
that were affected by changes to recipes and referenced files, whatever 
you want. It is just a buildrun, what you put inside is decided by how 
you start it.

I've taken great care to make sure that this can run as a git hook or by 
comparing different git revisions by implementing the functionality to 
derive a set of affected recipes from a set of changed files. This 
includes things easily missed by a more simple approach like a 
referenced license or additional file or patches.

The buildmaster/buildmaster.sh frontend script automates most of the 
common tasks (including updating to a new revision and building 
everything affected by the changed files). For reference I'm inserting 
the full text of my magic updateloop script in [6]. That's all there is 
to it for continuous building of changed/new recipes. The script is run 
in the HaikuPorts checkout on the server and takes everything it needs 
from there. Setting it up for a different branch means: just checking 
out that branch.

I don't understand the remark about --do-bootstrap. None of the setups 
are meant for bootstrapping. This is for continuous automated package 
building and publishing.

>     - Lots of requirements on build-master system (package, package_repo, haiku repo for licenses)

You actually listed all of them, so "lots" might be a bit of an 
exaggeration. The package and package_repo tools as well as any 
build_host libraries needed are a byproduct of building a standard Haiku 
image. You can also just build the two tools individually if you don't 
want to wait for a whole image to build.

That the licenses aren't duplicated as part of HaikuPorts is a bug in 
our setup IMHO. Relying on the presence of license files in the Haiku 
package without explicitly declaring that or bringing the license with 
your recipe is a bad practice IMO.

>     - Poor documentation (I've written whats out there now)

I've tried to outline the concepts a couple of times in my emails. In my 
job I am partly a sysadmin for various servers and set up a lot of 
machines and services, so obviously the tools used here don't seem 
strange to me at all. I understand that this doesn't necessarily apply 
to other people. However I would expect some sort of sysadmin background 
from a person running official Haiku servers and services as well.

 > I'm all about microservices, but my main concern is this whole thing 
sounds
 > like it is going to be held together via 30 cron jobs, 10 scripts in 
/usr/local/bin,
 > and a few old men to log in and manually fix stuff every other day.

I wouldn't really say 30 is old and don't see why one would need to tend 
to an automated system every other day, but the rest sounds about right 
(maybe lower numbers, say 1 cron job or git hook and 3 or 4 scripts 
chained together). The difference seems to be in the interpretation of 
whether this as a good or a bad thing. I find shell scripts, at least 
reasonably structured ones, pretty obvious. Large do-it-all servers on 
the other hand can quickly wander in the "blackbox/magic" direction.

In my opinion this is still just a very modular and flexible setup that 
can easily be hooked into, just as I personally would expect from such a 
system.

Regards,
Michael

--

[1] - ssh_tunnel.sh

#!/bin/bash

cd "$(dirname "$0")"

exec 1>> ssh_tunnel.log 2>&1

function log {
         echo "$(date +%Y%m%d_%H%M) $1"
}

log "Starting SSH tunnel loop from $0 in $(pwd)"

while true
do
         log "Starting SSH process"
         ssh -nNT -F ssh_tunnel.config ssh_tunnel
         log "SSH process quit with status $?"
         sleep 5
done

[2] - ssh_tunnel.config

Host ssh_tunnel
         HostName                hpkg.mlotz.ch
         User                    tunnel
         BatchMode               yes
         ConnectionAttempts      3
         ConnectTimeout          15
         ExitOnForwardFailure    yes
         IdentityFile            ssh_tunnel.key
         IdentitiesOnly          yes
         UserKnownHostsFile      ssh_tunnel.hostkey
         LogLevel                VERBOSE
         Protocol                2
         RemoteForward           8124 localhost:22
         ServerAliveInterval     15
         ServerAliveCountMax     3
         Cipher                  arcfour

[3] - mmlr_htpc_x86_gcc2.json
{
         "name": "mmlr_htpc_x86_gcc2",
         "ssh": {
                 "host": "localhost",
                 "port": "8124",
                 "user": "mmlr",
                 "privateKeyFile": "keydir/mmlr_htpc_x86_gcc2.key",
                 "hostKeyFile": "keydir/mmlr_htpc_x86_gcc2.hostkey"
         },
         "portstree": {
                 "path": "/Media/Source/builder/x86_gcc2",
                 "packagesPath": "/Media/Source/builder/x86_gcc2/packages",
                 "packagesCachePath": 
"/Media/Source/builder/x86_gcc2/packages/.cache"
         },
         "haikuporter": {
                 "path": "/Media/Source/builder/haikuporter/haikuporter",
                 "args": "-j2"
         }
}

[4] - authorized_keys of the tunnel user on the server
no-pty,no-X11-forwarding,permitopen=":1",command="/bin/echo tunnelonly" 
ssh-rsa AAAA...publickey...== mmlr_htpc_x86_gcc2

[5] - 
https://github.com/haikuports/haikuporter/blob/master/HaikuPorter/BuildMaster.py

[6] - updateloop.sh
#!/bin/sh
while true
do
         date -u
         ~/haikuporter/buildmaster/buildmaster.sh update \
                 && ~/haikuporter/buildmaster/createrepo.sh
         sleep 180
done >> update.log 2>&1