[OOTB-hive] Thoughts on Honeycomb that need your input

Tue Jul 12 20:43:43 BST 2016

Hi All

tl;dr. Sorry you didn't like my puppetry. Docker's nice like that
but you're missing something. If Honeycomb is artifacts then we need
an artifact server.  KVM plus Docker + artifact server + Jenkins +
an instance or two of Alfresco running would make a party on our
server hardware.

Wow, what a thread. I really don't want to pick it all to pieces but
I'll say a couple of things.

Firstly, I'm sorry that I have not had a chance to get into this
debate sooner. I've been very ill for some weeks, but I am mostly
recovered now.

Originally I put together a proof of concept of a puppet build.
Mistakenly I put this under the OOTB github group, I shouldn't really
have done that. Anyway if I recall the timeline correctly, that's when
Andreas mentioned he didn't like Puppet for that job and preferred
Docker. At that point I started work on my own puppet-alfresco project
and waited for some other suggestions to appear in the distro group that
I could get behind. There was not a single other suggestion. If I recall
correctly I didn't even announce working on my own project nor announce
any progress of it to the group. At some point Daren joined in my
efforts and we started thinking about it as a form of easily getting
Alfresco into the hands of the target market we both know, that of small
businesses.

Eventually I suppose we started thinking about it between ourselves as
Honeycomb, I think at some point we asked for contributors to it in that
context, obviously none appeared. But neither was there any further
overt pessimism for the project; we supposed that perhaps the lack of a
"1.0" version number was putting people off trying it. Thus in the weeks
approaching last year's June global hackathon we put a lot of effort
into putting together something we could call "production ready" in the
sense that you could run the installer on a bare metal box and end up
with a decent Alfresco install, firewalled, reverse proxy, installed
mysql server, installed mail server (in a limited sense... you'd better
not be on an IP which is listed in an RBL!) and SSL certificate
installation. And so we dared to call it "Honeycomb 1.0". And you'd
better believe I was seriously proud of the efforts of Daren and myself.

But I'm seriously not wedded to it. Here's what I learned about puppet
for this job along the way. It's hard to understand how puppet does
things, and so it's hard to pick up new developers. Even when you know
it it's pretty hard to get it right. Development lifecycle is really
slow because sometimes you need to clear right down and start again to
be sure that the resources are being applied correctly. But by $DEITY,
it really makes a rock solid install, I'd say orders of magnitude better
than shell scripts. You can interrupt the installation at any point
(with the notable exception of while in the install phase of the
underlying package manager, typically that breaks with the same
frequency as it does if you interrupt the package manager while running
it directly) and when you re-run it it will recover, no errors from
attempting to reapply users, packages, whatever.

While developing puppet-alfresco I started using Foreman, which is kind
of an open source version of Puppet Enterprise (plus a lot more) and
noticing it used puppet for its own installer, I felt at least partially
vindicated. However without mindshare and hence more developers it was
clear that this  was not really going to be a viable approach for an
Alfresco installer, especially given the rate of change of the
underlying OS distros and also Alfresco themselves.

So hooray, Daren and I held the poison chalice of "de facto Honeycomb"
for a bit. I'll be glad to put it down, it's getting a bit heavy.

As to docker-alfresco, I think Daren threw a bit of a curve ball in the
thread above by mentioning it. It's not supposed to be Honeycomb, it
never was. It arose because I was baffled that the only thing that
anyone could think to do with Alfresco in conjunction with Docker was to
stuff everything into one container. By this time $DAYJOB was in devops
in a company that was struggling with scale and desperately trying to
work out how to break their monolithic apps into bits, not exactly
microservices, but at least be able to scale the different parts of it
separately. I wanted that for Alfresco and for me the docker-alfresco
project is that, I wanted the freedom to be able to fire up N different
hosts and move bits of the Alfresco monolith between different hosts, to
reduce IO contention, memory contention etc. as appropriate, and to be
able to choose configurations based on predicted load patterns. Ideally
the dream part of that would be to migrate the pieces at runtime
transparently, but even being able to choose on which hosts to place
each part of the monolith at startup would be great, and Daren in
particular did some cool experiments about what could go where, based
across 3 hosts if I recall. Anyway it's not Honeycomb and it never was.

Also, seriously, I get that Docker makes a great lightweight VM
environment but if all you do is use it like that then you're missing
the best bits.

As to grua, yeah at the time docker-compose couldn't do container
startup ordering and dependencies, and if you start repo up before the
database it's just going to bring pain. But now docker-compose can do
that, so really, grua isn't needed for that anymore and docker-alfresco
could be refactored very easily to just use docker-compose, which would
probably help in scaling efforts with docker-swarm. Again, to use Docker
as I believe it is intended to be used, you definitely need *some* kind
of composition.

I mean I can't be the only one that in my mind's eye has community
Alfresco having some kind of clustering in the not too distant future,
can I? The only one that believes the community version could be a
properly scaleable engine that could handle challenges that might today
be considered "enterprise"? Meh, maybe I am, but I hope not. Today's
enterprise challenges are the ones that tomorrow we will need in the
community landscape, sure as night follows day.

Jeff's right, the original goal of Honeycomb was to be a collection of
addons which we considered should be installed in every Bee Alfresco.
Problem was we never got the infrastructure together. If we could have
had an easy to use infrastructure this would all turned out very
differently. Even in the puppet-alfresco build Daren and I constantly
asked "Where should we get this WAR from?", "This plugin is only in
source, should we compile it during the installer?", "What happens if
there's a zombie apocalypse?" (well no that last one was Jeff at Beecon
but we were definitely thinking along the same lines).

What we need is our own artifact repository, and a CI process that can
build the artifacts we need, including the share and repo WARs as well
as all the plugins we want, including wherever possible running a test
suite (perhaps forking addon projects where necessary to add unit
tests where practical), preferably including integration testing
against a full install of "Honeycomb Alfresco" with all our plugins
installed (UI testing as well as potentially webscript testing where
we have exposed them)

To that end, Daren and I have again been beavering away in our own
direction. We just can't help ourselves ;-) . Recently I hired a server
at Hetzner, the same provider the current infrastructure is supposed to
be on. It's a tadge smaller than what we have I think at 24GB RAM
(incidentally that now costs around 35EUR/mth so we might consider a new
larger server for ourselves since we actually have no infrastructure on
ours yet) and I think that my experiments have yielded a pretty good way
to move forward with infrastructure.

I've used Ubuntu at every layer, no reason, I just like it and this
server is just for me. Also all my puppet manifests for my personal
servers are based around Ubuntu. Perhaps CentOS would be a better choice
for OOTBee as they change the underlying distro much less often.

On the base server I've used qemu-kvm. Jeff mentioned OpenVZ which did
come from a conversation with me, since I've been using this at home for
some time, however, it's a sort of pre-Docker containerization, not full
VMs, and it's getting a bit long in the tooth, and hardly likely to get
more love with Docker in the space. KVM does give full VM separation,
and it's free as in speech as well as free as in beer, which I really
believe that as a free as in beer and speech project ourselves we should
strive to committing to as an "ideology" for want of a better word. If
needed you can run Windows on it too which you can't with OpenVZ.

Usability-wise, well as far as native apps go, pretty much only Linux
is directly supported, with a very full-featured app called 'virt-
manager' ( https://virt-manager.org/ ). Apparently it's possible to
get it working on a Mac too but I couldn't. Instead, I found the next
best thing, a great web interface ( https://www.webvirtmgr.net/ -
seems to have a bum SSL cert right now) that gives you full control
over the VMs including being able to launch VNC console sessions from
the web browser.

Key parts of the internal architecture - I've allocated about half of
the RAM to Docker (although KVM lets you scale RAM up and down
dynamically for individual VMs) and about 4GB to an OOTBee Alfresco
instance (of puppet-alfresco, but could be any install we like), 2GB for
a mysql server (not expecting heavy loads, also could be postgresql if
we like).  A few Gig more (not really scaled it yet, and not settled on
which artifact server) for Jenkins CI and either Nexus or Artifactory.
Daren's been playing with Nexus and I have tried Artifactory. The pro
version of Artifactory has loads of nice features but we will be using
the free version, I'm not sure it brings anything more to the table than
Nexus does but a little bit more experimentation will tell.

Jenkins can use either Docker or KVM to provide build slaves on demand
when necessary so that seems like quite a win. In case you're not
familiar with Jenkins or CI in general, it will watch a source tree,
e.g. github, for changes and automatically pull them and try to build,
test and install the resultant artifact into the artifact server, or
die trying.

In use, almost nothing is exposed externally, except for a VPN port, a
putative reverse proxy to access the internal Alfresco instances, and
also whatever we decide to expose as ports on our single IP address (at
least the artifact server will need to be represented, if not via the
reverse proxy). If you are invited to work on infrastructure, you would
be issued keys to connect to our VPN which would either be OpenVPN or
tinc, and then you would have for example a 10.11.12.0/24 address (or
whatever we choose) to gain access to the internal systems.

That then is my recommendation for how we proceed with the
infrastructure architecture. I welcome your comments with anticipation.
Now I'm supposed to be $GERUND_EXPLETIVE convalescing so I had better
get back to that ;-)

Cheers
Martin

--
  martin at bettercode.com

On Tue, 12 Jul 2016, at 06:04 AM, Andreas Steffan wrote:
> Thanks Richard!
> I hope things around Alfresco containerization will start to
> settle soon.
> Would be very awesome if people working with them could build on top
> of each other.
> Regards
>  Andreas
> Andreas Steffan
> Achter Billing 14
>  22399 Hamburg
>  Germany
> skype: deas0815
>  M: +49 160 4694826
>  T: +49 40 23943542
>  F: +49 40 23943542
> http://www.contentreich.de
> Contentreich : Alfresco ECM, Clojure, Groovy und WordPress - aus Spaß
> und für Geld
> Am 11.07.2016 11:24 nachm. schrieb "Richard Esplin"
> <richard.esplin at alfresco.com>:
>> Short answer is that there is a lot of experimentation going on,
>> but no consensus has emerged about best practices or recommended
>> trade-offs.
>>  * Lots of engineering teams are distributing Docker images to assist
>>    with development. Each team is building their own in slightly
>>    different ways.
>>  * The solutions engineers and testing teams are using Power Bundle
>>    to spin up consistent environments.
>>  * We are using Alfresco SPK with Chef and Vagrant to spin up and
>>    maintain our Amazon instances.
>>  * We know that the installer has a lot of problems, but we haven't
>>    yet found a better replacement.
>>  * Our internal production instances are virtualized machine images.
>>
>>  On Thursday, June 23, 2016 5:06:16 PM BST Andreas Steffan wrote:
>>  <snip>
>>  > People at Alfresco have containers on their radar, and I am sure
>>  > we will
>>  > see serious efforts coming from them. I'd appreciate to hear what
>>  > they
>>  > are up to in this regard - short and medium term - to keep in line
>>  > with
>>  > them.
>>  <snip>
> _________________________________________________
> OOTB-hive mailing list
> OOTB-hive at xtreamlab.net
> http://www.xtreamlab.net/mailman/listinfo/ootb-hive

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.xtreamlab.net/pipermail/ootb-hive/attachments/20160712/e19f9595/attachment-0001.html>