Rant: Wishing we had someone who worried about the “whole release” when decisions are being made

Sorry – this is a rant. The rule in rants is that you don’t have to be logical – you are just mad. So I apologize in advance… I don’t intend for this to be about anyone in particular but there are some example things that I call out – just because I call something out – it does not mean that it is a bad idea – and certainly does not mean that I am upset at a person who is just trying to do what they think is right. I am ranting about governance and process and approach – and community culture – not any single individual.
Back in 2004 and 2005 we had a “chief architect” – that was either me or Glenn – or most likely a combination of both.
The major role of the chief architect was to worry about the the entire scope of the Sakai release and not look at just their little corner of the code base and be “selfish”. Having global conventions and rules that were enforced across the code base – make everyone’s life better. And to really *have* global rules – you need policing activity and judges and you need to tell some people at times to “stop that!”.
Now a few people in the community over time looked at this police/judge (chief architect) role as a great privilege and a glorious source of ego building power. Actually this role is one of the crappiest roles in any project if it is done well. Everyone is pissed at you all the time and sometimes a person who is told “no” they go find a few friends and come at you with a mob when they get really mad. Sometimes the chief architect is wrong because they needed to be educated and has to publicly admit it and reverse their position. You cannot survive the chief architect position and keep your ego – it is beaten out of you if you do the job right.
—Sub-Rant—-
The key in any open source project is that leadership is action – not position. People who come from organizations where leadership is not valued and instead management is the more valued than leadership – they react to titles and positions differently. People who are sadly stuck in and/or grew up in in manager-oriented organizations get all veklempt about titles. Those who have seen leadership truly modeled really hardly care less about the title and focus on what a person *does* – in organizations that value leadership – management activity is seen as another task that is equal in importance (and often less rewarding than) non-management work. Management generally is the act of “taking care” of the non-management people – and since this is tiresome – it is often well-compensated. But in truly leadership-centered organizations – having a title of “manager” or “director” does not automatically make you better than other people in the organization – you must earn your rank in the pecking order based on what you *do* and how well you do it – not the fact that you have a fancy title.
So, these people from unenlightened organizations see the chief architect role as something that has a lot of power and little responsibility – and since they have never experienced nor understand real leadership – they treat that person like all of the useless troll managers they have met in their life and either whine and complain or kiss butt / play politics to get what they want.
—-End-Of-Sub-Rant—-
Back to the original rant.. So the chief architect role is an unloved and unappreciated role – they make decisions and tell people “no”. But here is the key – they also take responsibility for the *whole code base* and the whole release process and after release process and the production issues and and and – if something drops through the cracks – this person picks it up – or at least tries to catch things and warn folks – or get the right folks involved.
We don’t have a chief architect any more as best I can tell. I am certainly not it – and I am glad – it was a role that kept me nervous all the time and I was glad that while I had that responsibility I was paid full time with unlimited travel budget to do nothing other than guide the Sakai community – it is so engrossing – that it does not blend well with other tasks.
It is not clear exactly why we don’t have a chief architect role any more – the two biggest contributing factors are (a) management types wanted control of the core decision making processes – so that decisions would be made manager-style (lots of meetings, lots of politics, lots of quid-pro-quo, lots of compromise, and none of those loud technical people involved in the process) and (b) because we just got too big for one person to know and watch it all. The reality is that it is a little of both.
However we got here – we no longer have one-stop shopping in terms of when things break – who do we ask and/or blame. We make technical decisions Murder on the Orient Express style. The deed is done – and there is no one to blame because everyone naturally takes such “fractional” responsibility.
The way this works out in Sakai is that someone proposes some broad new approach in a meeting or on the list. A few people give a +1 affirmation or grunt assent in a meeting – the person walks out of the interaction claiming they have a “mandate” and proceeds to take us down this new direction.
Often the proposals are called “improve our XX approach to be in line with industry best practices” – and then this gives folks a carte blanche to change our overall direction in some small way. So they go off and change our overall direction – and if challenged – they say something like “the folks in the meeting at XXX seemed to think this was OK”.
Usually I am the one opposing things – and usually I am the lone opposition voice (or at least the only one who has the guts to try to say “no” out loud). And so it looks like the “people advocating moving towards best practice” (apple pie, patriotism, world peace) versus the old cranky guy.
I am vastly hampered in my attempts to be critical about dangerous changes in direction is because the people I am arguing with are all smarter than me technically – and I know that. The top 20 people in Sakai are so damn smart it is not funny – they are all way better developers than me. But they are also often young and in a hurry and do not have experience with the long-term pain of seemingly simple decisions.
As an example of long-term pain, in late 2003 – as chief architect I decided that Java Server Faces was a good UI technology. It turned out to be ABSOLUTE CRAP! The effect of my decision was to cause great harm to this community and my estimate of the wasted effort from that single decision to be about 2-3 million dollars across Sakai – it doomed several projects to lose years of progress – particularly Samigo – I feel horrible about the decision to rewrite Samigo in JSF. And the new chat is in JSF – AARGH. I live with this decision and publicly admit my shame on this over and over.
The key lesson is not that somehow I am stupid and all the other people in Sakai are smart. The key is that sooner or later a mistake will be made and that a bit of caution is a good thing – and most importantly – the pain of a decision is often not evident for months or years after the decision was made.
We are making decisions now of the same magnitude pretty regularly – and we won’t know whether they are right or wrong for a while. I have opposed both of these vehemently – and folks just treat me as the “old cranky guy”.
– The use of Wicket in the main SVN and Sakai 2.6 is absolute crap – it comes up with ugly warning messages about Wicket configuration and this is ta test version…. Aargh.
– The independent versioning of every little piece of Sakai – I call these “vanity versions” – is absolute crap
Perhaps “absolute crap” is to strong of a term here – but at least lets *delay* these decisions and explore them in experiments with production systems before committing the whole community and whole release to go in the direction.
But NOOOO – someone got three “+1” votes and one “-1” vote from Chuck – and we all discount Chuck’s vote to be “-0.001” because he is just an old cranky guy who says “no” to everything.
I am an old cranky guy because I know that we are making these decisions too quickly and with no real consideration of the long-term consequences three years from now. We have no chief architect and we have no structure that moves us toward being conservative in the overall direction of the product. Our governance structure lets major shifts in direction to happen with the agreement of 3-1 vote out of 100 developers.
Another thing that annoys me – is that we *do* have smart conservative folks in the community – other “cranky people” who are generally nervous about too much change – too fast. But they are also *verrry quiet* people.
In a few years – the code will be a swamp. Making a release will be like finding a needle in a haystack.
Now – all is not so dark. Here are a few really brilliant points of light in what we are doing:
– The Kernel 1 work is brilliant – it took over a year to make sure that this was a good idea – Ian gave us many opportunities to look at his idea and he patiently did it over and over again each time someone raised a question or issue. He did his work as a branch and let it become nice and boring before we switched to his new practice. And the switch was awesome and pretty much flawless. Even with this much preparation there were a few things that needed thinking through after the switch – but with Ian as our leader – we could work through decisions while letting Ian keep the big picture intact.
– Take a look at this JIRA – http://jira.sakaiproject.org/jira/browse/SAK-11875 – Look at the reasoned conversation – look at the depth of thinking look at the attention to detail – and most importantly look at how many really bright people brought their thinking to the problem and how deeply they thought about the problem. We are blessed to have 10-15 people who are smart enough to see the whole code base and work collaboratively to narrow down a really hard problem and produce a superb solution to the problem. So we have 10-15 people who have talent to be architects.
But we still have no chief architect – we still do not have someone who feels pain when the release breaks – like in the Orient Express – when the release tanks – no one feels responsible for more than a small part of the release – and the cause of the failure is likely to be someone else. The math is simple – if you own 5% of something and the whole thing breaks – there is a 95% chance that it is not your fault. Except of course for me – I screw up more than most so even though I touch about 5% of the code – if something screws up – it is 10% chance that it was my fault :).
So we have these people that make long term decisions – fight fiercely over my lone opposition and win – drop in their change and then run for the hills leaving the community to do the long term support of their new decision – and the sad bit is that 2-3 years later when the overall impact of the decision comes home to roost – the people who screwed it up will likely be gone and so there won’t even be the satisfaction of “I told you so”.
This is all OK – open source communities have people that come and go and new folks pick up from old folks – this works except when there are major direction shifts or major changes in community commitment that slip in under these “+3” votes.
Enough whining Chuck! What is the solution?
A year ago – I yearned to go back to the Chief Architect Model. I did not want (and do not want) to be the chief architect – I just wanted there to be one. I wanted to have the feeling that someone was watching over the work that we all do – and would swoop in and save us from ourselves. It is nice to know someone has your back. We kind of have that for the Kernel in Ian – this is cool – I want to help and cooperate with Ian and I hope we all support Ian and help him – he volunteered for a really painful job. Don’t look at this as a “power grab” – look at as a “life sentence” that he voluntarily accepted – be thankful and appreciative – send Ian a “thank you note” once in a while – buy him a beer when you have a chance.
What do we do in the rest of the SVN where most of the current crap is happening?
First, we clearly have enough talent to do this together and if we focus on the “one person” approach we leave a lot of talent on the table unused and slow things down to a crawl – and we also suffer from the fact that no matter how smart one person is – they have blind spots and they have personality that creeps into their decisions and pre-conceived notions that creep into their decisions over time – so many eyes on the problem of the size of the main SVN are a good thing. And it cannot be Ian – he is used up on the Kernel now.
Here are my recommendations:
– Bringing new technology in the core – like Wicket – needs to be done with at least one semester of production experience with software at more than one site. If there are folks who think wicket is the “next best bright shiny object” – let them prove it and let them vote with their production instead of the very inexpensive “+1” in an E-Mail. Wicket should not go into 2.6 – let a few adventurous sites run it and see if wicket has memory leaks or the internationalization sucks or it fails on IE7 (or whatever) The vote that matters is the “who ran this in production” vote – not the vacuous “+1” votes on the dev list. People need to *Stop* thinking about this as an ego issue. I am writing lots of code that is new stuff in Sakai – doing this in trunk of the main release is scary for me – my IMS tools will be run at UM for a whole semester before I would even make the slighted *peep* about them in the main release – this makes my life *better* not worse. It means that when they go into 2.7 – I can sleep at night knowing that I won’t be the one who crashed Rutgers production because I pushed my stuff into the main release before it was production hardened.
– When people want to do something that will make the “release process better” – like this crazy idea that we need 40 versions of things in the main SVN instead of one. Maybe this *is* a great idea – if it is a great idea – lets try it one place and get through a release and see if the process *IS* better. And then see if we can actually make a release with the “glorious” new ideas in place. If we do it only a little bit and it turns out to be a horrible failure then we back out the little bit. If we convert the whole SVN to the new approach and find out half way through QA that it was a bad idea then we have a massive back conversation. And folks – good ideas have staying power. If it is a good idea – it will happen in the next release – folks don’t forget about it – if the simple use of the new approach shows the glorious way forward – people will rush to the new approach. And maybe these folks who are so “smart” about how to change how we do the release – perhaps they ought to try to *do* a release with their proposed ideas before they just check it into trunk. There is a long way between Sakai compiling in a developer environment and Sakai having 100% squeaky clean release artifacts.
We can make these decisions collectively if we just do things more cautiously and folks who advocate new large-scale directions are willing to be patient. Frankly the last thing we need is someone who wants to make some global change *right now* because they want to get it done and then move back to some other job. This is a major red flag for me – it means that when their great idea takes a crap – they will be busy on something else. Making folks wait also tests their own conviction that their ideas is really good.
OK – so now my coffee cup this morning is empty so it is time to wrap up the rant.
All I am saying is give peace a chance. Actually no – his is what I am saying:
– Quit thinking that +3 is a mandate and sufficient justification for major change – the justification for major change is *proof* that it is a good idea.
– Take a conservative approach to new directions and new technologies – make them stand the test of time for 6 months before we dive forward – it is hard to go back
– Lets make a culture that makes it OK to oppose something. While I am the guy who publically does a “-1” on stuff – I do it so often that my credibility is shot – at least it is shot enough that my -1 alone means pretty much nothing when the proposer knows they are smarter than me (a common occurrence).
We have a lot of people with a long history of experience watching decisions happen and watching things hurt later because of those decisions (and yes most of the mistakes were mine) – but even though it is different people making the decisions – the mistakes are and will be the same – the core reason I made a mistake was when I was in a hurry and was not willing to try something out in the small before it because our “direction”.
Lets learn the “red flags”.
– A major rewrite that surprises the community right before code freeze
– Something that claims that it will *improve the release process* from someone who has never done a release
– A major new technology that one tool writer “discovered” and decided to use
– Anything where someone seems to be “in a hurry” or trying to slip in under the deadline
Change is essential – if we stop moving we start dying. All the decisions that have come back to haunt me over the past five years were “yes” decisions. It is easy to say “yes” – it is harder to say “no”. Just saying “no” is also a bad approach – no can be wrong as well.
We can recover far more gracefully from a wrong “no” decision than a wrong “yes” decision. If a “no” decision turns out to be wrong – we just turn it into a “yes” later. We hold on to the status quo a little longer – for those of us in this for the long term – we can be patient.
A bad “no” decision means we endured a “stable but sub-optimal” situation for a little while longer. A bad “yes” decision can cost us millions of dollars in wasted time and effort.
If a developer or team does not like being told “no” and want to make their own decisions – or thinks that Sakai is too conservative about changes to its release process or new technologies – and prefers a “more agile” approach – then just move your to contrib. You have 100% freedom in contrib – no one will tell you what, when or how to do something. People can put your code in production and if it craps out – then it is *your* creditability that drops and not Sakai’s credibility.
What I am looking for that will tell me that our process is healthy is when we as the community learn say “no” to something once in a while. Or at least we as a community make a strong statement “Please slow down” – to *something*. Once we as a community can learn to say “no” once in a while – maybe we can get to the point where our collective wisdom is better than a single individual making long-term decisions.
At this point my assessment is that we are worse off in terms of making long term decisions about the product than we were in the 2.3 era. From the 2.4 release and onward – we switched from the “chief architect” model to the “collective wisdom” model. We cannot and should not go “back” because one person cannot hold this all in their mind – but we do need to go “forward” to a point where we are truly reflecting on choices and making hard calls – and firmly telling folks “no” or “slow down” in certain situations.
Until we do that – our code releases will get mushier and mushier and the maintenance branches will become more and more important. This is an indication that our process for managing trunk is not working as well – and we fix things when we have a branch-manager (chief architect for the branch).
I would like to see us making better decisions in trunk – and making the branches – truly maintenance.
End of rant.
One of my favourite sayings is that “A project is doomed if the smartest developer is the chief architect”. Having someone as chief architect who is smart but not the smartest sets up a natural balance-of-power and leads to well reasoned decisions.