Should Sakai Evolve Towards MySQL 8.0 or Maria DB 10?

With the recent release of Sakai 21.0 (Thanks to all involved). We need to do some thinking about our direction forward w.r.t. how we approach our MySQL support in the Sakai community.

We have been supporting MySQL and Oracle since 2002 (19 years). In the old days, MySQL was the 100% open alternative for open source applications. But then in 2010, Oracle gained control of MySQL by buying Sun Micro-systems. I wrote a pretty fierce blog post about the Oracle purchase of Sun and my concerns for the future. At that point in time MariaDB was forked from the last non-Oracle version of MySQL. Some in the Sakai Community experimented with MariaDB from time to time – in the early 2010’s if things went wonky we ran back to MySQL.

Oracle has released MySQL 5.6 and 5.7 and MariaDB kept pace – even to the point of binary interoperability. We along with the rest of the open source world found this equivalence comforting and felt it was OK to just keep using MySQL and let the “other folks” use MariaDB.

But support for MySQL 5.6 just expired (February 2021) and support for MySQL 5.7 expires in October 2023.

The version of MySQL after 5.7 is MySQL 8.0. This is the first version that is heavily influenced by Oracle and it feels to me like MySQL 8 and MariaDB may begin to diverge. Hibernate (as of 5.3) supports the org.hibernate.dialect.MariaDBDialect separately from its MySQL dialects. Hibernate also supports the org.hibernate.dialect.MySQL8Dialect.

The reasonably high likelyhood is that Sakai will be able to simultaneously support the various versions of MySQL and MariaDB without too much effort. About 25% of the SQL in Sakai is hand-constructed and 75% of the SQL is constructed by Hibernate. Our hand-constructed SQL does not make use of intricate features of MySQL so it is likely to keep working in MySQL 8 and MariaDB 10. And you can pick your Hibernate dialect and support all the databases.

So the question is less about *not* supporting MySQL 8 that it is about deciding what our “preferred” / “first choice” database will be. For this preferred MySQL compatible database we can encourage developers to test using the preferred database and then run our nightly servers using that preferred database.

For example for Sakai 21 on our nightly page, we have eight MySQL 5.7 servers and one Oracle server.

For Sakai 22, we could keep things the way they are or switch to any the following:

* Run eight MariaDB 10 servers, 1 MySQL 5.7 server, and 1 Oracle Server
* Run eight MySQL 5.7 servers, 1 MariaDB server, 1 MySQL 8 server, and 1 Oracle Server
* Run eight MariaDB 10 servers, 1 MySQL 5.7 server, 1 MySQL 8 server, and 1 Oracle Server
* Run eight MySQL 8 servers, 1 MariaDB server, 1 MySQL 5.7 server, and 1 Oracle Server

You get the picture. By leaning towards either MySQL 8 or MariaDB 10 – we are signaling our “preference” / “first choice”.

We of course would fix problems that emerged in any of those databases. We would not want to break Sakai and MySQL 8 just to do something cool in MariaDB 10 or vice versa. We need to write conservative SQL and accept fixes if some hand-written MySQL breaks 5.7, 8.0 or MariaDB 10.

The question is where the majority of us are going to go so we can stick together and protect each other’s flanks by using a common approach wherever practical.

We can delay this decision until we release Sakai 22 – but the decision is easier and safer now. We get a year to experiment with the different databases and then based on what we learn this year – we can make a more informed decision next year. Also by next year we might get some signals from the market and other open source projects as to where they will be going w.r.t. the “MySQL 8 – to be or not to be” question.

Once we chat a bit about this on the lists, I will do a survey to poll the community.

My Opinion

At a minimum, I do *not* want to go “all in” with MySQL 8 – I want to move toward MariaDB if a year of experiments reveals no issues.

I would like a trunk master running on MySQL 5.7, MySQL 8.0, MariaDB 10, and Oracle. I will run my Smoker process against all four every night, to probe for really bad regressions. If we could somehow make them have identical sites and data – I could compare them click for click with smoker.

Earle Nietzel’s Opinion (from email)

I do think it’s time to start more formally adding MariaDB to the mix, however I think our database setup on nightly should reflect that which is being used by our users meaning:

For example:
80% of Sakai installs use MySQL (where 20% of installs make up 10% Oracle and 10% MariaDB – hypothetical numbers). For the other DB variants that make up the minority lets simply have a version of master representing that DB.

So an infrastructure might look like this:

master – 1 mysql / 1 mariadb / 1 oracle
experimental master – 1 mysql
21.x – 1 mysql
20.x -1 mysql
19.x – 1 mysql

Also the SQL ratio according to tables is: ~300 total tables of which 75% are Hibernate managed and the other 25% represent SqlService and Spring jdbc template.

Another thing to be thinking about with all these instances of Sakai is whether or not we have the resources to test on them.

I am not in favor of expanding instances if we don’t have resources to even test on them. This is the reason I suggest having only master instances for all of the databases we support and other instances are the db that is used most by users of Sakai.

Matthew Jones’ Opinion (from email)

I don’t know if that accurately represents the community or not. I feel many use AWS Aurora, another fork that is not compatible with MySQL 8, and only goes as far as 5.7, similar to MariaDB. So do we need an Aurora instance too? Probably not as much as if we had MariaDB.

I like the idea of going MariaDB first as it’s the most open, 5.7 compatible and it’s also the easiest for localhost. Though MySQL 8 isn’t too big of a problem to continue supporting as long as we don’t do much MySQL 8 specific stuff, which we likely won’t do. If we’re running MariaDB, I don’t see any point to leaving a separate MySQL 5.7 around, especially since it’s going to be EOL in 2023.

I think the hard decision about supporting multiple databases is if we want to take advantage of any special features of a database, we can’t while still supporting the others. For instance MariaDB supports a number of storage engines that aren’t available on the other databases and features like system versioned tables, that could be “nice to have” if we were able to use them. We’re basically stuck forever at the lowest common denominator, which is going to be MySQL 5.7 in this case.