Debian Jessie vs CDH upgrade plan
- Upgrade whole cluster to CDH 5.10 as is.
- Get new Hadoop nodes (T152713), install those as Debian Jessie with CDH 5.10
- Incrementally reinstall current cluster nodes as Debian Jessie.
Debian Jessie vs CDH upgrade plan
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T157807 Reinstall Analytics Hadoop Cluster with Debian Jessie | |||
Resolved | Ottomata | T152714 CDH 5.10 upgrade |
In Analytics Ops meeting today, we decided we should upgrade to CDH 5.10 now that it is out, even though it doesn't have Spark 2.x like we had hoped.
This will also be good to time with the order of new Hadoop nodes, and to get done before we replace stat1002/stat1003, so we can install those as Jessie too.
Previous CDH 5.5 upgrade task: T119646
etherpad process for that upgrade: https://etherpad.wikimedia.org/p/analytics-cdh5.5
etherpad for this one: https://etherpad.wikimedia.org/p/analytics-cdh5.10
Debian Jessie vs CDH upgrade plan
Testing steps include loading data on labs & upgrade & testing refinery jobs before starting cluster migration
Did the upgrade in labs today:
Change 336906 had a related patch set uploaded (by Ottomata):
Set hue allowed_hosts=* to work around bug http://community.cloudera.com/t5/Web-UI-Hue-Beeswax/New-Cloudera-installation-Hue-Bad-Request-400/td-p/50344/page/5
Ah ha! Hue did break because of a change. Had to do: https://gerrit.wikimedia.org/r/#/c/336906/1/templates/hue/hue.ini.erb
So, with that, everything looks good! Time to schedule...
So, we briefly talked about doing this on a weekend..buut I don't really have a free weekend day until March 4. I suppose this can wait that long. Thoughts?
I think it can wait, the advantage of doing it in a weekend would be less hassle for ourselves and users, but if you prefer to do it during a weekday thet would be fine too.
Current plan: do this February Tues 28th. I will send out announcement and schedule downtime.
Post upgrade manually: remove cdh5.5* packages from apt thirdparty. Use grep-dctrl line from reprepro updates to figure out what to remove:
grep-dctrl -e -S '^zookeeper$|^hadoop$|^hadoop-0.20-mapreduce$|^bigtop-jsvc$|^bigtop-utils$|^sqoop$|^hbase$|^pig$|^pig-udf-datafu$|^hive$|^oozie$|^hue$|^bigtop-tomcat$|^spark$|^avro-libs$|^parquet$|^parquet-format$|^spark-core$|^spark-history-server$|^spark-master$|^spark-python$|^spark-worker$|^mahout$|^kite$|^solr$|^sentry$|^impala$|^impala-catalog$|^impala-server$|^impala-shell$|^impa
Change 336906 merged by Ottomata:
Set hue allowed_hosts=* to work around bug http://community.cloudera.com/t5/Web-UI-Hue-Beeswax/New-Cloudera-installation-Hue-Bad-Request-400/td-p/50344/page/5