Thomas Telaak

Did you ever thought about automating the deployment process of Hadoop clusters to the cloud? – We did!

Due to a new project we needed to deploy a new HDP 2.3 (Hortonworks Data Platform) environment to Microsoft Azure, so we decided to work with Cloudbreak.
Cloudbreak delivers an easy to use Web UI that allows you to create a HDP cluster based on an Ambari blueprint and deploy it to the major cloud providers (Microsoft, Amazon, Google, OpenStack).
It also provides an auto-scaling feature that automatically sizes your cluster based on usage or time metrics so you can use your cluster very efficient.
The company behind Cloudbreak, SquenceIQ (recently acquired by HortonWorks) hosts an open Cloudbreak-deployer portal for everyone.
So for fast testing scenarios you can use this, too. Link: https://accounts.sequenceiq.com/

We set up our own Cloudbreak server in Azure.

Installation Progress

So let’s have a look on the installation process step by step:

Because you’ll need a x.509 certificate later on for Azure deployment, you can also use this for the Cloudbreak Host. So I used an existing Linux VM to create a certificate with OpenSSH:

openssl req ­x509 ­nodes ­days 365 ­newkey rsa:2048

­keyout my_azure_private.key ­out my_azure_cert.pem

I deployed a Cent OS 7.1 Machine from Azure Marketplace as Host for our Cloudbreak Server.

For SSH authentication I used the x.509 Certificate I just created.

After deployment you can connect to the machine over SSH.

putty user@cloudbreak_host_url -i generated_cert.ppk

In case you use Putty you need to convert the private key using Puttygen. Putty will not work with the generated one. (Conversion -> Import Key -> Save private key)

Now we need to install Docker as it is required for Cloudbreak.

First we update yum and install Docker:

sudo yum update

curl -sSL https://get.docker.com/ | sh

After installation start the docker service:

sudo service docker start

(I had trouble starting the service and got it work after installing docker-selinux:

yum install docker-selinux

Verify docker is running and make docker start on machine start:

sudo docker run hello-world

sudo chkconfig docker on

If not installed – install wget and unzip:

yum install wget

yum install unzip

 

Now create a directory for your cloudbreak installation. Then download Cloudbreak Deployer into this directory and unzip the package:

mkdir cloudbreak

 

Wget http://public­repo­1.hortonworks.com/HDP/cloudbreak/cloudbreak­deployer_1.0.0_Linux_x86_64.tgz

 

tar xvf cloudbreak­deployer_1.0.0_Linux_x86_64.tgz

 

Next copy cbd to your bin folder and run cbd init from the cloudbreak directory:

sudo cp cbd / /usr/local/bin

cbd init

After initialization you can create a profile with your public server address and a custom user:

echo export PUBLIC_IP=cloudbreakservername.cloudapp.net > Profile

echo export UAA_DEFAULT_USER_EMAIL=E-Mail Address >> Profile

echo export UAA_DEFAULT_USER_PW=Password >> Profile

To finish the installation, generate cloudbreak-config and start the server:

cbd generate

cbd start

 

Work with Cloudbreak

 

After you have installed the cloudbreak deployer visit the URL (http://yourcloudbreakurl:3000) and login with the credentials you provided in the created profile.

Now you need to create credentials for your cloud provider.

To do that use the certificate you generated for installation. Click on “create credentials” and open the created .pem file and copy it’s text into „SSH Certificate“.
After successful creation you can download a certificate wich is used to allow cloudbreak to manage your azure subscription:

 

Manage credentials in CloudBreak

The downloaded certificate can be uploaded to Azure: Settings –> Manage Certificates
(The certificate name depends on exporting cloudbreak user.)

The next step is to create a blueprint. The blueprint defines wich service will be installed on wich node in our cluster by Ambari.
There are three predefined blueprints. We used the HDP-small-default (Copy the JSON and edit it in a XML Editor) and added spark to it.
Now create resources that you want to use in your cluster.
F.e. we create a datanode resource we want to use for all datanodes.

 

Manage resources in CloudBreak

After you created all needed resources you could also create a custom network and security group.
We used the default ones.
Now select the created credential and create a cluster:

 

Create the cluster

Set a Cluster Name and choose the created blueprint and map the created resources to the cluster groups.

To change the default ambari username and password you can use the advanced options.

Create Cluster

Now “Create and Start” the cluster.
(Make sure that the time on the Cloudbreak host is correct)
You will see the current state of the create process in the event history.
After successful creation you will see the ambari ip and the current cluster state.
The cluster can now be managed from our cloudbreak portal. You can also activate auto-scaling policies to make your cluster elastic.
To connect to the cluster create a connection via SSH to the client node.
As the cluster runs in docker you have to connect to the ambari-agent docker container.

Change User:

sudo -su

Show Docker Container

docker ps (note container id of ambari agent)

Login to ambari shell:

docker exec -it ContainerID bash

Have Fun!

Troubleshooting

If cloudbreak thinks your cluster is in a bad state you maybe have to help yourself by update the cloudbreak database.

docker exec -it cbreak_cbdb_1 bash

psql -U postgres

> update stack set status=’AVAILABLE‘ where name like ‚cluster-name‘;

> update cluster set status=’AVAILABLE‘ where name like ‚cluster-name‘;