Never Ending Security

It starts all here

Tag Archives: Mesosphere Cluster

How to Configure Mesosphere Cluster on Linux


An Introduction to Mesosphere

Mesosphere is a software solution that expands upon the cluster management capabilities of Apache Mesos with additional components to provide a new and novel way to manage server infrastructures. By combining several components with Mesos, such as Marathon and Chronos, Mesosphere enables a way to easily scale applications by abstracting away many of the challenges associated with scaling.

Mesosphere provides features such as application scheduling, scaling, fault-tolerance, and self-healing. It also provides application service discovery, port unification, and end-point elasticity.

To give a better idea of how Mesosphere provides the aforementioned features, we will briefly explain what each key component of Mesosphere does, starting with Apache Mesos, and show how each is used in the context of Mesosphere.

A Basic Overview of Apache Mesos

Apache Mesos is an open source cluster manager that simplifies running applications on a scalable cluster of servers, and is the heart of the Mesosphere system.

Mesos offers many of the features that you would expect from a cluster manager, such as:

  • Scalability to over 10,000 nodes
  • Resource isolation for tasks through Linux Containers
  • Efficient CPU and memory-aware resource scheduling
  • Highly-available master through Apache ZooKeeper
  • Web UI for monitoring cluster state

Mesos Architecture

Mesos has an architecture that is composed of master and slave daemons, and frameworks. Here is a quick breakdown of these components, and some relevant terms:

  • Master daemon: runs on a master node and manages slave daemons
  • Slave daemon: runs on a master node and runs tasks that belong to frameworks
  • Framework: also known as a Mesos application, is composed of a scheduler, which registers with the master to receive resource offers, and one or more executors, which launches tasks on slaves. Examples of Mesos frameworks include Marathon, Chronos, and Hadoop
  • Offer: a list of a slave node’s available CPU and memory resources. All slave nodes send offers to the master, and the master provides offers to registered frameworks
  • Task: a unit of work that is scheduled by a framework, and is executed on a slave node. A task can be anything from a bash command or script, to an SQL query, to a Hadoop job
  • Apache ZooKeeper: software that is used to coordinate the master nodes


Note: “ZK” represents ZooKeeper in this diagram.

This architecture allows Mesos to share the cluster’s resources amongst applications with a high level of granularity. The amount of resources offered to a particular framework is based on the policy set on the master, and the framework scheduler decides which of the offers to use. Once the framework scheduler decides which offers it wants to use, it tells Mesos which tasks should be executed, and Mesos launches the tasks on the appropriate slaves. After tasks are completed, and the consumed resources are freed, the resource offer cycle repeats so more tasks can be scheduled.

High Availability

High availability of Mesos masters in a cluster is enabled through the use of Apache ZooKeeper to replicate the masters to form a quorum. ZooKeeper also coordinates master leader election and handles leader detection amongst Mesos components, including slaves and frameworks.

At least three master nodes are required for a highly-available configuration–a three master setup allows quorum to be maintained in the event that a single master fails–but five master nodes are recommended for a resilient production environment, allowing quorum to be maintained with two master nodes offline.

For more about Apache Mesos, visit its official documentation page.

A Basic Overview of Marathon

Marathon is a framework for Mesos that is designed to launch long-running applications, and, in Mesosphere, serves as a replacement for a traditional init system. It has many features that simplify running applications in a clustered environment, such as high-availability, node constraints, application health checks, an API for scriptability and service discovery, and an easy to use web user interface. It adds its scaling and self-healing capabilities to the Mesosphere feature set.

Marathon can be used to start other Mesos frameworks, and it can also launch any process that can be started in the regular shell. As it is designed for long-running applications, it will ensure that applications it has launched will continue running, even if the slave node(s) they are running on fails.

For more about Marathon, visit its GitHub page.

A Basic Overview of Chronos

Chronos is a framework for Mesos that was originally developed by Airbnb as a replacement for cron. As such, it is a fully-featured, distributed, and fault-tolerant scheduler for Mesos, which eases the orchestration of jobs, which are collections of tasks. It includes an API that allows for scripting of scheduling jobs, and a web UI for ease of use.

In Mesosphere, Chronos compliments Marathon as it provides another way to run applications, according to a schedule or other conditions, such as the completion of another job. It is also capable of scheduling jobs on multiple Mesos slave nodes, and provides statistics about job failures and successes.

For more about Chronos, visit its GitHub page.

A Basic Overview of HAProxy

HAProxy is a popular open source load balancer and reverse proxying solution. It can be used in Mesosphere to route network traffic from known hosts, typically Mesos masters, to the actual services that are running on Mesos slave nodes. The service discovery capabilities of Mesos can be used to dynamically configure HAProxy to route incoming traffic to the proper backend slave nodes.

For more about the general capabilities of HAProxy, check out our Introduction to HAProxy.


Mesosphere is a system that combines a number of components to effectively manage server clustering and highly available deployments on top of an existing operating system layer. Unlike systems like CoreOS, Mesosphere is not a specialized operating system and is instead a set of packages.

In this guide, we will go over how to configure a highly available cluster in Mesosphere. This configuration will set us up with failover in case any of our master nodes go down as well as a pool of slave servers to handle the tasks that are scheduled.

We will be using Ubuntu 14.04 servers for this guide.

Prerequisites and Goals

Before you complete this guide, it is strongly recommended that you review our introduction to Mesosphere. This is a good way to get familiar with the components that the system is comprised of and assist you in identifying what each unit is responsible for.

During this tutorial, we will be using six Ubuntu servers. This fulfills the Apache Mesos recommendation of having at least three masters for a production environment. It also provides a pool of three worker or slave servers, which will be assigned work when tasks are sent to the cluster.

The six servers we will be using will use zookeeper to keep track of the current leader of the master servers. The Mesos layer, built on top of this, will provide distributed synchronization and resource handling. It is responsible for managing the cluster. Marathon, the cluster’s distributed init system, is used to schedule tasks and hand work to the slave servers.

For the sake of this guide, we will be assuming that our machines have the following configuration:

Hostname Function IP Address
master1 Mesos master
master2 Mesos master
master3 Mesos master
slave1 Mesos slave
slave2 Mesos slave
slave3 Mesos slave

Each of these machines should have Ubuntu 14.04 installed. You will want to complete the basic configuration items listed in our Ubuntu 14.04 initial server setup guide.

When you are finished with the above steps, continue on with this guide.

Install Mesosphere on the Servers

The first step to getting your cluster up and running is to install the software. Fortunately, the Mesosphere project maintains an Ubuntu repository with up-to-date packages that are easy to install.

Add the Mesosphere Repositories to your Hosts

On all of the hosts (masters and slaves), complete the following steps.

First, add the Mesosphere repository to your sources list. This process involves downloading the Mesosphere project’s key from the Ubuntu keyserver and then crafting the correct URL for our Ubuntu release. The project provides a convenient way of doing this:

sudo apt-key adv --keyserver --recv E56151BF DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]') CODENAME=$(lsb_release -cs) echo "deb${DISTRO} ${CODENAME} main" | sudo tee /etc/apt/sources.list.d/mesosphere.list

Install the Necessary Components

After you have the Mesosphere repository added to your system, you must update your local package cache to gain access to the new component:

sudo apt-get -y update

Next, you need to install the necessary packages. The components you need will depend on the role of the host.

For your master hosts, you need the mesosphere meta package. This includes the zookeeper,mesos,marathon, and chronos applications:

sudo apt-get install mesosphere

For your slave hosts, you only need the mesos package, which also pulls in zookeeper as a dependency:

sudo apt-get install mesos

Set up the Zookeeper Connection Info for Mesos

The first thing we are going to do is configure our zookeeper connection info. This is the underlying layer that allows all of our hosts to connect to the correct master servers, so it makes sense to start here.

Our master servers will be the only members of our zookeeper cluster, but all of our servers will need some configuration to be able to communicate using the protocol. The file that defines this is/etc/mesos/zk.

On all of your hosts, complete the following step. Open the file with root privileges:

sudo nano /etc/mesos/zk

Inside, you will find the connection URL is set by default to access a local instance. It will look like this:


We need to modify this to point to our three master servers. This is done by replacing localhost with the IP address of our first Mesos master server. We can then add a comma after the port specification and replicate the format to add our second and third masters to the list.

For our guide, our masters have IP addresses of,, and Using these values, our file will look like this:


The line must start with zk:// and end with /mesos. In between, your master servers’ IP addresses andzookeeper ports (2181 by default) are specified.

Save and close the file when you are finished.

Use this identical entry in each of your masters and slaves. This will help each individual server connect to the correct master servers to communicate with the cluster.

Configure the Master Servers’ Zookeeper Configuration

On your master servers, we will need to do some additional zookeeper configuration.

The first step is to define a unique ID number, from 1 to 255, for each of your master servers. This is kept in the/etc/zookeeper/conf/myid file. Open it now:

sudo nano /etc/zookeeper/conf/myid

Delete all of the info in this file and replace it with a single number, from 1 to 255. Each of your master servers must have a unique number. For the sake of simplicity, it is easiest to start at 1 and work your way up. We will be using 1, 2, and 3 for our guide.

Our first server will just have this in the file:


Save and close the file when you are finished. Do this on each of your master servers.

Next, we need to modify our zookeeper configuration file to map our zookeeper IDs to actual hosts. This will ensure that the service can correctly resolve each host from the ID system that it uses.

Open the zookeeper configuration file now:

sudo nano /etc/zookeeper/conf/zoo.cfg

Within this file, you need to map each ID to a host. The host specification will include two ports, the first for communicating with the leader, and the second for handling elections when a new leader is required. Thezookeeper servers are identified by “server” followed by a dot and their ID number.

For our guide, we will be using the default ports for each function and our IDs are 1-3. Our file will look like this:


Add these same mappings in each of your master servers’ configuration files. Save and close each file when you are finished.

With that, our zookeeper configuration is complete. We can begin focusing on Mesos and Marathon.

Configure Mesos on the Master Servers

Next, we will configure Mesos on the three master servers. These steps should be taken on each of your master servers.

Modify the Quorum to Reflect your Cluster Size

First, we need to adjust the quorum necessary to make decisions. This will determine the number of hosts necessary for the cluster to be in a functioning state.

The quorum should be set so that over 50 percent of the master members must be present to make decisions. However, we also want to build in some fault tolerance so that if all of our masters are not present, the cluster can still function.

We have three masters, so the only setting that satisfies both of these requirements is a quorum of two. Since the initial configuration assumes a single server setup, the quorum is currently set to one.

Open the quorum configuration file:

sudo nano /etc/mesos-master/quorum

Change the value to “2”:


Save and close the file. Repeat this on each of your master servers.

Configure the Hostname and IP Address

Next, we’ll specify the hostname and IP address for each of our master servers. We will be using the IP address for the hostname so that our instances will not have trouble resolving correctly.

For our master servers, the IP address needs to be placed in these files:

  • /etc/mesos-master/ip
  • /etc/mesos-master/hostname

First, add each master node’s individual IP address in the /etc/mesos-master/ip file. Remember to change this for each server to match the appropriate value:

echo | sudo tee /etc/mesos-master/ip

Now, we can copy this value to the hostname file:

sudo cp /etc/mesos-master/ip /etc/mesos-master/hostname

Do this on each of your master servers.

Configure Marathon on the Master Servers

Now that Mesos is configured, we can configure Marathon, Mesosphere’s clustered init system implementation.

Marathon will run on each of our master hosts, but only the leading master server will be able to actually schedule jobs. The other Marathon instances will transparently proxy requests to the master server.

First, we need to set the hostname again for each server’s Marathon instance. Again, we will use the IP address, which we already have in a file. We can copy that to the file location we need.

However, the Marathon configuration directory structure we need is not created automatically. We will have to create the directory and then we can copy the file over:

sudo mkdir -p /etc/marathon/conf
sudo cp /etc/mesos-master/hostname /etc/marathon/conf

Next, we need to define the list of zookeeper masters that Marathon will connect to for information and scheduling. This is the same zookeeper connection string that we’ve been using for Mesos, so we can just copy the file. We need to place it in a file called master:

sudo cp /etc/mesos/zk /etc/marathon/conf/master

This will allow our Marathon service to connect to the Mesos cluster. However, we also want Marathon to store its own state information in zookeeper. For this, we will use the other zookeeper connection file as a base, and just modify the endpoint.

First, copy the file to the Marathon zookeeper location:

sudo cp /etc/marathon/conf/master /etc/marathon/conf/zk

Next, open the file in your editor:

sudo nano /etc/marathon/conf/zk

The only portion we need to modify in this file is the endpoint. We will change it from /mesos to/marathon:


This is all we need to do for our Marathon configuration.

Configure Service Init Rules and Restart Services

Next, we will restart our master servers’ services to use the settings that we have been configuring.

First, we will need to make sure that our master servers are only running the Mesos master process, and not running the slave process. We can stop any currently running slave processes (this might fail, but that’s okay since this is just to ensure the process is stopped). We can also ensure that the server doesn’t start the slave process at boot by creating an override file:

sudo stop mesos-slave
echo manual | sudo tee /etc/init/mesos-slave.override

Now, all we need to do is restart zookeeper, which will set up our master elections. We can then start our Mesos master and Marathon processes:

sudo restart zookeeper
sudo start mesos-master
sudo start marathon

To get a peak at what you have just set up, visit one of your master servers in your web browser at port5050:

You should see the main Mesos interface. You may be told you are being redirected to the active master depending on whether you connected to the elected leader or not. Either way, the screen will look similar to this:


This is a view of your cluster currently. There is not much to see because there are no slave nodes available and no tasks started.

We have also configured Marathon, Mesosphere’s long-running task controller. This will be available at port 8080on any of your masters:


We will briefly go over how to use these interfaces once we get our slaves set up.

Configure the Slave Servers

Now that we have our master servers configured, we can begin configuring our slave servers.

We have already configured our slaves with our masters servers’ zookeeper connection information. The slaves themselves do not run their own zookeeper instances.

We can stop any zookeeper process currently running on our slave nodes and create an override file so that it will not automatically start when the server reboots:

sudo stop zookeeper
echo manual | sudo tee /etc/init/zookeeper.override

Next, we want to create another override file to make sure the Mesos master process doesn’t start on our slave servers. We will also ensure that it is stopped currently (this command may fail if the process is already stopped. This is not a problem):

echo manual | sudo tee /etc/init/mesos-master.override
sudo stop mesos-master

Next, we need to set the IP address and hostname, just as we did for our master servers. This involves putting each node’s IP address into a file, this time under the /etc/mesos-slave directory. We will use this as the hostname as well, for easy access to services through the web interface:

echo | sudo tee /etc/mesos-slave/ip
sudo cp /etc/mesos-slave/ip /etc/mesos-slave/hostname

Again, use each slave server’s individual IP address for the first command. This will ensure that it is being bound to the correct interface.

Now, we have all of the pieces in place to start our Mesos slaves. We just need to turn on the service:

sudo start mesos-slave

Do this on each of your slave machines.

To see whether your slaves are successfully registering themselves in your cluster, go back to your one of your master servers at port 5050:

You should see the number of active slaves at “3” now in the interface:


You can also see that the available resources in the interface has been updated to reflect the pooled resources of your slave machines:


To get additional information about each of your slave machines, you can click on the “Slaves” link at the top of the interface. This will give you an overview of each machine’s resource contribution, as well as links to a page for each slave:


Starting Services on Mesos and Marathon

Marathon is Mesosphere’s utility for scheduling long-running tasks. It is easy to think of Marathon as the init system for a Mesosphere cluster because it handles starting and stopping services, scheduling tasks, and making sure applications come back up if they go down.

You can add services and tasks to Marathon in a few different ways. We will only be covering basic services. Docker containers will be handled in a future guide.

Starting a Service through the Web Interface

The most straight forward way of getting a service running quickly on the cluster is to add an application through the Marathon web interface.

First, visit the Marathon web interface on one of your the master servers. Remember, the Marathon interface is on port 8080:

From here, you can click on the “New App” button in the upper-right corner. This will pop up an overlay where you can add information about your new application:


Fill in the fields with the requirements for your app. The only fields that are mandatory are:

  • ID: A unique ID selected by the user to identify a process. This can be whatever you’d like, but must be unique.
  • Command: This is the actual command that will be run by Marathon. This is the process that will be monitored and restarted if it fails.

Using this information, you can set up a simple service that just prints “hello” and sleeps for 10 seconds. We will call this “hello”:


When you return to the interface, the service will go from “Deploying” to “Running”:


Every 10 seconds or so, the “Tasks/Instances” reading will go from “1/1″ to “0/1″ as the sleep amount passes and the service stops. Marathon then automatically restarts the task again. We can see this process more clearly in the Mesos web interface at port 5050:

Here, you can see the process finishing and being restarted:


If you click on “Sandbox” and then “stdout” on any of the tasks, you can see the “hello” output being produced:

Mesos output

Starting a Service through the API

We can also submit services through Marathon’s API. This involves passing in a JSON object containing all of the fields that the overlay contained.

This is a relatively simple process. Again, the only required fields are id for the process identifier andcmd which contains the actual command to run.

So we could create a JSON file called hello.json with this information:

nano hello.json

Inside, the bare minimum specification would look like this:

    "id": "hello2",
    "cmd": "echo hello; sleep 10"

This service will work just fine. However, if we truly want to emulate the service we created in the web UI, we have to add some additional fields. These were defaulted in the web UI and we can replicate them here:

    "id": "hello2",
    "cmd": "echo hello; sleep 10",
    "mem": 16,
    "cpus": 0.1,
    "instances": 1,
    "disk": 0.0,
    "ports": [0]

Save and close the JSON file when you are finished.

Next, we can submit it using the Marathon API. The target is one of our master’s Marathon service at port8080and the endpoint is /v2/apps. The data payload is our JSON file, which we can read into curlby using the -d flag with the @ flag to indicate a file.

The command to submit will look like this:

curl -i -H 'Content-Type: application/json' -d@hello2.json

If we look at the Marathon interface, we can see that it was successfully added. It seems to have the exact same properties as our first service:


The new service can be monitored and accessed in exactly the same way as the first service.

At this point, you should have a production-ready Mesosphere cluster up and running. We have only covered the basic configuration at this point, but you should be able to see the possibilities of leveraging the Mesosphere system.