VMware vSphere Internals Q/A

Estimated Reading Time: 4 minutes

Welcome Readers !! This is a second series of VMware vSphere Q/A which is emphasized on ESXi internals and related feature Q/A. I hope it will help you with interview preparation:

Que:1>. What is hyperthreading in terms of VMware terminology?

Ans: Hyperthreading technology allows a single physical processor core to behave like two logical processors. The processor can run two independent applications at the same time. To avoid confusion between logical and physical processors, Intel refers to a physical processor as a socket, and the discussion in this chapter uses that terminology as well.
On processors with Intel Hyper-Threading technology, each core can have two logical processors which share most of the core’s resources, such as memory caches and functional units. Such logical processors
are usually called threads.

Que:2> Does Hyperthreading doubles the performance?
Ans: While hyperthreading does not double the performance of a system, it can increase performance by better utilizing idle resources leading to greater throughput for certain important workload types. An
application running on one logical processor of a busy core can expect slightly more than half of the throughput that it obtains while running alone on a non-hyperthreaded processor. Hyperthreading performance improvements are highly application-dependent, and some applications might see performance degradation with hyperthreading because many processor resources (such as the cache) are shared between logical processors.

Que:3> What processors support Hyperthreading?
Ans: Many processors do not support hyperthreading and as a result have only one thread per core. For such processors, the number of cores also matches the number of logical processors. The following
processors support hyperthreading and have two threads per core.
– Processors based on the Intel Xeon 5500 processor microarchitecture.
– Intel Pentium 4 (HT-enabled)
– Intel Pentium EE 840 (HT-enabled)

Que:4> What does CPU affinity feature do?
Ans: Using CPU affinity, you can assign a virtual machine to a specific processor. This allows you to restrict the assignment of virtual machines to a specific available processor in multiprocessor systems.

How to configure it?
1.In the vSphere Client inventory panel, select a virtual machine and select Edit Settings.
2.Select the Resources tab and select Advanced CPU.
3.Click the Run on processor(s) button.
4.Select the processors where you want the virtual machine to run and click OK.

Que:5> What is VMware Memory Ballooning?
Virtual memory ballooning allows a physical host to recapture unused memory on its guest virtual machines and allocate the resource where needed.
Virtual memory ballooning is a computer memory reclamation technique used by a hypervisor to allow the physical host system to retrieve unused memory from certain guest virtual machines (VMs) and share it
with others. Memory ballooning allows the total amount of RAM required by guest VMs to exceed the amount of physical RAM available on the host. When the host system runs low on physical RAM resources, memory ballooning allocates it selectively to VMs.

If a VM only uses a portion of the memory that it was allocated, the ballooning technique makes it available for the host to use. For example, if all the VMs on a host are allocated 8 GB of memory, some
of the VMs will only use half the allotted share. Meanwhile, one VM might need 12 GB of memory for an intensive process. Memory ballooning allows the host to borrow that unused memory and allocate it to the VMs with higher memory demand.

The guest operating system runs inside the VM, which is allocated a portion of memory. Therefore, the guest OS is unaware of the total memory available. Memory ballooning makes the guest operating system
aware of the host’s memory shortage.

Virtualization providers such as VMware enable memory ballooning. VMware memory ballooning, Microsoft Hyper-V dynamic memory, and the open source KVM balloon process are similar in concept. The host uses
balloon drivers running on the VMs to determine how much memory it can take back from an under-utilizing VM. Balloon drivers must be installed on any VM that participates in the memory ballooning
technique.

Balloon drivers get the target balloon size from the hypervisor and then inflate by allocating the proper number of guest physical pages within the VM. This process is known as inflating the balloon; the process of releasing the available pages is known as deflating the balloon.

VM memory ballooning can create performance problems. When a balloon driver inflates to the point where the VM no longer has enough memory to run its processes within itself, it starts using another VM memory technique known as memory swapping. This will slow down the VM, depending upon the amount of memory to recoup and/or the quality of the storage IOPS delivered to it.

Que:6> How to check VMware ballooning?
To check for ballooning you can either open ESXTOP or the vCenter Performance Graphs

Que 7> What is the limitation of Vmware ballooning?

The balloon driver can inflate up to a maximum of 65%. For instance a VM with 1000MB memory the balloon can inflate to 650MB. The way to avoid ballooning is not to uninstall the balloon driver but to create
a “Memory Reservation” for the virtual machine. In case of full inflation for this particular VM the result is the hypervisor gets 650MB memory reclaimed. The downfall of this is that you risk your VM
to do Guest OS Swapping to its page file! Just remember page file swapping is better than hypervisor swapping. Hypervisor swapping happens without the guest operating system is aware of it. Page file
swapping it is the OS that decides what pages to swap to disk!

Understanding Docker Container Architecture

Estimated Reading Time: 3 minutes

What is Docker?

Docker is a lightweight containerization technology that has gained widespread popularity in recent years.

What does Docker uses?

It uses a host of the Linux kernel’s features such as namespaces, cgroups, AppArmor profiles, and so on, to sandbox processes into configurable virtual environments.

What does Docker container look like?

A Docker container can be correlated to an instance of a VM. It runs sandboxed processes that share the same kernel as the host. The term container comes from the concept of shipping containers. The idea is that you can ship containers from your development environment to the deployment environment and the applications running in the containers will behave the same way no matter where you run them. The following image shows the layers of AUFS.

dockerimage-0

How does a Docker Image look like?

A Docker image is made up of filesystems layered over each other.

dockerImage

At the base is a boot filesystem, bootfs, which resembles the typical Linux/Unix boot filesystem. A Docker user will probably never interact with the boot filesystem. Indeed, when a container has booted, it is moved into memory, and the boot filesystem is unmounted to free up the RAM used by the initrd disk image. So far this looks pretty much like a typical Linux virtualization stack.

Indeed, Docker next layers a root filesystem, rootfs, on top of the boot filesystem. This rootfs can be one or more operating systems (e.g., a Debian or Ubuntu filesystem).

In a more traditional Linux boot, the root filesystem is mounted read-only and then switched to read-write after boot and an integrity check is conducted. In the Docker world, however, the root filesystem stays in read-only mode, and Docker takes advantage of a union mount to add more read-only filesystems onto the root filesystem.

A union mount is a mount that allows several filesystems to be mounted at one time but appear to be one filesystem. The union mount overlays the filesystems on top of one another so that the resulting filesystem may contain files and subdirectories from any or all of the underlying filesystems. Docker calls each of these filesystems images.

Images can be layered on top of one another. The image below is called the parent image and you can traverse each layer until you reach the bottom of the image stack where the final image is called the base image.

Finally, when a container is launched from an image, Docker mounts a read-write filesystem on top of any layers below. This is where whatever processes we want our Docker container to run will execute. This sounds confusing, so perhaps it is best represented by a diagram.

When Docker first starts a container, the initial read-write layer is empty. As changes occur, they are applied to this layer; for example, if you want to change a file, then that file will be copied from the read-only layer below into the readwrite layer. The read-only version of the file will still exist but is now hidden underneath the copy.

This pattern is traditionally called “copy on write” and is one of the features that makes Docker so powerful. Each read-only image layer is read-only; this image never changes. When a container is created, Docker builds from the stack of images and then adds the read-write layer on top.

That layer, combined with the knowledge of the image layers below it and some configuration data, form the container. Containers can be changed, they have state, and they can be started and stopped. This, and the image-layering framework, allows us to quickly build images and run containers with our applications and services

Introduction to Docker Containers

Estimated Reading Time: 3 minutes

What is Docker?

Docker is a new open source tool based on Linux container technology (LXC).LXC is an OS level virtualization method for running multiple isolated Linux operating systems or
containers on single host. LXC does this by using kernel level name space, which helps to isolate containers from the host.

Docker is designed to change how you think about workload/application deployments. It helps you to easily create light-weight, self-sufficient, portable application containers that can be shared, modified and easily deployed to different infrastructures such as cloud/compute servers or bare metal servers. Docker mainly provide a comprehensive abstraction layer that allows developers to ‘containerize’ or ‘package’ any application and have it run on any infrastructure.Docker is based on container visualization and it is not new. There is no better tool than Docker to help manage kernel level technologies such as LXC, cgroups and a copy-on-write file system. It helps us manage the complicated kernel layer technologies through tools and APIs.

Is Docker secure?

Definitely. The user name space separates the users of the containers and the host, ensuring that the container root user does not have the root privilege to log in to the host OS. Likewise, there are the process name space and the network name space, which ensure that the display and management of the processes run in the container but not on the host and the network container, which has its own network device and IP addresses.

How is containerization different from Virtualization?

Containers virtualize at the OS level, whereas both Type-I and Type-2 hypervisor-based solutions virtualize at the hardware level. Both virtualization and containerization are a kind of virtualization; in the case of VMs, a hypervisor (both for Type-I and Type-II) slices the hardware, but containers make available protected portions of the OS. They effectively virtualize the OS. If you run multiple containers on the same host, no container will come to know that it is sharing the same resources because each container has its own abstraction takes the help of name spaces to provide the isolated regions known as containers. Each container runs in its own allocated name space and does not have access outside of it. Technologies such as cgroups, union file systems and container formats are also used for different purposes throughout the containerization to existing files).

How to start with Docker?

1. Checking the system information:

[root@localhost ~]# cat /etc/redhat-release
CentOS Linux release 7.0.1406 (Core)
[root@localhost ~]#

2.On CentOS 7, installing Docker is straightforward:

[root@localhost ~]# yum -y install docker docker-registry

3. Starting Docker at boot time:

[root@localhost ~]# systemctl enable docker.service
ln -s ‘/usr/lib/systemd/system/docker.service’ ‘/etc/systemd/system/multi-user.target.wants/docker.service’
[root@localhost ~]#

4. Starting the docker service:

[root@localhost ~]# systemctl start docker.service
[root@localhost ~]#

5.Verify the docker status:

[root@localhost ~]# systemctl status docker.service
docker.service -- Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled)
Active: active (running) since Thu 2015-03-05 06:03:19 EST; 32s ago
Docs: http://docs.docker.com
Main PID: 19739 (docker)
CGroup: /system.slice/docker.service
+-19739 /usr/bin/docker -d --selinux-enabled -H fd://

6.Let’s pull multiple docker Image from Dockerhub:

[root@localhost ~]# docker pull centos
Pulling repository centos

……
[root@localhost ~]#docker pull ubuntu
Pulling repository ubuntu
2d24f826cb16: Download complete
511136ea3c5a: Download complete
fa4fd76b09ce: Download complete
1c8294cc5160: Download complete
117ee323aaa9: Download complete
Status: Downloaded newer image for ubuntu:latest

7. Verify if docker images are pulled perfectly:

docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
centos 7 88f9454e60dd 39 hours ago 223.9 MB
centos centos7 88f9454e60dd 39 hours ago 223.9 MB
centos latest 88f9454e60dd 39 hours ago 223.9 MB
ubuntu 14.04 2d24f826cb16 13 days ago 192.7 MB
ubuntu latest 2d24f826cb16 13 days ago 192.7 MB
ubuntu trusty 2d24f826cb16 13 days ago 192.7 MB
ubuntu trusty-20150218.1 2d24f826cb16 13 days ago 192.7 MB
ubuntu 14.04.2 2d24f826cb16 13 days ago 192.7 MB

Running a CentOS Docker Container

docker run -i -t centos /bin/bash
[root@f93b7ef64ba4 /]# cat /etc/issue
\S
Kernel \r on an \m

[root@f93b7ef64ba4 /]# cat /etc/redhat-release
CentOS Linux release 7.0.1406 (Core)

You are now using a bash shell inside of a centos docker container.

Lets login to Ubuntu machine:

[root@localhost ~]# docker run -i -t ubuntu /bin/bash
root@6566e477a430:/# lsb_release -d
Description: Ubuntu 14.04.2 LTS

Let’s install git in the Ubuntu container as shown below:

apt-get install git

The container now has the git installed  stack. Type ‘exit’ to quit from the bash shell.

Next, we are going to create this as a golden image, so that the next time we need another GIT container, we don’t need to install it again.
Run the following command and please note the
‘CONTAINER ID’ of the image. In my case, the ID was
‘3de5614dd69c’:

[root@localhost ~] # docker ps -a

The ID shown in the listing is used to identify the container you are using, and you can use this ID to tell Docker to create an image.

Run the command below to make an image of the previously created LAMP container. The syntax is docker commit <CONTAINER ID> <name>.

I have used the previous container ID, which we got in the earlier step:

[root@localhost ~] # docker commit 3de5614dd69c ajeetraina/lamp-image

That’s it. You can verify if the container holds git software already installed.

How to delete all docker containers?

docker rm $(docker ps -aq)

Tips:

There is a difference in docker ps -all and docker ps – -all. Try it out?

-l, –latest=false Show only the latest created container, include non-running ones.

[root@localhost dell]# docker ps --all
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                     PORTS               NAMES
e98d2cef809e        ubuntu:14.04        “/bin/bash”         8 days ago          Exited (0) 8 days ago                          MyUbuntu1
f3804f721c1e        ubuntu:14.04        “/bin/bash”         8 days ago          Exited (0) 46 hours ago                        MyUbuntu
920e86fe624b        ubuntu:14.04        “ps”                8 days ago          Exited (0) 8 days ago                          cranky_feynman
1fa28a405c03        centos:7            “/bin/bash”         3 weeks ago         Exited (0) 10 days ago                         dreamy_wilson
6566e477a430        ubuntu:14.04        “/bin/bash”         3 weeks ago         Exited (127) 3 weeks ago                       insane_torvalds
f93b7ef64ba4        centos:7            “/bin/bash”         3 weeks ago         Exited (0) 10 days ago                         elegant_fermi
c44787c9f28a        centos:7            “/bin/bash”         3 weeks ago         Exited (127) 3 weeks ago                       cocky_pare
[root@localhost dell]# docker ps -all
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                  PORTS               NAMES
e98d2cef809e        ubuntu:14.04        “/bin/bash”         8 days ago          Exited (0) 8 days ago                       MyUbuntu1
[root@localhost dell]#

I hope it has been fun test driving Docker for the first time. In the future post, I am planning to cover other aspects of Docker platform.

VMware vSphere High Availability Q/A

Estimated Reading Time: 19 minutes

Looking for VMware vSphere High Availability based Interview questions? I have attempted to gather important interview questions which you might find useful for your preparation Here we go –

1.  How will you define VMware HA?

As per VMware Definition,VMware® High Availability (HA) provides easy to use, cost effective high availability for applications running in virtual machines. In the event of server failure, affected virtual machines are automatically restarted on other production servers with spare capacity.

The High Availability (HA) feature in vSphere 4.1 allows a group of ESX/ESXi hosts in a cluster to identify individual host failures and thereby provide for higher availability of hosted VMs.  HA will restart VMs which were running on a failed host; it is a high-availability solution, not a zero-downtime solution such as application clustering or VMware Fault Tolerance.  There will be a period of time when VMs are offline following a physical host failure, this is important to understand and you should ensure that your customers and management are aware of this.  HA is a complex topic, but setting it up and using it are fairly straight-forward.

2. List out key features of VMware HA?

Ans: Key features of VMware HA include:

  • Proactive monitoring of all physical servers and virtual machines
  • Automatic detection of server failure
  • Rapid restart of virtual machines affected by server failure
  • Optimal placement of virtual machines after server failure
  • Scalable availability up to 32 nodes across multiple servers

2. We have lot of features like VMotion, DRS, SMP etc, but why we need HA?

We need this because we need our services running without interruption. Assume like, for some reason if any one of the ESX server in the cluster goes down suddenly, what happens to the virtual machines which are running on that particular server? Are they continue to run or go down. Yes, they also goes down. But with the help of VMware HA, these vm’s can be restarted immediately on the other ESX servers in the same cluster. But here you will get a down time of 5 –10 mins. Because server crash is an unexpected thing.

3.  Does HA use vMotion?

No. VMware HA doesn’t use vMotion. Infact, VM stops and restarts on other ESX host.

4.What architecture changes were seen between ESXi 4.1 and 5.0?

vSphere 5.0 comes a new HA architecture. HA has been rewritten from the ground up to shed some of those constraints that were enforced by AAM. HA as part of 5.0, also referred to as FDM (fault domain manager), introduces less complexity and higher resiliency. From a UI perspective not a lot has changed, but there is a lot under the covers that has changed though, no more primary/secondary node concept as stated but a master/slave concept with an automated election process.

5. Can you brief what difference did you find between ESXi 4.1 and 5.0?

VMware vSphere 4.1 HA VMware vSphere 5.0 HA
It is called as Automated Availability Manager in this version. It is called as Fault Domain Manager in this version
When we configure HA on vSphere 4.1 cluster, the first 5 hosts will be designated as Primary nodes, out of these 5 one node will act as “Master Primary” and which will handle restarts of VM’s in the event of a host failure. All the remaining hosts will join as Secondary Nodes. When we configure HA on vSphere 5.0 cluster, the first node will be elected as master and all other nodes will be configured slaves. Master node will be elected based on number of data stores it is connected to, and if all the hosts in cluster are connected to same number of data stores, host’s managed id will be taken into consideration. Host with highest managed id will be elected as master.
Primary nodes maintain information about cluster settings and secondary node states. All these nodes exchange their heartbeat with each other to know the health status of other nodes.·         Primary nodes sends their heart beats to all other primary and secondary nodes.      Secondary nodes sends their heart beats to primaries only. Heart beats will be exchanged between all nodes every second.In case of a primary failure, other primary node will take the responsibility of restarts.If all primaries goes down at same point, no restarts will be initiated, in other words to initiate reboots at least one primary is required. Election of primary happens only during following scenarioso   When a host is disconnectedo   When a host is entered into maintenance modeo   When a host is not respondingo   And when cluster is reconfigured for HA. All hosts exchanges their heartbeats with each other to know about their health states.     Host Isolation response has been enhanced in this version, by introducing data store heart beating. Every host creates a hostname-hb file on the configured data stores and keeps it updated at specific interval. Two data stores will be selected for this purpose.   If we want to know who is master and who are slaves, just need to go to vCenter and click on Cluster Status from homepage in HA area.

6.Is HA dependent on Vmware vCenter Server?
Ans:  Yes. But only during the initial installation and configuration.

7 Does HA works without vCenter Server?

Ans: Yes. HA works as a master and slave relationship in the cluster.

8. Does HA works with DRS?

Ans. In vSphere 4.1, HA can work with and utilize Distributed Resource Scheduler (DRS) if it is also enabled on the cluster so it is important to understand what DRS is…though a full description of DRS is outside the scope of this article.  DRS continuously monitors the resource usage of hosts within a cluster and can suggest or automatically migrate (vMotion) a VM from one host to another to balance out the resource usage across the cluster as a whole and prevent any single host from becoming over-utilized.  HA is based on Legato’s Automated Availability Manager, and as such you will see some HA-related files and logs on an ESX host labeled with “AAM”.  HA requires vCenter for initial configuration, but unlike DRS it does not require vCenter to function after it is up and running.

In short, using VMware HA with Distributed Resource Scheduler (DRS) combines automatic failover with load balancing. This combination can result in faster rebalancing of virtual machines after VMware HA has moved virtual machines to different hosts. When VMware HA performs failover and restarts virtual machines on different hosts, its first priority is the immediate availability of all virtual machines. After the virtual machines have been restarted, those hosts on which they were powered on might be heavily loaded, while other hosts are comparatively lightly loaded. VMware HA uses the virtual machine’s CPU and memory reservation to determine if a host has enough spare capacity to accommodate the virtual machine.

9. How does primary and secondary nodes work under HA?

Ans:  HA will elect up to five hosts to become primary HA nodes, all other nodes in a cluster are secondary nodes up to a maximum of 32 total.  (Note: Host and Node are used interchangeably)

By default, the first 5 nodes to join the HA cluster will be the primary nodes.  If a primary node fails, is removed from the cluster, is placed in Maintenance Mode, or an administrator initiates the “Reconfigure for HA” command, HA will initiate the re-election process to randomly elect five primary nodes.  The purpose of a primary node is to maintain node state data, which is sent by all nodes every 10 seconds by default in vSphere 4.1.

 One primary node must be online at all times for HA to function, as such it is recommended to have primary nodes physically separated across multiple racks or enclosures if possible to ensure at least one remains online in the event that a rack or enclosure goes down.  With a limit of five primary nodes, the maximum allowable host failures for a single HA cluster is four.  One of the five primary nodes will automatically be designated as the active primary (also called Failover Coordinator), and it will be responsible for keeping track of restart attempts and deciding where to restart VMs.

10. Is it possible to determine which nodes are currently the primary nodes from the ESX console?

Ans. It is possible to find it by launching the AAM CLI using the following syntax:

/aam-installation/opt/vmware/aam/bin # ./Cli
From the AAM CLI, enter the ln command:
AAM> ln
From the AAM CLI you can also promote and demote primary nodes manually using the promoteNode and demoteNode commands, respectively, though this is not generally recommended.

11.Any idea how HA 4.1 determines a host has failed?
Ans:  This happens in two ways, a host can determine that it is isolated from all other hosts and initiate its configured isolation response, and other nodes can determine that one host is failed and attempt to restart the VMs hosted on the failed host elsewhere.  By default, all nodes send heartbeats to other nodes every second across the management network.  Primary nodes send heartbeats to all other nodes, and secondary nodes send heartbeats to primary nodes only.

12. Can you explain in details what is Isolation Response?

The isolation response setting determines what action a host will take when it determines that it is isolated from all other nodes in the HA cluster. When configuring HA for a cluster, you have three options for the isolation response: Power Off, Leave Powered On, and Shutdown.  The options are pretty self explanatory, the main thing to know is that the power off setting is equivalent to pulling the power on a physical server, it is not a clean shutdown. In vSphere 4.1, the default isolation response is shutdown.

When a host determines that it is no longer receiving heartbeats from any other hosts, it will attempt to ping its isolation address which by default is the default gateway of the management network.  If this fails, the isolation response is triggered.  Additional isolation addresses can be configured using the advanced setting das.isolationaddressX, where X is a number starting with 2 and incrementing upwards for each additional address.  This is useful to detect a situation where the management network may have failed while the VM networks are still operational.  The isolation detection timeline is 16 seconds, with an additional second added for each additional isolation address.  The timeline breaks down as follows; failure occurs at 0 seconds, at 13 seconds without receiving a heartbeat the isolation address is pinged, if this fails, at 14 seconds the isolation response is triggered by the host.  At 15 seconds the host is declared failed by other hosts in the cluster, and finally at 16 seconds with no heartbeats received the failover coordinator attempts to restart the failed host’s VMs on other nodes.  Should the initial restart fail, HA will attempt to restart the VM 5 more times before abandoning the restart attempt.

There is some planning to be done when configuring the isolation response.  If you use the default isolation address and isolation response settings (management default gateway and shutdown, respectively), it is possible for the management network of the host to become disconnected while the VM networks are still online. In this situation, the isolation response would be triggered and your VMs would be shutdown even though they are still online and functioning normally.  Alternatively, setting the isolation response to leave powered on while suffering a complete network failure on a node will prevent your VMs from being restarted on a functioning host, effectively taking them offline until an administrator intervenes.

13. What is HA Admission Control?

vCenter Server uses admission control to ensure that sufficient resources are available in a cluster to provide failover protection and to ensure that virtual machine resource reservations are respected. Three types of admission control are available.

1. Host

2. Resource Pool

3.VMware HA

Host ensures that a host has sufficient resources to satisfy the reservations of all virtual machines running on it.

Resource Pool Ensures that a resource pool has sufficient resources to satisfy the reservations, shares, and limits of all virtual machines associated with it.

VMware HA Ensures that sufficient resources in the cluster are reserved for virtual machine recovery in the event of host failure.

Admission control imposes constraints on resource usage and any action that would violate these constraints is not permitted. Examples of actions that could be disallowed include the following:

– Powering on a virtual machine.

– Migrating a virtual machine onto a host or into a cluster or resource pool.

– Increasing the CPU or memory reservation of a virtual machine.

Of the three types of admission control, only VMware HA admission control can be disabled. However, without it there is no assurance that all virtual machines in the cluster can be restarted after a host failure. VMware recommends that you do not disable admission control, but you might need to do so temporarily, for the following reasons: n If you need to violate the failover constraints when there are not enough resources to support them (for example, if you are placing hosts in standby mode to test them for use with DPM). n If an automated process needs to take actions that might temporarily violate the failover constraints (for example, as part of an upgrade directed by VMware Update Manager). n If you need to perform testing or maintenance operations

14.Is it possible to configure VMware HA to tolerate a specified number of host failures?

Ans; Yes.

You can configure VMware HA to tolerate a specified number of host failures. With the Host Failures Cluster Tolerates admission control policy, VMware HA ensures that a specified number of hosts can fail and sufficient resources remain in the cluster to fail over all the virtual machines from those hosts. With the Host Failures Cluster Tolerates policy, VMware HA performs admission control in the following way:

1 Calculates the slot size. A slot is a logical representation of memory and CPU resources. By default, it is sized to satisfy the requirements for any powered-on virtual machine in the cluster.

2 Determines how many slots each host in the cluster can hold.

3 Determines the Current Failover Capacity of the cluster. This is the number of hosts that can fail and still leave enough slots to satisfy all of the powered-on virtual machines.

4 Determines whether the Current Failover Capacity is less than the Configured Failover Capacity (provided by the user). If it is, admission control disallows the operation.

The maximum Configured Failover Capacity that you can set is four. Each cluster has up to five primary hosts and if all fail simultaneously, failover of all virtual machines might not be successful.

15. How is slot size calculated?

Slot Size Calculation Slot size is comprised of two components, CPU and memory.

1. VMware HA calculates the CPU component by obtaining the CPU reservation of each powered-on virtual machine and selecting the largest value. If you have not specified a CPU reservation for a virtual machine, it is assigned a default value of 256 MHz. You can change this value by using the das.vmcpuminmhz advanced attribute.)

2. VMware HA calculates the memory component by obtaining the memory reservation, plus memory overhead, of each powered-on virtual machine and selecting the largest value. There is no default value for the memory reservation. If your cluster contains any virtual machines that have much larger reservations than the others, they will distort slot size calculation. To avoid this, you can specify an upper bound for the CPU or memory component of the slot size by using the das.slotcpuinmhz or das.slotmeminmb advanced attributes, respectively.

16. What are pre-requites for HA to work?

1.Shared storage for the VMs running in HA cluster
2.Essentials plus, standard, Advanced, Enterprise and Enterprise Plus Licensing
3.Create VMHA enabled Cluster
4.Management network redundancy to avoid frequent isolation response in case of temporary network issues (preferred not a requirement)

17. What is maximum number of primary HA hosts in vSphere 4.1?

Maximum number of primary HA host is 5. VMware HA cluster chooses the first 5 hosts that joins the cluster as primary nodes and all others hosts are automatically selected as secondary nodes.

18. What is AAM in HA?

AAM is the Legato automated availability management.  Prior to vSphere 4.1, VMware’s HA is actually re engineered to work with VM’s with the help of  Legato’s Automated Availability Manager (AAM) software. VMware’s vCenter agent (vpxa) interfaces with the VMware HA agent which acts as an intermediary to the AAM software. From vSphere 5.0, it uses an agent called “FDM”  (Fault Domain Manager).

19.What is maximum number of primary HA hosts in vSphere 4.1?

Maximum number of primary HA host is 5. VMware HA cluster chooses the first 5 hosts that joins the cluster as primary nodes and all others hosts are automatically selected as secondary nodes.

20. How to see the list of Primary nodes in HA cluster?

View the log file named “aam_config_util_listnodes.log” under /var/log/vmware/aam using the below command

cat /var/log/vmware/aam/aam_config_util_listnodes.log

21. What is the command to restart /Start/Stop HA agent in the ESX host?

service vmwareaam restart

service vmwareaam stop

service vmwareaam start

22. Where to located HA related logs in case of troubleshooting?

/Var/log/vmware/aam

23. What the basic troubleshooting steps in case of HA agent install failed on hosts in HA cluster?

Below steps are taken from blog posts Troubleshooting HA

1. Check for some network issues

2. Check the DNS is configured properly

3. Check the vmware HA agent status in ESX host by using below commands

service vmwareaam status

4. Check the networks are properly configured  and named exactly as other hosts in the cluster. otherwise, you will get the below errors while installing or reconfiguring HA agent.

5. Check HA related ports are open in firewall to allow for the communication

Incoming port: TCP/UDP 8042-8045
Outgoing port: TCP/UDP 2050-2250

6. First try to restart /stop/start the vmware HA agent on the affected host using the below commands. In addition u can also try to restart vpxa and management agent in the Host.

service vmwareaam restart

service vmwareaam stop

service vmwareaam start

7. Right Click the affected host and click on “Reconfigure for VMWare HA” to re-install the HA agent that particular host.

8. Remove the affected host from the cluster. Removing ESX host from the cluster will not be allowed untill that host is put into maintenance mode.

9.Alternative solution for 3 step is, Goto cluster settings and uncheck the vmware HA in toturnoff the HA in that cluster and re-enable the vmware HA to get the agent installed.

10. For further troubleshooting , review the HA logs under /Var/log/vmware/aam directory.

24. What is the maximum number of hosts per HA cluster?

Maximum number of hosts in the HA cluster is 32

25. What is Host Isolation?

VMware HA has a mechanism to detect a host is isolated from rest of hosts in the cluster. When the ESX host loses its ability to exchange heartbeat via management networkbetween the other hosts in the HA cluster, that ESX host will be considered as a Isolated.

26. How Host Isolation is detected?

In HA cluster, ESX hosts uses heartbeats to communicate among other hosts in the cluster.By default, Heartbeat will be sent every 1 second.

If a ESX host in the cluster didn’t received heartbeat for for 13 seconds from any other hosts in the cluster, The host considered it as isolated and host will ping the configured isolation address(default gateway by default). If the ping fails, VMware HA will execute the Host isolation response

27. What are the different types isolation response available in HA?

Power off –  All the VMs are powered off , when the HA detects that the network isolation occurs

Shut down – All VMs running on that host are shut down with the help of VMware Tools, when the HA detects that the network isolation occurs.If the shutdown via VMWare tools not happened within 5 minutes, VM’s power off operation will be executed. This behavior can be changed with the help of HA advanced options. Please refer http://www.vmwarearena.com/2012/07/vmware-ha-advanced-options.html

Leave powered on –  The VM’s state remain powered on or remain unchanged, when the HA detects that the network isolation occurs.

27. How to add additional isolation address for redundancy?

By default, VMWare HA use to ping default gateway as the isolation address if it stops receiving heartbeat.We can add an additional values in case if we are using redundant service  console both belongs to different subnet.Let’s say we can add the default gateway of SC1 as first value and gateway of SC2 as the additional one using the below value

1. Right Click your HA cluster

2. Goto to advanced options of HA

3. Add the line “das.isolationaddress1 = 192.168.0.1″

4. Add the line “das.isolationaddress2 = 192.168.1.1″ as the additional isolation address

To know more about the http://www.vmwarearena.com/2012/07/vmware-ha-advanced-options.html

28. What is HA Admission control?

As per “VMware Availability Guide”,

VCenter Server uses admission control to ensure that sufficient resources are available in a cluster to provide failover protection and to ensure that virtual machine resource reservations are respected.

29. What are the 2 types of settings available for admission control?


Enable: Do not power on VMs that violate availability constraints

Disable: Power on VMs that violate availability constraints

30. What are the different types of Admission control policy available with VMware HA?

There are 3 different types of Admission control policy available.

Host failures cluster  tolerates
Percentage of cluster resources reserved as fail over spare capacity
Specify a fail over host

31. How the Host Failures cluster tolerates admission control policy works?


Select the maximum number of host failures that you can afford for or to guarantee fail over. Prior vSphere 4.1, Minimum is 1 and the maximum is 4.

In the Host Failures cluster tolerates admission control policy , we can define the specific number of hosts  that can fail  in the cluster and also it ensures that the sufficient resources remain to fail over all the virtual machines from that failed hosts to the other hosts incluster. VMware High Availability(HA) uses a mechanism called slots to calculate both the available and required resources in the cluster for a failing over virtual machines from a failed host  to other hosts in the cluster.

32. What is SLOT?

As per VMWare’s Definition,

“A slot is a logical representation of the memory and CPU resources that satisfy the requirements for any powered-on virtual machine in the cluster.”

If you have configured reservations at VM level, It influence the HA slot calculation. Highest memory reservation and highest CPU reservation of the VM in your cluster determines the slot size for the cluster.

33. How the HA Slots are Calculated?

Refer http://www.vmwarearena.com/2012/07/ha-slots-calculation.html.

34. How to Check the HA Slot information from vSphere Client?

Click on Cluster Summary Tab and Click on “Advanced Runtime Info” to see the the detailed HA slots information.

35. What is use of Host Monitoring  status in HA cluster?

Let’s take an example, you are performing network maintenance activity on your switches which connects your one of th ESX host in HA cluster.

what will happen if the switch connected to the ESX host in HA cluster is down?

It will not receive heartbeat and also ping to the isolation address also failed. so, host will think itself as isolated and HA will initiate the reboot of virtual machines on the host to other hosts in the cluster. Why do you need this unwanted situation while performing scheduled maintenance window.

To avoid the above situation when performing scheduled activity which may cause ESX hostto isolate, remove the check box in ” Enable Host Monitoring” until you are done with the network maintenance activity.

36. How to Manually define the HA Slot size?

By default, HA slot size is determined by the Virtual machine Highest CPU and memory reservation. If no reservation is specified at the VM level, default slot size of 256 MHZ for CPU and 0 MB + memory overhead for RAM will be taken as slot size. We can control the HA slot size manually by using the following values.

There are 4 options we can configure at HA advanced options related to slot size

das.slotMemInMB – Maximum Bound  value for HA memory slot size
das.slotCpuInMHz – Maximum Bound value for HA CPU slot Size
das.vmMemoryMinMB –  Minimum Bound  value for HA memory slot size
das.vmCpuMinMHz –  Minimum Bound  value for HA CPU slot size

For More HA related Advanced options, Please refer the link: http://www.vmwarearena.com/2012/07/vmware-ha-advanced-options.html

37. How the “Percentage of cluster resources reserved as failover spare capacity” admission control policy works?



In the Percentage of cluster resources reserved as failover spare capacity admission control policy, We can define the specific percentage of total cluster resources are reserved for failover.In contrast to the “Host Failures cluster tolerates admission control policy”, It will not use slots. Instead This policy calculates the in the way below

1.It calculates the Total resource requirement for all Powered-on Virtual Machines in the cluster  and also calculates the total resource available in host for virtual machines.
2.It calculates the current CPU and Memory Failover capacity for the capacity.
3.If the current CPU and Memory Failover capacity for the cluster < configured failover capacity (ex 25 %)
4.Admission control will not allow to power on the virtual machine which violates the availability constraints.

38. How the “Specify a failover host” admission control policy works?



In the Specify a failover host” admission control policy, We can define a specific host as a dedicated failover host. When isolation response is detected, HA attempts to restart the virtual machines on the specified failover host.In this Approach, dedicated failover hist will be sitting idle without actively involving or not participating in DRS load balancing.DRS will not migrate or power on placement of virtual machines on the defined failover host.

39. What is VM Monitoring status?

HA will usually monitors ESX hosts and reboot the virtual machine in the failed hosts in the other host in the cluster in case of host isolation but i need the HA to monitors for Virtual machine failures also. here the feature called VM monitoring status as part of HA settings.VM monitoring restarts the virtual machine if the vmware tools heartbeat didn’t received with the specified time using Monitoring sensitivity.

How to setup Salt Halite on CentOS 6.5

Estimated Reading Time: 2 minutes

Setting up Salt Halite is not straightforward process. Its a WebUI for Salt master where you can easily run commands and manage minions. I assume that you have  a master node setup with couple of minion(atleast one).

saltUI

Let’s start with the steps to accomplish it:

Starting with Build Process –

  1. cd /var/www
  2. git clone https://github.com/saltstack/halite
  3. cd halite/halite
  4. ./genindex.py -C

Installation salt-api:

     5. yum install salt-api

In the end add the master configuration file

Add the following content at the end of the file /etc/salt/master as shown:

rest_cherrypy:
host: 0.0.0.0
port: 8080
debug: true
disable_ssl: True
static: /var/www/halite/halite
app: /var/www/halite/halite/index.htmlexternal_auth:
pam:
salt:
– .*
– ‘@runner’
– ‘@wheel’

That I set here disable_ssl external_auth pam authentication using user login, follow the steps after the operation is completed:

6. /etc/init.d/salt-master restart

Adding a user

7 . #useradd salt
8.#echo salt | passwd –stdin salt

After the establishment of user testing

9. #salt -a pam \* test.ping

Enter the user name and password as seen minion return information indicates that the authentication is successful landing

Start salt-api

          10. #salt-api -d
          11#cd /var/www/halite/halite
          12.#python server_bottle.py -d -C -l debug -s cherrypy

Then open http: // ip: 8080 / app, through salt / salt can login.

See you in further post..

SaltStack on CentOS 6.5

Estimated Reading Time: 11 minutes

SaltStack is an extremely fast and scalable systems and configuration management software for predictive orchestration, cloud and data center automation, server provisioning, application deployment and much more. Today we are going to quickstart with SaltStack to see how effective it is.

saltstack

Let’s deep dive quick into SalStack environmental setup:

Machine Details:

Machine IP Address Hostname
Salt Master 208.64.250.8 208.64.250.8.uscolo.com
Salt Minion 1 208.64.250.6 SVM61
Salt Minion 2 208.64.250.7 SVM71

Setting up Salt Master:

  1. Let’s see what OS is running on the system
#cat /etc/issueCentOS release 6.5 (Final)

Kernel \r on an \m

  1. Download EPEL repo as the pre-requisite:
#wget http://ftp.riken.jp/Linux/fedora/epel/6/i386/epel-release-6-8.noarch.rpm–2015-01-31 15:19:07–  http://ftp.riken.jp/Linux/fedora/epel/6/i386/epel-release-6-8.noarch.rpm

Resolving ftp.riken.jp… 134.160.38.1

Connecting to ftp.riken.jp|134.160.38.1|:80… connected.

HTTP request sent, awaiting response… 200 OK

Length: 14540 (14K) [text/plain]

Saving to: “epel-release-6-8.noarch.rpm”

100%[======================================>] 14,540      54.6K/s   in 0.3s

2015-01-31 15:19:08 (54.6 KB/s) – “epel-release-6-8.noarch.rpm” saved [14540/14540]

  1. Install EPEL repo as shown below:
#yum install epel-release-6-8.noarch.rpmLoaded plugins: fastestmirror, refresh-packagekit, security

base                                                     | 3.7 kB     00:00

base/primary_db                                          | 4.6 MB     00:00

extras                                                   | 3.4 kB     00:00

extras/primary_db                                        |  30 kB     00:00

updates                                                  | 3.4 kB     00:00

updates/primary_db                                       | 2.1 MB     00:00

Setting up Install Process

Examining epel-release-6-8.noarch.rpm: epel-release-6-8.noarch

Marking epel-release-6-8.noarch.rpm to be installed

Resolving Dependencies

–> Running transaction check

—> Package epel-release.noarch 0:6-8 will be installed

  1. Install salt-master related packages in the master node. DONOT INSTALL MINION ON MASTER NODE.
[root@208 ~]# yum install salt-masterLoaded plugins: fastestmirror, refresh-packagekit, security

Determining fastest mirrors

epel/metalink                                            |  13 kB     00:00

* base: centos.mirror.lstn.net

* epel: mirror.prgmr.com

* extras: mirror.hmc.edu

* updates: ftp.osuosl.org

epel                                                     | 4.4 kB     00:00

epel/primary_db                                          | 6.3 MB     00:00

Setting up Install Process

Resolving Dependencies

–> Running transaction check

—> Package salt-master.noarch 0:2014.7.0-3.el6 will be installed

–> Processing Dependency: salt = 2014.7.0-3.el6 for package: salt-master-2014.7.0-3.el6.noarch

–> Running transaction check

—> Package salt.noarch 0:2014.7.0-3.el6 will be installed

–> Processing Dependency: sshpass for package: salt-2014.7.0-3.el6.noarch

–> Processing Dependency: python-zmq for package: salt-2014.7.0-3.el6.noarch

–> Processing Dependency: python-requests for package: salt-2014.7.0-3.el6.noarch

–> Processing Dependency: python-msgpack for package: salt-2014.7.0-3.el6.noarch

–> Processing Dependency: python-jinja2 for package: salt-2014.7.0-3.el6.noarch

–> Processing Dependency: m2crypto for package: salt-2014.7.0-3.el6.noarch

–> Processing Dependency: PyYAML for package: salt-2014.7.0-3.el6.noarch

–> Running transaction check

—> Package PyYAML.x86_64 0:3.10-3.1.el6 will be installed

–> Processing Dependency: libyaml-0.so.2()(64bit) for package: PyYAML-3.10-3.1.el6.x86_64

—> Package m2crypto.x86_64 0:0.20.2-9.el6 will be installed

—> Package python-jinja2.x86_64 0:2.2.1-2.el6_5 will be installed

–> Processing Dependency: python-babel >= 0.8 for package: python-jinja2-2.2.1-2.el6_5.x86_64

—> Package python-msgpack.x86_64 0:0.1.13-3.el6 will be installed

—> Package python-requests.noarch 0:1.1.0-4.el6.centos will be installed

–> Processing Dependency: python-urllib3 for package: python-requests-1.1.0-4.el6.centos.noarch

–> Processing Dependency: python-ordereddict for package: python-requests-1.1.0-4.el6.centos.noarch

–> Processing Dependency: python-chardet for package: python-requests-1.1.0-4.el6.centos.noarch

—> Package python-zmq.x86_64 0:14.3.1-1.el6 will be installed

–> Processing Dependency: libzmq.so.3()(64bit) for package: python-zmq-14.3.1-1.el6.x86_64

—> Package sshpass.x86_64 0:1.05-1.el6 will be installed

–> Running transaction check

—> Package libyaml.x86_64 0:0.1.3-4.el6_6 will be installed

—> Package python-babel.noarch 0:0.9.4-5.1.el6 will be installed

—> Package python-chardet.noarch 0:2.0.1-1.el6.centos will be installed

—> Package python-ordereddict.noarch 0:1.1-2.el6.centos will be installed

—> Package python-urllib3.noarch 0:1.5-7.el6.centos will be installed

–> Processing Dependency: python-six for package: python-urllib3-1.5-7.el6.centos.noarch

–> Processing Dependency: python-backports-ssl_match_hostname for package: python-urllib3-1.5-7.el6.centos.noarch

—> Package zeromq3.x86_64 0:3.2.4-1.el6 will be installed

–> Processing Dependency: libpgm-5.1.so.0()(64bit) for package: zeromq3-3.2.4-1.el6.x86_64

–> Running transaction check

—> Package openpgm.x86_64 0:5.1.118-3.el6 will be installed

—> Package python-backports-ssl_match_hostname.noarch 0:3.4.0.2-4.el6.centos will be installed

–> Processing Dependency: python-backports for package: python-backports-ssl_match_hostname-3.4.0.2-4.el6.centos.noarch

—> Package python-six.noarch 0:1.7.3-1.el6.centos will be installed

–> Running transaction check

—> Package python-backports.x86_64 0:1.0-3.el6.centos will be installed

–> Finished Dependency Resolution

Dependencies Resolved

================================================================================

Package                             Arch   Version               Repository

Size

================================================================================

Installing:

salt-master                         noarch 2014.7.0-3.el6        epel     33 k

Installing for dependencies:

PyYAML                              x86_64 3.10-3.1.el6          updates 157 k

libyaml                             x86_64 0.1.3-4.el6_6         updates  52 k

m2crypto                            x86_64 0.20.2-9.el6          base    471 k

openpgm                             x86_64 5.1.118-3.el6         epel    165 k

python-babel                        noarch 0.9.4-5.1.el6         base    1.4 M

python-backports                    x86_64 1.0-3.el6.centos      extras  5.3 k

python-backports-ssl_match_hostname noarch 3.4.0.2-4.el6.centos  extras   13 k

python-chardet                      noarch 2.0.1-1.el6.centos    extras  225 k

python-jinja2                       x86_64 2.2.1-2.el6_5         base    466 k

python-msgpack                      x86_64 0.1.13-3.el6          epel     29 k

python-ordereddict                  noarch 1.1-2.el6.centos      extras  7.7 k

python-requests                     noarch 1.1.0-4.el6.centos    extras   71 k

python-six                          noarch 1.7.3-1.el6.centos    extras   27 k

python-urllib3                      noarch 1.5-7.el6.centos      extras   41 k

python-zmq                          x86_64 14.3.1-1.el6          epel    467 k

salt                                noarch 2014.7.0-3.el6        epel    3.7 M

sshpass                             x86_64 1.05-1.el6            epel     19 k

zeromq3                             x86_64 3.2.4-1.el6           epel    334 k

Transaction Summary

================================================================================

Install      19 Package(s)

Total download size: 7.7 M

Installed size: 29 M

Is this ok [y/N]: y

Downloading Packages:

(1/19): PyYAML-3.10-3.1.el6.x86_64.rpm                   | 157 kB     00:00

(2/19): libyaml-0.1.3-4.el6_6.x86_64.rpm                 |  52 kB     00:00

(3/19): m2crypto-0.20.2-9.el6.x86_64.rpm                 | 471 kB     00:00

(4/19): openpgm-5.1.118-3.el6.x86_64.rpm                 | 165 kB     00:00

(5/19): python-babel-0.9.4-5.1.el6.noarch.rpm            | 1.4 MB     00:00

(6/19): python-backports-1.0-3.el6.centos.x86_64.rpm     | 5.3 kB     00:00

(7/19): python-backports-ssl_match_hostname-3.4.0.2-4.el |  13 kB     00:00

(8/19): python-chardet-2.0.1-1.el6.centos.noarch.rpm     | 225 kB     00:00

(9/19): python-jinja2-2.2.1-2.el6_5.x86_64.rpm           | 466 kB     00:00

(10/19): python-msgpack-0.1.13-3.el6.x86_64.rpm          |  29 kB     00:00

(11/19): python-ordereddict-1.1-2.el6.centos.noarch.rpm  | 7.7 kB     00:00

(12/19): python-requests-1.1.0-4.el6.centos.noarch.rpm   |  71 kB     00:00

(13/19): python-six-1.7.3-1.el6.centos.noarch.rpm        |  27 kB     00:00

(14/19): python-urllib3-1.5-7.el6.centos.noarch.rpm      |  41 kB     00:00

(15/19): python-zmq-14.3.1-1.el6.x86_64.rpm              | 467 kB     00:00

(16/19): salt-2014.7.0-3.el6.noarch.rpm                  | 3.7 MB     00:00

(17/19): salt-master-2014.7.0-3.el6.noarch.rpm           |  33 kB     00:00

(18/19): sshpass-1.05-1.el6.x86_64.rpm                   |  19 kB     00:00

(19/19): zeromq3-3.2.4-1.el6.x86_64.rpm                  | 334 kB     00:00

——————————————————————————–

Total                                           4.3 MB/s | 7.7 MB     00:01

warning: rpmts_HdrFromFdno: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY

Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6

Importing GPG key 0x0608B895:

Userid : EPEL (6) <epel@fedoraproject.org>

Package: epel-release-6-8.noarch (@/epel-release-6-8.noarch)

From   : /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6

Is this ok [y/N]: y

warning: rpmts_HdrFromFdno: Header V3 RSA/SHA256 Signature, key ID c105b9de: NOKEY

Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6

Importing GPG key 0xC105B9DE:

Userid : CentOS-6 Key (CentOS 6 Official Signing Key) <centos-6-key@centos.org>

Package: centos-release-6-5.el6.centos.11.1.x86_64 (@anaconda-CentOS-201311272149.x86_64/6.5)

From   : /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6

Is this ok [y/N]: y

Running rpm_check_debug

Running Transaction Test

Transaction Test Succeeded

Running Transaction

Installing : python-ordereddict-1.1-2.el6.centos.noarch                  1/19

Installing : python-six-1.7.3-1.el6.centos.noarch                        2/19

Installing : sshpass-1.05-1.el6.x86_64                                   3/19

Installing : python-backports-1.0-3.el6.centos.x86_64                    4/19

Installing : python-backports-ssl_match_hostname-3.4.0.2-4.el6.centos    5/19

Installing : python-urllib3-1.5-7.el6.centos.noarch                      6/19

Installing : m2crypto-0.20.2-9.el6.x86_64                                7/19

Installing : libyaml-0.1.3-4.el6_6.x86_64                                8/19

Installing : PyYAML-3.10-3.1.el6.x86_64                                  9/19

Installing : python-chardet-2.0.1-1.el6.centos.noarch                   10/19

Installing : python-requests-1.1.0-4.el6.centos.noarch                  11/19

Installing : python-babel-0.9.4-5.1.el6.noarch                          12/19

Installing : python-jinja2-2.2.1-2.el6_5.x86_64                         13/19

Installing : python-msgpack-0.1.13-3.el6.x86_64                         14/19

Installing : openpgm-5.1.118-3.el6.x86_64                               15/19

Installing : zeromq3-3.2.4-1.el6.x86_64                                 16/19

Installing : python-zmq-14.3.1-1.el6.x86_64                             17/19

Installing : salt-2014.7.0-3.el6.noarch                                 18/19

Installing : salt-master-2014.7.0-3.el6.noarch                          19/19

Verifying  : openpgm-5.1.118-3.el6.x86_64                                1/19

Verifying  : python-msgpack-0.1.13-3.el6.x86_64                          2/19

Verifying  : python-babel-0.9.4-5.1.el6.noarch                           3/19

Verifying  : python-chardet-2.0.1-1.el6.centos.noarch                    4/19

Verifying  : python-backports-ssl_match_hostname-3.4.0.2-4.el6.centos    5/19

Verifying  : PyYAML-3.10-3.1.el6.x86_64                                  6/19

Verifying  : libyaml-0.1.3-4.el6_6.x86_64                                7/19

Verifying  : python-ordereddict-1.1-2.el6.centos.noarch                  8/19

Verifying  : python-urllib3-1.5-7.el6.centos.noarch                      9/19

Verifying  : m2crypto-0.20.2-9.el6.x86_64                               10/19

Verifying  : salt-2014.7.0-3.el6.noarch                                 11/19

Verifying  : python-zmq-14.3.1-1.el6.x86_64                             12/19

Verifying  : python-jinja2-2.2.1-2.el6_5.x86_64                         13/19

Verifying  : salt-master-2014.7.0-3.el6.noarch                          14/19

Verifying  : python-backports-1.0-3.el6.centos.x86_64                   15/19

Verifying  : zeromq3-3.2.4-1.el6.x86_64                                 16/19

Verifying  : python-requests-1.1.0-4.el6.centos.noarch                  17/19

Verifying  : sshpass-1.05-1.el6.x86_64                                  18/19

Verifying  : python-six-1.7.3-1.el6.centos.noarch                       19/19

Installed:

salt-master.noarch 0:2014.7.0-3.el6

Dependency Installed:

PyYAML.x86_64 0:3.10-3.1.el6

libyaml.x86_64 0:0.1.3-4.el6_6

m2crypto.x86_64 0:0.20.2-9.el6

openpgm.x86_64 0:5.1.118-3.el6

python-babel.noarch 0:0.9.4-5.1.el6

python-backports.x86_64 0:1.0-3.el6.centos

python-backports-ssl_match_hostname.noarch 0:3.4.0.2-4.el6.centos

python-chardet.noarch 0:2.0.1-1.el6.centos

python-jinja2.x86_64 0:2.2.1-2.el6_5

python-msgpack.x86_64 0:0.1.13-3.el6

python-ordereddict.noarch 0:1.1-2.el6.centos

python-requests.noarch 0:1.1.0-4.el6.centos

python-six.noarch 0:1.7.3-1.el6.centos

python-urllib3.noarch 0:1.5-7.el6.centos

python-zmq.x86_64 0:14.3.1-1.el6

salt.noarch 0:2014.7.0-3.el6

sshpass.x86_64 0:1.05-1.el6

zeromq3.x86_64 0:3.2.4-1.el6

Complete!

[root@208 ~]# yum install salt-ssh

Loaded plugins: fastestmirror, refresh-packagekit, security

Loading mirror speeds from cached hostfile

* base: centos.mirror.lstn.net

* epel: mirror.prgmr.com

* extras: mirror.hmc.edu

* updates: ftp.osuosl.org

Setting up Install Process

Resolving Dependencies

–> Running transaction check

—> Package salt-ssh.noarch 0:2014.7.0-3.el6 will be installed

–> Finished Dependency Resolution

Dependencies Resolved

================================================================================

Package           Arch            Version                  Repository     Size

================================================================================

Installing:

salt-ssh          noarch          2014.7.0-3.el6           epel           12 k

Transaction Summary

================================================================================

Install       1 Package(s)

Total download size: 12 k

Installed size: 2.8 k

Is this ok [y/N]: y

Downloading Packages:

salt-ssh-2014.7.0-3.el6.noarch.rpm                       |  12 kB     00:00

Running rpm_check_debug

Running Transaction Test

Transaction Test Succeeded

Running Transaction

Installing : salt-ssh-2014.7.0-3.el6.noarch                               1/1

Verifying  : salt-ssh-2014.7.0-3.el6.noarch                               1/1

Installed:

salt-ssh.noarch 0:2014.7.0-3.el6

Complete!

[root@208 ~]# yum install salt-api

Loaded plugins: fastestmirror, refresh-packagekit, security

Loading mirror speeds from cached hostfile

* base: centos.mirror.lstn.net

* epel: mirror.prgmr.com

* extras: mirror.hmc.edu

* updates: ftp.osuosl.org

Setting up Install Process

Resolving Dependencies

–> Running transaction check

—> Package salt-api.noarch 0:2014.7.0-3.el6 will be installed

–> Finished Dependency Resolution

Dependencies Resolved

================================================================================

Package           Arch            Version                  Repository     Size

================================================================================

Installing:

salt-api          noarch          2014.7.0-3.el6           epel           12 k

Transaction Summary

================================================================================

Install       1 Package(s)

Total download size: 12 k

Installed size: 4.1 k

Is this ok [y/N]: y

Downloading Packages:

salt-api-2014.7.0-3.el6.noarch.rpm                       |  12 kB     00:00

Running rpm_check_debug

Running Transaction Test

Transaction Test Succeeded

Running Transaction

Installing : salt-api-2014.7.0-3.el6.noarch                               1/1

Verifying  : salt-api-2014.7.0-3.el6.noarch                               1/1

Installed:

salt-api.noarch 0:2014.7.0-3.el6

Complete!

Configuring SALT MASTER FILE:

#egrep -v “^#|^$” /etc/salt/masterinterface: 208.64.250.8

publish_port: 4505

user: root

ret_port: 4506

pidfile: /var/run/salt-master.pid

pki_dir: /etc/salt/pki/master

sock_dir: /var/run/salt/master

minion_data_cache: True

autosign_file: /etc/salt/autosign.conf

  1. Now restart the salt-master service:

#service salt-master restart

CONFIGURING SALT-MINION (Client Node)

  1. Assume that a different machine running CentOS 6.5 is present.
  2. Follow the same steps which is followed for pre-requisite for master except salt-master package. You need to install salt-minion through YUM.
  1. Configure the /etc/salt/minion file as shown below:

master: 208.64.250.8

master_port: 4506

  1. Restart the salt-minion service:
service salt-minion restartStopping salt-minion daemon:                               [FAILED]

Starting salt-minion daemon:                               [  OK  ]

  1. Run the following command to configure authentication keys in between master and client:
[root@208 ~]# salt-key -L

Accepted Keys:

Unaccepted Keys:

Rejected Keys:

[root@208 ~]# salt-key -A

The key glob ‘*’ does not match any unaccepted keys.

[root@208 ~]# service iptables stop

iptables: Setting chains to policy ACCEPT: filter          [  OK  ]

iptables: Flushing firewall rules:                         [  OK  ]

iptables: Unloading modules:                               [  OK  ]

[root@208 ~]# salt-key -L

Accepted Keys:

Unaccepted Keys:

SVM61

Rejected Keys:

[root@208 ~]# salt-key -A

The following keys are going to be accepted:

Unaccepted Keys:

SVM61

Proceed? [n/Y] Y

Key for minion SVM61 accepted.

[root@208 ~]#

Verifying master and minion functionality test:

Run the below command on the salt master:

salt ‘*’ test.ping -vExecuting job with jid 20150131181518540377

——————————————-

SVM61:

True

[root@208 salt]# salt ‘*’ test.ping

SVM61:

True

[root@208 salt]# salt ‘*’ disk.usage

SVM61:

———-

/:

———-

1K-blocks:

8780808

available:

6021132

capacity:

28%

filesystem:

/dev/mapper/vg_svm1-lv_root

used:

2313624

/boot:

———-

1K-blocks:

495844

available:

436779

capacity:

8%

filesystem:

/dev/sda1

used:

33465

/dev/shm:

———-

1K-blocks:

251000

available:

251000

capacity:

0%

filesystem:

tmpfs

used:

0

Troubleshooting Tips:

  1. Suppose you face any issue related to keys, then first thing to check is minion logs which can be tailed at /var/log/salt/minion.
  2. If you encounter the following error message:

The master may need to be updated if it is a version of Salt lower than 2014.7.0, or If you are confident that you are connecting to a valid Salt Master, then remove the master public key and restart the Salt Minion.The master public key can be found at: /etc/salt/pki/minion/minion_master.pub

 

Fix: remove the key on minion and master and then restart the minion service. You can remove the key from master through salt-key –delete-all and then start from start.

Preparing the first salt Formulae:

Salt formulae are simple YAML text files and by default reside on the salt master.

You can put all your salt formulae under /srv/salt folder.

Example: Let’s see how can you install Subversion on the remote minion from salt master.

Add the following text in subversion.sls:

cat /srv/salt/subversion.slssubversion:

pkg:

– installed

[root@208 salt]#

What does the above code means?

The first line is called the ID Declaration; essentially the “label” for this stanza. subversion will be used for our package name. The name you use here must match up with the actual package name used by your package manager.  (In reality, the ID Declaration can be any arbitrary text and you can specify the actual package name below, but we’ll do it this way right now for simplicity’s sake).

The second line is called the State Declaration. This refers to the specific Salt State that we’re going to make use of. In this example we’re using the “pkg” state.

Now run the following command to install subversion on the minion machine in a single shot:

salt ‘SVM61’ state.sls subversionSVM61:

———-

ID: subversion

Function: pkg.installed

Result: True

Comment: The following packages were installed/updated: subversion.

Started: 18:51:48.459666

Duration: 52120.684 ms

Changes:

———-

apr:

———-

new:

1.3.9-5.el6_2

old:

apr-util:

———-

new:

1.3.9-3.el6_0.1

old:

neon:

———-

new:

0.29.3-3.el6_4

old:

pakchois:

———-

new:

0.4-3.2.el6

old:

perl-URI:

———-

new:

1.40-2.el6

old:

subversion:

———-

new:

1.6.11-10.el6_5

old:

Summary

————

Succeeded: 1 (changed=1)

Failed:    0

————

Total states run:     1

[root@208 ~]#

Did you see that? Subversion gets installed successfully. Verify it on minion machine:

[root@SVM61 ~]# rpm -qa subversionsubversion-1.6.11-10.el6_5.x86_64

[root@SVM61 ~]#

Setting up Minion 2:

  1. Follow the same step which you followed during the minion ( SVM61)
  2. Install salt-minion(and NOT SALT MASTER) specific package.
  3. Once you configure the following entry in /etc/salt/minion:
[root@SVM71 ~]# egrep -v “^#|^$” /etc/salt/minionmaster: 208.64.250.8

master_port: 4506

user: root

pidfile: /var/run/salt-minion.pid

pki_dir: /etc/salt/pki/minion

id: SVM71

[root@SVM71 ~]#

  1. Restart the salt-minion service.
  2. Once restarted, you will find the following output. Accept the key and you are ready to test the ping test.
[root@208 ~]# salt-key -LAccepted Keys:

SVM61

Unaccepted Keys:

SVM71

Rejected Keys:

[root@208 ~]# salt-key -A

The following keys are going to be accepted:

Unaccepted Keys:

SVM71

Proceed? [n/Y] Y

Key for minion SVM71 accepted.

[root@208 ~]# salt-key -L

Accepted Keys:

SVM61

SVM71

Unaccepted Keys:

Rejected Keys:

[root@208 ~]#

  1. Let’s check the ping test from minion 2 system:
salt SVM71 test.pingSVM71:

True

[root@208 ~]# salt ‘*’ test.ping

SVM71:

True

SVM61:

True

[root@208 ~]#

 

Hence our 2 minions and 1 master are readily configured.

100 Hadoop Interview Questions

Estimated Reading Time: 28 minutes

Are you preparing for Hadoop Interview? Have you spent last several hours to get the collection of Hadoop MapReduce questions? Are you in last minute preparation for HDFS related fundamentals?

If yes, you are at right place. I bring you collections of Hadoop MapReduce Q & A series which are derived from all the relevant website. The idea is simple – Dump all Hadoop related Questions under a single umbrella.

Let’s start –
1. What is a JobTracker in Hadoop? How many instances of JobTracker run on a Hadoop Cluster?

JobTracker is the daemon service for submitting and tracking MapReduce jobs in Hadoop. There is only One Job Tracker process run on any hadoop cluster. Job Tracker runs on its own JVM process. In a typical production cluster its run on a separate machine. Each slave node is configured with job tracker node location. The JobTracker is single point of failure for the Hadoop MapReduce service. If it goes down, all running jobs are halted. JobTracker in Hadoop performs following actions(from Hadoop Wiki:)

  • Client applications submit jobs to the Job tracker.
  • The JobTracker talks to the NameNode to determine the location of the data
  • The JobTracker locates TaskTracker nodes with available slots at or near the data
  • The JobTracker submits the work to the chosen TaskTracker nodes.
  • The TaskTracker nodes are monitored. If they do not submit heartbeat signals often enough, they are deemed to have failed and the work is scheduled on a different TaskTracker.
  • A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides what to do then: it may resubmit the job elsewhere, it may mark that specific record as something to avoid, and it may may even blacklist the TaskTracker as unreliable.
  • When the work is completed, the JobTracker updates its status.
  • Client applications can poll the JobTracker for information.

2. How JobTracker schedules a task?

The TaskTrackers send out heartbeat messages to the JobTracker, usually every few minutes, to reassure the JobTracker that it is still alive. These message also inform the JobTracker of the number of available slots, so the JobTracker can stay up to date with where in the cluster work can be delegated. When the JobTracker tries to find somewhere to schedule a task within the MapReduce operations, it first looks for an empty slot on the same server that hosts the DataNode containing the data, and if not, it looks for an empty slot on a machine in the same rack.

3. What is a Task Tracker in Hadoop? How many instances of TaskTracker run on a Hadoop Cluster

A TaskTracker is a slave node daemon in the cluster that accepts tasks (Map, Reduce and Shuffle operations) from a JobTracker. There is only One Task Tracker process run on any hadoop slave node. Task Tracker runs on its own JVM process. Every TaskTracker is configured with a set of slots, these indicate the number of tasks that it can accept. The TaskTracker starts a separate JVM processes to do the actual work (called as Task Instance) this is to ensure that process failure does not take down the task tracker. The TaskTracker monitors these task instances, capturing the output and exit codes. When the Task instances finish, successfully or not, the task tracker notifies the JobTracker. The TaskTrackers also send out heartbeat messages to the JobTracker, usually every few minutes, to reassure the JobTracker that it is still alive. These message also inform the JobTracker of the number of available slots, so the JobTracker can stay up to date with where in the cluster work can be delegated.

4. What is a Task instance in Hadoop? Where does it run?

Task instances are the actual MapReduce jobs which are run on each slave node. The TaskTracker starts a separate JVM processes to do the actual work (called as Task Instance) this is to ensure that process failure does not take down the task tracker. Each Task Instance runs on its own JVM process. There can be multiple processes of task instance running on a slave node. This is based on the number of slots configured on task tracker. By default a new task instance JVM process is spawned for a task.

5. How many Daemon processes run on a Hadoop system?

Hadoop is comprised of five separate daemons. Each of these daemon run in its own JVM.Following 3 Daemons run on Master nodes NameNode – This daemon stores and maintains the metadata for HDFS. Secondary NameNode – Performs housekeeping functions for the NameNode. JobTracker – Manages MapReduce jobs, distributes individual tasks to machines running the Task Tracker. Following 2 Daemons run on each Slave nodes DataNode – Stores actual HDFS data blocks. TaskTracker – Responsible for instantiating and monitoring individual Map and Reduce tasks.

6. What is configuration of a typical slave node on Hadoop cluster? How many JVMs run on a slave node?

  • Single instance of a Task Tracker is run on each Slave node. Task tracker is run as a separate JVM process.
  • Single instance of a DataNode daemon is run on each Slave node. DataNode daemon is run as a separate JVM process.
  • One or Multiple instances of Task Instance is run on each slave node. Each task instance is run as a separate JVM process. The number of Task instances can be controlled by configuration. Typically a high end machine is configured to run more task instances.

7. What is the difference between HDFS and NAS ?

The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. Following are differences between HDFS and NAS

  • In HDFS Data Blocks are distributed across local drives of all machines in a cluster. Whereas in NAS data is stored on dedicated hardware.
  • HDFS is designed to work with MapReduce System, since computation are moved to data. NAS is not suitable for MapReduce since data is stored seperately from the computations.
  • HDFS runs on a cluster of machines and provides redundancy usinga replication protocal. Whereas NAS is provided by a single machine therefore does not provide data redundancy.

8. How NameNode Handles data node failures?

NameNode periodically receives a Heartbeat and a Blockreport from each of the DataNodes in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode. When NameNode notices that it has not recieved a hearbeat message from a data node after a certain amount of time, the data node is marked as dead. Since blocks will be under replicated the system begins replicating the blocks that were stored on the dead datanode. The NameNode Orchestrates the replication of data blocks from one datanode to another. The replication data transfer happens directly between datanodes and the data never passes through the namenode.

9. Does MapReduce programming model provide a way for reducers to communicate with each other? In a MapReduce job can a reducer communicate with another reducer?

Nope, MapReduce programming model does not allow reducers to communicate with each other. Reducers run in isolation.

10. Can I set the number of reducers to zero?

Yes, Setting the number of reducers to zero is a valid configuration in Hadoop. When you set the reducers to zero no reducers will be executed, and the output of each mapper will be stored to a separate file on HDFS. [This is different from the condition when reducers are set to a number greater than zero and the Mappers output (intermediate data) is written to the Local file system(NOT HDFS) of each mappter slave node.]

11. Where is the Mapper Output (intermediate kay-value data) stored ?

The mapper output (intermediate data) is stored on the Local file system (NOT HDFS) of each individual mapper nodes. This is typically a temporary directory location which can be setup in config by the hadoop administrator. The intermediate data is cleaned up after the Hadoop Job completes.

12. What are combiners? When should I use a combiner in my MapReduce Job?

Combiners are used to increase the efficiency of a MapReduce program. They are used to aggregate intermediate map output locally on individual mapper outputs. Combiners can help you reduce the amount of data that needs to be transferred across to the reducers. You can use your reducer code as a combiner if the operation performed is commutative and associative. The execution of combiner is not guaranteed, Hadoop may or may not execute a combiner. Also, if required it may execute it more then 1 times. Therefore your MapReduce jobs should not depend on the combiners execution.

13. What is Writable & WritableComparable interface?

  • apache.hadoop.io.Writable is a Java interface. Any key or value type in the Hadoop Map-Reduce framework implements this interface. Implementations typically implement a static read(DataInput) method which constructs a new instance, calls readFields(DataInput) and returns the instance.
  • apache.hadoop.io.WritableComparable is a Java interface. Any type which is to be used as a key in the Hadoop Map-Reduce framework should implement this interface. WritableComparable objects can be compared to each other using Comparators.

14. What is the Hadoop MapReduce API contract for a key and value Class?

  • The Key must implement the org.apache.hadoop.io.WritableComparable interface.
  • The value must implement the org.apache.hadoop.io.Writable interface.

15. What is a IdentityMapper and IdentityReducer in MapReduce ?

  • apache.hadoop.mapred.lib.IdentityMapper Implements the identity function, mapping inputs directly to outputs. If MapReduce programmer do not set the Mapper Class using JobConf.setMapperClass then IdentityMapper.class is used as a default value.
  • apache.hadoop.mapred.lib.IdentityReducer Performs no reduction, writing all input values directly to the output. If MapReduce programmer do not set the Reducer Class using JobConf.setReducerClass then IdentityReducer.class is used as a default value.

16. What is the meaning of speculative execution in Hadoop? Why is it important?

Speculative execution is a way of coping with individual Machine performance. In large clusters where hundreds or thousands of machines are involved there may be machines which are not performing as fast as others. This may result in delays in a full job due to only one machine not performaing well. To avoid this, speculative execution in hadoop can run multiple copies of same map or reduce task on different slave nodes. The results from first node to finish are used.

17. When is the reducers are started in a MapReduce job?

In a MapReduce job reducers do not start executing the reduce method until the all Map jobs have completed. Reducers start copying intermediate key-value pairs from the mappers as soon as they are available. The programmer defined reduce method is called only after all the mappers have finished.

18. If reducers do not start before all mappers finish then why does the progress on MapReduce job shows something like Map(50%) Reduce(10%)? Why reducers progress percentage is displayed when mapper is not finished yet?

Reducers start copying intermediate key-value pairs from the mappers as soon as they are available. The progress calculation also takes in account the processing of data transfer which is done by reduce process, therefore the reduce progress starts showing up as soon as any intermediate key-value pair for a mapper is available to be transferred to reducer. Though the reducer progress is updated still the programmer defined reduce method is called only after all the mappers have finished.

19. What is HDFS ? How it is different from traditional file systems?

HDFS, the Hadoop Distributed File System, is responsible for storing huge data on the cluster. This is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant.

  • HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware.
  • HDFS provides high throughput access to application data and is suitable for applications that have large data sets.
  • HDFS is designed to support very large files. Applications that are compatible with HDFS are those that deal with large data sets. These applications write their data only once but they read it one or more times and require these reads to be satisfied at streaming speeds. HDFS supports write-once-read-many semantics on files.

20. What is HDFS Block size? How is it different from traditional file system block size?

In HDFS data is split into blocks and distributed across multiple nodes in the cluster. Each block is typically 64Mb or 128Mb in size. Each block is replicated multiple times. Default is to replicate each block three times. Replicas are stored on different nodes. HDFS utilizes the local file system to store each HDFS block as a separate file. HDFS Block size can not be compared with the traditional file system block size.

21. What is a NameNode? How many instances of NameNode run on a Hadoop Cluster?

The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself. There is only One NameNode process run on any hadoop cluster. NameNode runs on its own JVM process. In a typical production cluster its run on a separate machine. The NameNode is a Single Point of Failure for the HDFS Cluster. When the NameNode goes down, the file system goes offline. Client applications talk to the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file. The NameNode responds the successful requests by returning a list of relevant DataNode servers where the data lives.

22. What is a DataNode? How many instances of DataNode run on a Hadoop Cluster?

A DataNode stores data in the Hadoop File System HDFS. There is only One DataNode process run on any hadoop slave node. DataNode runs on its own JVM process. On startup, a DataNode connects to the NameNode. DataNode instances can talk to each other, this is mostly during replicating data.

23. How the Client communicates with HDFS?

The Client communication to HDFS happens using Hadoop HDFS API. Client applications talk to the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file on HDFS. The NameNode responds the successful requests by returning a list of relevant DataNode servers where the data lives. Client applications can talk directly to a DataNode, once the NameNode has provided the location of the data.

24. How the HDFS Blocks are replicated?

HDFS is designed to reliably store very large files across machines in a large cluster. It stores each file as a sequence of blocks; all blocks in a file except the last block are the same size. The blocks of a file are replicated for fault tolerance. The block size and replication factor are configurable per file. An application can specify the number of replicas of a file. The replication factor can be specified at file creation time and can be changed later. Files in HDFS are write-once and have strictly one writer at any time. The NameNode makes all decisions regarding replication of blocks. HDFS uses rack-aware replica placement policy. In default configuration there are total 3 copies of a datablock on HDFS, 2 copies are stored on datanodes on same rack and 3rd copy on a different rack.

25. Explain what is Speculative Execution?

In Hadoop during Speculative Execution a certain number of duplicate tasks are launched.  On different slave node, multiple copies of same map or reduce task can be executed using Speculative Execution. In simple words, if a particular drive is taking long time to complete a task, Hadoop will create a duplicate task on another disk.  Disk that finish the task first are retained and disks that do not finish first are killed

26.Explain what are the basic parameters of a Mapper?

The basic parameters of a Mapper are

  • LongWritable and Text
  • Text and IntWritable

27.Explain what is the function of MapReducer partitioner?

The function of MapReducer partitioner is to make sure that all the value of a single key goes to the same reducer, eventually which helps evenly distribution of the map output over the reducers.

28.Explain what is difference between an Input Split and HDFS Block?

Logical division of data is known as Split while physical division of data is known as HDFS Block.

29. Explain what happens in textinformat ?

In textinputformat, each line in the text file is a record.  Value is the content of the line while Key is the byte offset of the line. For instance, Key: longWritable, Value: text

30.Mention what are the main configuration parameters that user need to specify to run Mapreduce Job ?

The user of Mapreduce framework needs to specify

  • Job’s input locations in the distributed file system
  • Job’s output location in the distributed file system
  • Input format
  • Output format
  • Class containing the map function
  • Class containing the reduce function
  • JAR file containing the mapper, reducer and driver classes

31.Explain what is WebDAV in Hadoop?

To support editing and updating files WebDAV is a set of extensions to HTTP.  On most operating system WebDAV shares can be mounted as filesystems , so it is possible to access HDFS as a standard filesystem by exposing HDFS over WebDAV.

32. Explain what is sqoop in Hadoop ?

To transfer the data between Relational database management (RDBMS) and Hadoop HDFS a tool is used known as Sqoop. Using Sqoop data can be transferred from RDMS like MySQL or Oracle into HDFS as well as exporting data from HDFS file to RDBMS.

33. Explain what is Sequencefileinputformat?

Sequencefileinputformat is used for reading files in sequence. It is a specific compressed binary file format which is optimized for passing data between the output of one MapReduce job to the input of some other MapReduce job.

34.Explain what does the conf.setMapper Class do ?

Conf.setMapperclass  sets the mapper class and all the stuff related to map job such as reading data and generating a key-value pair out of the mapper.

35.What happens when a datanode fails ?

When a datanode fails

  • Jobtracker and namenode detect the failure
  • On the failed node all tasks are re-scheduled
  • Namenode replicates the users data to another node

36.What is Distributed Cache in mapreduce framework?

Distributed cache is an important feature provide by map reduce framework. Distributed cache can cache text, archive, jars which could be used by application to improve performance. Application provide details of file to jobconf object to cache.

37. Can we change the file cached by DistributedCache?

No, DistributedCache tracks the caching with timestamp. cached file should not be changed during the job execution..

38. Can we deploye job tracker other than name node?

Yes, in production it is highly recommended. For self development and learning you may setup according to your need.

39.Number of mode supported by Hadoop? Differences between all?

1. Standalone (local) mode, It works on single Java virtual machine, don’t use distributed file system. Not much of use other than to run Mapreduce program.
2. Pseudo-distributed mode All daemons runs on single machine.
3. Fully distributed mode Enterprises uses this version for development and production.

40.Name the most common Input Formats defined in Hadoop? Which one is default?

 The two most common Input Formats defined in Hadoop are:

 – TextInputFormat

– KeyValueInputF2ormat

– SequenceFileInputFormat

 TextInputFormat is the Hadoop default.

41.What is the difference between TextInputFormat and KeyValueInputFormat class?

TextInputFormat: It reads lines of text files and provides the offset of the line as key to the Mapper and actual line as Value to the mapper.

KeyValueInputFormat: Reads text file and parses lines into key, Val pairs. Everything up to the first tab character is sent as key to the Mapper and the remainder of the line is sent as value to the mapper.

42. What is InputSplit in Hadoop?

 When a Hadoop job is run, it splits input files into chunks and assign each split to a mapper to process. This is called InputSplit.

43. How is the splitting of file invoked in Hadoop framework?  

 It is invoked by the Hadoop framework by running getInputSplit()method of the Input format class (like FileInputFormat) defined by the user.

44.Consider case scenario: In M/R system, – HDFS block size is 64 MB

– Input format is FileInputFormat

 – We have 3 files of size 64K, 65Mb and 127Mb

 How many input splits will be made by Hadoop framework?

Hadoop will make 5 splits as follows:

– 1 split for 64K files

– 2 splits for 65MB files

– 2 splits for 127MB files

45. What is the purpose of RecordReader in Hadoop?

The InputSplit has defined a slice of work, but does not describe how to access it. The RecordReader class actually loads the data from1 its source and converts it into (key, value) pairs suitable for reading by the Mapper. The RecordReader instance is defined by the Input Format.

46. After the Map phase finishes, the Hadoop framework does “Partitioning, Shuffle and sort”. Explain what happens in this phase?

Partitioning: It is the process of determining which reducer instance will receive which intermediate keys and values. Each mapper must determine for all of its output (key, value) pairs which reducer will receive them. It is necessary that for any key, regardless of which mapper instance generated it, the destination partition is the same.

Shuffle: After the first map tasks have completed, the nodes may still be performing several more map tasks each. But they also begin exchanging the intermediate outputs from the map tasks to where they are required by the reducers. This process of moving map outputs to the reducers is known as shuffling.

Sort: Each reduce task is responsible for reducing the values associated with several intermediate keys. The set of intermediate keys on a single node is automatically sorted by Hadoop before they are presented to the Reducer.

47. If no custom partitioner is defined in Hadoop then how is data partitioned before it is sent to the reducer?

 The default partitioner computes a hash value for the key and assigns the partition based on this result.

48. What is a Combiner?

The Combiner is a ‘mini-reduce’ process which operates only on data generated by a mapper. The Combiner will receive as input all data emitted by the Mapper instances on a given node. The output from the Combiner is then sent to the Reducers, instead of the output from the Mappers.

49. What is the relationship between Jobs and Tasks in Hadoop?

 One job is broken down into one or many tasks in Hadoop.

50. Hadoop achieves parallelism by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program and slow down the program. What mechanism Hadoop provides to combat this?

 Speculative Execution.

51. How does speculative execution work in Hadoop?  

JobTracker makes different TaskTrackers pr2ocess same input. When tasks complete, they announce this fact to the JobTracker. Whichever copy of a task finishes first becomes the definitive copy. If other copies were executing speculatively, Hadoop tells the TaskTrackers to abandon the tasks and discard their outputs. The Reducers then receive their inputs from whichever Mapper completed successfully, first.

52. Using command line in Linux, how will you

See all jobs running in the Hadoop cluster

– Kill a job?

 Hadoop job – list

Hadoop job – kill jobID

53. What is Hadoop Streaming?  

Streaming is a generic API that allows programs written in virtually any language to be used as Hadoop Mapper and Reducer implementations.

54. What is the characteristic of streaming API that makes it flexible run MapReduce jobs in languages like Perl, Ruby, Awk etc.?

Hadoop Streaming allows to use arbitrary programs for the Mapper and Reducer phases of a MapReduce job by having both Mappers and Reducers receive their input on stdin and emit output (key, value) pairs on stdout.

55. What is the benefit of Distributed cache? Why can we just have the file in HDFS and have the application read it?

This is because distributed cache is much faster. It copies the file to all trackers at the start of the job. Now if the task tracker runs 10 or 100 Mappers or Reducer, it will use the same copy of distributed cache. On the other hand, if you put code in file to read it from HDFS in the MR Job then every Mapper will try to access it from HDFS hence if a TaskTracker run 100 map jobs then it will try to read this file 100 times from HDFS. Also HDFS is not very efficient when used like this.

56. What mechanism does Hadoop framework provide to synchronise changes made in Distribution Cache during runtime of the application?

This is a tricky question. There is no such mechanism. Distributed Cache by design is read only during the time of Job execution.

57. Have you ever used Counters in Hadoop. Give us an example scenario?

Anybody who claims to have worked on a Hadoop project is expected to use counters.

58. Is it possible to have Hadoop job output in multiple directories? If yes, how?

Yes, by using Multiple Outputs class.

59. What will a Hadoop job do if you try to run it with an output directory that is already present? Will it

– Overwrite it

– Warn you and continue

– Throw an exception and exit

The Hadoop job will throw an exception and exit.

60. How can you set an arbitrary number of mappers to be created for a job in Hadoop?

 You cannot set it.

61. How can you set an arbitrary number of Reducers to be created for a job in Hadoop?

You can either do it programmatically by using method setNumReduceTasks in the Jobconf Class or set it up as a configuration setting.

62. How will you write a custom partitioner for a Hadoop job?

 To have Hadoop use a custom partitioner you will have to do minimum the following three:

– Create a new class that extends Partitioner Class

– Override method getPartition

– In the wrapper that runs the Mapreduce, either

– Add the custom partitioner to the job programmatically using method set Partitioner Class or – add the custom partitioner to the job as a config file (if your wrapper reads from config file or oozie)

63. How did you debug your Hadoop code?  

 There can be several ways of doing this but most common ways are:-

– By using counters.

– The web interface provided by Hadoop framework.

64. Did you ever built a production process in Hadoop? If yes, what was the process when your Hadoop job fails due to any reason?

It is an open-ended question but most candidates if they have written a production job, should talk about some type of alert mechanism like email is sent or there monitoring system sends an alert. Since Hadoop works on unstructured data, it is very important to have a good alerting system for errors since unexpected data can very easily break the job.

65.What are the four modules that make up the Apache Hadoop framework?

  • Hadoop Common, which contains the common utilities and libraries necessary for Hadoop’s other modules.
  • Hadoop YARN, the framework’s platform for resource-management
  • Hadoop Distributed File System, or HDFS, which stores information on commodity machines
  • Hadoop MapReduce, a programming model used to process  large-scale sets of data

66. What does the mapred.job.tracker command do?

The mapred.job.tracker command will provide a list of nodes that are currently acting as a job tracker process.

67.What is “jps”?

jps is a command used to check if your task tracker, job tracker, datanode, and Namenode are working.

68. What are the port numbers for job tracker, task tracker, and Namenode?

The port number for job tracker is 30, the port number for task tracker is 60, and the port number for Namenode is 70.

69.What are the parameters of mappers and reducers?

The four parameters for mappers are:

  • LongWritable (input)
  • text (input)
  • text (intermediate output)
  • IntWritable (intermediate output)

The four parameters for reducers are:

  • Text (intermediate output)
  • IntWritable (intermediate output)
  • Text (final output)
  • IntWritable (final output)

70. Is it possible to rename the output file, and if so, how?

Yes, it is possible to rename the output file by utilizing a multi-format output class.

71.True or false: Each mapper must generate the same number of key/value pairs as its input had.

The answer is:

False. Mapper may generate any number of key/value pairs (including zero).

72. True or false: Mappers output key/value must be of the same type as its input.

The answer is:

False. Mapper may produce key/value pairs of any type.

73. True or false: Reducer is applied to all values associated with the same key.

The answer is:

True. Reducer is applied to all values associated with the same key.

74. True or false: Reducers input key/value pairs are sorted by the key.

The answer is:

True. Reducers input key/value pairs are sorted by the key.
implementation.

75. True or false: Each reducer must generate the same number of key/value pairs as its input had.

The answer is:

False. Reducer may generate any number of key/value pairs (including zero).

76. True or false: Reducers output key/value pair must be of the same type as its input.

The answer is:

False. The statement is false in Hadoop and true in Google’s implementation.

77. What happens in case of hardware/software failure?  

The answer is:

MapReduce framework must be able to recover from both hardware (disk failures, RAM errors) and software (bugs, unexpected exceptions) errors. Both are common and expected.

78.Is it possible to start reducers while some mappers still run? Why?

The answer is:

No. Reducer’s input is grouped by the key. The last mapper could theoretically produce key already consumed by running reducer.

79.Define a straggler. 

Straggler is either map or reduce task that takes unusually long time to complete.
80.What does partitioner do?
Partitioner divides key/values pairs produced by map tasks between reducers.
81. Decide if the statement is true or false: Each combiner runs exactly once. 

The answer is:

False. The framework decides whether combiner runs zero, once or multiple times.

82. Explain mapper lifecycle.

The answer is:

Initialization method is called before any other method is called. It has no parameters and no output.

Map method is called separately for each key/value pair. It process input key/value pairs and emits intermediate key/value pairs.

Close method runs after all input key/value have been processed. The method should close all open resources. It may also emit key/value pairs.

83. Explain reducer lifecycle.

The answer is:

Initialization method is called before any other method is called. It has no parameters and no output.

Reduce method is called separately for each key/[values list] pair. It process intermediate key/value pairs and emits final key/value pairs. Its input is a key and iterator over all intermediate values associated with the same key.

Close method runs after all input key/value have been processed. The method should close all open resources. It may also emit key/value pairs.

84. Local Aggregation

What is local aggregation and why is it used?

The answer is:

Either combiner or a mapper combines key/value pairs with the same key together. They may do also some additional preprocessing of combined values. Only key/value pairs produced by the same mapper are combined.

Key/Value pairs created by map tasks are transferred between nodes during shuffle and sort phase. Local aggregation reduces amount of data to be transferred.

If the distribution of values over keys is skewed, data preprocessing in combiner helps to eliminate reduce stragglers.

85.What is in-mapper combining? State advantages and disadvantages over writing custom combiner. 

The answer is:

Local aggregation (combining of key/value pairs) done inside the mapper.

Map method does not emit key/value pairs, it only updates internal data structure. Close method combines and preprocess all stored data and emits final key/value pairs. Internal data structure is initialized in init method.

Advantages:

  • It will run exactly once. Combiner may run multiple times or not at all.
  • We are sure it will run during map phase. Combiner may run either after map phase or before reduce phase. The latter case provides no reduction in transferred data.
  • In-mapper combining is typically more effective. Combiner does not reduce amount of data produced by mappers, it only groups generated data together. That causes unnecessary object creation, destruction, serialization and deserialization.

Disadvantages:

  • Scalability bottleneck: the technique depends on having enough memory to store all partial results. We have to flush partial results regularly to avoid it. Combiner use produce no scalability bottleneck.

86. Pairs and Stripes

Explain Pair design patter on a co-occurence example. Include advantages/disadvantages against Stripes approach, possible optimizations and their efficacy.  

The answer is:

Mapper generates keys composed from pairs of words that occurred together. The value contains the number 1. Framework groups key/value pairs with the same work pairs together and reducer simply counts the number values for each incoming key/value pairs.

Each final pair encodes a cell in co-occurrence matrix. Local aggregation, e.g. combiner or in-mapper combining, can be used.

Advantages:

  • Simple values, less serialization/deserialization overhead.
  • Simpler memory management. No scalability bottleneck (only if in-mapper optimization would be used).

Disadvantages:

  • Huge amount of intermediate key/value pairs. Shuffle and sort phase is slower.
  • Local aggregation is less effective – too many distinct keys.

87.Explain Stripes design patter on a co-occurence example. Include advantages/disadvantages against Pairs approach, possible optimizations and their efficacy.

The answer is:

Mapper generates a distinct key from each encountered word. Associated value contains a map of all co-occurred words as map keys and number of co-occurrences as map values. Framework groups same words together and reducer merges value maps.

Each final pair encodes a row in co-occurrence matrix. Combiner or in-mapper combining can be used.

Advantages:

  • Small amount of intermediate key/value pairs. Shuffle and sort phase is faster.
  • Intermediate keys are smaller.
  • Effective local aggregation – smaller number of distinct keys.

Disadvantages:

  • Complex values, more serialization/deserialization overhead.
  • More complex memory management. As value maps may grow too big, the approach has potential for scalability bottleneck.

88. Explain scalability bottleneck caused by stripes approach.

The answer is:

Stripes solution keeps a map of co-occurred words in memory. As the amount of co-occurred words is unlimited, the map size is unlimited too. Huge map does not fit into the memory and causes paging or out of memory errors.  

89. Computing Relative Frequencies

Relative frequencies of co-occurrences problem:

Input: text documents
key: document id
value: text document

Output: key/value pairs where
key: pair(word1, word2)
value: #co-occurrences(word1, word2)/#co-occurrences(word1, any word)

Fix following solution to relative frequencies of co-occurrences problem:

01 class MAPPER
02   method INITIALIZE
03     H = new hash map   
04
05   method MAP(docid a, doc d)
06     for all term w in doc d do
07       for all term u patri neighbors(w) do
08         H(w) = H(w) + 1
09         emit(pair(u, w), count 1)
10
11   method CLOSE
12     for all term w in H
13       emit(pair(w, *), H(w))   
14
15 class REDUCER
16   variable total_occurrences = 0
17
18   method REDUCE(pair (p, u), counts[c1, c2, ..., cn])
19     s = 0
20     for all c in counts[c1, c2, ..., cn] do
21       s = s + c
22
23     if u = *
24       total_occurrences = s
25     else
26       emit(pair p, s/total_occurrences)
27
28 class SORTING_COMPARATOR
29   method compare(key (p1, u1), key (p2, u2))
30     if p1 = p2 AND u1 = *
31       return key1 is lower
32
33     if p1 = p2 AND u2 = *
34       return key2 is lower
35
36     return compare(p1, p2)
The answer is:

Partitioner is missing, framework could send key/value pairs with totals to different reducer than key/pairs with word pairs.

1 class PARTITIONING_COMPARATOR
2   method compare(key (p1, u1), key (p2, u2))
3     if p1 = p2
4       return keys are equal
5
6     return keys are different

90.Describe order inversion design pattern.

The answer is:

Order inversion is used if the algorithm requires two passes through mapper generated key/value pairs with the same key. The first pass generates some overall statistic which is then applied to data during the second pass. The reducer would need to buffer data in the memory just to be able to pass twice through them.

First pass result is calculated by mappers and stored in some internal data structure. The mapper emits the result in closing method, after all usual intermediate key/value pairs.

The pattern requires custom partitioning and sort. First pass result must come to the reducer before usual key/value pairs. Of course, it must come to the same reducer.  

91. Secondary Sorting

Describe value-to-key design pattern. 

The answer is:

Hadoop implementation does not provide sorting for grouped values in reducers input. Value-to-key is used as a workaround.

Part of the value is added to the key. Custom sort then sorts primary by the key and secondary by the added value. Custom partitioner must move all data with the same original key to the same reducer.  

92. Relational Joins

Describe reduce side join between tables with one-on-one relationship.

The answer is:

Mapper produces key/value pairs with join ids as keys and row values as value. Corresponding rows from both tables are grouped together by the framework during shuffle and sort phase.

Reduce method in reducer obtains join id and two values, each represents row from one table. Reducer joins the data.

93. Describe reduce side join between tables with one-to-many relationship.

The answer is:

We assume that the join key is primary key in table called S. Second table is called T. In other words, the table S in on the ‘one’ side of the relationship and the table T is on the ‘many’ side of the relationship.

We have to implement mapper, custom sorter, partitioner and reducer.

Mapper produces key composed from join id and table flag. Partitioner splits the data in such a way, that all key/value pairs with the same join id goes to the same reducer. Custom sort puts key/value pair generated from the table S right before key/value pair with the same join id from the table T.

Reducers input looks like this:
((JoinId1, s)-> row)
((JoinId1, t)-> [rows])
((JoinId2, s)-> row)
((JoinId2, t)-> [rows])
...
((JoinIdn, s), row)
((JoinIdn, t), [rows])

The reducer joins all rows from s pair with all rows from following t pair.

94. Describe reduce side join between tables with many-to-many relationship.

The answer is:

We assume that data are stored in tables called S and T. The table S is smaller. We have to implement mapper, custom sorter, partitioner and reducer.

Mapper produces key composed from join id and table flag. Partitioner splits the data in such a way, that all key/value pairs with the same join id goes to the same reducer. Custom sort puts the key/value pairs generated from the table S is right before all key/value pair with the data from the table T.

Reducers input looks like this:
((JoinId1, s)-> [rows])
((JoinId1, t)-> [rows])
((JoinId2, s)-> [rows])
((JoinId2, t)-> [rows])
...
((JoinIdn, s), [rows])
((JoinIdn, t), [rows])

The reducer buffers all rows with the same JoinId from the table S into the memory and joins them with following T table rows.

All data from the smaller table must fit into the memory – the algorithm has scalability bottleneck problem.

95. Describe map side join between two database tables.

The answer is:

Map side join works only if following assumptions hold:

  • both datasets are sorted by the join key,
  • both datasets are partitioned the same way.

Mapper maps over larger dataset and reads corresponding part of smaller dataset inside the mapper. As the smaller set is partitioned the same way as bigger one, only one map task access the same data. As the data are sorted by the join key, we can perform merge join O(n).

96.Describe memory backed join.

The answer is:

Smaller set of data is loaded into the memory in every mapper. Mappers loop over larger dataset and joins it with data in the memory. If the smaller set is too big to fit into the memory, dataset is loaded into memcached or some other caching solution.

97. Which one is faster? Map side join or reduce side join?

The answer is:

Map side join is faster.

98. What is the difference between Hadoop 1.0 Vs Hadoop 2.0?

hadoop-distribution

99. What is Difference between Secondary namenode, Checkpoint namenode & backupnode ?

A.Before we understand difference between Name node siblings, we need to understand Name node, you can Click Here!

Secondary namenode is deprecated and now it is known as checkpoint node. Hadoop latest version after 2.0 supports checkpoint node.

However Secondary namenode and backup nodes are not same. Backup node performs same operation of checkpointing and do one more task than to Secondary/Checkpoint namenode is maintains an updated copy of FSImage in memory(RAM). It is always synchronized with namenode. So there is no need to copy FSImage & log file from namenode.

Because Backupnode keep upto date changes in RAM, So Backupnode and Namenode’s RAM should be of same size.

100. What is a IdentityMapper and IdentityReducer in MapReduce ?

org.apache.hadoop.mapred.lib.IdentityMapper Implements the identity function, mapping inputs directly to outputs. If MapReduce programmer do not set the Mapper Class using JobConf.setMapperClass then IdentityMapper.class is used as a default value.
org.apache.hadoop.mapred.lib.IdentityReducer Performs no reduction, writing all input values directly to the output. If MapReduce programmer do not set the Reducer Class using JobConf.setReducerClass then IdentityReducer.class is used as a default value.

Stay tuned for more Hadoop interview question under Part-II.

Learn Puppet With Me – Day 4

Estimated Reading Time: 2 minutes

In today’s session, we are going to quickstart writing a basic puppet module. A Very simple example could be creating a directory on remote Linux machines. Hence, we go further with this.

Aim – How to create a directory through Puppet?

Steps:

1.Create a puppet module called ajeet-environment:

[root@puppetmaster modules]# puppet module generate ajeet-environment

ajeet-environment

  1. Under /etc/puppetlabs/puppet/modules/ajeet-environment/manifests , ensure that these files are present:

[root@puppetmaster manifests]# pwd

/etc/puppetlabs/puppet/modules/ajeet-environment/manifests

[root@puppetmaster manifests]# ls -la

total 16

drwxr-xr-x. 2 root root 4096 Jan 20 00:50 .

drwxr-xr-x. 5 root root 4096 Jan 20 00:43 ..

-rw-r–r–. 1 root root  289 Jan 20 00:50 createafile.pp

-rw-r–r–. 1 root root 1015 Jan 20 00:38 init.pp

  1. The Contents should look like:
lscreateafile.pp  init.pp[root@puppetmaster manifests]# cat createafile.pp

class createafile{

# create a directory

file { “/etc/sites-conf”:

ensure => “directory”,

}

# a fuller example, including permissions and ownership

file { “/var/log/admins-app-log”:

ensure => “directory”,

owner  => “root”,

group  => “wheel”,

mode   => 750,

}

}

File: init.pp

class ajeet-environment {include createafile}

File: site.pp

node ‘puppetagent1.cse.com’ {#include ajeet-environment

}

Machine: Puppetagent1.cse.com

puppetagent-1

Hence the directory gets created on the puppetagent.

Learn Puppet with Me – Day 3

Estimated Reading Time: 12 minutes

Puppet is an open source framework and toolset for managing the configuration of computer systems. Puppet can be used to manage configuration on UNIX (including OSX) and Linux platforms, and recently Microsoft Windows platforms as well. Puppet is often used to manage a host throughout its lifecycle: from initial build and installation, to upgrades, maintenance, and finally to end-of-life, whenyou move services elsewhere. Puppet is designed to continuously interact with your hosts, unlike provisioning tools which build your hosts and leave them unmanaged.

Puppet has a simple operating model that is easy to understand and implement. The model is made up of three components:-

• Deployment

• Configuration Language and Resource Abstraction Layer

• Transactional Layer

Puppet is usually deployed in a simple client-server model (Figure 1-2). The server is called a “Puppet master”, the Puppet client software is called an agent and the host itself is defined as a node.

The Puppet master runs as a daemon on a host and contains the configuration required for your environment. The Puppet agents connect to the Puppet master via an encrypted and authenticated connection using standard SSL, and retrieve or “pull” any configuration to be applied.

Importantly, if the Puppet agent has no configuration available or already has the required configuration then Puppet will do nothing. This means that Puppet will only make changes to your environment if they are required. The whole process is called a configuration run.

Each agent can run Puppet as a daemon via a mechanism such as cron, or the connection can be manually triggered. The usual practice is to run Puppet as a daemon and have it periodically check with the master to confirm that its configuration is up-to-date or to retrieve any new configuration. However,many people find being able to trigger Puppet via a mechanism such as cron, or manually, better suits their needs. By default, the Puppet agent will check the master for new or changed configuration once every 30 minutes. You can configure this period to suit your environment.

Configuration Language and Resource Abstraction Layer

Puppet uses a declarative language to define your configuration items, which Puppet calls “resources.” This declarative nature creates an important distinction between Puppet and many other configuration tools. A declarative language makes statements about the state of your configuration – for example, it declares that a package should be installed or a service should be started.

Most configuration tools, such as a shell or Perl script, are imperative or procedural. They describe HOW things should be done rather than the desired end state – for example, most custom scripts used to manage configuration would be considered imperative. This means Puppet users just declare what the state of their hosts should be: what packages should be installed, what services should be running, etc. With Puppet, the system administrator doesn’t care HOW this state is achieved – that’s Puppet’s problem. Instead, we abstract our host’s configuration into resources.

Configuration Language

What does this declarative language mean in real terms? Let’s look at a simple example. We have an environment with Red Hat Enterprise Linux, Ubuntu, and Solaris hosts and we want to install the vim application on all our hosts. To do this manually, we’d need to write a script that does the following:

• Connects to the required hosts (including handling passwords or keys)

• Checks to see if vim is installed

• If not, uses the appropriate command for each platform to install vim, for example on Red Hat the yum command and on Ubuntu the apt-get command

• Potentially reports the results of this action to ensure completion and success

Puppet approaches this process quite differently. In Puppet, we define a configuration resource for the vim package. Each resource is made up of a type (what sort of resource is being managed: packages, services, or cron jobs), a title (the name of the resource), and a series of attributes (values that specify the state of the resource – for example, whether a service is started or stopped).

Example:

A Puppet Resource

package { “vim”:

ensure => present,

}

Resource Abstraction Layer

With our resource created, Puppet takes care of the details of how to manage that resource when our agents connect. Puppet handles the “how” by knowing how different platforms and operating systems manage certain types of resources. Each type has a number of “providers.” A provider contains the “how” of managing packages using a particular package management tool. For the package type, forexample, for there are more than 20 providers covering a variety of tools including yum, aptitude, pkgadd, ports, and emerge.

When an agent connects, Puppet uses a tool called “Facter” to return information about that agent, including what operating system it is running. Puppet then chooses the appropriate package provider for that operating system and uses that provider to check if the vimpackage is installed. For example, on Red Hat it would execute yum, on Ubuntu it would execute aptitude, and on Solaris it would use the pkg command. If the package is not installed, then Puppet will install it. If the package is already installed, Puppet does nothing. Puppet will then report its success or failure in applying the resource back to the Puppet master.

INTRODUCING FACTER AND FACTS

Facter is a system inventory tool which returns “facts” about each agent, such as its hostname, IP address, operating system and version, and other configuration items. These facts are gathered when the agent runs. The facts are then sent to the Puppet master, and automatically created as variables available to Puppet.You can see the facts available on your clients by running the facter binary from the command line. Each fact is returned as a key => value pair. For example:

operatingsystem => Ubuntu

ipaddress => 10.0.0.10

Transactional Layer

Puppet’s transactional layer is its engine. A Puppet transaction encompasses the process of configuring each host including:

• Interpret and compile your configuration

• Communicate the compiled configuration to the agent

• Apply the configuration on the agent

• Report the results of that application to the master

The first step Puppet takes is to analyze your configuration and calculate how to apply it to your agent. To do this, Puppet creates a graph showing all resources, their relationships to each other and to each agent. This allows Puppet to work out in what order, based on relationships you create, to apply each resource to your host. This model is one of Puppet’s most powerful features. Puppet then takes the resources and compiles them into a “catalog” for each agent. The catalog is sent to the host and applied by the Puppet agent. The results of this application are then sent back to the master in the form of a report.

The transaction layer allows configurations to be created and applied repeatedly on the host.Puppet calls this idempotent, meaning multiple applications of the same operation will yield the same results. Puppet configuration can be safely run multiple times with the same outcome on your host and hence ensuring your configuration stays consistent.

3. Understanding Puppet Components

If you look at /etc/puppet directory , you will find various components of puppet underlying.

[root@puppet-server puppet]# ls -la

total 28

drwxr-xr-x. 4 puppet puppet 4096 Aug 23 11:41 .

drwxr-xr-x. 79 root root 4096 Aug 25 04:01 ..

-rwxr-xr-x. 1 puppet puppet 2552 Aug 21 17:49 auth.conf

-rwxr-xr-x. 1 puppet puppet 381 Jun 20 18:24 fileserver.conf

drwxr-xr-x. 4 puppet puppet 4096 Aug 25 06:50 manifests

drwxr-xr-x. 11 puppet puppet 4096 Aug 25 06:49 modules

-rwxr-xr-x. 1 puppet puppet 1059 Aug 23 11:41 puppet.conf

Let’s talk about what is manifest.

“Manifest” is Puppet’s term for files containing configuration information. Manifest files have a suffix of .pp. This directory and file is often already created when the Puppet packages are installed. If it hasn’t already been created, then create this directory and file now:

# mkdir /etc/puppet/manifests

Under manifests, there are important files such as nodes.pp, site.pp, template.pp and few classes and definitions. We are going to cover those too here.

Puppet manifests are made up of a number of major components:

• Resources – Individual configuration items

• Files – Physical files you can serve out to your agents

• Templates – Template files that you can use to populate files

• Nodes – Specifies the configuration of each agent

• Classes – Collections of resources

• Definitions – Composite collections of resources

These components are wrapped in a configuration language that includes variables, conditionals, arrays and other features. Later in this chapter we’ll introduce you to the basics of the Puppet language and its elements. In the next chapter, we’ll extend your knowledge of the language by taking you through an implementation of a multi-agent site managed with Puppet.

The site.pp file

The site.pp file tells Puppet where and what configuration to load for our clients. We’re going to store

this file in a directory called manifests under the /etc/puppet directory.

Please Note: Puppet will not start without the site.pp file being present.

Our first step in creating our first agent configuration is defining and extending the site.ppfile. See an example of this file in Listing 1-3.

The site.pp File

import ‘nodes.pp’

$puppetserver = ‘puppet.example.com’

The import directive tells Puppet to load a file called nodes.pp. This directive is used to include any Puppet configuration we want to load.

When Puppet starts, it will now load the nodes.pp file and process the contents. In this case, this file will contain the node definitions we create for each agent we connect. You can also import multiple files like so:

import ‘nodes/*’

import ‘classes/*’

The import statement will load all files with a suffix of .pp in the directories nodes andclasses.

The $puppetserver statement sets a variable. In Puppet, configuration statements starting with a dollar sign are variables used to specify values that you can use in Puppet configuration.

In Listing 1-3, we’ve created a variable that contains the fully qualified domain name of our Puppet master, enclosed in double quotes.

Agent Configuration

Let’s add our first agent definition to the nodes.pp file we’ve just asked Puppet to import. In Puppet manifests, agents are defined using node statements.

# touch /etc/puppet/manifests/nodes.pp.

You can see the node definition we’re going to add in Listing 1-4.

Listing 1-4. Our Node Configuration

node ‘puppet-client.test.com’ {

include sudo

}

Next, we specify an include directive in our node definition. The include directive specifies a collection of configuration that we want to apply to our host. There are two types of collections we can

include in a node:

• Classes – a basic collection of resources

• Modules – an advanced, portable collection of resources that can include classes, definitions, and other supporting configuration.You can include multiple collections by using multipleinclude directives or separating each

collection with commas.

include sudo

include sshd

include vim, syslog-ng

Creating our first module

The next step is to create the sudo module. A module is a collection of manifests, resources, files, templates, classes, and definitions. A single module would contain everything required to configure a particular application. For example, it could contain all the resources (specified in manifest files), files and associated configuration to configure Apache or the sudo command on a host.

Each module needs a specific directory structure and a file called init.pp. This structure allows Puppet to automatically load modules. To perform this automatic loading, Puppet checks a series of directories called the module path. The module path is configured with themodulepath configuration

option in the [main] section of the puppet.conf file. By default, Puppet looks for modules in the /etc/puppet/modules and /var/lib/puppet/modules directories, but you can add additional locations if

required:

[main]

moduledir = /etc/puppet/modules:/var/lib/puppet/modules:/opt/modules

The automatic loading of modules means, unlike our nodes.pp file, modules don’t need to be loaded into Puppet using the import directive.

Module Structure

Let’s start by creating a module directory and file structure in Listing 1-5. We’re going to create this structure under the directory /etc/puppet/modules. We will name the modulesudo. Modules (and classes) must be normal words containing only letters, numbers, underscores and dashes.

Listing 1-5. Module Structure

# mkdir –p /etc/puppet/modules/sudo/{files,templates,manifests}

# touch /etc/puppet/modules/sudo/manifests/init.pp

The manifests directory will hold our init.pp file and any other configuration. The init.pp file is the core of your module and every module must have one. The files directory will hold any files we wish to serve as part of our module. The templates directory will contain any templates that our module might use.

The init.pp file

Now let’s look inside our sudo module, starting with the init.pp file, which we can see in Listing 1-6.

Listing 1-6. The sudo module’s init.pp file

class sudo {

package { sudo:

ensure => present,

}

if $operatingsystem == “Ubuntu” {

package { “sudo-ldap”:

ensure => present,

require => Package[“sudo”],

}

}

file { “/etc/sudoers”:

owner => “root”,

group => “root”,

mode => 0440,

source => “puppet://$puppetserver/modules/sudo/etc/sudoers”,

require => Package[“sudo”],

}

}

Our sudo module’s init.pp file contains a single class, also called sudo. There are three resources in the class, two packages and a file resource.The first package resource ensures that the sudo package is installed, ensure => present. The second package resource uses Puppet’s if/else syntax to set a condition on the installation of the sudoldap package.

The next portion of our source value tells Puppet where to look for the file. This is the equivalent of the path to a network file share. The first portion of this share is modules,which tells us that the file is stored in a module. Next we specify the name of the module the file is contained in, in this case sudo.

Finally, we specify the path inside that module to find the file.

All files in modules are stored under the files directory; this is considered the “root” of the module’s file “share.” In our case, we would create the directory etc under the files directory and create the sudoers file in this directory.

puppet$ mkdir –p /etc/puppet/modules/sudo/files/etc

puppet$ cp /etc/sudoers /etc/puppet/manifests/files/etc/sudoers

Applying Our First Configuration

We’ve created our first Puppet module! Let’s step through what will happen when we connect an agent that includes this module.

1. It will install the sudo package.

2. If it’s an Ubuntu host, then it will also install the sudo-ldap package

3. Lastly, it will download the sudoers file and install it into /etc/sudoers.

Now let’s see this in action and include our new module on the agent we’ve created, Nod1.example.com.

Remember we created a node statement for our host in Listing 1.4:

node ‘node1.example.com’ {

include sudo

}

When the agent connects it will now include the sudo module. To do this we run the Puppet agent again, as shown in Listing 1-7.

Listing 1-7. Applying Our First Configuration

puppet# puppet agent –server=puppet.example.com –no-daemonize –verbose –onetime

notice: Starting Puppet client version 2.6.1

info: Caching catalog for node1.example.com

info: Applying configuration version ‘1272631279’

notice: //sudo/Package[sudo]/ensure: created

notice: //sudo/File[/etc/sudoers]/checksum: checksum changed

‘{md5}9f95a522f5265b7e7945ff65369acdd2’ to ‘{md5}d657d8d55ecdf88a2d11da73ac5662a4’

info: Filebucket[/var/lib/puppet/clientbucket]: Adding

/etc/sudoers(d657d8d55ecdf88a2d11da73ac5662a4)

info: //sudo/File[/etc/sudoers]: Filebucketed /etc/sudoers to puppet with sum

d657d8d55ecdf88a2d11da73ac5662a4

notice: //sudo/File[/etc/sudoers]/content: content changed

‘{md5}d657d8d55ecdf88a2d11da73ac5662a4’

In our next section, we are going to talk more on Puppet Modules.

How to install Nagios on Linux?

Estimated Reading Time: 8 minutes

Last week I thought of setting up Nagios on my Linux Box.I installed a fresh piece of RHEL on my Virtualbox and everything went fine. I thought of putting this complete setup on my blog and here it is : “A Complete Monitoring Tool for your Linux Box”

nagios

Here is my Machine Configuration:

[root@irc ~]# cat /etc/redhat-release

Red Hat Enterprise Linux Server release 5.3 (Tikanga)

[root@irc ~]#

[root@irc ~]# uname -arn

Linux irc.chatserver.com 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

[root@irc ~]#

1) Create Account Information

Become the root user.

su -l

Create a new nagios user account and give it a password.

/usr/sbin/useradd -m nagios

passwd nagios

Create a new nagcmd group for allowing external commands to be submitted through the web interface. Add both the nagios user and the apache user to the group.

/usr/sbin/groupadd nagcmd

/usr/sbin/usermod -a -G nagcmd nagios

/usr/sbin/usermod -a -G nagcmd apache

2) Download Nagios and the Plugins

Create a directory for storing the downloads.

mkdir ~/downloads

cd ~/downloads

Download the source code tarballs of both Nagios and the Nagios plugins (visit http://www.nagios.org/download/ for links to the latest versions). These directions were tested with Nagios 3.1.1 and Nagios Plugins 1.4.11.

wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.2.0.tar.gz

wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.11.tar.gz

3) Compile and Install Nagios

Extract the Nagios source code tarball.

cd ~/downloads

tar xzf nagios-3.2.0.tar.gz

cd nagios-3.2.0

Run the Nagios configure script, passing the name of the group you created earlier like so:

./configure –with-command-group=nagcmd

Compile the Nagios source code.

make all

Install binaries, init script, sample config files and set permissions on the external command directory.

make install

make install-init

make install-config

make install-commandmode

Don’t start Nagios yet – there’s still more that needs to be done…

4) Customize Configuration

Sample configuration files have now been installed in the /usr/local/nagios/etc directory. These sample files should work fine for getting started with Nagios. You’ll need to make just one change before you proceed…

Edit the /usr/local/nagios/etc/objects/contacts.cfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address you’d like to use for receiving alerts.

vi /usr/local/nagios/etc/objects/contacts.cfg

5) Configure the Web Interface

Install the Nagios web config file in the Apache conf.d directory.

make install-webconf

Create a nagiosadmin account for logging into the Nagios web interface. Remember the password you assign to this account – you’ll need it later.

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Restart Apache to make the new settings take effect.

service httpd restart

Note: Consider implementing the ehanced CGI security measures described here to ensure that your web authentication credentials are not compromised.

6) Compile and Install the Nagios Plugins

Extract the Nagios plugins source code tarball.

cd ~/downloads

tar xzf nagios-plugins-1.4.11.tar.gz

cd nagios-plugins-1.4.11

Compile and install the plugins.

./configure –with-nagios-user=nagios –with-nagios-group=nagios

make

make install

7) Start Nagios

Add Nagios to the list of system services and have it automatically start when the system boots.

chkconfig –add nagios

chkconfig nagios on

Verify the sample Nagios configuration files.

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

If there are no errors, start Nagios.

service nagios start

8) Modify SELinux Settings

Fedora ships with SELinux (Security Enhanced Linux) installed and in Enforcing mode by default. This can result in “Internal Server Error” messages when you attempt to access the Nagios CGIs.

See if SELinux is in Enforcing mode.

getenforce

Put SELinux into Permissive mode.

setenforce 0

To make this change permanent, you’ll have to modify the settings in /etc/selinux/config and reboot.

Instead of disabling SELinux or setting it to permissive mode, you can use the following command to run the CGIs under SELinux enforcing/targeted mode:

chcon -R -t httpd_sys_content_t /usr/local/nagios/sbin/

chcon -R -t httpd_sys_content_t /usr/local/nagios/share/

For information on running the Nagios CGIs under Enforcing mode with a targeted policy, visit the Nagios Support Portal or Nagios Community Wiki.

9) Login to the Web Interface

You should now be able to access the Nagios web interface at the URL below. You’ll be prompted for the username (nagiosadmin) and password you specified earlier.

http://localhost/nagios/

Click on the “Service Detail” navbar link to see details of what’s being monitored on your local machine. It will take a few minutes for Nagios to check all the services associated with your machine, as the checks are spread out over time.

10) Other Modifications

Make sure your machine’s firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely.

Configuring email notifications is out of the scope of this documentation. While Nagios is currently configured to send you email notifications, your system may not yet have a mail program properly installed or configured. Refer to your system documentation, search the web, or look to the Nagios Support Portal or Nagios Community Wiki for specific instructions on configuring your system to send email messages to external addresses. More information on notifications can be found here.

11) You’re Done

Congratulations! You sucessfully installed Nagios. Your journey into monitoring is just beginning.

Example:

Say, If You Nagios Server is 10.14.236.140. You need to monitor the Linux Machine with IP: 10.14.236.70. You need to follow up like this:

[root@irc objects]# pwd

/usr/local/nagios/etc/objects

[root@irc objects]#

[root@irc objects]# ls

commands.cfg localhost.cfg printer.cfg switch.cfg timeperiods.cfg

contacts.cfg localhost.cfg.orig remotehost.cfg templates.cfg windows.cfg

[root@irc objects]#

The File should looks like:

# HOST DEFINITION

#

###############################################################################

###############################################################################

# Define a host for the local machine

define host{

use linux-server ; Name of host template to use

; This host definition will inherit all variab les that are defined

; in (or inherited by) the linux-server host t emplate definition.

host_name localhost

alias localhost

address 127.0.0.1

}

define host{

use linux-server ; Name of host template to use

; This host definition will inherit all variab les that are defined

; in (or inherited by) the linux-server host t emplate definition.

host_name ideath.logic.com

alias ideath

address 10.14.236.140

}

###############################################################################

###############################################################################

#

# HOST GROUP DEFINITION

#

###############################################################################

###############################################################################

# Define an optional hostgroup for Linux machines

define hostgroup{

hostgroup_name linux-server ; The name of the hostgroup

alias Linux Servers ; Long name of the group

members localhost ; Comma separated list of hosts that belong to this group

}

###############################################################################

###############################################################################

#

# SERVICE DEFINITIONS

#

###############################################################################

###############################################################################

# Define a service to “ping” the local machine

define service{

use local-service ; Name of service template to use

host_name localhost

service_description PING

check_command check_ping!100.0,20%!500.0,60%

}

define service{

use local-service ; Name of service template to use

host_name ideath.logica.com

service_description PING

check_command check_ping!100.0,20%!500.0,60%

}

# Define a service to check the disk space of the root partition

# on the local machine. Warning if < 20% free, critical if # < 10% free space on partition. define service{ use local-service ; Name of service template to use host_name localhost service_description Root Partition check_command check_local_disk!20%!10%!/ } define service{ use local-service ; Name of service template to use host_name ideath.logic.com service_description Root Partition check_command check_local_disk!20%!10%!/ } # Define a service to check the number of currently logged in # users on the local machine. Warning if > 20 users, critical

# if > 50 users.

define service{

use local-service ; Name of service template to use

host_name localhost

service_description Current Users

check_command check_local_users!20!50

}

define service{

use local-service ; Name of service template to use

host_name ideath.logic.com

service_description Current Users

check_command check_local_users!20!50

}

# Define a service to check the number of currently running procs

# on the local machine. Warning if > 250 processes, critical if

# > 400 users.

define service{

use local-service ; Name of service template to use

host_name localhost

service_description Total Processes

check_command check_local_procs!250!400!RSZDT

}

define service{

use local-service ; Name of service template to use

host_name ideath.logic.com

service_description Total Processes

check_command check_local_procs!250!400!RSZDT

}

# Define a service to check the load on the local machine.

define service{

use local-service ; Name of service template to use

host_name localhost

service_description Current Load

check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0

}

define service{

use local-service ; Name of service template to use

host_name ideath.logic.com

service_description Current Load

check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0

}

# Define a service to check the swap usage the local machine.

# Critical if less than 10% of swap is free, warning if less than 20% is free

define service{

use local-service ; Name of service template to use

host_name localhost

service_description Swap Usage

check_command check_local_swap!20!10

}

define service{

use local-service ; Name of service template to use

host_name ideath.logic.com

service_description Swap Usage

check_command check_local_swap!20!10

}

# Define a service to check SSH on the local machine.

# Disable notifications for this service by default, as not all users may have SSH enabled.

define service{

use local-service ; Name of service template to use

host_name localhost

service_description SSH

check_command check_ssh

notifications_enabled 0

}

define service{

use local-service ; Name of service template to use

host_name ideath.logic.com

service_description SSH

check_command check_ssh

check_period 24×7

notifications_enabled 0

is_volatile 0

max_check_attempts 4

normal_check_interval 5

retry_check_interval 1

contact_groups admins

notification_options w,c,u,r

notification_interval 960

notification_period 24×7

check_command check_ssh

}

# Define a service to check HTTP on the local machine.

# Disable notifications for this service by default, as not all users may have HTTP enabled.

define service{

use local-service ; Name of service template to use

host_name localhost

service_description HTTP

check_command check_http

notifications_enabled 0

}

define service{

use local-service ; Name of service template to use

host_name ideath.logic.com

service_description HTTP

check_command check_http

notifications_enabled 0

is_volatile 0

max_check_attempts 4

normal_check_interval 5

retry_check_interval 1

contact_groups admins

notification_options w,c,u,r

notification_interval 960

notification_period 24×7

check_command check_http

}

Ideath.logic.com is the hostname of 10.14.236.70.

Do make entry in /etc/hosts if it is unable to resolve the IP(or else check the DNS).