Skip to content

rocketlawyer/cloudera-centos

Repository files navigation

Cloudera Playbook

An Ansible Playbook that installs the Cloudera stack on RHEL/CentOS

Running the playbook

$ vi ~/.ansible.cfg

[defaults]
# disable key check if host is not initially in 'known_hosts'
host_key_checking = False

[ssh_connection]
# if True, make ansible use scp if the connection type is ssh (default is sftp)
scp_if_ssh = True
$ vi ~/ansible_hosts

[scm_server]
<host>        license_file=/path/to/cloudera_license.txt

[db_server]
<host>

[krb5_server]
<host>        default_realm=<REALM>

[utility_servers:children]
scm_server
db_server
krb5_server

[gateway_servers]
<host>        host_template=HostTemplate-Gateway role_ref_names=HDFS-HTTPFS-1

[master_servers]
<host>        host_template=HostTemplate-Master1
<host>        host_template=HostTemplate-Master2
<host>        host_template=HostTemplate-Master3

[worker_servers]
<host>
<host>
<host>

[worker_servers:vars]
host_template=HostTemplate-Workers

[cdh_servers:children]
utility_servers
gateway_servers
master_servers
worker_servers
  • Run playbook
$ ansible-playbook -i hosts site.yml
    

Ansible communicates with the hosts defined in the inventory over SSH. It assumes you’re using SSH keys to authenticate so your public SSH key should exist in authorized_keys on those hosts. Your user will need sudo privileges to install the required packages.

By default Ansible will connect to the remote hosts using the current user (as SSH would). To override the remote user name you can specify the --user option in the command, or add the following variables to the inventory:

[all:vars]
ansible_user=pythian

Overriding CDH service/role configuration

The playbook uses Cloudera Manager Templates to provision a cluster. As part of the template import process Cloudera Manager applies Autoconfiguration rules that set properties such as memory and CPU allocations for various roles.

If the cluster has different hardware or operational requirements then you can override these properties in group_vars/cdh_servers. For example:

cdh_services:
  - type: hdfs        
    datanode_java_heapsize: 10737418240

These properties get added as variables to the rendered template's instantiator block and can be referenced from the service configs. For example roles/cdh/templates/hdfs.j2:

"roleType": "DATANODE",
"configs": [{
  "name": "datanode_java_heapsize",
  "variable": "DATANODE_JAVA_HEAPSIZE"
}

Backup Restore

Default location of backup is at /mnt/fuse_stor/testfile.txt. Variable can be found at

backup_file: testfile.txt

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages