Storage

Configure DC/OS Local Volume

yum -y install gdisk  
yum -y install util-linux  
gdisk /dev/sdc  
vgextend vgVAR /dev/sdc1  
lvextend --extents +100%FREE /dev/vgVAR/lvVAR /dev/sdc1  
xfs\_growfs /dev/mapper/vgVAR-lvVAR

/var/lib/dcos/mesos-resources -- the current Mesos Agent resource state  
/var/lib/mesos/slave/meta/slaves/latest -- Mesos Agent checkpoint state

mkdir -p /dcos/volume0 && mkdir -p /var/local/ImageSilo  
dd if=/dev/zero of=/var/local/ImageSilo/volume0.img bs=1M count=200000  
losetup /dev/loop0 /var/local/ImageSilo/volume0.img  
mkfs -t xfs /dev/loop0  
losetup -d /dev/loop0  
systemctl stop dcos-mesos-slave.service  
rm -f /var/lib/dcos/mesos-resources  
rm -f /var/lib/mesos/slave/meta/slaves/latest  
echo "/var/local/ImageSilo/volume0.img /dcos/volume0 auto loop 0 2" \| sudo tee -a /etc/fstab  
mount /dcos/volume0  
shutdown -r now

Destroy HDFS Service

docker run mesosphere/janitor /janitor.py -r hdfs-role -p hdfs-principal -z hdfs  
(run in leader node) curl -d 'frameworkId=149b8e15-6168-42fb-a2d8-785f4159eaa9-1649' -X POST http://192.168.2.210:5050/master/teardown  
docker run mesosphere/janitor /janitor.py -z dcos-service-hdfs  
docker run mesosphere/janitor /janitor.py -z hadoop-ha

Destroy Kafka Service

(run in leader node) curl -d 'frameworkId=4e99a619-30d4-4d10-a640-78d45a3aab38-0067' -X POST http://192.168.2.210:5050/master/teardown
docker run mesosphere/janitor /janitor.py -z dcos-service-kafka

HDFS resiliency model for an HA deployment

A quick summary of the HDFS resiliency model for an HA deployment like yours:

  • The two NameNodes form an active/standby pair. In the event of a machine restart of the active, then the system detects failure of the active and the standby takes over as the new active. Once the machine completes its restart, the NameNode process runs again, and it becomes the new standby. There is no downtime unless both NameNodes are down simultaneously. The data on the host (e.g. the fsimage metadata file) is typically maintained between restarts. If this is not the case in your environment, then you'll need additional recovery steps to re-establish the standby, such as by running the

    hdfs namenode -bootstrapStandby

    command.

  • The 3 JournalNodes form a quorum. In the event of a machine restart, the NameNode can continue writing its edit log transactions to the remaining 2 JournalNodes. Once the machine completes its restart, the JournalNode process runs again, catches up with transactions it may have missed, and then the NameNode writes to all 3 again. There is no downtime unless 2 or more JournalNodes are down simultaneously. If data (e.g. the edits files) are not maintained across restarts, then the restarted JournalNode will catch up by copying from a running JournalNode.

  • DataNodes are mostly disposable. In the event of a machine restart, clients will be rerouted to other running DataNodes for their reads and writes (assuming the typical replication factor of 3). Once the machine completes its restart, the DataNode process runs again, and it can start serving read/write traffic from clients again. There is no downtime unless a mass simultaneous failure event (extremely unlikely and probably correlated with bigger data center problems) causes all the DataNodes hosting replicas of a particular block are down simultaneously. If data (the block file directory) is not maintained across restarts, then after a restart, it will look like a whole new DataNode coming online. If this causes cluster imbalance, then that can be remedied by running the HDFS Balancer.

dcos hdfs config target

{
  "service": {
    "role": "hdfs-role",
    "principal": "hdfs-principal",
    "user": "nobody",
    "name": "hdfs",
    "secret": "",
    "checkpoint": true
  },
  "executor": {
    "hdfs_url": "http://192.168.2.240:8800/hdfs/assets/hadoop-2.6.0-cdh5.7.1-dcos.tar.gz",
    "executor_url": "http://192.168.2.240:8800/hdfs/assets/0.9.2-2.6.0/executor.zip",
    "jre_url": "http://192.168.2.240:8800/hdfs/assets/jre-8u91-linux-x64.tar.gz",
    "hdfs_version": "2.5.0",
    "command": "./executor/bin/hdfs-executor executor/conf/executor.yml",
    "java_home": "./jre1.8.0_91",
    "hdfs_home": "./hadoop-2.6.0-cdh5.7.1",
    "cpus": 0.5,
    "memory_mb": 1024,
    "heap_mb": 768,
    "disk_mb": 1024
  },
  "hdfs": {
    "service_name": "hdfs",
    "name_node_bind_host": "0.0.0.0",
    "name_node_rpc_port": 9001,
    "name_node_http_port": 9002,
    "journal_nodes": 3,
    "journal_node_address": "0.0.0.0",
    "journal_node_rpc_port": 8485,
    "journal_node_http_port": 8480,
    "data_node_address": "0.0.0.0",
    "data_node_rpc_port": 9003,
    "data_node_http_port": 9004,
    "data_node_ipc_port": 9005,
    "volume_directory": "volume",
    "domain_socket_directory": "",
    "zookeeper_quorum": "master.mesos:2181",
    "permissions_enabled": false,
    "data_node_bandwidth_per_second": 41943040,
    "name_node_threshold_percentage": 0.9,
    "name_node_heartbeat_recheck_interval": 60000,
    "data_node_handler_count": 10,
    "name_node_handler_count": 20,
    "compress_image": true,
    "image_compression_codec": "org.apache.hadoop.io.compress.SnappyCodec",
    "name_node_invalidate_work_percentage": 0.95,
    "name_node_replication_work_multiplier": 4,
    "client_read_short_circuit": false,
    "client_read_short_circuit_streams": 1000,
    "client_read_short_circuit_cache_expiry_ms": 1000
  },
  "core": {
    "default_name": "hdfs://hdfs",
    "hue_hosts": "*",
    "hue_groups": "*",
    "root_hosts": "*",
    "root_groups": "*",
    "http_fs_hosts": "*",
    "http_fs_groups": "*"
  },
  "curator": "master.mesos:2181",
  "nameNode": {
    "cpus": 0.5,
    "memory_mb": 4096,
    "heap_mb": 2048,
    "disk_mb": 10240,
    "disk_type": "ROOT"
  },
  "journalNode": {
    "cpus": 0.5,
    "memory_mb": 2048,
    "heap_mb": 2048,
    "disk_mb": 10240,
    "disk_type": "ROOT"
  },
  "dataNode": {
    "cpus": 0.5,
    "memory_mb": 4096,
    "heap_mb": 2048,
    "disk_mb": 102400,
    "disk_type": "ROOT"
  },
  "dataNodesCount": 5
}

dcos hdfs plan show

{
  "phases": [
    {
      "id": "902bd1ed-9970-4d53-a48a-399910cbd7be",
      "name": "Reconciliation",
      "blocks": [
        {
          "id": "e5c3f8f4-9ffb-423f-b485-b1033b4c5fb7",
          "status": "Complete",
          "name": "Reconciliation",
          "message": "Reconciliation complete",
          "has_decision_point": false
        }
      ],
      "status": "Complete"
    },
    {
      "id": "4bb3231a-5419-4b32-a5b5-01ecbba4f242",
      "name": "Quorum Journal",
      "blocks": [
        {
          "id": "12c7d935-7790-43a5-a8a7-5071881e632f",
          "status": "Complete",
          "name": "journalnode-0",
          "message": "journalnode-0 in target state.",
          "has_decision_point": false
        },
        {
          "id": "6da34299-6d47-4824-afcd-bc1b488c3e9a",
          "status": "Complete",
          "name": "journalnode-1",
          "message": "journalnode-1 in target state.",
          "has_decision_point": false
        },
        {
          "id": "1e68c2be-cc6b-4224-87f1-8eb9b0b9b681",
          "status": "Complete",
          "name": "journalnode-2",
          "message": "journalnode-2 in target state.",
          "has_decision_point": false
        }
      ],
      "status": "Complete"
    },
    {
      "id": "f5351d29-9824-4e65-8edb-58c95dc8cfd6",
      "name": "Name Service",
      "blocks": [
        {
          "id": "09c41be5-7156-4fa8-aa41-f205bdf8b2d8",
          "status": "Complete",
          "name": "namenode-0",
          "message": "namenode-0 in target state.",
          "has_decision_point": false
        },
        {
          "id": "9f7f865e-2104-4488-a1fc-2cbaec4ef441",
          "status": "Complete",
          "name": "namenode-1",
          "message": "namenode-1 in target state.",
          "has_decision_point": false
        }
      ],
      "status": "Complete"
    },
    {
      "id": "9ed50be2-8c4a-4bb4-96b5-8e05111046be",
      "name": "Distributed Storage",
      "blocks": [
        {
          "id": "f3c9d505-84e1-4ac9-bd73-3bf2ea6a7e57",
          "status": "Complete",
          "name": "datanode-0",
          "message": "datanode-0 in target state.",
          "has_decision_point": false
        },
        {
          "id": "9d58748e-484a-4ffa-9598-5bf283e4a23e",
          "status": "Complete",
          "name": "datanode-1",
          "message": "datanode-1 in target state.",
          "has_decision_point": false
        },
        {
          "id": "437860f3-f865-4ce7-8c1b-ae88cb675904",
          "status": "Complete",
          "name": "datanode-2",
          "message": "datanode-2 in target state.",
          "has_decision_point": false
        },
        {
          "id": "81687ff3-ae52-49aa-9602-482f4508e531",
          "status": "Complete",
          "name": "datanode-3",
          "message": "datanode-3 in target state.",
          "has_decision_point": false
        },
        {
          "id": "e6e07056-3432-4422-b233-288c4f4b1d92",
          "status": "Complete",
          "name": "datanode-4",
          "message": "datanode-4 in target state.",
          "has_decision_point": false
        }
      ],
      "status": "Complete"
    }
  ],
  "errors": [],
  "status": "Complete"
}

HDFS commands

bin/hadoop fs -copyToLocal hdfs://name-0-node.hdfs.mesos:9001/parquet /var/hdfs_backup
bin/hadoop fs -ls hdfs://name-0-node.hdfs.mesos:9001/
bin/hadoop fs -copyFromLocal -p /var/hdfs_backup/* hdfs://name-0-node.hdfs.mesos:9001/
bin/hadoop fs -getmerge hdfs://name-0-node.hdfs.mesos:9001/csv/DPM_Loss.csv /tmp/DPM_Loss.csv

Last updated

Was this helpful?