Tom White - Hadoop The Definitive Guide_ 4 edition - 2015 (811394), страница 69
Текст из файла (страница 69)
You can start the balancer with:% start-balancer.shThe -threshold argument specifies the threshold percentage that defines what it meansfor the cluster to be balanced. The flag is optional; if omitted, the threshold is 10%. Atany one time, only one balancer may be running on the cluster.The balancer runs until the cluster is balanced, it cannot move any more blocks, or itloses contact with the namenode. It produces a logfile in the standard log directory,where it writes a line for every iteration of redistribution that it carries out.
Here is theoutput from a short run on a small cluster (slightly reformatted to fit the page):Time StampIteration# Bytes Already MovedMar 18, 2009 5:23:42 PM 00 KBMar 18, 2009 5:27:14 PM 1195.24 MBThe cluster is balanced. Exiting...Balancing took 6.072933333333333 minutes...Left To Move219.21 MB22.45 MB...Being Moved150.29 MB150.29 MBThe balancer is designed to run in the background without unduly taxing the cluster orinterfering with other clients using the cluster. It limits the bandwidth that it uses tocopy a block from one node to another. The default is a modest 1 MB/s, but this can bechanged by setting the dfs.datanode.balance.bandwidthPerSec property in hdfssite.xml, specified in bytes.HDFS|329MonitoringMonitoring is an important part of system administration. In this section, we look atthe monitoring facilities in Hadoop and how they can hook into external monitoringsystems.The purpose of monitoring is to detect when the cluster is not providing the expectedlevel of service.
The master daemons are the most important to monitor: the namenodes(primary and secondary) and the resource manager. Failure of datanodes and nodemanagers is to be expected, particularly on larger clusters, so you should provide extracapacity so that the cluster can tolerate having a small percentage of dead nodes at anytime.In addition to the facilities described next, some administrators run test jobs on a pe‐riodic basis as a test of the cluster’s health.LoggingAll Hadoop daemons produce logfiles that can be very useful for finding out what ishappening in the system. “System logfiles” on page 295 explains how to configure thesefiles.Setting log levelsWhen debugging a problem, it is very convenient to be able to change the log leveltemporarily for a particular component in the system.Hadoop daemons have a web page for changing the log level for any log4j log name,which can be found at /logLevel in the daemon’s web UI.
By convention, log names inHadoop correspond to the names of the classes doing the logging, although there areexceptions to this rule, so you should consult the source code to find log names.It’s also possible to enable logging for all packages that start with a given prefix. Forexample, to enable debug logging for all classes related to the resource manager, wewould visit the its web UI at http://resource-manager-host:8088/logLevel and set thelog name org.apache.hadoop.yarn.server.resourcemanager to level DEBUG.The same thing can be achieved from the command line as follows:% hadoop daemonlog -setlevel resource-manager-host:8088 \org.apache.hadoop.yarn.server.resourcemanager DEBUGLog levels changed in this way are reset when the daemon restarts, which is usually whatyou want.
However, to make a persistent change to a log level, you can simply changethe log4j.properties file in the configuration directory. In this case, the line to add is:log4j.logger.org.apache.hadoop.yarn.server.resourcemanager=DEBUG330|Chapter 11: Administering HadoopGetting stack tracesHadoop daemons expose a web page (/stacks in the web UI) that produces a threaddump for all running threads in the daemon’s JVM. For example, you can get a threaddump for a resource manager from http://resource-manager-host:8088/stacks.Metrics and JMXThe Hadoop daemons collect information about events and measurements that arecollectively known as metrics.
For example, datanodes collect the following metrics (andmany more): the number of bytes written, the number of blocks replicated, and thenumber of read requests from clients (both local and remote).The metrics system in Hadoop 2 and later is sometimes referred to asmetrics2 to distinguish it from the older (now deprecated) metricssystem in earlier versions of Hadoop.Metrics belong to a context; “dfs,” “mapred,” “yarn,” and “rpc” are examples of differentcontexts. Hadoop daemons usually collect metrics under several contexts. For example,datanodes collect metrics for the “dfs” and “rpc” contexts.How Do Metrics Differ from Counters?The main difference is their scope: metrics are collected by Hadoop daemons, whereascounters (see “Counters” on page 247) are collected for MapReduce tasks and aggregatedfor the whole job.
They have different audiences, too: broadly speaking, metrics are foradministrators, and counters are for MapReduce users.The way they are collected and aggregated is also different. Counters are a MapReducefeature, and the MapReduce system ensures that counter values are propagated fromthe task JVMs where they are produced back to the application master, and finally backto the client running the MapReduce job. (Counters are propagated via RPC heartbeats;see “Progress and Status Updates” on page 190.) Both the task process and the applica‐tion master perform aggregation.The collection mechanism for metrics is decoupled from the component that receivesthe updates, and there are various pluggable outputs, including local files, Ganglia, andJMX. The daemon collecting the metrics performs aggregation on them before they aresent to the output.All Hadoop metrics are published to JMX (Java Management Extensions), so you canuse standard JMX tools like JConsole (which comes with the JDK) to view them.
ForMonitoring|331remotemonitoring,youmustsettheJMXsystempropertycom.sun.management.jmxremote.port (and others for security) to allow access. To dothis for the namenode, say, you would set the following in hadoop-env.sh:HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote.port=8004"You can also view JMX metrics (in JSON format) gathered by a particular Hadoopdaemon by connecting to its /jmx web page.
This is handy for debugging. For example,you can view namenode metrics at http://namenode-host:50070/jmx.Hadoop comes with a number of metrics sinks for publishing metrics to external sys‐tems, such as local files or the Ganglia monitoring system. Sinks are configured in thehadoop-metrics2.properties file; see that file for sample configuration settings.MaintenanceRoutine Administration ProceduresMetadata backupsIf the namenode’s persistent metadata is lost or damaged, the entire filesystem is ren‐dered unusable, so it is critical that backups are made of these files. You should keepmultiple copies of different ages (one hour, one day, one week, and one month, say) toprotect against corruption, either in the copies themselves or in the live files runningon the namenode.A straightforward way to make backups is to use the dfsadmin command to downloada copy of the namenode’s most recent fsimage:% hdfs dfsadmin -fetchImage fsimage.backupYou can write a script to run this command from an offsite location to store archivecopies of the fsimage.
The script should additionally test the integrity of the copy. Thiscan be done by starting a local namenode daemon and verifying that it has successfullyread the fsimage and edits files into memory (by scanning the namenode log for theappropriate success message, for example).2Data backupsAlthough HDFS is designed to store data reliably, data loss can occur, just like in anystorage system; thus, a backup strategy is essential. With the large data volumes that2. Hadoop comes with an Offline Image Viewer and an Offline Edits Viewer, which can be used to check theintegrity of the fsimage and edits files. Note that both viewers support older formats of these files, so you canuse them to diagnose problems in these files generated by previous releases of Hadoop. Type hdfs oiv andhdfs oev to invoke these tools.332|Chapter 11: Administering HadoopHadoop can store, deciding what data to back up and where to store it is a challenge.The key here is to prioritize your data.
The highest priority is the data that cannot beregenerated and that is critical to the business; however, data that is either straightfor‐ward to regenerate or essentially disposable because it is of limited business value is thelowest priority, and you may choose not to make backups of this low-priority data.Do not make the mistake of thinking that HDFS replication is asubstitute for making backups. Bugs in HDFS can cause replicas tobe lost, and so can hardware failures. Although Hadoop is express‐ly designed so that hardware failure is very unlikely to result in dataloss, the possibility can never be completely ruled out, particularlywhen combined with software bugs or human error.When it comes to backups, think of HDFS in the same way as youwould RAID. Although the data will survive the loss of an individ‐ual RAID disk, it may not survive if the RAID controller fails or isbuggy (perhaps overwriting some data), or the entire array is dam‐aged.It’s common to have a policy for user directories in HDFS.
For example, they may havespace quotas and be backed up nightly. Whatever the policy, make sure your users knowwhat it is, so they know what to expect.The distcp tool is ideal for making backups to other HDFS clusters (preferably runningon a different version of the software, to guard against loss due to bugs in HDFS) orother Hadoop filesystems (such as S3) because it can copy files in parallel. Alternatively,you can employ an entirely different storage system for backups, using one of the meth‐ods for exporting data from HDFS described in “Hadoop Filesystems” on page 53.HDFS allows administrators and users to take snapshots of the filesystem. A snapshotis a read-only copy of a filesystem subtree at a given point in time.
Snapshots are veryefficient since they do not copy data; they simply record each file’s metadata and blocklist, which is sufficient to reconstruct the filesystem contents at the time the snapshotwas taken.Snapshots are not a replacement for data backups, but they are a useful tool for pointin-time data recovery for files that were mistakenly deleted by users. You might have apolicy of taking periodic snapshots and keeping them for a specific period of time ac‐cording to age. For example, you might keep hourly snapshots for the previous day anddaily snapshots for the previous month.Maintenance|333Filesystem check (fsck)It is advisable to run HDFS’s fsck tool regularly (i.e., daily) on the whole filesystem toproactively look for missing or corrupt blocks. See “Filesystem check (fsck)” on page326.Filesystem balancerRun the balancer tool (see “Balancer” on page 329) regularly to keep the filesystemdatanodes evenly balanced.Commissioning and Decommissioning NodesAs an administrator of a Hadoop cluster, you will need to add or remove nodes fromtime to time.