Description

This template monitors DataNode related metrics. It is applicable for the devices containing HDFS application.

Prerequisites

Java must be installed on the device. Gateway should be up and running. The device should be reachable from Gateway. The device should be in managed state.

Metric Parameters

Metric Parameters
ParameterDescription
Frequency
  • Frequency is the interval in which you want to probe and collect metric data from the target device/resource
  • Frequency is defined in minutes (min).
  • Warning ThresholdIf the metric value satisfies the condition defined along with Warning Threshold value, then a notification is sent to the user.
    Critical ThresholdIf the metric value satisfies the condition defined along with Critical Threshold value, then a notification is sent to the user.
    AlertThe alert value can be set to either Yes or No. If it is Yes, then an alert message is sent to the user.

    Metrics

    hdfs.datanode.dfs.remaining

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe remaining disk space left.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    UnitBytes

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.dfs.capacity

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionTotal configured HDFS storage capacity.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    UnitBytes

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.dfs.used

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionTotal HDFS storage used.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    UnitBytes

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.cache.capacity

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe capacity of the HDFS cache on this DataNode.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    UnitBytes

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.cache.used

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe total cache used.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    UnitBytes

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.last.volume.failure.date

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe date/time of the last volume failure in milliseconds since epoch.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    Unitms

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.estimated.capacity.lost.total

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe estimated capacity lost in bytes.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    UnitBytes

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.num.blocks.cached

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe number of blocks cached.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    Unitcount

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.num.blocks.failed.to.cache

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe total number of blocks the DataNode failed to cache.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    Unitcount

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.num.blocks.failed.to.uncache

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionThe total number of blocks the DataNode failed to uncache.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    Unitcount

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    hdfs.datanode.num.failed.volumes

    Metric Details

    Metric Details
    Applicable forDevice
    DescriptionTotal number of failed volumes.
    CategoryApplication
    Collector TypeGateway
    Monitor NameHDFS DataNode Monitor
    Unitcount

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency21 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph