Description

[Using SMA-MIB.mib] It monitors the Voltaire InfiniBand Subnet Management Agent (SMA) System Module attributes like sysModuleState, sysModuleTempValue, sysModuleTempState, sysModuleRate, sysModulePowerConsumption and also monitors the state of a remote action. Tested on MellanoxVoltaire-4036E-Infiniband [SysObjID: 1.3.6.1.4.1.5206.1.24].

Prerequisites

SNMP should be enabled in end device and device should support SMA-MIB OIDs and SNMP credentials should be attached against the device in portal.

How to Apply: This template is All instance selection based. It will not ask user to select any instance(s) while assigning it to a device.

Metric Parameters

Metric Parameters
ParameterDescription
Frequency
  • Frequency is the interval in which you want to probe and collect metric data from the target device/resource
  • Frequency is defined in minutes (min).
  • Warning ThresholdIf the metric value satisfies the condition defined along with Warning Threshold value, then a notification is sent to the user.
    Critical ThresholdIf the metric value satisfies the condition defined along with Critical Threshold value, then a notification is sent to the user.
    AlertThe alert value can be set to either Yes or No. If it is Yes, then an alert message is sent to the user.

    Metrics

    vol.infiniband.sma.remote.state

    Metric Details

    Metric Details
    Applicable forDevice
    SNMP OID1.3.6.1.4.1.5206.2.1.0,
    1.3.6.1.4.1.5206.2.4.0, 1.3.6.1.4.1.5206.2.6.0, 1.3.6.1.4.1.5206.2.7.0, 1.3.6.1.4.1.5206.2.8.0
    ExpressionremoteState
    DescriptionQueries the state of a remote action. [OIDs: 1.3.6.1.4.1.5206.2.7.0, 1.3.6.1.4.1.5206.2.8.0, 1.3.6.1.4.1.5206.2.9.0]
    CategorySNMP monitors
    Collector TypeGateway
    Monitor NameVoltaire InfiniBand SMA Remote Action State
    Unit

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency51 – 1440 (mins)
    Filter
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical OperatorNOT_EQUALEnds with, ==, !=, >=, <=, >, <, In Range, Out of range, Equals, Not equals, Equals Ignore Case, Not Equals Ignore Case, Contains, Not contains, Regex match, Regex no match, In string list, Not in string list, In List, Not in list, Starts with
    Critical Threshold1[{"1":"success"},{"2":"ftpExecutionFailed"},{"3":"linuxCmdFailed"},{"4":"invalidRemotePath"},{"5":"invalidFileName"},{"6":"localRepositoryFull"},{"7":"localFileDoesNotExist"},{"8":"invalidFileContents"},{"9":"successToOverwriteFile"},{"10":"unknownError"},{"11":"errNoRemotePathInput"},{"12":"errNoFileNameInput"},{"13":"errcorruptedRepostoryFile"},{"14":"errPlatforTypeNotSupported"},{"15":"errSmbNotInActiveMode"},{"16":"errFailtoSyncSmb"},{"17":"errFailtoUpgradeSystem"},{"18":"errFailtoExportLogs"},{"19":"ftpUpgradeSoftwareInProgress"}]
    Critical Repeat Count11-12
    AlertYesYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    vol.infiniband.sma.sys.module.state

    Metric Details

    Metric Details
    Applicable forDevice
    SNMP OID1.3.6.1.4.1.5206.3.29.1.6,
    1.3.6.1.4.1.5206.3.29.1.1, 1.3.6.1.4.1.5206.3.29.1.2
    ExpressionsysModuleState
    DescriptionState of the module. Possible values "1=>notPresent, 2=>ok, 3=>fault, 4=>dcFault, 5=>acFault, 6=>unknown, 7=>io-fault". [OID: 1.3.6.1.4.1.5206.3.29.1.6]
    CategorySNMP monitors
    Collector TypeGateway
    Monitor NameVoltaire InfiniBand SMA System Module Monitors
    Unit

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency51 – 1440 (mins)
    FilterNULLNot Applicable
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical OperatorNOT_IN_LISTEnds with, ==, !=, >=, <=, >, <, In Range, Out of range, Equals, Not equals, Equals Ignore Case, Not Equals Ignore Case, Contains, Not contains, Regex match, Regex no match, In string list, Not in string list, In List, Not in list, Starts with
    Critical Threshold1,2[{"1":"notPresent"},{"2":"ok"},{"3":"fault"},{"4":"dcFault"},{"5":"acFault"},{"6":"unknown"},{"7":"io-fault"}]
    Critical Repeat Count11-12
    AlertYesYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    vol.infiniband.sma.sys.module.temp.value

    Metric Details

    Metric Details
    Applicable forDevice
    SNMP OID1.3.6.1.4.1.5206.3.29.1.7
    ExpressionNULL
    DescriptionA module holds multiple heat sensors. This metric holds the maximum temperature measured across the module. [OID: 1.3.6.1.4.1.5206.3.29.1.7]
    CategorySNMP monitors
    Collector TypeGateway
    Monitor NameVoltaire InfiniBand SMA System Module Monitors
    UnitC

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency51 – 1440 (mins)
    FilterNULLNot Applicable
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    vol.infiniband.sma.sys.module.temp.state

    Metric Details

    Metric Details
    Applicable forDevice
    SNMP OID1.3.6.1.4.1.5206.3.29.1.8
    ExpressionNULL
    DescriptionState of module according to temperature. Possible values are "1: alarm, 2: warning, 3: normal, 4: sensorFault, 5: notAvalible". [OID: 1.3.6.1.4.1.5206.3.29.1.8]
    CategorySNMP monitors
    Collector TypeGateway
    Monitor NameVoltaire InfiniBand SMA System Module Monitors
    Unit

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency51 – 1440 (mins)
    FilterNULLNot Applicable
    Warning OperatorEQUALEnds with, ==, !=, >=, <=, >, <, In Range, Out of range, Equals, Not equals, Equals Ignore Case, Not Equals Ignore Case, Contains, Not contains, Regex match, Regex no match, In string list, Not in string list, In List, Not in list, Starts with
    Warning Threshold2[{"1":"alarm"},{"2":"warning"},{"3":"normal"},{"4":"sensorFault"},{"5":"notAvalible"}]
    Warning Repeat Count11-12
    Critical OperatorIN_LISTEnds with, ==, !=, >=, <=, >, <, In Range, Out of range, Equals, Not equals, Equals Ignore Case, Not Equals Ignore Case, Contains, Not contains, Regex match, Regex no match, In string list, Not in string list, In List, Not in list, Starts with
    Critical Threshold1,4[{"1":"alarm"},{"2":"warning"},{"3":"normal"},{"4":"sensorFault"},{"5":"notAvalible"}]
    Critical Repeat Count11-12
    AlertYesYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    vol.infiniband.sma.sys.module.power.usage

    Metric Details

    Metric Details
    Applicable forDevice
    SNMP OID1.3.6.1.4.1.5206.3.29.1.12
    ExpressionNULL
    DescriptionThe system DC power consumption is Watt. [OID: 1.3.6.1.4.1.5206.3.29.1.12]
    CategorySNMP monitors
    Collector TypeGateway
    Monitor NameVoltaire InfiniBand SMA System Module Monitors
    UnitW

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency51 – 1440 (mins)
    FilterNULLNot Applicable
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph

    vol.infiniband.sma.sys.module.fan.rate

    Metric Details

    Metric Details
    Applicable forDevice
    SNMP OID1.3.6.1.4.1.5206.3.29.1.11
    ExpressionNULL
    DescriptionThe fan rate of the module. [OID: 1.3.6.1.4.1.5206.3.29.1.11]
    CategorySNMP monitors
    Collector TypeGateway
    Monitor NameVoltaire InfiniBand SMA System Module Monitors
    Unit

    Possible Inputs

    Possible Inputs
    MetricInput ValueRange of Values
    Frequency51 – 1440 (mins)
    FilterNULLNot Applicable
    Warning Operator
    Warning Threshold
    Warning Repeat Count
    Critical Operator
    Critical Threshold
    Critical Repeat Count
    AlertNoYes/No
    Graph (Yes/No)YesYes/No

    Sample Output

    No graph