Introduction

Amazon Elasticsearch Service is a fully managed service that is easy to deploy, easy to secure, and cost effective at scale.

Features include:

  • Support for the tools that build, monitor, and troubleshoot your applications at the scale that you need.
  • Support for open source Elasticsearch APIs, managed Kibana, integration with Logstash and other AWS services, and built-in alerting and SQL querying.
  • Pay for only for what is used with no upfront costs or usage requirements. For example, you can get the ELK stack that you need, without the operational overhead.

Setup

To set up the OpsRamp AWS integration and discover the AWS service, go to AWS Integration Discovery Profile and select Elastic Search Service.

Metrics

OpsRamp MetricMetric Display NameUnitAggregation TypeDescription
aws_es_NodesNodesCountAVERAGENumber of nodes in the Amazon ES cluster.
aws_es_SearchableDocumentsSearchableDocumentsCountAVERAGETotal number of searchable documents across all indices in the cluster.
aws_es_DeletedDocumentsDeletedDocumentsCountAVERAGETotal number of deleted documents across all indices in the cluster.
aws_es_CPUUtilizationCPUUtilization.esPercentAVERAGEMaximum percentage of CPU resources used for data nodes in the cluster.
aws_es_FreeStorageSpaceFreeStorageSpace.esMegabytesMinimumFree space, in megabytes, for all data nodes in the cluster.
aws_es_ClusterUsedSpaceClusterUsedSpaceMegabytesMinimumTotal used space, in megabytes, for a cluster.
aws_es_ClusterIndexWritesBlockedClusterIndexWritesBlockedCountMaximumIndicates whether the cluster is accepting or blocking incoming write requests.
aws_es_JVMMemoryPressureJVMMemoryPressurePercentMaximumMaximum percentage of the Java heap used for all data nodes in the cluster.
aws_es_AutomatedSnapshotFailureAutomatedSnapshotFailureCountMaximumNumber of failed automated snapshots for the cluster.
aws_es_CPUCreditBalanceCPUCreditBalance.esCountMinimumRemaining CPU credits available for data nodes in the cluster.
aws_es_KibanaHealthyNodesKibanaHealthyNodesCountMinimumHealth check for Kibana.
aws_es_MasterCPUUtilizationMasterCPUUtilizationPercentAVERAGEMaximum percentage of CPU resources used by the dedicated master nodes.
aws_es_MasterJVMMemoryPressureMasterJVMMemoryPressurePercentMaximumMaximum percentage of the Java heap used for all dedicated master nodes in the cluster.
aws_es_MasterCPUCreditBalanceMasterCPUCreditBalanceCountMinimumRemaining CPU credits available for dedicated master nodes in the cluster.
aws_es_MasterReachableFromNodeMasterReachableFromNodeCountMinimumHealth check for MasterNotDiscovered exceptions. Value of 1 indicates normal behavior.
aws_es_ClusterStatus_green_esClusterStatus.green.esCountMaximumIndicates that all index shards are allocated to nodes in the cluster.
aws_es_ClusterStatus_yellow_esClusterStatus.yellow.esCountMaximumIndicates that the primary shards for all indices are allocated to nodes in a cluster, but the replica shards for at least one index are not.
aws_es_ClusterStatus_red_esClusterStatus.red.esCountMaximumIndicates that the primary and replica shards of at least one index are not allocated to nodes in a cluster.
aws_es_2xx2xxCountSumNumber of requests to the domain that resulted in the given HTTP response code 2xx.
aws_es_3xx3xxCountSumNumber of requests to the domain that resulted in the given HTTP response code 3xx.
aws_es_4xx4xxCountSumNumber of requests to the domain that resulted in the given HTTP response code 4xx.
aws_es_5xx5xxCountSumNumber of requests to the domain that resulted in the given HTTP response code 5xx.
aws_es_AlertingDegradedAlertingDegradedCountMaximumValue of 1 means that either the alerting index is red or one or more nodes is not on schedule. Value of 0 indicates normal behavior.
aws_es_AlertingIndexExistsAlertingIndexExistsCountMaximumValue of 1 means the .opendistro-alerting-config index exists. Value of 0 means it does not. Until you use the alerting feature for the first time, this value remains 0.
aws_es_AlertingIndexStatus_greenAlertingIndexStatus.greenCountMaximumHealth of the index. Value of 1 means green. Value of 0 means that the index either doesnt exist or isnt green.
aws_es_AlertingIndexStatus_redAlertingIndexStatus.redCountMaximumHealth of the index. Value of 1 means red. Value of 0 means that the index either does notexist or is not red.
aws_es_AlertingIndexStatus_yellowAlertingIndexStatus.yellowCountMaximumHealth of the index. Value of 1 means yellow. Value of 0 means that the index either does not exist or is not yellow.
aws_es_AlertingNodesNotOnScheduleAlertingNodesNotOnScheduleCountMaximumValue of 1 means some jobs are not running on schedule. Value of 0 means that all alerting jobs are running on schedule (or that no alerting jobs exist). Check the Amazon ES console or make a _nodes/stats request to see if any nodes show high resource usage.
aws_es_AlertingNodesOnScheduleAlertingNodesOnScheduleCountMaximumValue of 1 means that all alerting jobs are running on schedule (or that no alerting jobs exist). Value of 0 means some jobs are not running on schedule.
aws_es_SQLUnhealthySQLUnhealthyCountMaximumValue of 1 indicates that, in response to certain requests, the SQL plugin is returning 5xx response codes or passing invalid query DSL to Elasticsearch. Other requests should continue to succeed. Value of 0 indicates no recent failures. If a sustained value of 1 is displayed, troubleshoot the requests that clients are making to the plugin.
aws_es_SQLRequestCountSQLRequestCountCountSumNumber of requests to the Open Distro SQL API.
aws_es_AlertingScheduledJobEnabledAlertingScheduledJobEnabledCountMaximumValue of 1 means that the opendistro.scheduled_jobs.enabled cluster setting is true. Value of 0 means it is false and scheduled jobs are disabled.
aws_es_SQLFailedRequestCountBySysErrSQLFailedRequestCountBySysErrCountSumNumber of requests to the Open Distro SQL API API that failed due to a server problem or feature limitation. For example, a request might return HTTP status code 503 due to a VerificationException.
aws_es_SQLFailedRequestCountByCusErrSQLFailedRequestCountByCusErrCountSumNumber of requests to the Open Distro SQL API that failed due to a client issue. For example, a request might return HTTP status code 400 due to an IndexNotFoundException.

Event support

CloudTrail event support

  • Supported
  • Configurable in OpsRamp AWS Integration Discovery Profile.

CloudWatch alarm support

  • Not Supported

External reference