Introduction

Cloud Dataflow is a fully-managed service for transforming and enriching data in stream (real-time) and batch (historical) modes with equal reliability and expressiveness – no more complex workarounds or compromises needed. With its serverless approach to resource provisioning and management, you have access to virtually limitless capacity to solve your biggest data processing challenges, while paying only for what you use.

Cloud Dataflow unlocks transformational use cases across industries, including:

  • Check Clickstream, Point-of-Sale, and segmentation analysis in retail.
  • Check Fraud detection in financial services.
  • Check Personalized user experience in gaming.
  • Check IoT analytics in manufacturing, healthcare, and logistics.

Setup

To set up the OpsRamp Google integration and discover the Google service, go to Google Integration Discovery Profile and select GOOGLE/Dataflow Job.

Metrics

OpsRamp MetricMetric Display NameUnitAggregation TypeDescription
google_dataflow_job_current_num_vcpusCurrent number of vCPUs in useCountAverageThe number of vCPUs currently being used by this Dataflow job. This is the current number of workers times the number of vCPUs per worker.
google_dataflow_job_data_watermark_ageData watermark ageSecondsAverageThe age (time since event timestamp) of the most recent item of data that has been fully processed by the pipeline.
google_dataflow_job_elapsed_timeElapsed timeSecondsAverageDuration that the current run of this pipeline has been in the Running state so far, in seconds. When a run completes, this stays at the duration of that run until the next run starts.
google_dataflow_job_element_countElement countCountAverageNumber of elements added to the pcollection so far.
google_dataflow_job_error_countError countCountAverageNumber of errors that happened so far.
google_dataflow_job_estimated_byte_countEstimated byte countBytesAverageAn estimated number of bytes added to the pcollection so far. Dataflow calculates the average encoded size of elements in a pcollection and mutiplies it by the number of elements.
google_dataflow_job_is_failedFailedCountAverageHas this job failed.
google_dataflow_job_statusStatusStringAverageCurrent state of this pipeline (for example: RUNNING, DONE, CANCELLED, FAILED, ...). Not reported while the pipeline is not running.
google_dataflow_job_system_lagSystem lagSecondsAverageThe current maximum duration that an item of data has been awaiting processing, in seconds.
google_dataflow_job_total_memory_usage_timeTotal memory usage timeGB.secondsAverageThe total GB seconds of memory allocated to this Dataflow job.
google_dataflow_job_total_pd_usage_timeTotal PD usage timeGB.secondsAverageThe total GB seconds for all persistent disk used by all workers associated with this Dataflow job.
google_dataflow_job_total_vcpu_timeTotal vCPU timeSecondsAverageThe total vCPU seconds used by this Dataflow job.
google_dataflow_job_billable_shuffle_data_processedJob Billable Shuffle Data ProcessedBytesAverageThe billable bytes of shuffle data processed by this Dataflow job, in bytes Sampled every 60 seconds.
google_dataflow_job_total_shuffle_data_processedJob Total Shuffle Data ProcessedBytesAverageThe total bytes of shuffle data processed by this Dataflow job.
google_dataflow_job_total_streaming_data_processedJob Total Streaming Data ProcessedBytesAverageThe total bytes of shuffle data processed by this Dataflow job.
google_dataflow_job_user_counterJob User CounterCountAverageA user-defined counter metric.
google_dataflow_job_current_shuffle_slotsCurrent shuffle slots in useCountAverageThe current shuffle slots used by this Dataflow job. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds.
google_dataflow_job_elements_produced_countElements ProducedCountAverageThe number of elements produced by each PTransform. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds.
google_dataflow_job_estimated_bytes_produced_countEstimated Bytes ProducedCountAverageThe estimated total byte size of elements produced by each PTransform. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds.

Event support

  • Supported
  • Configurable in OpsRamp Google Integration Discovery Profile.

External reference