Introduction
Cloud Dataflow is a fully-managed service for transforming and enriching data in stream (real-time) and batch (historical) modes with equal reliability and expressiveness – no more complex workarounds or compromises needed. With its serverless approach to resource provisioning and management, you have access to virtually limitless capacity to solve your biggest data processing challenges, while paying only for what you use.
Cloud Dataflow unlocks transformational use cases across industries, including:
- Check Clickstream, Point-of-Sale, and segmentation analysis in retail.
- Check Fraud detection in financial services.
- Check Personalized user experience in gaming.
- Check IoT analytics in manufacturing, healthcare, and logistics.
Setup
To set up the OpsRamp Google integration and discover the Google service,
go to Google Integration Discovery Profile and select GOOGLE/Dataflow Job
.
Metrics
OpsRamp Metric | Metric Display Name | Unit | Aggregation Type | Description |
---|---|---|---|---|
google_dataflow_job_current_num_vcpus | Current number of vCPUs in use | Count | Average | The number of vCPUs currently being used by this Dataflow job. This is the current number of workers times the number of vCPUs per worker. |
google_dataflow_job_data_watermark_age | Data watermark age | Seconds | Average | The age (time since event timestamp) of the most recent item of data that has been fully processed by the pipeline. |
google_dataflow_job_elapsed_time | Elapsed time | Seconds | Average | Duration that the current run of this pipeline has been in the Running state so far, in seconds. When a run completes, this stays at the duration of that run until the next run starts. |
google_dataflow_job_element_count | Element count | Count | Average | Number of elements added to the pcollection so far. |
google_dataflow_job_error_count | Error count | Count | Average | Number of errors that happened so far. |
google_dataflow_job_estimated_byte_count | Estimated byte count | Bytes | Average | An estimated number of bytes added to the pcollection so far. Dataflow calculates the average encoded size of elements in a pcollection and mutiplies it by the number of elements. |
google_dataflow_job_is_failed | Failed | Count | Average | Has this job failed. |
google_dataflow_job_status | Status | String | Average | Current state of this pipeline (for example: RUNNING, DONE, CANCELLED, FAILED, ...). Not reported while the pipeline is not running. |
google_dataflow_job_system_lag | System lag | Seconds | Average | The current maximum duration that an item of data has been awaiting processing, in seconds. |
google_dataflow_job_total_memory_usage_time | Total memory usage time | GB.seconds | Average | The total GB seconds of memory allocated to this Dataflow job. |
google_dataflow_job_total_pd_usage_time | Total PD usage time | GB.seconds | Average | The total GB seconds for all persistent disk used by all workers associated with this Dataflow job. |
google_dataflow_job_total_vcpu_time | Total vCPU time | Seconds | Average | The total vCPU seconds used by this Dataflow job. |
google_dataflow_job_billable_shuffle_data_processed | Job Billable Shuffle Data Processed | Bytes | Average | The billable bytes of shuffle data processed by this Dataflow job, in bytes Sampled every 60 seconds. |
google_dataflow_job_total_shuffle_data_processed | Job Total Shuffle Data Processed | Bytes | Average | The total bytes of shuffle data processed by this Dataflow job. |
google_dataflow_job_total_streaming_data_processed | Job Total Streaming Data Processed | Bytes | Average | The total bytes of shuffle data processed by this Dataflow job. |
google_dataflow_job_user_counter | Job User Counter | Count | Average | A user-defined counter metric. |
google_dataflow_job_current_shuffle_slots | Current shuffle slots in use | Count | Average | The current shuffle slots used by this Dataflow job. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. |
google_dataflow_job_elements_produced_count | Elements Produced | Count | Average | The number of elements produced by each PTransform. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. |
google_dataflow_job_estimated_bytes_produced_count | Estimated Bytes Produced | Count | Average | The estimated total byte size of elements produced by each PTransform. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. |
Event support
- Supported
- Configurable in OpsRamp Google Integration Discovery Profile.