Details for Data Flow
Logging details for Data Flow Spark diagnostic logs.
Resources
- applications
Log Categories
API value (ID): | Console (Display Name) | Description |
---|---|---|
all | Diagnostic | Includes all logs generated by the Apache Spark framework (driver and executors). |
Availability
Data Flow logging is available in all the regions of the commercial realms.
Comments
Spark diagnostic logs can be enabled only at the Data Flow application level, and can't be overridden.
If you enable logging for a Data Flow application, Spark diagnostic logs are streamed for any new Data Flow run submission. Already accepted or in-progress runs aren't updated.
Contents of a Data Flow Spark Diagnostic Log
Property | Description |
---|---|
specversion | Oracle Cloud Infrastructure Logging schema version of the log. |
type | Log category that follows the com.oraclecloud.{service}.{resource-type}.{category} convention.com.oraclecloud.dataflow.run.driver com.oraclecloud.dataflow.run.executor |
source | Name of the resource that generated the message. |
subject | A specific subresource that generated the event. |
id | A source-unique identifier for this batch ingestion. |
time | Time the function output was generated, in RFC 3339 timestamp format. |
oracle.logid | OCID of the Oracle Cloud Infrastructure Logging log object. |
oracle.loggroupid | OCID of the Oracle Cloud Infrastructure Logging log group. |
oracle.compartmentid | OCID of the compartment the Oracle Cloud Infrastructure Logging log group is in. |
oracle.tenantid | OCID of the tenant. |
oracle.ingestedtime | Time the log line was ingested by Oracle Cloud Infrastructure Logging, in RFC 3339 timestamp format. |
data[i].id | Unique Identifier for this log event. |
data[i].time | Time when this specific log entry was generated. Must adhere to the format specified in RFC 3339. |
data[i].data | Non-empty data representing a log event. |
data.data[i].level | The logging level of the logging event. |
data.data[i].message | A message describing the event details. |
data.data[i].opcRequestId | A unique Oracle-assigned request ID generated when the Data Flow run was submitted and included in the createRun response. |
data.data[i].runId | The OCID of the Data Flow run whose resource (a Spark driver or executor) generated this message. |
data.data[i].thread | The name of the thread that generated the logging event. |
Example Data Flow Spark Diagnostic Log
{
"datetime": 1687551602245,
"logContent": {
"data": {
"logLevel": "INFO",
"message": "Execution complete.",
"opcRequestId": "<unique_ID>",
"runId": "ocid1.dataflowrun.oc1.ca-toronto-1.<unique_ID>",
"thread": "shaded.dataflow.oracle.dfcs.spark.wrapper.DataflowWrapper"
},
"id": "<unique_ID>",
"oracle": {
"compartmentid": "ocid1.tenancy.oc1..<unique_ID>",
"ingestedtime": "2023-06-23T20:20:06.974Z",
"loggroupid": "ocid1.loggroup.oc1.ca-toronto-1.<unique_ID>",
"logid": "ocid1.log.oc1.ca-toronto-1.<unique_ID>",
"tenantid": "ocid1.tenancy.oc1..<unique_ID>"
},
"source": "Sample CSV Processing App",
"specversion": "1.0",
"subject": "spark-driver",
"time": "2023-06-23T20:20:02.245Z",
"type": "com.oraclecloud.dataflow.run.driver"
},
"regionId": "ca-toronto-1"
}
Using the CLI
See Enable Oracle Cloud Infrastructure Logging Spark Diagnostic Logs for an example command to enable Data Flow Spark diagnostic logging.