The number of on-disk bytes spilled by this task
Time taken on the executor to deserialize this task
Time the executor spends actually running the task (including fetching shuffle data)
Host's name the task runs on
If this task reads from a HadoopRDD or from persisted data, metrics on how much data was read are stored here.
Amount of time the JVM spent in garbage collection while executing this task
The number of in-memory bytes spilled by this task
If this task writes data externally (e.
If this task writes data externally (e.g. to a distributed filesystem), metrics on how much data was written are stored here.
Amount of time spent serializing the task result
The number of bytes this task transmitted back to the driver as the TaskResult
If this task writes to shuffle output, metrics on the written shuffle data will be collected here
Storage statuses of any blocks that have been updated as a result of this task.
:: DeveloperApi :: Metrics tracked during the execution of a task.
This class is used to house metrics both for in-progress and completed tasks. In executors, both the task thread and the heartbeat thread write to the TaskMetrics. The heartbeat thread reads it to send in-progress metrics, and the task thread reads it to send metrics along with the completed task.
So, when adding new fields, take into consideration that the whole object can be serialized for shipping off at any time to consumers of the SparkListener interface.