pyspark.sql.DataFrame.hint#
- DataFrame.hint(name, *parameters)[source]#
Specifies some hint on the current
DataFrame
.New in version 2.2.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- namestr
A name of the hint.
- parametersstr, list, float or int
Optional parameters.
- Returns
DataFrame
Hinted DataFrame
Examples
>>> df = spark.createDataFrame([(2, "Alice"), (5, "Bob")], schema=["age", "name"]) >>> df2 = spark.createDataFrame([Row(height=80, name="Tom"), Row(height=85, name="Bob")]) >>> df.join(df2, "name").explain() == Physical Plan == ... ... +- SortMergeJoin ... ...
Explicitly trigger the broadcast hashjoin by providing the hint in
df2
.>>> df.join(df2.hint("broadcast"), "name").explain() == Physical Plan == ... ... +- BroadcastHashJoin ... ...