public interface HadoopFsRelationProvider
HadoopFsRelationProvider
), a user defined
schema, and an optional list of partition columns, this interface is used to pass in the
parameters specified by a user.
Users may specify the fully qualified class name of a given data source. When that class is
not found Spark SQL will append the class name DefaultSource
to the path, allowing for
less verbose invocation. For example, 'org.apache.spark.sql.json' would resolve to the
data source 'org.apache.spark.sql.json.DefaultSource'
A new instance of this class will be instantiated each time a DDL call is made.
The difference between a RelationProvider
and a HadoopFsRelationProvider
is
that users need to provide a schema and a (possibly empty) list of partition columns when
using a HadoopFsRelationProvider
. A relation provider can inherits both RelationProvider
,
and HadoopFsRelationProvider
if it can support schema inference, user-specified
schemas, and accessing partitioned relations.
Modifier and Type | Method and Description |
---|---|
HadoopFsRelation |
createRelation(SQLContext sqlContext,
java.lang.String[] paths,
scala.Option<StructType> dataSchema,
scala.Option<StructType> partitionColumns,
scala.collection.immutable.Map<java.lang.String,java.lang.String> parameters)
Returns a new base relation with the given parameters, a user defined schema, and a list of
partition columns.
|
HadoopFsRelation createRelation(SQLContext sqlContext, java.lang.String[] paths, scala.Option<StructType> dataSchema, scala.Option<StructType> partitionColumns, scala.collection.immutable.Map<java.lang.String,java.lang.String> parameters)
dataSchema
- Schema of data columns (i.e., columns that are not partition columns).sqlContext
- (undocumented)paths
- (undocumented)partitionColumns
- (undocumented)parameters
- (undocumented)