public final class Bucketizer extends Model<Bucketizer>
Bucketizer
maps a column of continuous features to a column of feature buckets.Constructor and Description |
---|
Bucketizer() |
Bucketizer(String uid) |
Modifier and Type | Method and Description |
---|---|
static double |
binarySearchForBuckets(double[] splits,
double feature)
Binary searching in several buckets to place each data point.
|
static boolean |
checkSplits(double[] splits)
We require splits to be of length >= 3 and to be in strictly increasing order.
|
double[] |
getSplits() |
Bucketizer |
setInputCol(String value) |
Bucketizer |
setOutputCol(String value) |
Bucketizer |
setSplits(double[] value) |
DoubleArrayParam |
splits()
Parameter for mapping continuous features into buckets.
|
DataFrame |
transform(DataFrame dataset)
Transforms the input dataset.
|
StructType |
transformSchema(StructType schema)
:: DeveloperApi ::
|
String |
uid() |
transform, transform, transform
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
clear, copyValues, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public static boolean checkSplits(double[] splits)
public static double binarySearchForBuckets(double[] splits, double feature)
splits
- (undocumented)feature
- (undocumented)SparkException
- if a feature is < splits.head or > splits.lastpublic String uid()
public DoubleArrayParam splits()
public double[] getSplits()
public Bucketizer setSplits(double[] value)
public Bucketizer setInputCol(String value)
public Bucketizer setOutputCol(String value)
public DataFrame transform(DataFrame dataset)
Transformer
transform
in class Transformer
dataset
- (undocumented)public StructType transformSchema(StructType schema)
PipelineStage
Derives the output schema from the input schema.
transformSchema
in class PipelineStage
schema
- (undocumented)