public final class RandomForestClassifier extends ProbabilisticClassifier<Vector,RandomForestClassifier,RandomForestClassificationModel>
Random Forest
learning algorithm for
classification.
It supports both binary and multiclass labels, as well as both continuous and categorical
features.Constructor and Description |
---|
RandomForestClassifier() |
RandomForestClassifier(java.lang.String uid) |
Modifier and Type | Method and Description |
---|---|
RandomForestClassifier |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<java.lang.String> |
featuresCol()
Param for features column name.
|
java.lang.String |
getFeaturesCol() |
java.lang.String |
getLabelCol() |
java.lang.String |
getPredictionCol() |
java.lang.String |
getRawPredictionCol() |
Param<java.lang.String> |
labelCol()
Param for label column name.
|
Param<java.lang.String> |
predictionCol()
Param for prediction column name.
|
Param<java.lang.String> |
rawPredictionCol()
Param for raw prediction (a.k.a.
|
RandomForestClassifier |
setCacheNodeIds(boolean value) |
RandomForestClassifier |
setCheckpointInterval(int value) |
RandomForestClassifier |
setFeatureSubsetStrategy(java.lang.String value) |
RandomForestClassifier |
setImpurity(java.lang.String value) |
RandomForestClassifier |
setMaxBins(int value) |
RandomForestClassifier |
setMaxDepth(int value) |
RandomForestClassifier |
setMaxMemoryInMB(int value) |
RandomForestClassifier |
setMinInfoGain(double value) |
RandomForestClassifier |
setMinInstancesPerNode(int value) |
RandomForestClassifier |
setNumTrees(int value) |
RandomForestClassifier |
setSeed(long value) |
RandomForestClassifier |
setSubsamplingRate(double value) |
static java.lang.String[] |
supportedFeatureSubsetStrategies()
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
|
static java.lang.String[] |
supportedImpurities()
Accessor for supported impurity settings: entropy, gini
|
protected RandomForestClassificationModel |
train(DataFrame dataset)
Train a model using the given dataset and parameters.
|
java.lang.String |
uid()
An immutable unique ID for the object and its derivatives.
|
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType) |
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType)
Validates and transforms the input schema with the provided param map.
|
setProbabilityCol, setThresholds
setRawPredictionCol
extractLabeledPoints, fit, setFeaturesCol, setLabelCol, setPredictionCol, transformSchema
transformSchema
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParams
toString
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public RandomForestClassifier(java.lang.String uid)
public RandomForestClassifier()
public static final java.lang.String[] supportedImpurities()
public static final java.lang.String[] supportedFeatureSubsetStrategies()
public java.lang.String uid()
Identifiable
public RandomForestClassifier setMaxDepth(int value)
public RandomForestClassifier setMaxBins(int value)
public RandomForestClassifier setMinInstancesPerNode(int value)
public RandomForestClassifier setMinInfoGain(double value)
public RandomForestClassifier setMaxMemoryInMB(int value)
public RandomForestClassifier setCacheNodeIds(boolean value)
public RandomForestClassifier setCheckpointInterval(int value)
public RandomForestClassifier setImpurity(java.lang.String value)
public RandomForestClassifier setSubsamplingRate(double value)
public RandomForestClassifier setSeed(long value)
public RandomForestClassifier setNumTrees(int value)
public RandomForestClassifier setFeatureSubsetStrategy(java.lang.String value)
protected RandomForestClassificationModel train(DataFrame dataset)
Predictor
fit()
to avoid dealing with schema validation
and copying parameters into the model.
train
in class Predictor<Vector,RandomForestClassifier,RandomForestClassificationModel>
dataset
- Training datasetpublic RandomForestClassifier copy(ParamMap extra)
Params
copy
in interface Params
copy
in class Predictor<Vector,RandomForestClassifier,RandomForestClassificationModel>
extra
- (undocumented)defaultCopy()
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
public Param<java.lang.String> rawPredictionCol()
public java.lang.String getRawPredictionCol()
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
schema
- input schemafitting
- whether this is in fittingfeaturesDataType
- SQL DataType for FeaturesType.
E.g., VectorUDT
for vector features.public Param<java.lang.String> labelCol()
public java.lang.String getLabelCol()
public Param<java.lang.String> featuresCol()
public java.lang.String getFeaturesCol()
public Param<java.lang.String> predictionCol()
public java.lang.String getPredictionCol()