public class VectorIndexerModel extends Model<VectorIndexerModel>
This maintains vector sparsity.
param: numFeatures Number of features, i.e., length of Vectors which this transforms param: categoryMaps Feature value index. Keys are categorical feature indices (column indices). Values are maps from original features values to 0-based category indices. If a feature is not in this map, it is treated as continuous.
| Modifier and Type | Method and Description |
|---|---|
scala.collection.immutable.Map<Object,scala.collection.immutable.Map<Object,Object>> |
categoryMaps() |
int |
getMaxCategories() |
java.util.Map<Integer,java.util.Map<Double,Integer>> |
javaCategoryMaps()
Java-friendly version of
categoryMaps |
IntParam |
maxCategories()
Threshold for the number of values a categorical feature can take.
|
int |
numFeatures() |
VectorIndexerModel |
setInputCol(String value) |
VectorIndexerModel |
setOutputCol(String value) |
DataFrame |
transform(DataFrame dataset)
Transforms the input dataset.
|
StructType |
transformSchema(StructType schema)
:: DeveloperApi ::
|
String |
uid() |
transform, transform, transformequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitclear, copy, copyValues, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParamsinitializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarningpublic String uid()
public int numFeatures()
public scala.collection.immutable.Map<Object,scala.collection.immutable.Map<Object,Object>> categoryMaps()
public java.util.Map<Integer,java.util.Map<Double,Integer>> javaCategoryMaps()
categoryMapspublic VectorIndexerModel setInputCol(String value)
public VectorIndexerModel setOutputCol(String value)
public DataFrame transform(DataFrame dataset)
Transformertransform in class Transformerdataset - (undocumented)public StructType transformSchema(StructType schema)
PipelineStageDerives the output schema from the input schema.
transformSchema in class PipelineStageschema - (undocumented)public IntParam maxCategories()
(default = 20)
public int getMaxCategories()