public class PowerIterationClustering extends Object implements PowerIterationClusteringParams, DefaultParamsWritable
This class is not yet an Estimator/Transformer, use assignClusters method to run the
PowerIterationClustering algorithm.
| Constructor and Description |
|---|
PowerIterationClustering() |
| Modifier and Type | Method and Description |
|---|---|
Dataset<Row> |
assignClusters(Dataset<?> dataset)
Run the PIC algorithm and returns a cluster assignment for each input vertex.
|
PowerIterationClustering |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
Param<String> |
dstCol()
Name of the input column for destination vertex IDs.
|
Param<String> |
initMode()
Param for the initialization algorithm.
|
IntParam |
k()
The number of clusters to create (k).
|
static PowerIterationClustering |
load(String path) |
IntParam |
maxIter()
Param for maximum number of iterations (>= 0).
|
Param<?>[] |
params()
Returns all params sorted by their names.
|
static MLReader<T> |
read() |
PowerIterationClustering |
setDstCol(String value) |
PowerIterationClustering |
setInitMode(String value) |
PowerIterationClustering |
setK(int value) |
PowerIterationClustering |
setMaxIter(int value) |
PowerIterationClustering |
setSrcCol(String value) |
PowerIterationClustering |
setWeightCol(String value) |
Param<String> |
srcCol()
Param for the name of the input column for source vertex IDs.
|
String |
uid()
An immutable unique ID for the object and its derivatives.
|
Param<String> |
weightCol()
Param for weight column name.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetDstCol, getInitMode, getK, getSrcColgetMaxItergetWeightColclear, copyValues, defaultCopy, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, set, set, set, setDefault, setDefault, shouldOwntoStringwritesavepublic static PowerIterationClustering load(String path)
public static MLReader<T> read()
public final IntParam k()
PowerIterationClusteringParamsk in interface PowerIterationClusteringParamspublic final Param<String> initMode()
PowerIterationClusteringParamsinitMode in interface PowerIterationClusteringParamspublic Param<String> srcCol()
PowerIterationClusteringParamssrcCol in interface PowerIterationClusteringParamspublic Param<String> dstCol()
PowerIterationClusteringParamsdstCol in interface PowerIterationClusteringParamspublic final Param<String> weightCol()
HasWeightColweightCol in interface HasWeightColpublic final IntParam maxIter()
HasMaxItermaxIter in interface HasMaxIterpublic Param<?>[] params()
ParamsParam.
public String uid()
Identifiableuid in interface Identifiablepublic PowerIterationClustering setK(int value)
public PowerIterationClustering setInitMode(String value)
public PowerIterationClustering setMaxIter(int value)
public PowerIterationClustering setSrcCol(String value)
public PowerIterationClustering setDstCol(String value)
public PowerIterationClustering setWeightCol(String value)
public Dataset<Row> assignClusters(Dataset<?> dataset)
dataset - A dataset with columns src, dst, weight representing the affinity matrix,
which is the matrix A in the PIC paper. Suppose the src column value is i,
the dst column value is j, the weight column value is similarity s,,ij,,
which must be nonnegative. This is a symmetric matrix and hence
s,,ij,, = s,,ji,,. For any (i, j) with nonzero similarity, there should be
either (i, j, s,,ij,,) or (j, i, s,,ji,,) in the input. Rows with i = j are
ignored, because we assume s,,ij,, = 0.0.
public PowerIterationClustering copy(ParamMap extra)
ParamsdefaultCopy().