org.apache.spark.mllib.evaluation

BinaryClassificationMetrics

class BinaryClassificationMetrics extends Logging

:: Experimental :: Evaluator for binary classification.

Annotations
@Experimental()
Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. BinaryClassificationMetrics
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new BinaryClassificationMetrics(scoreAndLabels: RDD[(Double, Double)])

    Defaults numBins to 0.

  2. new BinaryClassificationMetrics(scoreAndLabels: RDD[(Double, Double)], numBins: Int)

    scoreAndLabels

    an RDD of (score, label) pairs.

    numBins

    if greater than 0, then the curves (ROC curve, PR curve) computed internally will be down-sampled to this many "bins". If 0, no down-sampling will occur. This is useful because the curve contains a point for each distinct score in the input, and this could be as large as the input itself -- millions of points or more, when thousands may be entirely sufficient to summarize the curve. After down-sampling, the curves will instead be made of approximately numBins points instead. Points are made from bins of equal numbers of consecutive points. The size of each bin is floor(scoreAndLabels.count() / numBins), which means the resulting number of bins may not exactly equal numBins. The last bin in each partition may be smaller as a result, meaning there may be an extra sample at partition boundaries.

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def areaUnderPR(): Double

    Computes the area under the precision-recall curve.

  7. def areaUnderROC(): Double

    Computes the area under the receiver operating characteristic (ROC) curve.

  8. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  9. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  12. def fMeasureByThreshold(): RDD[(Double, Double)]

    Returns the (threshold, F-Measure) curve with beta = 1.

    Returns the (threshold, F-Measure) curve with beta = 1.0.

  13. def fMeasureByThreshold(beta: Double): RDD[(Double, Double)]

    Returns the (threshold, F-Measure) curve.

    Returns the (threshold, F-Measure) curve.

    beta

    the beta factor in F-Measure computation.

    returns

    an RDD of (threshold, F-Measure) pairs.

    See also

    http://en.wikipedia.org/wiki/F1_score

  14. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  16. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  17. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  18. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  19. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  20. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  21. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  22. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  23. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  24. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  25. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  26. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  27. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  28. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  29. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  30. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  31. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  32. final def notify(): Unit

    Definition Classes
    AnyRef
  33. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  34. val numBins: Int

    if greater than 0, then the curves (ROC curve, PR curve) computed internally will be down-sampled to this many "bins".

    if greater than 0, then the curves (ROC curve, PR curve) computed internally will be down-sampled to this many "bins". If 0, no down-sampling will occur. This is useful because the curve contains a point for each distinct score in the input, and this could be as large as the input itself -- millions of points or more, when thousands may be entirely sufficient to summarize the curve. After down-sampling, the curves will instead be made of approximately numBins points instead. Points are made from bins of equal numbers of consecutive points. The size of each bin is floor(scoreAndLabels.count() / numBins), which means the resulting number of bins may not exactly equal numBins. The last bin in each partition may be smaller as a result, meaning there may be an extra sample at partition boundaries.

  35. def pr(): RDD[(Double, Double)]

    Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.

    Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.

    See also

    http://en.wikipedia.org/wiki/Precision_and_recall

  36. def precisionByThreshold(): RDD[(Double, Double)]

    Returns the (threshold, precision) curve.

  37. def recallByThreshold(): RDD[(Double, Double)]

    Returns the (threshold, recall) curve.

  38. def roc(): RDD[(Double, Double)]

    Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.

    Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.

    See also

    http://en.wikipedia.org/wiki/Receiver_operating_characteristic

  39. val scoreAndLabels: RDD[(Double, Double)]

    an RDD of (score, label) pairs.

  40. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  41. def thresholds(): RDD[Double]

    Returns thresholds in descending order.

  42. def toString(): String

    Definition Classes
    AnyRef → Any
  43. def unpersist(): Unit

    Unpersist intermediate RDDs used in the computation.

  44. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  46. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped