InputFormatInfo (Spark 2.0.2 JavaDoc)

Object
- org.apache.spark.scheduler.InputFormatInfo

```
public class InputFormatInfo
extends Object
```
:: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.

Constructor Summary

Constructors
Constructor and Description

InputFormatInfo(org.apache.hadoop.conf.Configuration configuration, Class<?> inputFormatClazz, String path)

Constructors
Constructor and Description
`InputFormatInfo(org.apache.hadoop.conf.Configuration configuration, Class<?> inputFormatClazz, String path)`

Method Summary

Methods
Modifier and Type	Method and Description
`static scala.collection.immutable.Map<String,scala.collection.immutable.Set<SplitInfo>>`	`computePreferredLocations(scala.collection.Seq<InputFormatInfo> formats)` Computes the preferred locations based on input(s) and returned a location to block map.
`org.apache.hadoop.conf.Configuration`	`configuration()`
`boolean`	`equals(Object other)`
`int`	`hashCode()`
`Class<?>`	`inputFormatClazz()`
`boolean`	`mapredInputFormat()`
`boolean`	`mapreduceInputFormat()`
`String`	`path()`
`String`	`toString()`

Methods inherited from class Object
getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - InputFormatInfo
```
public InputFormatInfo(org.apache.hadoop.conf.Configuration configuration,
               Class<?> inputFormatClazz,
               String path)
```
- Method Detail
  - computePreferredLocations
```
public static scala.collection.immutable.Map<String,scala.collection.immutable.Set<SplitInfo>> computePreferredLocations(scala.collection.Seq<InputFormatInfo> formats)
```
    Computes the preferred locations based on input(s) and returned a location to block map. Typical use of this method for allocation would follow some algo like this:
    a) For each host, count number of splits hosted on that host. b) Decrement the currently allocated containers on that host. c) Compute rack info for each host and update rack -> count map based on (b). d) Allocate nodes based on (c) e) On the allocation result, ensure that we don't allocate "too many" jobs on a single node (even if data locality on that is very high) : this is to prevent fragility of job if a single (or small set of) hosts go down.
    go to (a) until required nodes are allocated.
    If a node 'dies', follow same procedure.
    PS: I know the wording here is weird, hopefully it makes some sense !
    
    Parameters:
    formats - (undocumented)
    
    Returns:
    (undocumented)
  - configuration
```
public org.apache.hadoop.conf.Configuration configuration()
```
  - inputFormatClazz
```
public Class<?> inputFormatClazz()
```
  - path
```
public String path()
```
  - mapreduceInputFormat
```
public boolean mapreduceInputFormat()
```
  - mapredInputFormat
```
public boolean mapredInputFormat()
```
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class Object
  - equals
```
public boolean equals(Object other)
```
    Overrides:
    
    equals in class Object

Class InputFormatInfo

Constructor Summary

Method Summary

Methods inherited from class Object

Constructor Detail

InputFormatInfo

Method Detail

computePreferredLocations

configuration

inputFormatClazz

path

mapreduceInputFormat

mapredInputFormat

toString

hashCode

equals