public abstract class FileInputSplit<OT> extends LocatableInputSplit<OT>
Modifier and Type | Class and Description |
---|---|
static class |
FileInputSplit.InputSplitOpenThread
Obtains a DataInputStream in an thread that is not interrupted.
|
Modifier and Type | Field and Description |
---|---|
protected boolean |
enumerateNestedFiles
The flag to specify whether recursive traversal of the input directory
structure is enabled.
|
protected static float |
MAX_SPLIT_SIZE_DISCREPANCY
The fraction that the last split may be larger than the others.
|
protected long |
minSplitSize
The minimal split size, set by the configure() method.
|
protected int |
numSplits
The desired number of splits, as set by the configure() method.
|
protected long |
openTimeout
Time to wait when opening a file.
|
protected long |
splitLength
Length of the current split
|
protected long |
splitStart
Start point of the current split
|
protected FSDataInputStream |
stream
The input data stream
|
Constructor and Description |
---|
FileInputSplit(int num,
Path file,
long start,
long length,
java.lang.String[] hosts)
Constructs a split with host information.
|
FileInputSplit(int num,
Path file,
java.lang.String[] hosts) |
Modifier and Type | Method and Description |
---|---|
void |
close()
Method that marks the end of the life-cycle of an input split.
|
void |
configure(Config parameters)
Configure the split with parameters
|
boolean |
equals(java.lang.Object obj) |
long |
getLength()
Returns the number of bytes in the file to process.
|
long |
getMinSplitSize() |
int |
getNumSplits() |
Path |
getPath()
Returns the path of the file containing this split's data.
|
long |
getStart()
Returns the position of the first byte in the file to process.
|
int |
hashCode() |
boolean |
isEnumerateNestedFiles() |
void |
open()
Opens an input stream to the file defined in the input format.
|
void |
open(Config cfg) |
void |
setEnumerateNestedFiles(boolean enumerateNestedFiles) |
void |
setMinSplitSize(long minSplitSize) |
void |
setNumSplits(int numSplits) |
java.lang.String |
toString() |
getHostnames, getSplitNumber
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
nextRecord, reachedEnd
protected int numSplits
protected boolean enumerateNestedFiles
protected static final float MAX_SPLIT_SIZE_DISCREPANCY
protected long minSplitSize
protected long splitStart
protected long splitLength
protected FSDataInputStream stream
protected long openTimeout
public FileInputSplit(int num, Path file, long start, long length, java.lang.String[] hosts)
num
- the number of this input splitfile
- the file namestart
- the position of the first byte in the file to processlength
- the number of bytes in the file to process (-1 is flag for "read whole file")hosts
- the list of hosts containing the block, possibly null
public FileInputSplit(int num, Path file, java.lang.String[] hosts)
public boolean isEnumerateNestedFiles()
public void setEnumerateNestedFiles(boolean enumerateNestedFiles)
public long getMinSplitSize()
public void setMinSplitSize(long minSplitSize)
public int getNumSplits()
public void setNumSplits(int numSplits)
public Path getPath()
public long getStart()
public long getLength()
public int hashCode()
hashCode
in class LocatableInputSplit<OT>
public boolean equals(java.lang.Object obj)
equals
in class LocatableInputSplit<OT>
public java.lang.String toString()
toString
in class LocatableInputSplit<OT>
public void configure(Config parameters)
InputSplit
parameters
- the parameterspublic void close() throws java.io.IOException
InputSplit
When this method is called, the input format it guaranteed to be opened.
java.io.IOException
- Thrown, if the input could not be closed properly.public void open() throws java.io.IOException
The stream is actually opened in an asynchronous thread to make sure any interruptions to the thread working on the input format do not reach the file system.
java.io.IOException
- Thrown, if the spit could not be opened due to an I/O problem.public void open(Config cfg) throws java.io.IOException
java.io.IOException