1) Writing Splits:
Splits are written to a split file, eventually persisted in HDFS. Splits are re-initialized from the split file at the server side. This protocol is being adapted as custom implementations of input formats work with custom record readers and custom input split implementations. Hence to be generic and satisfying all custom implementations, we choose to serialize the splits metadata.
2) HadoopReadOperatorDescriptor has been revised to remove any dependency over the base class. The base class is now redundant.
3) DatatypeHelper : slight modification to use Map instead of HashMap in the method hashMap2jobConf().
4) HDFSWriteOperatorDescriptor : revised with simpler constructors that do not require client to configure output splits.
/home/raman/research/work/hyracks-trunk/hyracks/hyracks/hyracks-dataflow-hadoop
git-svn-id: https://hyracks.googlecode.com/svn/trunk@135 123451ca-8445-de46-9d55-352943316053
8 files changed
tree: e49b8ff25b5794209007c5e7438feb3919ed940d
- hyracks/