[ASTERIXDB-2713][EXT] CSV & TSV support for external dataset p3
- user model changes: no
- storage format changes: no
- interface changes: yes
IRecordDataParser, IRecordReader, IRecordConverter
Details:
- record parser:
- delimited-data (CSV/TSV) parser: ignore and warn for invalid records.
- other parses: continue to use their existing behaviour.
- stream parser:
continue to use their existing behaviour.
- fixes:
- fixed S3 stream read() to properly advance to next files and also
to notify consumers to handle properties like header properly.
- fixed localfs stream read() when reached end of current file
and notifying of a new file source.
- extracted the read() of both streams since now they are identical.
- report file, record number and field number in warnings of parser
- propagate stream name to parsers that need report stream name
- add test cases
Change-Id: Ie1ba545d753d8afef9cef4e290e058019a465201
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/5926
Reviewed-by: Ali Alsuliman <ali.al.solaiman@gmail.com>
Reviewed-by: Murtadha Hubail <mhubail@apache.org>
Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
diff --git a/asterixdb/asterix-app/data/csv/header/h_mul_rec.csv b/asterixdb/asterix-app/data/csv/header/h_mul_rec.csv
new file mode 100644
index 0000000..23d0bcd
--- /dev/null
+++ b/asterixdb/asterix-app/data/csv/header/h_mul_rec.csv
@@ -0,0 +1,4 @@
+f1,f2,f3,f4
+1,2,3,"str"
+4,5,6,"rts"
+7,8,9,"srt"
\ No newline at end of file