Utilizing CSV file utilities
Comma Separated Value (CSV) files are a very common data format. EDA provides the com.neptuny.cpit.etl.util class that contains useful methods for processing such files.
The safeSplit method
This method splits fields in different lines of a CSV file based on the double quote (") string delimiter.
To better understand the utility of this method, consider the following example where a particular line requires to be split based on the double quote delimiter:
"first","12,123,23","a"
If you use the safeSplit method, you can split the line and obtain 3 fields as expected. The following code example explains how this is done:
ArrayList<String> safeSplit(",", String line)
The detectSeparator method
This method automatically detects field separators present in different lines of a CSV file.
The following code example illustrates the usage of this method:
String sep = CSVUtil.detectSeparator(String line){
The technique that this method uses to find the separator is based on heuristics: it assumes that if a line in the CSV file contains a character that is not a word character, number, parenthesis, or a quoting delimiter, then it probably is a field separator.