先简单介绍下CSV和TSV文件的区别:
TSV ,Tab-separated values的缩写,即制表符分隔值。关于TSV标准,参考:http://en.wikipedia.org/wiki/Tab-separated_values CSV,Comma-separated values,即逗号分隔值。关于CSV标准,参考:http://en.wikipedia.org/wiki/Comma-separated_values
项目需要把原有的tsv文件数据整理一下形成更方便使用的新tsv文件(加几列)。涉及到tsv文件的读写。其实自己实现也是很简单的功能,不过正好有现成的工具包supercsv,就拿来用用试试。 官网地址:http://supercsv.sourceforge.net/index.html
文档可以说是清晰明了,网上其实也有不少用supercsv解析csv文件的例子,不过从tsv和csv的区别就可以看出,完全一套代码是可以解决的,只要换个分隔符就好饿了。supercsv里,也确实做到了。 先附上官网的例子:http://supercsv.sourceforge.net/examples_reading.html
解析的csv文件:
ustomerNo,firstName,lastName,birthDate,mailingAddress,married,numberOfKids,favouriteQuote,email,loyaltyPoints 1,John,Dunbar,13/06/1945,"1600 Amphitheatre Parkway Mountain View, CA 94043 United States",,,"""May the Force be with you."" - Star Wars",jdunbar@gmail.com,0 2,Bob,Down,25/02/1919,"1601 Willow Rd. Menlo Park, CA 94025 United States",Y,0,"""Frankly, my dear, I don't give a damn."" - Gone With The Wind",bobdown@hotmail.com,123456 3,Alice,Wunderland,08/08/1985,"One Microsoft Way Redmond, WA 98052-6399 United States",Y,0,"""Play it, Sam. Play ""As Time Goes By."""" - Casablanca",throughthelookingglass@yahoo.com,2255887799 4,Bill,Jobs,10/07/1973,"2701 San Tomas Expressway Santa Clara, CA 95050 United States",Y,3,"""You've got to ask yourself one question: ""Do I feel lucky?"" Well, do ya, punk?"" - Dirty Harry",billy34@hotmail.com,36
利用MapReader方式解析的代码:
代码如下 | 复制代码 |
<pre class="brush:java"> /** * An example of reading using CsvMapReader. */ private static void readWithCsvMapReader() throws Exception { ICsvMapReader mapReader = null; /** final String emailRegex = "[a-z0-9._]+@[a-z0-9.]+"; // just an example, not very robust! |
样例的代码恐怕清楚的不能再清楚了。只需要解释一点,分隔符是通过CsvPreference.STANDARD_PREFERENCE设定的。如果想要解析TSV文件,只需要将这里换成CsvPreference TAB_PREFERENCE即可。
附个源码吧:
代码如下 | 复制代码 |
<pre class="brush:java"> /** * Ready to use configuration that should cover 99% of all usages. */ public static final CsvPreference STANDARD_PREFERENCE = new CsvPreference.Builder(‘”’ , ‘,’,”rn”).build(); /** </pre> |