CSVJDBC - 在聚合函数中解释字符串而不是整数

CSVJDBC - Interpreting Strings Instead of Integers in Aggregate Functions

我正在使用 CSVJDBC 驱动程序从 CSV 文件中检索结果。所有记录字段都被解释为字符串。如何利用 MAX 聚合函数来获取列的最大整数?据我所知,csvjdbc不支持转换。

考虑这个示例文件:

sequenceNumber,decimalNumber,randomInteger,email,testNumber
0,0.4868176550817932,560801,cleta.stroman@gmail.com,0.0
1,0.9889360969432277,903488,chelsie.roob@hotmail.com,1.0
2,0.8161798688893893,367870,hardy.waelchi@yahoo.com,2.0
3,0.926163166852633,588581,rafaela.white@hotmail.com,3.0
4,0.05084859872223901,563000,belle.hagenes@gmail.com,4.0
5,0.7636864392027013,375299,joey.beier@gmail.com,5.0
6,0.31433980690632457,544036,cornell.will@gmail.com,6.0
7,0.4061012200967966,41792,catalina.kemmer@gmail.com,7.0
8,0.3541002754332119,196272,raoul.bogisich@yahoo.com,8.0
9,0.4189826302561652,798405,clay.roberts@yahoo.com,9.0
10,0.9076084714059381,135783,angel.white@yahoo.com,10.0
11,0.565716974613909,865847,marlin.hoppe@gmail.com,11.0
12,0.9484076609924861,224744,anjali.stanton@gmail.com,12.0
13,0.05223710002804138,977787,harley.morar@hotmail.com,13.0
15,0.6270851001160621,469901,eldora.schmeler@yahoo.com,14.0

我使用以下代码片段:

import org.relique.jdbc.csv.CsvDriver;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;


public class CSVDemo
{
    public static void main(String[] args)
    {
    try
    {
        // Load the driver.
        Class.forName("org.relique.jdbc.csv.CsvDriver");

        // Create a connection. The first command line parameter is
        // the directory containing the .csv files.
        // A single connection is thread-safe for use by several threads.

        String CSVDIRECTORY = "/tmp/csv-directory/";
        String CSVDB ="mediumList";
        Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + CSVDIRECTORY);

        // Create a Statement object to execute the query with.
        // A Statement is not thread-safe.
        Statement stmt = conn.createStatement();

        ResultSet results = stmt.executeQuery("SELECT MAX(decimalNumber) FROM "+CSVDB);

        // Dump out the results to a CSV file with the same format
        // using CsvJdbc helper function
        boolean append = true;
        CsvDriver.writeToCsv(results, System.out, append);

        // Clean up
        conn.close();
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
    }
}

当我执行查询时

我得到了预期的结果:

MAX([DECIMALNUMBER])
0.9889360969432277

但是当我想要最大的sequenceNumber时,这个是19

ResultSet results = stmt.executeQuery("SELECT MAX(sequenceNumber)   FROM  "+CSVDB);

结果我得到 9:

MAX([SEQUENCENUMBER])
9

它适用于 decimalNumber,也适用于文本。它不适用于 testNumber,因为 csvjdbs returns 字典序最大值而不是整数值。有没有可能直接解决这个问题,或者我需要获取所有记录和 select 最大值 Java?

基本解决方案:

这是我的基本解决方案,需要先获取所有数字:

        ResultSet results = stmt.executeQuery("SELECT sequenceNumber FROM "+CSVDB);
        int max=-1;

        while(results.next()){
            String sum = results.getString(1);

            int currentSeq = Integer.parseInt(sum);
            System.out.println("current_ "+sum);
            if(currentSeq>max){
                max=currentSeq;
            }

有没有更优雅的方式?

基于 Joop Eggen 的解决方案

public int getMaxSequenceAggregate() {
       int max = 0;
       try {
           Properties props = new Properties();
           Connection connection;

           props.put("columnTypes", "Int,Double,Int,String,Int");
           connection = DriverManager.getConnection("jdbc:relique:csv:" + this.directoryPath, props);
           PreparedStatement statement = null;
           ResultSet result;
           statement = connection.prepareStatement("SELECT MAX(sequenceNumber) FROM " + this.filePath);
           result = statement.executeQuery();

           while (result.next()) {
               max = result.getInt(1);
               LOGGER.info("maximum sequence: " + max);

           }

           connection.close();
       } catch (SQLException e) {
           e.printStackTrace();
       }

       return max;
   }

您最好指定列类型,因为第一列似乎是字符串,其中 "9" > "10".

Properties props = new Properties();
props.put("columnTypes", "Integer,Double,Integer,String,Integer");
Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + CSVDIRECTORY, props);

如下来自CSV/JDBC documentation

If columnTypes is set to an empty string then column types are inferred from the data.

我想这在大多数用例中都是可取的。 因此,使用 Joop Eggen 的示例可以简化为:

Properties props = new Properties();
props.put("columnTypes", "");
Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + CSVDIRECTORY, props);

我试过了,它演示了类似于其他 JDBC 驱动程序的动态类型检测。 想知道为什么这不是默认设置。