JDBC 接收器配置选项 batch.size

Question

来自https://docs.confluent.io/3.1.1/connect/connect-jdbc/docs/sink_config_options.html#jdbc-sink-configuration-options

Specifies how many records to attempt to batch together for insertion into the destination table, when possible.

Type: int
Default: 3000
Valid Values: [0,…]
Importance: medium

So, this is from Confluent site.

Importance is medium, default is 3000. What if I want the KAFKA changes every 30 secs even if there are say, only 27 KAFKA messages for the topic? What is default setting in which processing occurs on a per elapsed time basis? We all know this is catered for as we can run many examples just passing 1 records from, say mySQL to SQLServer, but I cannot find the parameter value for time based processing. Can I influence it?

https://github.com/confluentinc/kafka-connect-jdbc/issues/290 也注意到了这一点。那里有一些有趣的东西。

Answer 1

我觉得你应该把重点放在"when possible"

这几个字上

consumer.max.poll.records 总是会从 Kafka 抓取那么多记录。轮询完成后，JDBC 接收器将根据需要构建尽可能多的批次，直到在 consumer.max.poll.interval.ms

内调用下一次消费者轮询

JDBC 接收器配置选项 batch.size

JDBC Sink Configuration Options batch.size

jdbc

apache-kafka

apache-kafka-connect