在 shell 的特定时间戳后仅列出 hbase 中的行键

Question

如何在 shell 的特定时间戳后仅列出 hbase 中的行键（不是值或时间戳）。

Answer 1

时间戳绑定到列而不是行。因此，如果您按时间戳过滤，您只会 return 该行的某些列。

如果你有 table t1 :

ROW             COLUMN+CELL
ID1             column=d:actif, timestamp=25, value=false
ID1             column=d:name, timestamp=22, value="Sudipto"
ID1             column=m:lastMaj, timestamp=25, value=25
ID2             column=d:actif, timestamp=24, value=false
ID2             column=m:lastMaj, timestamp=24, value=24

您可以使用 :

过滤时间戳

scan 't1', { TIMERANGE => [0, 25] }

但你只会 return :

ROW             COLUMN+CELL
ID1             column=d:actif, timestamp=26, value=false
ID1             column=m:lastMaj, timestamp=26, value=26
ID2             column=d:actif, timestamp=24, value=false
ID2             column=m:lastMaj, timestamp=24, value=24

所以你丢失了专栏:

ROW             COLUMN+CELL
ID1             column=d:name, timestamp=22, value="Sudipto"

但是，如果你想要所有的列，有一个解决方案。您可以按元数据过滤（此处 m:lastMaj）。每当您修改行的一列时，都必须更新此 lastMaj 数据。

这里，我在修改"d:actif"的时候，也修改了"m:lastMaj"（他们的时间戳都是25）
当我想获取特定时间戳后的所有行时，我将只过滤 "m:lastMaj timestamp".

在 shell 中按值扫描的命令可以是：

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes
scan 't', { FILTER => 
    SingleColumnValueFilter.new(
        Bytes.toBytes('m'), 
        Bytes.toBytes('lastMaj'),
        CompareFilter::CompareOp.valueOf('GREATER'), 
        Bytes.toBytes('25'))
}

在 shell 的特定时间戳后仅列出 hbase 中的行键

List only the row keys in hbase after a specific timestamp from shell

hadoop

hbase