clickhouse 导入 csv 字典

clickhouse import csv dictionary

新的 CH 用户和我正在尝试设置一个字典,将航空公司 2 字符代码映射到航空公司名称,以便在我使用此处的示例数据创建的 onTime 数据库中使用 https://clickhouse.com/docs/en/getting-started/example-datasets/ontime/

然后我用这些内容手动创建了一个 csv 文件:

id,code,name
1,UA,United Airlines
2,HA,Hawaiian Airlines
3,OO,SkyWest
4,B6,Jetblue Airway
5,QX,Horizon Air
6,YX,Republic Airway
7,G4,Allegiant Air
8,EV,ExpressJet Airlines
9,YV,Mesa Airlines
10,WN,Southwest Airlines
11,OH,PSA Airlines
12,MQ,Envoy Air
13,9E,Endeavor Air
14,NK,Spirit Airlines
15,AA,American Airlines
16,DL,Delta Air Lines
17,AS,Alaska Airlines
18,F9,Frontier Airlines

创建了字典

CREATE DICTIONARY airlinecompany
(
    id UInt64, 
    code String,
    company String

)
PRIMARY KEY id 
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSV'))
LAYOUT(FLAT())
LIFETIME(3600)

我可以看到字典已创建

┌─name───────────┐
│ airlinecompany │
│ ontime         │
└────────────────┘

但是当我尝试列出它的内容时,我得到了这个错误:

Received exception from server (version 22.3.3):
Code: 27. DB::Exception: Received from localhost:9000. DB::Exception: Cannot parse input: expected ',' before: 'id,code,name\r\n1,UA,United Airlines\r\n2,HA,Hawaiian Airlines\r\n3,OO,SkyWest\r\n4,B6,Jetblue Airway\r\n5,QX,Horizon Air\r\n6,YX,Republic Airway\r\n7,G4,Allegiant Air\r\n8,EV,':
Row 1:
Column 0,   name: id,      type: UInt64, ERROR: text "id,code,na" is not like UInt64

: While executing CSVRowInputFormat. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)

但我不认为 csv 以 , 在 id 之前开头。我是否遗漏了创建声明中的某些内容,或者我是否需要以某种方式生成 csv?

*** 使用正确的插入进行编辑: 我之前做错的主要有两件事

-布局需要COMPLEX_KEY_HASHED()

-主键应该是code

CREATE DICTIONARY airlinecompany
(
    id UInt64, 
    code String,
    company String

)
PRIMARY KEY code
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
LAYOUT(COMPLEX_KEY_HASHED())
LIFETIME(3600)

CSV 文件包含 header 因此需要使用 CSVWithNames-format instead of CSV:

CREATE DICTIONARY airlinecompany
(
    id UInt64, 
    code String,
    company String

)
PRIMARY KEY id 
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
LAYOUT(FLAT())
LIFETIME(3600)