无法通过 logstash 加载 csv 文件
Unable to load csv file through logstash
我是 ELK 的新手,我正在尝试通过 Logstash 加载本地存储的 .csv 文件,以便我可以将它与 Elasticsearch 一起使用。
logstash 配置文件如下所示:
input {
file {
path => "C:\ELK-Stack\Cars Data Set\cars.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator =>","
columns => ["maker","model","mileage","manufacture-year","engine_displacement","engine_power","body_type", "color_slug","stk_year","transmission","door_count","seat_count","fuel_type","date_created","date_last_seen", "price_eur"]
}
mutate {convert => ["mileage", "integer"]}
mutate {convert => ["price_eur", "float"]}
mutate {convert => ["door_count", "integer"]}
mutate {convert => ["engine_power", "integer"]}
mutate {convert => ["seat_count", "integer"]}
}
output {
elasticsearch {
hosts => ["localhost:9200"]}
index => "cars"
document_type => "sold_cars"
}
stdout {}
}
文件路径为:C:\ELK-Stack\Cars Data Set\cars.csv
我得到如下所示的输出:
.csv 文件有超过一百万行。任何帮助,将不胜感激。
编辑:
现在我正在处理另一个数据集,无法通过 logstash 加载它。
input {
file {
path => "C:\ELK-Stack1.csv"
start_position => "beginning"
sincedb_path => "NUL"
}
}
filter {
csv {
separator =>","
columns => ["Unique Key","Created Date","Closed Date","Agency","Agency Name","Complaint Type","Descriptor", "Location Type","Incident Zip","Incident Address","Street Name","Cross Street 1","Cross Street 2","Intersection Street 1","Intersection Street 2", "Address Type", "City", "Landmark", "Facility Type", "Status", "Due Date", "Resolution Description", "Resolution Action Updated Date", "Community Board", "BBL", "Borough", "X Coordinate (State Plane)", "Y Coordinate (State Plane)", "Open Data Channel Type", "Park Facility Name", "Park Borough", "Vehicle Type", "Taxi Company Borough", "Taxi Pick Up Location", "Bridge Highway Name", "Bridge Highway Segment", "Latitude", "Longitude", "Location"]
}
mutate {convert => ["Unique Key", "integer"]}
mutate {convert => ["Created Date", "timestamp"]}
mutate {convert => ["Closed Date", "timestamp"]}
mutate {convert => ["Due Date", "timestamp"]}
mutate {convert => ["Resolution Action Updated Date", "timestamp"]}
mutate {convert => ["X Coordinate (State Plane)", "integer"]}
mutate {convert => ["X Coordinate (State Plane)", "integer"]}
mutate {convert => ["Latitude", "integer"]}
mutate {convert => ["Longitude", "integer"]}
mutate {convert => ["Location", "integer"]}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "311"
}
stdout {}
}
有什么想法是错误的吗?
您的配置有两个错误,第一个是输出块中的拼写错误,hosts
行中的右大括号,错误日志中对此进行了描述。
exception => "LogStash:ConfigurationError"
错误的是这一行:hosts => ["localhost:9200"]}
这是固定配置
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "cars"
}
stdout {}
}
并且由于您是 运行 Logstash 7.5,document_type
选项已从版本 7.0 中删除。
第二个错误在您的输入块中,即使在 运行 windows 时您也应该使用正斜杠,但是指向 /dev/null/
的 sincedb_path
是 Linux/macOS 配置,在 Windows 上你应该使用 NUL
.
这是正确的配置
input {
file {
path => "C:/ELK-Stack/Cars Data Set/cars.csv"
start_position => "beginning"
sincedb_path => "NUL"
}
}
我是 ELK 的新手,我正在尝试通过 Logstash 加载本地存储的 .csv 文件,以便我可以将它与 Elasticsearch 一起使用。
logstash 配置文件如下所示:
input {
file {
path => "C:\ELK-Stack\Cars Data Set\cars.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator =>","
columns => ["maker","model","mileage","manufacture-year","engine_displacement","engine_power","body_type", "color_slug","stk_year","transmission","door_count","seat_count","fuel_type","date_created","date_last_seen", "price_eur"]
}
mutate {convert => ["mileage", "integer"]}
mutate {convert => ["price_eur", "float"]}
mutate {convert => ["door_count", "integer"]}
mutate {convert => ["engine_power", "integer"]}
mutate {convert => ["seat_count", "integer"]}
}
output {
elasticsearch {
hosts => ["localhost:9200"]}
index => "cars"
document_type => "sold_cars"
}
stdout {}
}
文件路径为:C:\ELK-Stack\Cars Data Set\cars.csv
我得到如下所示的输出:
.csv 文件有超过一百万行。任何帮助,将不胜感激。
编辑:
现在我正在处理另一个数据集,无法通过 logstash 加载它。
input {
file {
path => "C:\ELK-Stack1.csv"
start_position => "beginning"
sincedb_path => "NUL"
}
}
filter {
csv {
separator =>","
columns => ["Unique Key","Created Date","Closed Date","Agency","Agency Name","Complaint Type","Descriptor", "Location Type","Incident Zip","Incident Address","Street Name","Cross Street 1","Cross Street 2","Intersection Street 1","Intersection Street 2", "Address Type", "City", "Landmark", "Facility Type", "Status", "Due Date", "Resolution Description", "Resolution Action Updated Date", "Community Board", "BBL", "Borough", "X Coordinate (State Plane)", "Y Coordinate (State Plane)", "Open Data Channel Type", "Park Facility Name", "Park Borough", "Vehicle Type", "Taxi Company Borough", "Taxi Pick Up Location", "Bridge Highway Name", "Bridge Highway Segment", "Latitude", "Longitude", "Location"]
}
mutate {convert => ["Unique Key", "integer"]}
mutate {convert => ["Created Date", "timestamp"]}
mutate {convert => ["Closed Date", "timestamp"]}
mutate {convert => ["Due Date", "timestamp"]}
mutate {convert => ["Resolution Action Updated Date", "timestamp"]}
mutate {convert => ["X Coordinate (State Plane)", "integer"]}
mutate {convert => ["X Coordinate (State Plane)", "integer"]}
mutate {convert => ["Latitude", "integer"]}
mutate {convert => ["Longitude", "integer"]}
mutate {convert => ["Location", "integer"]}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "311"
}
stdout {}
}
有什么想法是错误的吗?
您的配置有两个错误,第一个是输出块中的拼写错误,hosts
行中的右大括号,错误日志中对此进行了描述。
exception => "LogStash:ConfigurationError"
错误的是这一行:hosts => ["localhost:9200"]}
这是固定配置
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "cars"
}
stdout {}
}
并且由于您是 运行 Logstash 7.5,document_type
选项已从版本 7.0 中删除。
第二个错误在您的输入块中,即使在 运行 windows 时您也应该使用正斜杠,但是指向 /dev/null/
的 sincedb_path
是 Linux/macOS 配置,在 Windows 上你应该使用 NUL
.
这是正确的配置
input {
file {
path => "C:/ELK-Stack/Cars Data Set/cars.csv"
start_position => "beginning"
sincedb_path => "NUL"
}
}