从 shell 脚本中的字符串中提取信息
pulling information out of a string in shell script
我无法从 shell 脚本中的字符串中提取所需的信息。我已经阅读并尝试提出正确的 awk 或 sed 命令来执行此操作,但我就是想不通。希望大家帮帮忙。
假设我有一个字符串如下:
["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]
现在我要做的是将所有这些属性提取到单独的字符串数组中。例如:
我想要一个 ID 数组 2817262 2262 28182
名称数组 somename somename somename
一组 hasproperty false false true
任何人都可以帮我想出我需要的命令来解决这个问题。另请记住,字符串可能会比这长得多,因此如果我们不能将其具体针对 3 种情况,那将很有帮助。非常感谢。
你可以使用 grep。
grep -oP '"ids":\K\d+' file
示例:
$ echo '["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]' | grep -oP '"ids":\K\d+'
2817262
2262
28182
grep 解决方案很漂亮。您的问题被标记为 awk。 awk 解决方案很丑陋:
echo '["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]' \
| awk '{split(substr([=10=],2,length([=10=])-2),x,",");
for(i=0;i<length(x);i++) {split(x[i],a,":");
if(a[1]=="\"ids\"") print a[1],a[2]}}'
输出:
"ids" 2817262
"ids" 2262
"ids" 28182
请选择grep解决方案作为正确答案。
因为它是用 awk 标记的
awk '{while(x=match([=10=],/"ids":([^,]+)/,a)){print a[1];[=10=]=substr([=10=],x+RLENGTH)}}' file
这只会继续匹配任何 id
,然后将行更改为仅包含 ID 之后的内容。
输出
2817262
2262
28182
也可以这样做(灵感来自 Wintermutes 对另一个答案的评论)
awk -v RS=",|]" 'sub(/^.*"ids":/,"")' file
这是一个纯粹的 bash 解决方案(啰嗦,不是吗?我倾向于同意@chepner):
str='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,
"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,
"isvalid":true,"name":"somename","hasproperty":true]'
#Remove [ ]
str=${str/[/}
str=${str/]/}
declare -a ids
declare -a names
declare -a properties
oldIFS="$IFS"
IFS=','
for record in $str
do
type=${record%%:*}
value=${record##*:}
if [[ $type == \"ids\" ]]
then
ids[ids_i++]="$value"
elif [[ $type == \"name\" ]]
then
names[names_i++]="$value"
elif [[ $type == \"hasproperty\" ]]
then
properties[properties_i++]="$value"
else
echo "Ignored type: '$type'" >&2
fi
done
IFS="$oldIFS"
echo "ids: ${ids[@]}"
echo "names: ${names[@]}"
echo "properties: ${properties[@]}"
唯一可行的是没有子进程。
awk 'BEGIN {
Field = 1
Index = 0
}
{
gsub( /[][]/,"")
gsub( /"[a-z]*":/, "")
FS=","
while ( Field < NF) {
ThisID[ Index]=$Field
ThisName[ Index]=$(Field + 2)
ThisProperty [ Index]=$(Field + 3)
Index+=1
Field+=4
}
}
END {
for ( Iter=0;Iter<Index;Iter+=1) printf( "%s ", ThisID[Iter])
printf "\n"
for ( Iter=0;Iter<Index;Iter++) printf( "%s ", ThisName[Iter])
printf "\n"
for ( Iter=0;Iter<Index;Iter++) printf( "%s ", ThisProperty[Iter])
printf "\n"
}' YourFile
还是把你的数组赋值给你最喜欢的变量
unset n
string='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]'
while IFS=',' read -ra line
do
((n++))
for i in "${line[@]//\"/}"
do
eval ${i%:*}[$n]=${i#*:}
done
done < <(sed 's/[][]//g;s/,"ids/\n"ids/g' <<<$string)
以上将产生4个数组(ids
、isvalid
、name
、hasproperty
)。如果你不需要 isvalid
只需添加:
unset n
string='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]'
while IFS=',' read -ra line
do
((n++))
for i in "${line[@]//\"/}"
do
[ "${i%:*}" != "isvalid" ] && eval ${i/:/[$n]=}
done
done < <(sed 's/[][]//g;s/,"ids/\n"ids/g' <<<$string)
鉴于您发布的输入,如果您想要的只是每种类型的项目的列表,那么这就是您所需要的:
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^ids/{print }' file
2817262
2262
28182
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^name/{print }' file
somename
somename
somename
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^hasproperty/{print }' file
false
false
true
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^isvalid/{print }' file
true
false
true
但这不太可能是解决您的问题的正确方法。正如我在评论中提到的,如果您需要一些真正的帮助,请编辑您的问题以提供更多信息。
我无法从 shell 脚本中的字符串中提取所需的信息。我已经阅读并尝试提出正确的 awk 或 sed 命令来执行此操作,但我就是想不通。希望大家帮帮忙。
假设我有一个字符串如下:
["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]
现在我要做的是将所有这些属性提取到单独的字符串数组中。例如:
我想要一个 ID 数组 2817262 2262 28182 名称数组 somename somename somename 一组 hasproperty false false true
任何人都可以帮我想出我需要的命令来解决这个问题。另请记住,字符串可能会比这长得多,因此如果我们不能将其具体针对 3 种情况,那将很有帮助。非常感谢。
你可以使用 grep。
grep -oP '"ids":\K\d+' file
示例:
$ echo '["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]' | grep -oP '"ids":\K\d+'
2817262
2262
28182
grep 解决方案很漂亮。您的问题被标记为 awk。 awk 解决方案很丑陋:
echo '["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]' \
| awk '{split(substr([=10=],2,length([=10=])-2),x,",");
for(i=0;i<length(x);i++) {split(x[i],a,":");
if(a[1]=="\"ids\"") print a[1],a[2]}}'
输出:
"ids" 2817262
"ids" 2262
"ids" 28182
请选择grep解决方案作为正确答案。
因为它是用 awk 标记的
awk '{while(x=match([=10=],/"ids":([^,]+)/,a)){print a[1];[=10=]=substr([=10=],x+RLENGTH)}}' file
这只会继续匹配任何 id
,然后将行更改为仅包含 ID 之后的内容。
输出
2817262
2262
28182
也可以这样做(灵感来自 Wintermutes 对另一个答案的评论)
awk -v RS=",|]" 'sub(/^.*"ids":/,"")' file
这是一个纯粹的 bash 解决方案(啰嗦,不是吗?我倾向于同意@chepner):
str='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,
"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,
"isvalid":true,"name":"somename","hasproperty":true]'
#Remove [ ]
str=${str/[/}
str=${str/]/}
declare -a ids
declare -a names
declare -a properties
oldIFS="$IFS"
IFS=','
for record in $str
do
type=${record%%:*}
value=${record##*:}
if [[ $type == \"ids\" ]]
then
ids[ids_i++]="$value"
elif [[ $type == \"name\" ]]
then
names[names_i++]="$value"
elif [[ $type == \"hasproperty\" ]]
then
properties[properties_i++]="$value"
else
echo "Ignored type: '$type'" >&2
fi
done
IFS="$oldIFS"
echo "ids: ${ids[@]}"
echo "names: ${names[@]}"
echo "properties: ${properties[@]}"
唯一可行的是没有子进程。
awk 'BEGIN {
Field = 1
Index = 0
}
{
gsub( /[][]/,"")
gsub( /"[a-z]*":/, "")
FS=","
while ( Field < NF) {
ThisID[ Index]=$Field
ThisName[ Index]=$(Field + 2)
ThisProperty [ Index]=$(Field + 3)
Index+=1
Field+=4
}
}
END {
for ( Iter=0;Iter<Index;Iter+=1) printf( "%s ", ThisID[Iter])
printf "\n"
for ( Iter=0;Iter<Index;Iter++) printf( "%s ", ThisName[Iter])
printf "\n"
for ( Iter=0;Iter<Index;Iter++) printf( "%s ", ThisProperty[Iter])
printf "\n"
}' YourFile
还是把你的数组赋值给你最喜欢的变量
unset n
string='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]'
while IFS=',' read -ra line
do
((n++))
for i in "${line[@]//\"/}"
do
eval ${i%:*}[$n]=${i#*:}
done
done < <(sed 's/[][]//g;s/,"ids/\n"ids/g' <<<$string)
以上将产生4个数组(ids
、isvalid
、name
、hasproperty
)。如果你不需要 isvalid
只需添加:
unset n
string='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]'
while IFS=',' read -ra line
do
((n++))
for i in "${line[@]//\"/}"
do
[ "${i%:*}" != "isvalid" ] && eval ${i/:/[$n]=}
done
done < <(sed 's/[][]//g;s/,"ids/\n"ids/g' <<<$string)
鉴于您发布的输入,如果您想要的只是每种类型的项目的列表,那么这就是您所需要的:
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^ids/{print }' file
2817262
2262
28182
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^name/{print }' file
somename
somename
somename
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^hasproperty/{print }' file
false
false
true
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^isvalid/{print }' file
true
false
true
但这不太可能是解决您的问题的正确方法。正如我在评论中提到的,如果您需要一些真正的帮助,请编辑您的问题以提供更多信息。