如何将文件输入分割成 Java 中的部分
How to segment file input into portions in Java
我需要在下面的文件中分隔每条规则。
我怎样才能在 Java 中做到这一点?
这是文件内容
rule apt_regin_2011_32bit_stage1 {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin 32 bit stage 1 loaders"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={331015EA261D38A7}
$key2={9145A98BA37617DE}
$key3={EF745F23AA67243D}
$mz="MZ"
condition:
($mz at 0) and any of ($key*) and filesize < 300000
}
rule apt_regin_rc5key {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin RC5 decryption keys"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01}
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78}
condition:
any of ($key*)
}
rule apt_regin_vfs {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin VFSes"
version = "1.0"
last_modified = "2014-11-18"
strings:
$a1={00 02 00 08 00 08 03 F6 D7 F3 52}
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52}
$a3={00 04 00 10 00 10 03 C2 D3 1C 93}
$a4={00 04 00 10 C8 00 04 C8 93 06 D8}
condition:
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0)
}
rule apt_regin_dispatcher_disp_dll {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin disp.dll dispatcher"
version = "1.0"
last_modified = "2014-11-18"
strings:
$mz="MZ"
$string1="shit"
$string2="disp.dll"
$string3="255.255.255.255"
$string4="StackWalk64"
$string5="imagehlp.dll"
condition:
($mz at 0) and (all of ($string*))
}
如文件中所示,我需要将文件输入中找到的 4 条规则中的每一条分开,知道我该怎么做吗?
请耐心等待我。我是新手
提前赞赏!
将所有 4 个规则分开后,我需要将每个规则放入一个数组列表中。
例如:
数组列表[0]
rule apt_regin_2011_32bit_stage1 {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin 32 bit stage 1 loaders"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={331015EA261D38A7}
$key2={9145A98BA37617DE}
$key3={EF745F23AA67243D}
$mz="MZ"
condition:
($mz at 0) and any of ($key*) and filesize < 300000
}
数组列表[1]
rule apt_regin_rc5key {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin RC5 decryption keys"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01}
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78}
condition:
any of ($key*)
}
数组列表[2]
rule apt_regin_vfs {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin VFSes"
version = "1.0"
last_modified = "2014-11-18"
strings:
$a1={00 02 00 08 00 08 03 F6 D7 F3 52}
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52}
$a3={00 04 00 10 00 10 03 C2 D3 1C 93}
$a4={00 04 00 10 C8 00 04 C8 93 06 D8}
condition:
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0)
}
等等。
我该怎么做?
仅作记录:如果您的问题是仅到"segment"您输入的"rules",那么只需执行:
List<List<String>> sections = new ArrayList<>();
List<String> currentSection = null;
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null) {
if(line.startsWith("rule ")) {
if (currentSection != null) {
// we are finished with the previous section!
sections.add(currentSection);
}
currentSection = new ArrayList<>();
currentSection.add(line);
} else {
if(! line.trim().isEmpty()) {
// any non-empty line goes into the current section
currentSection.add(line);
}
}
}
} // end of try/while ... I am too lazy to count my braces ;-)
if (currentSelection != null) {
// make sure to add the final section, too!
sections.add(currentSelection);
}
但是:你对你真正的要求不是很准确。我很确定您真正的问题不在于 "segmenting" 该输入文件。
很可能,您的实际任务是读取该文件,并且对于该文件中的每个部分,您需要获取其内容的 some/all 以进行进一步处理。
换句话说:你实际上是在问 "how do I parse/process" 这个输入。我们无法回答这个问题;因为你没有告诉我们你到底想用这些数据做什么。
本质上,这是您的选择space:
- 如果真的有这么一个固定的布局,那么"parsing"归结起来理解"first comes rule, then comes meta, which looks like ..."。意思是:你"hard-code"把你的数据结构放到你的代码中。示例:您恰好 "know" 第三行包含
copyright = "some value"
。然后你开始使用正则表达式(或简单的字符串方法,如 indexOf()、substring())来提取你感兴趣的信息。
- 如果文件格式实际上是某种 "standard"(例如 XMl、JSON、YAML 等),那么您可以简单地选择一些第 3 方库解析此类文件。举个例子……我不能说;这绝对不是我熟悉的格式。
- 最坏的情况,您需要编写自己的解析器。编写解析器是一个复杂但 "well researched" 的主题,例如参见 [=12=]。
我需要在下面的文件中分隔每条规则。 我怎样才能在 Java 中做到这一点?
这是文件内容
rule apt_regin_2011_32bit_stage1 {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin 32 bit stage 1 loaders"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={331015EA261D38A7}
$key2={9145A98BA37617DE}
$key3={EF745F23AA67243D}
$mz="MZ"
condition:
($mz at 0) and any of ($key*) and filesize < 300000
}
rule apt_regin_rc5key {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin RC5 decryption keys"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01}
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78}
condition:
any of ($key*)
}
rule apt_regin_vfs {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin VFSes"
version = "1.0"
last_modified = "2014-11-18"
strings:
$a1={00 02 00 08 00 08 03 F6 D7 F3 52}
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52}
$a3={00 04 00 10 00 10 03 C2 D3 1C 93}
$a4={00 04 00 10 C8 00 04 C8 93 06 D8}
condition:
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0)
}
rule apt_regin_dispatcher_disp_dll {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin disp.dll dispatcher"
version = "1.0"
last_modified = "2014-11-18"
strings:
$mz="MZ"
$string1="shit"
$string2="disp.dll"
$string3="255.255.255.255"
$string4="StackWalk64"
$string5="imagehlp.dll"
condition:
($mz at 0) and (all of ($string*))
}
如文件中所示,我需要将文件输入中找到的 4 条规则中的每一条分开,知道我该怎么做吗? 请耐心等待我。我是新手 提前赞赏!
将所有 4 个规则分开后,我需要将每个规则放入一个数组列表中。
例如: 数组列表[0]
rule apt_regin_2011_32bit_stage1 {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin 32 bit stage 1 loaders"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={331015EA261D38A7}
$key2={9145A98BA37617DE}
$key3={EF745F23AA67243D}
$mz="MZ"
condition:
($mz at 0) and any of ($key*) and filesize < 300000
}
数组列表[1]
rule apt_regin_rc5key {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin RC5 decryption keys"
version = "1.0"
last_modified = "2014-11-18"
strings:
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01}
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78}
condition:
any of ($key*)
}
数组列表[2]
rule apt_regin_vfs {
meta:
copyright = "Kaspersky Lab"
description = "Rule to detect Regin VFSes"
version = "1.0"
last_modified = "2014-11-18"
strings:
$a1={00 02 00 08 00 08 03 F6 D7 F3 52}
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52}
$a3={00 04 00 10 00 10 03 C2 D3 1C 93}
$a4={00 04 00 10 C8 00 04 C8 93 06 D8}
condition:
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0)
}
等等。
我该怎么做?
仅作记录:如果您的问题是仅到"segment"您输入的"rules",那么只需执行:
List<List<String>> sections = new ArrayList<>();
List<String> currentSection = null;
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null) {
if(line.startsWith("rule ")) {
if (currentSection != null) {
// we are finished with the previous section!
sections.add(currentSection);
}
currentSection = new ArrayList<>();
currentSection.add(line);
} else {
if(! line.trim().isEmpty()) {
// any non-empty line goes into the current section
currentSection.add(line);
}
}
}
} // end of try/while ... I am too lazy to count my braces ;-)
if (currentSelection != null) {
// make sure to add the final section, too!
sections.add(currentSelection);
}
但是:你对你真正的要求不是很准确。我很确定您真正的问题不在于 "segmenting" 该输入文件。
很可能,您的实际任务是读取该文件,并且对于该文件中的每个部分,您需要获取其内容的 some/all 以进行进一步处理。
换句话说:你实际上是在问 "how do I parse/process" 这个输入。我们无法回答这个问题;因为你没有告诉我们你到底想用这些数据做什么。
本质上,这是您的选择space:
- 如果真的有这么一个固定的布局,那么"parsing"归结起来理解"first comes rule, then comes meta, which looks like ..."。意思是:你"hard-code"把你的数据结构放到你的代码中。示例:您恰好 "know" 第三行包含
copyright = "some value"
。然后你开始使用正则表达式(或简单的字符串方法,如 indexOf()、substring())来提取你感兴趣的信息。 - 如果文件格式实际上是某种 "standard"(例如 XMl、JSON、YAML 等),那么您可以简单地选择一些第 3 方库解析此类文件。举个例子……我不能说;这绝对不是我熟悉的格式。
- 最坏的情况,您需要编写自己的解析器。编写解析器是一个复杂但 "well researched" 的主题,例如参见 [=12=]。