如何创建子字符串位置 -Java 的映射或索引列表?
How can you create an map or list of indexes of a substring position-Java?
我正在解析文本文件中的许多行。文件行的长度和宽度是固定的,但取决于行的开头,例如“0301 ....”,文件数据结构被拆分。有以 11、34 等开头的行示例,并且基于该行以不同方式拆分。
示例:如果行首包含“03”,则该行将在
处拆分
name = line.substring(2, 10);
surname = line.substring(11, 21);
id = line.substring(22, 34);
adress = line.substring (35, 46);
另一个示例:如果行首包含“24”,则该行将在
处拆分
name = line.substring(5, 15);
salary = line.substring(35, 51);
empid = line.substring(22, 34);
department = line.substring (35, 46);
所以我有很多子字符串被添加到许多字符串中,然后写入一个新的 csv 文件。
我的问题是,是否有任何简单的方法来存储子字符串的坐标(索引)并在以后更轻松地调用它们?例子
name = (2,10);
surname = (11,21);
...
等等
或者可能使用子字符串的任何替代方法?谢谢!
创建一个名为 Line
的 class 并存储这些对象而不是字符串:
class Line {
int[] name;
int[] surname;
int[] id;
int[] address;
String line;
public Line(String line) {
this.line = line;
String startCode = line.substring(0, 3);
switch(startCode) {
case "03":
this.name = new int[]{2, 10};
this.surname = new int[]{11, 21};
this.id = new int[]{22, 34};
this.address = new int[]{35, 46};
break;
case "24":
// same thing with different indices
break;
// add more cases
}
}
public String getName() {
return this.line.substring(this.name[0], this.name[1]);
}
public String getSurname() {
return this.line.substring(this.surname[0], this.surname[1]);
}
public String getId() {
return this.line.substring(this.id[0], this.id[1]);
}
public String getAddress() {
return this.line.substring(this.address[0], this.address[1]);
}
}
然后:
String line = "03 ..."
Line parsed = new Line(line);
parsed.getName();
parsed.getSurname();
...
如果您要从 Line
对象中多次检索 name
、surname
等,您甚至可以在第一次缓存它,这样您没有多次调用 substring
你可以尝试这样的事情。我会把边界检查和优化留给你,但作为第一步...
public static void main( String[] args ) {
Map<String, Map<String,IndexDesignation>> substringMapping = new HashMap<>();
// Put all the designations of how to map here
substringMapping.put( "03", new HashMap<>());
substringMapping.get( "03" ).put( "name", new IndexDesignation(2,10));
substringMapping.get( "03" ).put( "surname", new IndexDesignation(11,21));
// This determines which mapping value to use
Map<String,IndexDesignation> indexDesignationMap = substringMapping.get(args[0].substring(0,2));
// This holds the results
Map<String, String> resultsMap = new HashMap<>();
// Make sure we actually have a map to use
if ( indexDesignationMap != null ) {
// Now take this particular map designation and turn it into the resulting map of name to values
for ( Map.Entry<String,IndexDesignation> mapEntry : indexDesignationMap.entrySet() ) {
resultsMap.put(mapEntry.getKey(), args[0].substring(mapEntry.getValue().startIndex,
mapEntry.getValue().endIndex));
}
}
// Print out the results (and you can assign to another object here as needed)
System.out.println( resultsMap );
}
// Could also just use a list of two elements instead of this
static class IndexDesignation {
int startIndex;
int endIndex;
public IndexDesignation( int startIndex, int endIndex ) {
this.startIndex = startIndex;
this.endIndex = endIndex;
}
}
我们也可以使用正则表达式模式和流来实现结果。
比如说,我们有这样一个文本文件 -
03SomeNameSomeSurname
24SomeName10000
正则表达式模式具有用于将属性名称分配给已解析文本的组名称。所以,第一行的模式是 -
^03(?<name>.{8})(?<surname>.{11})
密码是-
public static void main(String[] args) {
// Fixed Width File Lines
List<String> fileLines = List.of(
"03SomeNameSomeSurname",
"24SomeName10000"
);
// List all regex patterns for the specific file
List<Pattern> patternList = List.of(
Pattern.compile("^03(?<name>.{8})(?<surname>.{11})"), // Regex for String - 03SomeNameSomeSurname
Pattern.compile("^24(?<name>.{8})(?<salary>.{5})")); // Regex For String - 24SomeName10000
// Pattern for finding Group Names
Pattern groupNamePattern = Pattern.compile("\?<([a-zA-Z0-9]*)>");
List<List<String>> output = fileLines.stream().map(
line -> patternList.stream() // Stream over the pattern list
.map(pattern -> pattern.matcher(line)) // Create a matcher for the fixed width line and regex pattern
.filter(matcher -> matcher.find()) // Filter matcher which matches correctly
.map( // Transform matcher results into String (Group Name = Matched Value
matcher ->
groupNamePattern.matcher(matcher.pattern().toString()).results() // Find Group Names for the regex pattern
.map(groupNameMatchResult -> groupNameMatchResult.group(1) + "=" + matcher.group(groupNameMatchResult.group(1))) // Transform into String (Group Name = Matched Value)
.collect(Collectors.joining(","))) // Join results delimited with ,
.collect(Collectors.toList())
).collect(Collectors.toList());
System.out.println(output);
}
输出结果已经将属性名和属性值解析为String的List。
[[name=SomeName,surname=SomeSurname], [name=SomeName,salary=10000]]
我正在解析文本文件中的许多行。文件行的长度和宽度是固定的,但取决于行的开头,例如“0301 ....”,文件数据结构被拆分。有以 11、34 等开头的行示例,并且基于该行以不同方式拆分。
示例:如果行首包含“03”,则该行将在
处拆分name = line.substring(2, 10);
surname = line.substring(11, 21);
id = line.substring(22, 34);
adress = line.substring (35, 46);
另一个示例:如果行首包含“24”,则该行将在
处拆分name = line.substring(5, 15);
salary = line.substring(35, 51);
empid = line.substring(22, 34);
department = line.substring (35, 46);
所以我有很多子字符串被添加到许多字符串中,然后写入一个新的 csv 文件。
我的问题是,是否有任何简单的方法来存储子字符串的坐标(索引)并在以后更轻松地调用它们?例子
name = (2,10);
surname = (11,21);
... 等等
或者可能使用子字符串的任何替代方法?谢谢!
创建一个名为 Line
的 class 并存储这些对象而不是字符串:
class Line {
int[] name;
int[] surname;
int[] id;
int[] address;
String line;
public Line(String line) {
this.line = line;
String startCode = line.substring(0, 3);
switch(startCode) {
case "03":
this.name = new int[]{2, 10};
this.surname = new int[]{11, 21};
this.id = new int[]{22, 34};
this.address = new int[]{35, 46};
break;
case "24":
// same thing with different indices
break;
// add more cases
}
}
public String getName() {
return this.line.substring(this.name[0], this.name[1]);
}
public String getSurname() {
return this.line.substring(this.surname[0], this.surname[1]);
}
public String getId() {
return this.line.substring(this.id[0], this.id[1]);
}
public String getAddress() {
return this.line.substring(this.address[0], this.address[1]);
}
}
然后:
String line = "03 ..."
Line parsed = new Line(line);
parsed.getName();
parsed.getSurname();
...
如果您要从 Line
对象中多次检索 name
、surname
等,您甚至可以在第一次缓存它,这样您没有多次调用 substring
你可以尝试这样的事情。我会把边界检查和优化留给你,但作为第一步...
public static void main( String[] args ) {
Map<String, Map<String,IndexDesignation>> substringMapping = new HashMap<>();
// Put all the designations of how to map here
substringMapping.put( "03", new HashMap<>());
substringMapping.get( "03" ).put( "name", new IndexDesignation(2,10));
substringMapping.get( "03" ).put( "surname", new IndexDesignation(11,21));
// This determines which mapping value to use
Map<String,IndexDesignation> indexDesignationMap = substringMapping.get(args[0].substring(0,2));
// This holds the results
Map<String, String> resultsMap = new HashMap<>();
// Make sure we actually have a map to use
if ( indexDesignationMap != null ) {
// Now take this particular map designation and turn it into the resulting map of name to values
for ( Map.Entry<String,IndexDesignation> mapEntry : indexDesignationMap.entrySet() ) {
resultsMap.put(mapEntry.getKey(), args[0].substring(mapEntry.getValue().startIndex,
mapEntry.getValue().endIndex));
}
}
// Print out the results (and you can assign to another object here as needed)
System.out.println( resultsMap );
}
// Could also just use a list of two elements instead of this
static class IndexDesignation {
int startIndex;
int endIndex;
public IndexDesignation( int startIndex, int endIndex ) {
this.startIndex = startIndex;
this.endIndex = endIndex;
}
}
我们也可以使用正则表达式模式和流来实现结果。
比如说,我们有这样一个文本文件 -
03SomeNameSomeSurname
24SomeName10000
正则表达式模式具有用于将属性名称分配给已解析文本的组名称。所以,第一行的模式是 -
^03(?<name>.{8})(?<surname>.{11})
密码是-
public static void main(String[] args) {
// Fixed Width File Lines
List<String> fileLines = List.of(
"03SomeNameSomeSurname",
"24SomeName10000"
);
// List all regex patterns for the specific file
List<Pattern> patternList = List.of(
Pattern.compile("^03(?<name>.{8})(?<surname>.{11})"), // Regex for String - 03SomeNameSomeSurname
Pattern.compile("^24(?<name>.{8})(?<salary>.{5})")); // Regex For String - 24SomeName10000
// Pattern for finding Group Names
Pattern groupNamePattern = Pattern.compile("\?<([a-zA-Z0-9]*)>");
List<List<String>> output = fileLines.stream().map(
line -> patternList.stream() // Stream over the pattern list
.map(pattern -> pattern.matcher(line)) // Create a matcher for the fixed width line and regex pattern
.filter(matcher -> matcher.find()) // Filter matcher which matches correctly
.map( // Transform matcher results into String (Group Name = Matched Value
matcher ->
groupNamePattern.matcher(matcher.pattern().toString()).results() // Find Group Names for the regex pattern
.map(groupNameMatchResult -> groupNameMatchResult.group(1) + "=" + matcher.group(groupNameMatchResult.group(1))) // Transform into String (Group Name = Matched Value)
.collect(Collectors.joining(","))) // Join results delimited with ,
.collect(Collectors.toList())
).collect(Collectors.toList());
System.out.println(output);
}
输出结果已经将属性名和属性值解析为String的List。
[[name=SomeName,surname=SomeSurname], [name=SomeName,salary=10000]]