如何创建子字符串位置 -Java 的映射或索引列表?

How can you create an map or list of indexes of a substring position-Java?

我正在解析文本文件中的许多行。文件行的长度和宽度是固定的,但取决于行的开头,例如“0301 ....”,文件数据结构被拆分。有以 11、34 等开头的行示例,并且基于该行以不同方式拆分。

示例:如果行首包含“03”,则该行将在

处拆分
name = line.substring(2, 10);
surname = line.substring(11, 21);
id = line.substring(22, 34);
adress = line.substring (35, 46); 

另一个示例:如果行首包含“24”,则该行将在

处拆分
name = line.substring(5, 15);
salary = line.substring(35, 51);
empid = line.substring(22, 34);
department = line.substring (35, 46); 

所以我有很多子字符串被添加到许多字符串中,然后写入一个新的 csv 文件。

我的问题是,是否有任何简单的方法来存储子字符串的坐标(索引)并在以后更轻松地调用它们?例子

name = (2,10);
surname = (11,21);

... 等等

或者可能使用子字符串的任何替代方法?谢谢!

创建一个名为 Line 的 class 并存储这些对象而不是字符串:

class Line {

  int[] name;
  int[] surname;
  int[] id;
  int[] address;

  String line;

  public Line(String line) {
    this.line = line;

    String startCode = line.substring(0, 3);
    switch(startCode) {
      case "03":
        this.name = new int[]{2, 10};
        this.surname = new int[]{11, 21};
        this.id = new int[]{22, 34};
        this.address = new int[]{35, 46};
        break;
      case "24":
        // same thing with different indices
        break;
      // add more cases
    }
  }

  public String getName() {
    return this.line.substring(this.name[0], this.name[1]);
  }

  public String getSurname() {
    return this.line.substring(this.surname[0], this.surname[1]);
  }

  public String getId() {
    return this.line.substring(this.id[0], this.id[1]);
  }

  public String getAddress() {
    return this.line.substring(this.address[0], this.address[1]);
  }
}

然后:

String line = "03 ..."

Line parsed = new Line(line);
parsed.getName();
parsed.getSurname();
...

如果您要从 Line 对象中多次检索 namesurname 等,您甚至可以在第一次缓存它,这样您没有多次调用 substring

你可以尝试这样的事情。我会把边界检查和优化留给你,但作为第一步...

public static void main( String[] args ) {

    Map<String, Map<String,IndexDesignation>> substringMapping = new HashMap<>();

    // Put all the designations of how to map here

    substringMapping.put( "03", new HashMap<>());
    substringMapping.get( "03" ).put( "name", new IndexDesignation(2,10));
    substringMapping.get( "03" ).put( "surname", new IndexDesignation(11,21));

    // This determines which mapping value to use
    Map<String,IndexDesignation> indexDesignationMap = substringMapping.get(args[0].substring(0,2));

    // This holds the results
    Map<String, String> resultsMap = new HashMap<>();

    // Make sure we actually have a map to use
    if ( indexDesignationMap != null ) {
        // Now take this particular map designation and turn it into the resulting map of name to values

        for ( Map.Entry<String,IndexDesignation> mapEntry : indexDesignationMap.entrySet() ) {
            resultsMap.put(mapEntry.getKey(), args[0].substring(mapEntry.getValue().startIndex,
                    mapEntry.getValue().endIndex));
        }
    }

    // Print out the results (and you can assign to another object here as needed)
    System.out.println( resultsMap );
}

// Could also just use a list of two elements instead of this
static class IndexDesignation {
    int startIndex;
    int endIndex;
    public IndexDesignation( int startIndex, int endIndex ) {
        this.startIndex = startIndex;
        this.endIndex = endIndex;
    }
}

我们也可以使用正则表达式模式和流来实现结果。

比如说,我们有这样一个文本文件 -

03SomeNameSomeSurname
24SomeName10000

正则表达式模式具有用于将属性名称分配给已解析文本的组名称。所以,第一行的模式是 -

^03(?<name>.{8})(?<surname>.{11})

密码是-

public static void main(String[] args) {

        // Fixed Width File Lines
        List<String> fileLines = List.of(
                "03SomeNameSomeSurname",
                "24SomeName10000"
        );
        // List all regex patterns for the specific file
        List<Pattern> patternList = List.of(
                Pattern.compile("^03(?<name>.{8})(?<surname>.{11})"), // Regex for String - 03SomeNameSomeSurname
                Pattern.compile("^24(?<name>.{8})(?<salary>.{5})")); // Regex For String - 24SomeName10000

        // Pattern for finding Group Names
        Pattern groupNamePattern = Pattern.compile("\?<([a-zA-Z0-9]*)>");

        List<List<String>> output  = fileLines.stream().map(
                line -> patternList.stream() // Stream over the pattern list
                        .map(pattern -> pattern.matcher(line)) // Create a matcher for the fixed width line and regex pattern
                        .filter(matcher -> matcher.find()) // Filter matcher which matches correctly
                        .map( // Transform matcher results into String (Group Name = Matched Value
                                matcher ->
                                        groupNamePattern.matcher(matcher.pattern().toString()).results() // Find Group Names for the regex pattern
                                                .map(groupNameMatchResult -> groupNameMatchResult.group(1) + "=" + matcher.group(groupNameMatchResult.group(1))) // Transform into String (Group Name = Matched Value)
                                .collect(Collectors.joining(","))) // Join results delimited with ,
                        .collect(Collectors.toList())
        ).collect(Collectors.toList());

        System.out.println(output);
    }

输出结果已经将属性名和属性值解析为String的List。

[[name=SomeName,surname=SomeSurname], [name=SomeName,salary=10000]]