无法使用 apache poi 检测来自 excel 的删除线数据

Question

我正在为我的项目使用 Java 8、excel 和 apache poi。我有兴趣使用 java 从 excel 中提取某些单元格值。我正在尝试检测 excel 单元格中删除线的文本，但文本格式略有不同，这就是我遇到一些问题的原因。

下面是我的 excel sheet:

中的数据布局

从 excel 中提取此数据后，我总是将其保存为字符串数组列表格式，例如 a = [text 1, text 2, text 3]。如果您想查看我如何在此数组列表中存储数据，请在下面提到代码。

我想要的：

我想忽略所有那些被删除的文本，所以在上面的例子中，我希望第一张图片和第二张图片的输出像这样 [text 2, text 3]。

我试过的：

为了检测删除线值，我首先尝试了以下代码：

XSSFRichTextString text = new XSSFRichTextString(a.get(0));                             
XSSFFont font = text.getFontAtIndex(0);
Boolean font_striked = font.getStrikeout();

但上面的代码不能像 font_striked returns null 一样工作，它必须 return true 或 false

在我的案例中，部分适用于单行单元格值的代码是：

boolean striked_out = sheet.getRow(row_index).getCell(column_index).getCellStyle(). getFont().getStrikeout();

此代码仅在单元格中有单行值且不带有项目符号列表时有效，如上所示。它失败了，因为它不是为这种文本制作的。

P.S 我相信，如果我能够以某种方式从 arraylist 的项目符号点中检测到一个删除线字符串，我就可以让它适用于所有数据。

根据下面的答案，我更新了我的问题，添加了以下代码来展示我如何制作我的字符串数组列表

我如何将 excel 中的数据转换为 Arraylist：

String value_header = cell.getStringCellValue();
String[] newline_split = value_header.split("-");

for (int i = 0; i < newline_split.length; i++){
            final_values = newline_split[i].
                  replace("\n"," ").replaceAll("\s{2,}", " ").trim();
            XSSFRichTextString text = new XSSFRichTextString(final_values);                     
            XSSFFont font = text.getFontAtIndex(0);
            Boolean font_striked = font.getStrikeout();
} // for ends here

Answer 1

这是在 Excel 中使用 VBA 获得删除线的方法：

Public Sub IsMyActivecellStriked()
    Debug.Print ActiveCell.Font.Strikethrough
End Sub

如果你有这样的事情：

然后你应该找到一种方法来访问字符并检查它们。像这样：

Option Explicit
Public Sub TestMe()

    Dim strRange    As String
    Dim varArr      As Variant
    Dim varStr      As Variant
    Dim lngStart    As Long
    Dim lngEnd      As Long

    strRange = [a1]
    varArr = Split(strRange, Chr(10))

    For Each varStr In varArr
       lngStart = InStr(1, strRange, varStr)
       Debug.Print [a1].Characters(Start:=lngStart, Length:=Len(varStr)).Font.Strikethrough
       Debug.Print [a1].Characters(Start:=lngStart, Length:=Len(varStr)).Text
    Next varStr

End Sub

这将立即为您提供以下内容 window：

False
aaa
True
bbb
True
ccc
False
ddd

这应该可以用 POI 库翻译成 Java。

Answer 2

根据我对上述问题的理解问题（如果我错了请纠正我..！）它应该显示单元格中的文本是否有删除线。（正确或错误）

下面我创建了一个演示：

public class ApachePOI {    
    public static void main(String[] args) {

        //Using workbook
        XSSFWorkbook workbook;

        try {
            //Access excel file as workbook
            workbook = new XSSFWorkbook(new FileInputStream(new File("/testExcelfile.xlsx")));

            // first sheet of excel file
            XSSFSheet xssfFirstSheet = workbook.getSheetAt(0);

            //Check for A1  cell that strikethrough or not
            boolean strikedOutTextStatus = xssfFirstSheet.getRow(0).getCell(0).getCellStyle().getFont().getStrikeout();

            //print status of A1 cell text
            System.out.println(strikedOutTextStatus);

// 更新代码

      if(strikedOutTextStatus){

            String cellStringValue = xssfFirstSheet.getRow(0).getCell(0).getStringCellValue();

            System.out.println("cell Value  : "+cellStringValue.replace("-", "").replace(" ", ""));
        }


        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }           
    }    
}

Answer 3

你需要先得到RichTextString，然后遍历所有的FormattingRun，检查它是否被划掉，如果没有，然后得到适当的子串并将其放入List:

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.usermodel.CellType.*;
import org.apache.poi.xssf.usermodel.*;

import java.io.FileInputStream;

import java.util.List;
import java.util.ArrayList;

class ReadExcelRichTextCells {

 public static void main(String[] args) throws Exception {

  Workbook wb  = WorkbookFactory.create(new FileInputStream("ExcelRichTextCells.xlsx"));

  Sheet sheet = wb.getSheetAt(0);
  for (Row row : sheet) {
   for (Cell cell : row) {

    switch (cell.getCellTypeEnum()) {
     case STRING:
      XSSFRichTextString richtextstring = (XSSFRichTextString)cell.getRichStringCellValue();
      String textstring = richtextstring.getString();

      List<String> textparts = new ArrayList<String>();

      if (richtextstring.hasFormatting()) {
       for (int i = 0; i < richtextstring.numFormattingRuns(); i++) {

        if (richtextstring.getFontOfFormattingRun(i)==null || !richtextstring.getFontOfFormattingRun(i).getStrikeout()) {

         int indexofformattingrun = richtextstring.getIndexOfFormattingRun(i);
         String textpart = textstring.substring(indexofformattingrun, 
                                                indexofformattingrun + richtextstring.getLengthOfFormattingRun(i));
         String[] textpart_split = textpart.split("-");
         for (int j = 0; j < textpart_split.length; j++){
          String text = textpart_split[j].replace("\n", "").trim();       
          if (!"".equals(text)) textparts.add(text);
         }
        }
       } 
      } else {
       textparts.add(textstring);
      }

      System.out.println(textparts);
      break;

     //...
     default:
      System.out.println("default cell"); //should never occur
    }
   }
  }

  wb.close();

 }
}

无法使用 apache poi 检测来自 excel 的删除线数据

Cannot detect strikeout data from excel using apache poi

java

excel

apache-poi

strikethrough