C# saxonapi.Evaluate 在 500MB XML 上使用 1300 万行 运行 Xquery 花费的时间太长
C# saxonapi.Evaluate taking too long to run Xquery on 500MB XML with 13Million lines
应用程序在 4CPU 16GB RAM 上编译为 64 位 运行。 SaxonApi.Evaluate 对 500MB xml 文件和 1300 万行的 3 次评估调用占用了 47 分钟的总时间(60 分钟)。每个 Evaluate 运行一个 XQuery,其中 returns 80,000 个项目,每个项目有 20 个节点。
我们需要做些什么来改进 SaxonApi.Evaluate 方法
一些您可能会觉得有用的提示:
- 测量查询性能如何随源文档大小而变化。它是线性的还是二次的?如果它是二次的,那可能是因为你正在进行某种连接。如果它是一个简单的连接,那么 Saxon-EE 优化器可能会提供实质性的提升 - 下载评估并试一试。
- 关于性能,细节决定成败。为了解释您获得的性能,我们需要知道您正在做的事情的每一个细节,以至于我们可以自己重现结果。告诉我们您有一个需要很长时间的查询,甚至没有显示查询,这是在浪费每个人的时间。
粘贴示例 XML 和我正在使用的 Xquery。 LargeXML中有80K/Top/level1/Sch3K1.
XML
<?xml version="1.0" encoding="UTF-8"?>
<Top>
<level1>
<Sch3K1>
<PartnershipInformation>
<PartnershipName>Partner1</PartnershipName>
<PartnershipFEIN>XXXXXXX</PartnershipFEIN>
<PartnerAddress>
<USAddress>
<AddressLine1Txt>xxxx</AddressLine1Txt>
<CityNm>City</CityNm>
<StateAbbreviationCd>MO</StateAbbreviationCd>
<ZIPCd>1111</ZIPCd>
</USAddress>
</PartnerAddress>
</PartnershipInformation>
<PartnerInformation>
<Individual>
<PartnerName>
<FirstName>Partner1 FName</FirstName>
<MiddleInitial>P</MiddleInitial>
<LastName>Partner1 LName</LastName>
</PartnerName>
<PartnerSSN>XXXXXX</PartnerSSN>
</Individual>
<PartnerAddress>
<USAddress>
<AddressLine1Txt>318 Some STREET</AddressLine1Txt>
<CityNm>City2</CityNm>
<StateAbbreviationCd>WY</StateAbbreviationCd>
<ZIPCd>2222</ZIPCd>
</USAddress>
</PartnerAddress>
<LimitedPartner>X</LimitedPartner>
<DomesticPartner>X</DomesticPartner>
<PartnersProfitBOY>0.00003779</PartnersProfitBOY>
<PartnersProfitEOY>0.0000319</PartnersProfitEOY>
<PartnersLossBOY>0.00003779</PartnersLossBOY>
<PartnersLossEOY>0.0000319</PartnersLossEOY>
<PartnersCapitalBOY>0.00003779</PartnersCapitalBOY>
<PartnersCapitalEOY>0.0000319</PartnersCapitalEOY>
<PartnersLiabilitiesNonrecourse>0</PartnersLiabilitiesNonrecourse>
<PartnersLiabilitiesQNF>0</PartnersLiabilitiesQNF>
<PartnersLiabilitiesRecourse>0</PartnersLiabilitiesRecourse>
<CapitalAccountBeginning>1858311</CapitalAccountBeginning>
<CapitalAccountIncrease>137711</CapitalAccountIncrease>
<CapitalAccountWithdrawls>646011</CapitalAccountWithdrawls>
<CapitalAccountEnding>1350011</CapitalAccountEnding>
<CapitalAccountMethod>
<TaxBasis>X</TaxBasis>
</CapitalAccountMethod>
<PartnerStateRes>WY</PartnerStateRes>
<ByApportionment>X</ByApportionment>
<ApportionmentPercentage>0.0360504</ApportionmentPercentage>
</PartnerInformation>
<PartnersShare>
<OrdinaryIncome>
<FederalAmount>111</FederalAmount>
<PerStateLaw>111</PerStateLaw>
<StateSourceNonRes>29</StateSourceNonRes>
</OrdinaryIncome>
<NetIncomeRentalRE/>
<NetIncomeRentalNonRE>
<FederalAmount>700</FederalAmount>
<PerStateLaw>700</PerStateLaw>
<StateSourceNonRes>25</StateSourceNonRes>
</NetIncomeRentalNonRE>
<GuaranteedPymts/>
<InterestIncome>
<FederalAmount>12</FederalAmount>
<PerStateLaw>12</PerStateLaw>
</InterestIncome>
<OrdinaryDividends/>
<RoyaltyIncome/>
<ShortTermCapGain>
<FederalAmount>3</FederalAmount>
<PerStateLaw>3</PerStateLaw>
</ShortTermCapGain>
<LongTermCapGain>
<FederalAmount>15</FederalAmount>
<PerStateLaw>15</PerStateLaw>
<StateSourceNonRes>1</StateSourceNonRes>
</LongTermCapGain>
<NetSection1231Gain>
<FederalAmount>475</FederalAmount>
<PerStateLaw>475</PerStateLaw>
<StateSourceNonRes>17</StateSourceNonRes>
</NetSection1231Gain>
<AttributableToSaleFarmAssets/>
<OtherIncome>
<FederalAmount>-596</FederalAmount>
<PerStateLaw>-596</PerStateLaw>
<StateSourceNonRes>-21</StateSourceNonRes>
<Explanation>Other income</Explanation>
</OtherIncome>
<Sec179Deduction/>
<OtherDeductions>
<FederalAmount>12</FederalAmount>
<PerStateLaw>12</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>Total Other Deductions</Explanation>
</OtherDeductions>
<ForeignTransactions>
<FederalAmount>64338</FederalAmount>
<PerStateLaw>64338</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>GrossIncomeFromAllSources</Explanation>
</ForeignTransactions>
<ForeignTransactions>
<FederalAmount>170</FederalAmount>
<PerStateLaw>170</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>GeneralCategorySourcedAtPartnershipLevel</Explanation>
</ForeignTransactions>
<ForeignTransactions>
<FederalAmount>151</FederalAmount>
<PerStateLaw>151</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>GeneralCategoryApportionedAtPartnerLevel</Explanation>
</ForeignTransactions>
<ForeignTransactions>
<FederalAmount>5</FederalAmount>
<PerStateLaw>5</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>TotalForeignTaxes</Explanation>
</ForeignTransactions>
<AltMinTax>
<FederalAmount>480</FederalAmount>
<PerStateLaw>480</PerStateLaw>
<StateSourceNonRes>17</StateSourceNonRes>
<Explanation>Post 1986 depreciation adjustment</Explanation>
</AltMinTax>
<AltMinTax>
<FederalAmount>-636</FederalAmount>
<PerStateLaw>-636</PerStateLaw>
<StateSourceNonRes>-23</StateSourceNonRes>
<Explanation>Adjusted gain or loss</Explanation>
</AltMinTax>
<NondeductibleExpenses>
<FederalAmount>31</FederalAmount>
<PerStateLaw>31</PerStateLaw>
</NondeductibleExpenses>
<Distributions>
<DistSecurities>
<FederalAmount>6460</FederalAmount>
<PerStateLaw>6460</PerStateLaw>
</DistSecurities>
</Distributions>
<OtherInformation>
<InvestmentIncome>
<FederalAmount>12</FederalAmount>
<Adjustment>12</Adjustment>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>Investment Income</Explanation>
</InvestmentIncome>
</OtherInformation>
<IncomeLossReconciliation>
<PerStateLaw>1413</PerStateLaw>
<StateSourceNonRes>51</StateSourceNonRes>
</IncomeLossReconciliation>
<GrossIncomeAllActivities/>
</PartnersShare>
<PartnersApportionmentFactors>
<FirstFactor>
<FactorUsed>Property</FactorUsed>
<Wisconsin>0</Wisconsin>
<TotalCompany>0</TotalCompany>
</FirstFactor>
<SecondFactor>
<FactorUsed>Payroll</FactorUsed>
<Wisconsin>0</Wisconsin>
<TotalCompany>0</TotalCompany>
</SecondFactor>
<ThirdFactor>
<FactorUsed>Sales</FactorUsed>
<Wisconsin>0</Wisconsin>
<TotalCompany>0</TotalCompany>
</ThirdFactor>
</PartnersApportionmentFactors>
<PartnersShareAddSub>
<Additions>
<TotalAdditions>0</TotalAdditions>
</Additions>
<Subtractions>
<TotalSubtractions>0</TotalSubtractions>
</Subtractions>
<TotalAdjustment>0</TotalAdjustment>
</PartnersShareAddSub>
</Sch3K1>
</level1>
</Top>
XQuery
for
$level1 at $currentlevel1Pos in if(exists(./x:top/x:level1)) then ./x:top/x:level1 else element{'level1'} {''},
$Sch3K1 at $currentSch3K1Pos in if(exists(./x:top/x:level1/x:Sch3K1)) then ./x:top/x:level1/x:Sch3K1 else element{'Sch3K1'} {''},
$PartnerInformation at $currentPartnerInformationPos in if(exists($Sch3K1/x:PartnerInformation)) then $Sch3K1/x:PartnerInformation else element{'PartnerInformation'} {''},
$PartnersProfitBOY at $currentPartnersProfitBOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersProfitBOY)) then $Sch3K1/x:PartnerInformation/x:PartnersProfitBOY else element{'PartnersProfitBOY'} {''},
$PartnersProfitEOY at $currentPartnersProfitEOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersProfitEOY)) then $Sch3K1/x:PartnerInformation/x:PartnersProfitEOY else element{'PartnersProfitEOY'} {''},
$PartnersLossBOY at $currentPartnersLossBOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersLossBOY)) then $Sch3K1/x:PartnerInformation/x:PartnersLossBOY else element{'PartnersLossBOY'} {''},
$PartnersLossEOY at $currentPartnersLossEOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersLossEOY)) then $Sch3K1/x:PartnerInformation/x:PartnersLossEOY else element{'PartnersLossEOY'} {''},
$PartnersCapitalBOY at $currentPartnersCapitalBOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY)) then $Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY else element{'PartnersCapitalBOY'} {''},
$PartnersCapitalEOY at $currentPartnersCapitalEOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY)) then $Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY else element{'PartnersCapitalEOY'} {''}
let $genlevel1 := false
let $genSch3K1 := false
let $prevSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+-1]
let $nextSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+1]
let $Sch3K1Count := count(./x:top/x:level1/x:Sch3K1)
let $genPartnerInformation := false
let $genPartnersProfitBOY := exists($Sch3K1/x:PartnerInformation/x:PartnersProfitBOY)
let $genPartnersProfitEOY := exists($Sch3K1/x:PartnerInformation/x:PartnersProfitEOY)
let $genPartnersLossBOY := exists($Sch3K1/x:PartnerInformation/x:PartnersLossBOY)
let $genPartnersLossEOY := exists($Sch3K1/x:PartnerInformation/x:PartnersLossEOY)
let $genPartnersCapitalBOY := exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY)
let $genPartnersCapitalEOY := exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY)
return
<Evaluation>
<FieldEntry>
<Name>x:PartnersProfitBOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersProfitBOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersProfitBOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersProfitBOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersProfitBOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersProfitBOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersProfitEOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersProfitEOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersProfitEOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersProfitEOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersProfitEOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersProfitEOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersLossBOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersLossBOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersLossBOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersLossBOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersLossBOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersLossBOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersLossEOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersLossEOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersLossEOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersLossEOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersLossEOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersLossEOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersCapitalBOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersCapitalBOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersCapitalBOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersCapitalEOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersCapitalEOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersCapitalEOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
</Evaluation>
首先,这里有一些异常。
- 查询无法编译,因为它使用了尚未声明的命名空间前缀 "x"。 (但源文档似乎没有使用命名空间)
- 查询将顶级元素引用为
x:top
但在源文档中它是 Top
- 一些变量绑定到
false
,而 false()
确实是有意的(Saxon 对此发出警告)。
其次,有很多声明的变量没有被使用。例如,$PartnersCapitalBOY
和 $genPartnersCapitalBOY
。原则上,优化器很容易忽略未使用的变量,但是给优化器做不必要的工作并不总是一个好主意,因为它会分散它的注意力,无法找到优化可以产生真正差异的模式。
第三,我对结构的重复使用表示怀疑:
(for) $PartnerInformation at $currentPartnerInformationPos
in if(exists($Sch3K1/x:PartnerInformation))
then $Sch3K1/x:PartnerInformation
else element{'PartnerInformation'} {''},
这里的问题是创建新元素的构造不能移出循环,因为 XQuery 非常挑剔这样的构造每次执行时都必须创建不同的元素这一事实。所以(没有实际检查优化器详细做了什么)我怀疑这个结构抑制了可能的优化。
四、条款:
let $prevSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+-1]
let $nextSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+1]
如果 ./x:top/x:level1/x:Sch3K1
绑定到全局变量, 可能会更有效率。
乍一看,您的查询非常可怕,有 9 个嵌套循环,每个循环迭代超过 80K 个元素:一个天真的实现会执行最里面的代码大约 10^45 次,所以如果最里面的代码需要一纳秒来执行,总查询需要 10^36 秒,考虑到宇宙的年龄小于 10^16 秒,这是相当长的时间。因此,如果这是 运行 一小时后,优化器做得很好。
它能够做得如此出色的唯一原因是,如此多的查询显然毫无意义。
查看优化器跟踪 (-explain) 实际上,我很惊讶执行的优化很少,我怀疑造成这种情况的主要原因是 "for" 子句中间的元素构造函数。
我将从简化查询开始:
- 消除所有未使用的变量
- 如果您确实需要创建虚拟元素以实现外连接,请将这些虚拟元素作为全局变量创建一次,而不是在循环中重复创建它们。
有了这些变化,逻辑可能会变得更清晰。我认为本质上,它实际上是一个非常简单的查询。
根据 Michael Kay 的建议,我更改了 FLWOR 语句,将全局变量用于构造和一些变量赋值。 return 语句没有变化,也没有包含在下面。当我 运行 Query.exe 时,需要 21 分钟进行更改,而 return 结果需要 24 分钟。有轻微的改善。将结果保存到文件中是 150 MB ...所以我遗漏了什么。谢谢
let $docxml := doc("p.xml")
let $gSch3K1 := $docxml/Top/level1/Sch3K1
let $glevel1Element := element{'level1'} {''}
let $gSch3K1Element := element{'Sch3K1'} {''}
let $gPartnerInformationElement := element{'PartnerInformation'} {''}
for
$level1 at $currentlevel1Pos in if(exists($docxml/Top/level1)) then $docxml/Top/level1 else $glevel1Element,
$Sch3K1 at $currentSch3K1Pos in if(exists($docxml/Top/level1/Sch3K1)) then $docxml/Top/level1/Sch3K1 else $gSch3K1Element,
$PartnerInformation at $currentPartnerInformationPos in if(exists($Sch3K1/PartnerInformation)) then $Sch3K1/PartnerInformation else $gPartnerInformationElement
let $prevSch3K1 := $gSch3K1[$currentSch3K1Pos+-1]
let $nextSch3K1 := $gSch3K1[$currentSch3K1Pos+1]
let $Sch3K1Count := count($docxml/Top/level1/Sch3K1)
return
---
应用程序在 4CPU 16GB RAM 上编译为 64 位 运行。 SaxonApi.Evaluate 对 500MB xml 文件和 1300 万行的 3 次评估调用占用了 47 分钟的总时间(60 分钟)。每个 Evaluate 运行一个 XQuery,其中 returns 80,000 个项目,每个项目有 20 个节点。
我们需要做些什么来改进 SaxonApi.Evaluate 方法
一些您可能会觉得有用的提示:
- 测量查询性能如何随源文档大小而变化。它是线性的还是二次的?如果它是二次的,那可能是因为你正在进行某种连接。如果它是一个简单的连接,那么 Saxon-EE 优化器可能会提供实质性的提升 - 下载评估并试一试。
- 关于性能,细节决定成败。为了解释您获得的性能,我们需要知道您正在做的事情的每一个细节,以至于我们可以自己重现结果。告诉我们您有一个需要很长时间的查询,甚至没有显示查询,这是在浪费每个人的时间。
粘贴示例 XML 和我正在使用的 Xquery。 LargeXML中有80K/Top/level1/Sch3K1.
XML
<?xml version="1.0" encoding="UTF-8"?>
<Top>
<level1>
<Sch3K1>
<PartnershipInformation>
<PartnershipName>Partner1</PartnershipName>
<PartnershipFEIN>XXXXXXX</PartnershipFEIN>
<PartnerAddress>
<USAddress>
<AddressLine1Txt>xxxx</AddressLine1Txt>
<CityNm>City</CityNm>
<StateAbbreviationCd>MO</StateAbbreviationCd>
<ZIPCd>1111</ZIPCd>
</USAddress>
</PartnerAddress>
</PartnershipInformation>
<PartnerInformation>
<Individual>
<PartnerName>
<FirstName>Partner1 FName</FirstName>
<MiddleInitial>P</MiddleInitial>
<LastName>Partner1 LName</LastName>
</PartnerName>
<PartnerSSN>XXXXXX</PartnerSSN>
</Individual>
<PartnerAddress>
<USAddress>
<AddressLine1Txt>318 Some STREET</AddressLine1Txt>
<CityNm>City2</CityNm>
<StateAbbreviationCd>WY</StateAbbreviationCd>
<ZIPCd>2222</ZIPCd>
</USAddress>
</PartnerAddress>
<LimitedPartner>X</LimitedPartner>
<DomesticPartner>X</DomesticPartner>
<PartnersProfitBOY>0.00003779</PartnersProfitBOY>
<PartnersProfitEOY>0.0000319</PartnersProfitEOY>
<PartnersLossBOY>0.00003779</PartnersLossBOY>
<PartnersLossEOY>0.0000319</PartnersLossEOY>
<PartnersCapitalBOY>0.00003779</PartnersCapitalBOY>
<PartnersCapitalEOY>0.0000319</PartnersCapitalEOY>
<PartnersLiabilitiesNonrecourse>0</PartnersLiabilitiesNonrecourse>
<PartnersLiabilitiesQNF>0</PartnersLiabilitiesQNF>
<PartnersLiabilitiesRecourse>0</PartnersLiabilitiesRecourse>
<CapitalAccountBeginning>1858311</CapitalAccountBeginning>
<CapitalAccountIncrease>137711</CapitalAccountIncrease>
<CapitalAccountWithdrawls>646011</CapitalAccountWithdrawls>
<CapitalAccountEnding>1350011</CapitalAccountEnding>
<CapitalAccountMethod>
<TaxBasis>X</TaxBasis>
</CapitalAccountMethod>
<PartnerStateRes>WY</PartnerStateRes>
<ByApportionment>X</ByApportionment>
<ApportionmentPercentage>0.0360504</ApportionmentPercentage>
</PartnerInformation>
<PartnersShare>
<OrdinaryIncome>
<FederalAmount>111</FederalAmount>
<PerStateLaw>111</PerStateLaw>
<StateSourceNonRes>29</StateSourceNonRes>
</OrdinaryIncome>
<NetIncomeRentalRE/>
<NetIncomeRentalNonRE>
<FederalAmount>700</FederalAmount>
<PerStateLaw>700</PerStateLaw>
<StateSourceNonRes>25</StateSourceNonRes>
</NetIncomeRentalNonRE>
<GuaranteedPymts/>
<InterestIncome>
<FederalAmount>12</FederalAmount>
<PerStateLaw>12</PerStateLaw>
</InterestIncome>
<OrdinaryDividends/>
<RoyaltyIncome/>
<ShortTermCapGain>
<FederalAmount>3</FederalAmount>
<PerStateLaw>3</PerStateLaw>
</ShortTermCapGain>
<LongTermCapGain>
<FederalAmount>15</FederalAmount>
<PerStateLaw>15</PerStateLaw>
<StateSourceNonRes>1</StateSourceNonRes>
</LongTermCapGain>
<NetSection1231Gain>
<FederalAmount>475</FederalAmount>
<PerStateLaw>475</PerStateLaw>
<StateSourceNonRes>17</StateSourceNonRes>
</NetSection1231Gain>
<AttributableToSaleFarmAssets/>
<OtherIncome>
<FederalAmount>-596</FederalAmount>
<PerStateLaw>-596</PerStateLaw>
<StateSourceNonRes>-21</StateSourceNonRes>
<Explanation>Other income</Explanation>
</OtherIncome>
<Sec179Deduction/>
<OtherDeductions>
<FederalAmount>12</FederalAmount>
<PerStateLaw>12</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>Total Other Deductions</Explanation>
</OtherDeductions>
<ForeignTransactions>
<FederalAmount>64338</FederalAmount>
<PerStateLaw>64338</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>GrossIncomeFromAllSources</Explanation>
</ForeignTransactions>
<ForeignTransactions>
<FederalAmount>170</FederalAmount>
<PerStateLaw>170</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>GeneralCategorySourcedAtPartnershipLevel</Explanation>
</ForeignTransactions>
<ForeignTransactions>
<FederalAmount>151</FederalAmount>
<PerStateLaw>151</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>GeneralCategoryApportionedAtPartnerLevel</Explanation>
</ForeignTransactions>
<ForeignTransactions>
<FederalAmount>5</FederalAmount>
<PerStateLaw>5</PerStateLaw>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>TotalForeignTaxes</Explanation>
</ForeignTransactions>
<AltMinTax>
<FederalAmount>480</FederalAmount>
<PerStateLaw>480</PerStateLaw>
<StateSourceNonRes>17</StateSourceNonRes>
<Explanation>Post 1986 depreciation adjustment</Explanation>
</AltMinTax>
<AltMinTax>
<FederalAmount>-636</FederalAmount>
<PerStateLaw>-636</PerStateLaw>
<StateSourceNonRes>-23</StateSourceNonRes>
<Explanation>Adjusted gain or loss</Explanation>
</AltMinTax>
<NondeductibleExpenses>
<FederalAmount>31</FederalAmount>
<PerStateLaw>31</PerStateLaw>
</NondeductibleExpenses>
<Distributions>
<DistSecurities>
<FederalAmount>6460</FederalAmount>
<PerStateLaw>6460</PerStateLaw>
</DistSecurities>
</Distributions>
<OtherInformation>
<InvestmentIncome>
<FederalAmount>12</FederalAmount>
<Adjustment>12</Adjustment>
<StateSourceNonRes>0</StateSourceNonRes>
<Explanation>Investment Income</Explanation>
</InvestmentIncome>
</OtherInformation>
<IncomeLossReconciliation>
<PerStateLaw>1413</PerStateLaw>
<StateSourceNonRes>51</StateSourceNonRes>
</IncomeLossReconciliation>
<GrossIncomeAllActivities/>
</PartnersShare>
<PartnersApportionmentFactors>
<FirstFactor>
<FactorUsed>Property</FactorUsed>
<Wisconsin>0</Wisconsin>
<TotalCompany>0</TotalCompany>
</FirstFactor>
<SecondFactor>
<FactorUsed>Payroll</FactorUsed>
<Wisconsin>0</Wisconsin>
<TotalCompany>0</TotalCompany>
</SecondFactor>
<ThirdFactor>
<FactorUsed>Sales</FactorUsed>
<Wisconsin>0</Wisconsin>
<TotalCompany>0</TotalCompany>
</ThirdFactor>
</PartnersApportionmentFactors>
<PartnersShareAddSub>
<Additions>
<TotalAdditions>0</TotalAdditions>
</Additions>
<Subtractions>
<TotalSubtractions>0</TotalSubtractions>
</Subtractions>
<TotalAdjustment>0</TotalAdjustment>
</PartnersShareAddSub>
</Sch3K1>
</level1>
</Top>
XQuery
for
$level1 at $currentlevel1Pos in if(exists(./x:top/x:level1)) then ./x:top/x:level1 else element{'level1'} {''},
$Sch3K1 at $currentSch3K1Pos in if(exists(./x:top/x:level1/x:Sch3K1)) then ./x:top/x:level1/x:Sch3K1 else element{'Sch3K1'} {''},
$PartnerInformation at $currentPartnerInformationPos in if(exists($Sch3K1/x:PartnerInformation)) then $Sch3K1/x:PartnerInformation else element{'PartnerInformation'} {''},
$PartnersProfitBOY at $currentPartnersProfitBOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersProfitBOY)) then $Sch3K1/x:PartnerInformation/x:PartnersProfitBOY else element{'PartnersProfitBOY'} {''},
$PartnersProfitEOY at $currentPartnersProfitEOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersProfitEOY)) then $Sch3K1/x:PartnerInformation/x:PartnersProfitEOY else element{'PartnersProfitEOY'} {''},
$PartnersLossBOY at $currentPartnersLossBOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersLossBOY)) then $Sch3K1/x:PartnerInformation/x:PartnersLossBOY else element{'PartnersLossBOY'} {''},
$PartnersLossEOY at $currentPartnersLossEOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersLossEOY)) then $Sch3K1/x:PartnerInformation/x:PartnersLossEOY else element{'PartnersLossEOY'} {''},
$PartnersCapitalBOY at $currentPartnersCapitalBOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY)) then $Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY else element{'PartnersCapitalBOY'} {''},
$PartnersCapitalEOY at $currentPartnersCapitalEOYPos in if(exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY)) then $Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY else element{'PartnersCapitalEOY'} {''}
let $genlevel1 := false
let $genSch3K1 := false
let $prevSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+-1]
let $nextSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+1]
let $Sch3K1Count := count(./x:top/x:level1/x:Sch3K1)
let $genPartnerInformation := false
let $genPartnersProfitBOY := exists($Sch3K1/x:PartnerInformation/x:PartnersProfitBOY)
let $genPartnersProfitEOY := exists($Sch3K1/x:PartnerInformation/x:PartnersProfitEOY)
let $genPartnersLossBOY := exists($Sch3K1/x:PartnerInformation/x:PartnersLossBOY)
let $genPartnersLossEOY := exists($Sch3K1/x:PartnerInformation/x:PartnersLossEOY)
let $genPartnersCapitalBOY := exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY)
let $genPartnersCapitalEOY := exists($Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY)
return
<Evaluation>
<FieldEntry>
<Name>x:PartnersProfitBOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersProfitBOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersProfitBOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersProfitBOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersProfitBOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersProfitBOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersProfitEOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersProfitEOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersProfitEOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersProfitEOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersProfitEOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersProfitEOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersLossBOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersLossBOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersLossBOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersLossBOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersLossBOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersLossBOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersLossEOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersLossEOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersLossEOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersLossEOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersLossEOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersLossEOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersCapitalBOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersCapitalBOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersCapitalBOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalBOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
<FieldEntry>
<Name>x:PartnersCapitalEOY</Name>
<Xpath>x:top/x:level1/x:Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY</Xpath>
<Value>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY/data()}</Value>
<NextValue>{$nextSch3K1/x:PartnerInformation/x:PartnersCapitalEOY/data()}</NextValue>
<PrevValue>{$prevSch3K1/x:PartnerInformation/x:PartnersCapitalEOY/data()}</PrevValue>
<Index>{$currentSch3K1Pos}</Index>
<Count>{$Sch3K1Count}</Count>
<FieldKey>{$Sch3K1/x:PartnerInformation/x:PartnersCapitalEOY/@FieldKey/data()}</FieldKey>
<NodeIsPresent></NodeIsPresent>
<HasChildNodes></HasChildNodes>
</FieldEntry>
</Evaluation>
首先,这里有一些异常。
- 查询无法编译,因为它使用了尚未声明的命名空间前缀 "x"。 (但源文档似乎没有使用命名空间)
- 查询将顶级元素引用为
x:top
但在源文档中它是Top
- 一些变量绑定到
false
,而false()
确实是有意的(Saxon 对此发出警告)。
其次,有很多声明的变量没有被使用。例如,$PartnersCapitalBOY
和 $genPartnersCapitalBOY
。原则上,优化器很容易忽略未使用的变量,但是给优化器做不必要的工作并不总是一个好主意,因为它会分散它的注意力,无法找到优化可以产生真正差异的模式。
第三,我对结构的重复使用表示怀疑:
(for) $PartnerInformation at $currentPartnerInformationPos
in if(exists($Sch3K1/x:PartnerInformation))
then $Sch3K1/x:PartnerInformation
else element{'PartnerInformation'} {''},
这里的问题是创建新元素的构造不能移出循环,因为 XQuery 非常挑剔这样的构造每次执行时都必须创建不同的元素这一事实。所以(没有实际检查优化器详细做了什么)我怀疑这个结构抑制了可能的优化。
四、条款:
let $prevSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+-1]
let $nextSch3K1 := ./x:top/x:level1/x:Sch3K1[$currentSch3K1Pos+1]
如果 ./x:top/x:level1/x:Sch3K1
绑定到全局变量,可能会更有效率。
乍一看,您的查询非常可怕,有 9 个嵌套循环,每个循环迭代超过 80K 个元素:一个天真的实现会执行最里面的代码大约 10^45 次,所以如果最里面的代码需要一纳秒来执行,总查询需要 10^36 秒,考虑到宇宙的年龄小于 10^16 秒,这是相当长的时间。因此,如果这是 运行 一小时后,优化器做得很好。
它能够做得如此出色的唯一原因是,如此多的查询显然毫无意义。
查看优化器跟踪 (-explain) 实际上,我很惊讶执行的优化很少,我怀疑造成这种情况的主要原因是 "for" 子句中间的元素构造函数。
我将从简化查询开始:
- 消除所有未使用的变量
- 如果您确实需要创建虚拟元素以实现外连接,请将这些虚拟元素作为全局变量创建一次,而不是在循环中重复创建它们。
有了这些变化,逻辑可能会变得更清晰。我认为本质上,它实际上是一个非常简单的查询。
根据 Michael Kay 的建议,我更改了 FLWOR 语句,将全局变量用于构造和一些变量赋值。 return 语句没有变化,也没有包含在下面。当我 运行 Query.exe 时,需要 21 分钟进行更改,而 return 结果需要 24 分钟。有轻微的改善。将结果保存到文件中是 150 MB ...所以我遗漏了什么。谢谢
let $docxml := doc("p.xml")
let $gSch3K1 := $docxml/Top/level1/Sch3K1
let $glevel1Element := element{'level1'} {''}
let $gSch3K1Element := element{'Sch3K1'} {''}
let $gPartnerInformationElement := element{'PartnerInformation'} {''}
for
$level1 at $currentlevel1Pos in if(exists($docxml/Top/level1)) then $docxml/Top/level1 else $glevel1Element,
$Sch3K1 at $currentSch3K1Pos in if(exists($docxml/Top/level1/Sch3K1)) then $docxml/Top/level1/Sch3K1 else $gSch3K1Element,
$PartnerInformation at $currentPartnerInformationPos in if(exists($Sch3K1/PartnerInformation)) then $Sch3K1/PartnerInformation else $gPartnerInformationElement
let $prevSch3K1 := $gSch3K1[$currentSch3K1Pos+-1]
let $nextSch3K1 := $gSch3K1[$currentSch3K1Pos+1]
let $Sch3K1Count := count($docxml/Top/level1/Sch3K1)
return
---