从查询中的数值表达式返回的数据类型

Question

在 cloudera 的 Impala 指南中（ https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_conversion_functions.html) 有一个示例演示了如何使用 typeof() 函数来检查数值表达式的返回数据类型：

+--------------------------+
| typeof(5.30001 / 2342.1) |
+--------------------------+
| DECIMAL(13,11)           |
+--------------------------+

当我这样做时:

select typeof(5.30001),typeof(2342.1),typeof(5.30001 / 2342.1);

给出的是这样的

DECIMAL(6,5)    DECIMAL(5,1)    DECIMAL(13,11)

我的意思是前两个很明显，但我不知道为什么第三个的数据类型像 that.Can 返回的数据类型由数值表达式本身确定？另外，对于我表示为 decimal(13,5)/decimal(25,4)（例如）的列除法表达式，有没有办法确定返回的数据类型应该是什么？谢谢。

Answer 1

这个确实有点晦涩。下面是算术小数类型转换的相关代码。

  /**
   * Returns the result type for (t1 op t2) where t1 and t2 are both DECIMAL, used when
   * DECIMAL version 2 is enabled.
   *
   * These rules are similar to (post Dec 2016) Hive / sql server rules.
   * http://blogs.msdn.com/b/sqlprogrammability/archive/2006/03/29/564110.aspx
   * https://msdn.microsoft.com/en-us/library/ms190476.aspx
   *
   * TODO: implement V2 rules for ADD/SUB.
   *
   * Changes:
   *  - There are slight difference with how precision/scale reduction occurs compared
   *    to SQL server when the desired precision is more than the maximum supported
   *    precision.  But an algorithm of reducing scale to a minimum of 6 is used.
   */
  private static ScalarType getDecimalArithmeticResultTypeV2(Type t1, Type t2,
      ArithmeticExpr.Operator op) throws AnalysisException {
    Preconditions.checkState(t1.isFullySpecifiedDecimal());
    Preconditions.checkState(t2.isFullySpecifiedDecimal());
    ScalarType st1 = (ScalarType) t1;
    ScalarType st2 = (ScalarType) t2;
    int s1 = st1.decimalScale();
    int s2 = st2.decimalScale();
    int p1 = st1.decimalPrecision();
    int p2 = st2.decimalPrecision();
    int resultScale;
    int resultPrecision;

    switch (op) {
      case DIVIDE:
        // Divide result always gets at least MIN_ADJUSTED_SCALE decimal places.
        resultScale = Math.max(ScalarType.MIN_ADJUSTED_SCALE, s1 + p2 + 1);
        resultPrecision = p1 - s1 + s2 + resultScale;
        break;
      case MOD:
        resultScale = Math.max(s1, s2);
        resultPrecision = Math.min(p1 - s1, p2 - s2) + resultScale;
        break;
      case MULTIPLY:
        resultScale = s1 + s2;
        resultPrecision = p1 + p2 + 1;
        break;
      case ADD:
      case SUBTRACT:
      default:
        // Not yet implemented - fall back to V1 rules.
        return getDecimalArithmeticResultTypeV1(t1, t2, op);
    }
    // Use the scale reduction technique when resultPrecision is too large.
    return ScalarType.createAdjustedDecimalType(resultPrecision, resultScale);
  }

所以 5.30001 的精度为 6，比例为 5，而 2342.1 的精度为 5，比例为 1。我们有 s1 = 5 和 p2 = 5。输出比例为 s1 + p2 + 1 == 11，精度为6 - 5 + 1 + 11 == 13。这些规则在评论链接中有更好的解释。

从查询中的数值表达式返回的数据类型

Returned data type from a numeric expression in query

sql

hadoop

impala