U-SQL 列类型转换
U-SQL Column Type Convertion
我创建了一个 U-SQL 查询,它从 DataLake Store 获取输入文件并转换值。最终输出存储在 DataLake Store 中。
DECLARE @in string = "system/dbotable{*}.tsv";
DECLARE @out string ="system/temp.tsv";
@searchlog =
EXTRACT
Id int,
Address string,
number int
FROM @in
USING Extractors.Tsv();
@transactions =
SELECT
*,
ROW_NUMBER()
OVER(PARTITION BY Id ORDER BY Id DESC) AS RowNumber
FROM @searchlog;
@result =
SELECT
Id ,
Address,
number
FROM @transactions
WHERE RowNumber == 1;
OUTPUT @result
TO @out
USING Outputters.Tsv();
并且显示以下错误,
Execution failed with error '1_SV1_Extract Error : '{"diagnosticCode":195887132,"severity":"Error","component":"RUNTIME","source":"User","errorId":"E_RUNTIME_USER_EXTRACT_COLUMN_CONVERSION_INVALID_ERROR","message":"Invalid character when attempting to convert column data.","description":"HEX: \"2243616E696E6522\" Invalid character when converting input record.\nPosition: line 1, column index: 1, column name: \"Id\".","resolution":"Check the input for errors or use \"silent\" switch to ignore over(under)-sized rows in the input.\nConsider that ignoring \"invalid\" rows may influence job results and that types have to be nullable for conversion errors to be ignored.","helpLink":""
Id 列似乎并不总是 Integer 类型。
我会先将 Id 列提取为字符串,然后在第二步中尝试使用用户定义的函数将其转换为 Int,如下所示:https://msdn.microsoft.com/en-us/library/azure/mt621309.aspx(基于 DateTime 的示例)。
另一种选择是在提取器中使用 silent:true
,这样您就可以自动忽略转换失败的行。
我创建了一个 U-SQL 查询,它从 DataLake Store 获取输入文件并转换值。最终输出存储在 DataLake Store 中。
DECLARE @in string = "system/dbotable{*}.tsv";
DECLARE @out string ="system/temp.tsv";
@searchlog =
EXTRACT
Id int,
Address string,
number int
FROM @in
USING Extractors.Tsv();
@transactions =
SELECT
*,
ROW_NUMBER()
OVER(PARTITION BY Id ORDER BY Id DESC) AS RowNumber
FROM @searchlog;
@result =
SELECT
Id ,
Address,
number
FROM @transactions
WHERE RowNumber == 1;
OUTPUT @result
TO @out
USING Outputters.Tsv();
并且显示以下错误,
Execution failed with error '1_SV1_Extract Error : '{"diagnosticCode":195887132,"severity":"Error","component":"RUNTIME","source":"User","errorId":"E_RUNTIME_USER_EXTRACT_COLUMN_CONVERSION_INVALID_ERROR","message":"Invalid character when attempting to convert column data.","description":"HEX: \"2243616E696E6522\" Invalid character when converting input record.\nPosition: line 1, column index: 1, column name: \"Id\".","resolution":"Check the input for errors or use \"silent\" switch to ignore over(under)-sized rows in the input.\nConsider that ignoring \"invalid\" rows may influence job results and that types have to be nullable for conversion errors to be ignored.","helpLink":""
Id 列似乎并不总是 Integer 类型。
我会先将 Id 列提取为字符串,然后在第二步中尝试使用用户定义的函数将其转换为 Int,如下所示:https://msdn.microsoft.com/en-us/library/azure/mt621309.aspx(基于 DateTime 的示例)。
另一种选择是在提取器中使用 silent:true
,这样您就可以自动忽略转换失败的行。