SSIS 2015 脚本任务在 C# 或 VB 中将文本文件转换为 UTF8
SSIS 2015 Script task convert text file to UTF8 in C# or VB
我想将生成的 txt 文件转换为 UTF8 格式的文件,以便我可以通过 Polybase 将其加载到我的 Azure SQL DW 中。要求源文件为UTF8.
MSDN 有一个 "IO Streaming example" HERE 非常适合单个作业。不过,我正在尝试为大约 30 个表构建一个 SSIS 解决方案。我相信使用这种方法会导致竞争条件,当另一个 SSIS 包需要它时,PS 脚本将被 1 个 SSIS 包锁定。
我是 sql 开发人员,不是 .NET 开发人员,所以请原谅我。假设我知道如何将参数传递给脚本任务,如何将上面的转换为 SSIS C# 脚本任务?
来自 MSDN 的 PowerShell 代码
#Static variables
$ascii = [System.Text.Encoding]::ASCII
$utf16le = [System.Text.Encoding]::Unicode
$utf8 = [System.Text.Encoding]::UTF8
$ansi = [System.Text.Encoding]::Default
$append = $False
#Set source file path and file name
$src = [System.IO.Path]::Combine("<MySrcFolder>","<MyUtf8stage>.txt")
#Set source file encoding (using list above)
$src_enc = $ascii
#Set target file path and file name
$tgt = [System.IO.Path]::Combine("<MyDestFolder>","<MyFinalstage>.txt")
#Set target file encoding (using list above)
$tgt_enc = $utf8
$read = New-Object System.IO.StreamReader($src,$src_enc)
$write = New-Object System.IO.StreamWriter($tgt,$append,$tgt_enc)
while ($read.Peek() -ne -1)
{
$line = $read.ReadLine();
$write.WriteLine($line);
}
$read.Close()
$read.Dispose()
$write.Close()
$write.Dispose()
更新
我找到了一个类似的 post,我可以根据自己的需要进行调整,我发誓在 posting 之前我搜索了很多。无论如何,这是为我工作的。如果您看到任何改进,请分享:
public void Main()
{
//$Package::SourceSQLObject = tablename
//$Package::StageFile_DestinationFolderPath = rootpath eg "C:\temp\"
string path = (string)Dts.Variables["$Package::StageFile_DestinationFolderPath"].Value;
string name = (string)Dts.Variables["$Package::SourceSQLObject"].Value;
string from = Path.Combine(path, name) + ".csv";
string to = Path.ChangeExtension(from, "txt");
Dts.Log("Starting " + to.ToUpper(), 0, null);
using (StreamReader reader = new StreamReader(from, Encoding.ASCII, false, 10))
using (StreamWriter writer = new StreamWriter(to, false, Encoding.UTF8, 10))
{
while (reader.Peek() >= 0)
{
writer.WriteLine(reader.ReadLine());
}
}
Dts.TaskResult = (int)ScriptResults.Success;
var mySrcFolder = ""; // something from user variables?
var myUtf8stage = ""; // something from user variables?
var myFinalstage = ""; // something from user variables?
// Static variables
var ascii = System.Text.Encoding.ASCII;
var utf16le = System.Text.Encoding.Unicode;
var utf8 = System.Text.Encoding.UTF8;
var ansi = System.Text.Encoding.Default;
var append = false;
// Set source file path and file name
var src = System.IO.Path.Combine(
mySrcFolder,
String.Format("{0}.txt", myUtf8stage));
// Set source file encoding (using list above)
var src_enc = ascii;
// Set target file path and file name
var tgt = System.IO.Path.Combine(
mySrcFolder,
String.Format("{0}.txt", myFinalstage));
// Set target file encoding (using list above)
var tgt_enc = utf8;
using (var read = new System.IO.StreamReader(src, src_enc))
using (var write = new System.IO.StreamWriter(tgt, append, tgt_enc))
{
while (read.Peek() != -1)
{
var line = read.ReadLine();
write.WriteLine(line);
}
}
您的代码表明您正在尝试将 ASCII 文件转换为 UTF-8,但是 article 还说明了以下内容:
As UTF-8 uses the same character encoding as ASCII PolyBase will also
support loading data that is ASCII encoded.
所以我的建议是首先使用 Polybase 尝试转换文件,在尝试转换文件之前检查是否存在任何转换问题。
public void Main()
{
//$Package::SourceSQLObject = tablename
//$Package::StageFile_DestinationFolderPath = rootpath eg "C:\temp\"
string path = (string)Dts.Variables["$Package::StageFile_DestinationFolderPath"].Value;
string name = (string)Dts.Variables["$Package::SourceSQLObject"].Value;
string from = Path.Combine(path, name) + ".csv";
string to = Path.ChangeExtension(from, "txt");
Dts.Log("Starting " + to.ToUpper(), 0, null);
using (StreamReader reader = new StreamReader(from, Encoding.ASCII, false, 10))
using (StreamWriter writer = new StreamWriter(to, false, Encoding.UTF8, 10))
{
while (reader.Peek() >= 0)
{
writer.WriteLine(reader.ReadLine());
}
}
Dts.TaskResult = (int)ScriptResults.Success;
我想将生成的 txt 文件转换为 UTF8 格式的文件,以便我可以通过 Polybase 将其加载到我的 Azure SQL DW 中。要求源文件为UTF8.
MSDN 有一个 "IO Streaming example" HERE 非常适合单个作业。不过,我正在尝试为大约 30 个表构建一个 SSIS 解决方案。我相信使用这种方法会导致竞争条件,当另一个 SSIS 包需要它时,PS 脚本将被 1 个 SSIS 包锁定。
我是 sql 开发人员,不是 .NET 开发人员,所以请原谅我。假设我知道如何将参数传递给脚本任务,如何将上面的转换为 SSIS C# 脚本任务?
来自 MSDN 的 PowerShell 代码
#Static variables
$ascii = [System.Text.Encoding]::ASCII
$utf16le = [System.Text.Encoding]::Unicode
$utf8 = [System.Text.Encoding]::UTF8
$ansi = [System.Text.Encoding]::Default
$append = $False
#Set source file path and file name
$src = [System.IO.Path]::Combine("<MySrcFolder>","<MyUtf8stage>.txt")
#Set source file encoding (using list above)
$src_enc = $ascii
#Set target file path and file name
$tgt = [System.IO.Path]::Combine("<MyDestFolder>","<MyFinalstage>.txt")
#Set target file encoding (using list above)
$tgt_enc = $utf8
$read = New-Object System.IO.StreamReader($src,$src_enc)
$write = New-Object System.IO.StreamWriter($tgt,$append,$tgt_enc)
while ($read.Peek() -ne -1)
{
$line = $read.ReadLine();
$write.WriteLine($line);
}
$read.Close()
$read.Dispose()
$write.Close()
$write.Dispose()
更新
我找到了一个类似的 post,我可以根据自己的需要进行调整,我发誓在 posting 之前我搜索了很多。无论如何,这是为我工作的。如果您看到任何改进,请分享:
public void Main()
{
//$Package::SourceSQLObject = tablename
//$Package::StageFile_DestinationFolderPath = rootpath eg "C:\temp\"
string path = (string)Dts.Variables["$Package::StageFile_DestinationFolderPath"].Value;
string name = (string)Dts.Variables["$Package::SourceSQLObject"].Value;
string from = Path.Combine(path, name) + ".csv";
string to = Path.ChangeExtension(from, "txt");
Dts.Log("Starting " + to.ToUpper(), 0, null);
using (StreamReader reader = new StreamReader(from, Encoding.ASCII, false, 10))
using (StreamWriter writer = new StreamWriter(to, false, Encoding.UTF8, 10))
{
while (reader.Peek() >= 0)
{
writer.WriteLine(reader.ReadLine());
}
}
Dts.TaskResult = (int)ScriptResults.Success;
var mySrcFolder = ""; // something from user variables?
var myUtf8stage = ""; // something from user variables?
var myFinalstage = ""; // something from user variables?
// Static variables
var ascii = System.Text.Encoding.ASCII;
var utf16le = System.Text.Encoding.Unicode;
var utf8 = System.Text.Encoding.UTF8;
var ansi = System.Text.Encoding.Default;
var append = false;
// Set source file path and file name
var src = System.IO.Path.Combine(
mySrcFolder,
String.Format("{0}.txt", myUtf8stage));
// Set source file encoding (using list above)
var src_enc = ascii;
// Set target file path and file name
var tgt = System.IO.Path.Combine(
mySrcFolder,
String.Format("{0}.txt", myFinalstage));
// Set target file encoding (using list above)
var tgt_enc = utf8;
using (var read = new System.IO.StreamReader(src, src_enc))
using (var write = new System.IO.StreamWriter(tgt, append, tgt_enc))
{
while (read.Peek() != -1)
{
var line = read.ReadLine();
write.WriteLine(line);
}
}
您的代码表明您正在尝试将 ASCII 文件转换为 UTF-8,但是 article 还说明了以下内容:
As UTF-8 uses the same character encoding as ASCII PolyBase will also support loading data that is ASCII encoded.
所以我的建议是首先使用 Polybase 尝试转换文件,在尝试转换文件之前检查是否存在任何转换问题。
public void Main()
{
//$Package::SourceSQLObject = tablename
//$Package::StageFile_DestinationFolderPath = rootpath eg "C:\temp\"
string path = (string)Dts.Variables["$Package::StageFile_DestinationFolderPath"].Value;
string name = (string)Dts.Variables["$Package::SourceSQLObject"].Value;
string from = Path.Combine(path, name) + ".csv";
string to = Path.ChangeExtension(from, "txt");
Dts.Log("Starting " + to.ToUpper(), 0, null);
using (StreamReader reader = new StreamReader(from, Encoding.ASCII, false, 10))
using (StreamWriter writer = new StreamWriter(to, false, Encoding.UTF8, 10))
{
while (reader.Peek() >= 0)
{
writer.WriteLine(reader.ReadLine());
}
}
Dts.TaskResult = (int)ScriptResults.Success;