我应该如何在我的代码中使用 Task.Run 以获得适当的可伸缩性和性能?
How should I use Task.Run in my code for proper scalability and performance?
我开始对我的代码产生巨大的怀疑,我需要更有经验的程序员的一些建议。
在我的应用程序中单击按钮时,应用程序运行一个命令,即调用 ScrapJockeys
方法:
if (UpdateJockeysPl) await ScrapJockeys(JPlFrom, JPlTo + 1, "jockeysPl"); //1 - 1049
ScrapJockeys
正在触发一个 for
循环,重复代码块 20K - 150K 次(视情况而定)。在循环内部,我需要调用一个服务方法,该方法的执行需要花费大量时间。此外,我希望能够取消循环以及 loop/method.
内部发生的所有事情
现在我正在使用一个带有任务列表的方法,并且在循环内部触发了 Task.Run
。在每个任务中,我调用一个等待的服务方法,与同步代码相比,它可以将所有内容的执行时间减少到 1/4。此外,每个任务都分配了一个取消令牌,如示例 GitHub link:
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
//init values and controls in here
List<Task> tasks = new List<Task>();
for (int i = startIndex; i < stopIndex; i++)
{
int j = i;
Task task = Task.Run(async () =>
{
LoadedJockey jockey = new LoadedJockey();
CancellationToken.ThrowIfCancellationRequested();
if (dataType == "jockeysPl") jockey = await _scrapServices.ScrapSingleJockeyPlAsync(j);
if (dataType == "jockeysCz") jockey = await _scrapServices.ScrapSingleJockeyCzAsync(j);
//doing some stuff with results in here
}, TokenSource.Token);
tasks.Add(task);
}
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
关于我的问题,我的代码是否一切正常?根据 this article:
Many async newbies start off by trying to treat asynchronous tasks the
same as parallel (TPL) tasks and this is a major misstep.
那我应该用什么?
并且根据this article:
On a busy server, this kind of implementation can kill scalability.
那我该怎么做呢?
请注意,服务接口方法签名为Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index);
而且我也不是 100% 确定我在我的服务 class 中正确使用 Task.Run
。里面的方法将代码包装在 await Task.Run(() =>
中,就像示例中的 GitHub link:
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
LoadedJockey jockey = new LoadedJockey();
await Task.Run(() =>
{
//do some time consuming things
});
return jockey;
}
据我从文章中了解到,这是一种反模式。但我有点困惑。根据,应该没问题吧……?如果没有,如何更换?
As far as I understand from the articles, this is a kind of anti-pattern.
这是一个anti-pattern。但是如果不能修改服务实现,你至少应该能够并行执行任务。像这样:
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
ConcurrentBag<Task> tasks = new ConcurrentBag<Task>();
ParallelOptions parallelLoopOptions = new ParallelOptions() { CancellationToken = CancellationToken };
Parallel.For(startIndex, stopIndex, parallelLoopOptions, i =>
{
int j = i;
switch (dataType)
{
case "jockeysPl":
tasks.Add(_scrapServices.ScrapSingleJockeyPlAsync(j));
break;
case "jockeysCz":
tasks.Add(_scrapServices.ScrapSingleJockeyCzAsync(j));
break;
}
});
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
在 UI 方面,当您的 CPU-bound 代码足够长以至于您需要将其移出 UI 线程时,您应该使用 Task.Run
.这与服务器端完全不同,服务器端使用 Task.Run
完全 是 anti-pattern.
在你的情况下,你所有的代码似乎都是 I/O-based,所以我认为根本不需要 Task.Run
。
你的问题中有一个语句与提供的代码冲突:
I am calling an awaited service method
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
await Task.Run(() =>
{
//do some time consuming things
});
}
传递给 Task.Run
的 lambda 不是 async
,因此不可能等待服务方法。确实 it is not.
更好的解决方案是异步加载 HTML(例如,使用 HttpClient.GetStringAsync
),然后调用 HtmlDocument.LoadHtml
,如下所示:
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
LoadedJockey jockey = new LoadedJockey();
...
string link = sb.ToString();
var html = await httpClient.GetStringAsync(link).ConfigureAwait(false);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
if (jockey.Name == null)
...
return jockey;
}
并从 for
循环中删除 Task.Run
:
private async Task ScrapJockey(string dataType)
{
LoadedJockey jockey = new LoadedJockey();
CancellationToken.ThrowIfCancellationRequested();
if (dataType == "jockeysPl") jockey = await _scrapServices.ScrapSingleJockeyPlAsync(j).ConfigureAwait(false);
if (dataType == "jockeysCz") jockey = await _scrapServices.ScrapSingleJockeyCzAsync(j).ConfigureAwait(false);
//doing some stuff with results in here
}
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
//init values and controls in here
List<Task> tasks = new List<Task>();
for (int i = startIndex; i < stopIndex; i++)
{
tasks.Add(ScrapJockey(dataType));
}
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
我开始对我的代码产生巨大的怀疑,我需要更有经验的程序员的一些建议。
在我的应用程序中单击按钮时,应用程序运行一个命令,即调用 ScrapJockeys
方法:
if (UpdateJockeysPl) await ScrapJockeys(JPlFrom, JPlTo + 1, "jockeysPl"); //1 - 1049
ScrapJockeys
正在触发一个 for
循环,重复代码块 20K - 150K 次(视情况而定)。在循环内部,我需要调用一个服务方法,该方法的执行需要花费大量时间。此外,我希望能够取消循环以及 loop/method.
现在我正在使用一个带有任务列表的方法,并且在循环内部触发了 Task.Run
。在每个任务中,我调用一个等待的服务方法,与同步代码相比,它可以将所有内容的执行时间减少到 1/4。此外,每个任务都分配了一个取消令牌,如示例 GitHub link:
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
//init values and controls in here
List<Task> tasks = new List<Task>();
for (int i = startIndex; i < stopIndex; i++)
{
int j = i;
Task task = Task.Run(async () =>
{
LoadedJockey jockey = new LoadedJockey();
CancellationToken.ThrowIfCancellationRequested();
if (dataType == "jockeysPl") jockey = await _scrapServices.ScrapSingleJockeyPlAsync(j);
if (dataType == "jockeysCz") jockey = await _scrapServices.ScrapSingleJockeyCzAsync(j);
//doing some stuff with results in here
}, TokenSource.Token);
tasks.Add(task);
}
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
关于我的问题,我的代码是否一切正常?根据 this article:
Many async newbies start off by trying to treat asynchronous tasks the same as parallel (TPL) tasks and this is a major misstep.
那我应该用什么?
并且根据this article:
On a busy server, this kind of implementation can kill scalability.
那我该怎么做呢?
请注意,服务接口方法签名为Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index);
而且我也不是 100% 确定我在我的服务 class 中正确使用 Task.Run
。里面的方法将代码包装在 await Task.Run(() =>
中,就像示例中的 GitHub link:
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
LoadedJockey jockey = new LoadedJockey();
await Task.Run(() =>
{
//do some time consuming things
});
return jockey;
}
据我从文章中了解到,这是一种反模式。但我有点困惑。根据
As far as I understand from the articles, this is a kind of anti-pattern.
这是一个anti-pattern。但是如果不能修改服务实现,你至少应该能够并行执行任务。像这样:
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
ConcurrentBag<Task> tasks = new ConcurrentBag<Task>();
ParallelOptions parallelLoopOptions = new ParallelOptions() { CancellationToken = CancellationToken };
Parallel.For(startIndex, stopIndex, parallelLoopOptions, i =>
{
int j = i;
switch (dataType)
{
case "jockeysPl":
tasks.Add(_scrapServices.ScrapSingleJockeyPlAsync(j));
break;
case "jockeysCz":
tasks.Add(_scrapServices.ScrapSingleJockeyCzAsync(j));
break;
}
});
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
在 UI 方面,当您的 CPU-bound 代码足够长以至于您需要将其移出 UI 线程时,您应该使用 Task.Run
.这与服务器端完全不同,服务器端使用 Task.Run
完全 是 anti-pattern.
在你的情况下,你所有的代码似乎都是 I/O-based,所以我认为根本不需要 Task.Run
。
你的问题中有一个语句与提供的代码冲突:
I am calling an awaited service method
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
await Task.Run(() =>
{
//do some time consuming things
});
}
传递给 Task.Run
的 lambda 不是 async
,因此不可能等待服务方法。确实 it is not.
更好的解决方案是异步加载 HTML(例如,使用 HttpClient.GetStringAsync
),然后调用 HtmlDocument.LoadHtml
,如下所示:
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
LoadedJockey jockey = new LoadedJockey();
...
string link = sb.ToString();
var html = await httpClient.GetStringAsync(link).ConfigureAwait(false);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
if (jockey.Name == null)
...
return jockey;
}
并从 for
循环中删除 Task.Run
:
private async Task ScrapJockey(string dataType)
{
LoadedJockey jockey = new LoadedJockey();
CancellationToken.ThrowIfCancellationRequested();
if (dataType == "jockeysPl") jockey = await _scrapServices.ScrapSingleJockeyPlAsync(j).ConfigureAwait(false);
if (dataType == "jockeysCz") jockey = await _scrapServices.ScrapSingleJockeyCzAsync(j).ConfigureAwait(false);
//doing some stuff with results in here
}
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
//init values and controls in here
List<Task> tasks = new List<Task>();
for (int i = startIndex; i < stopIndex; i++)
{
tasks.Add(ScrapJockey(dataType));
}
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}