如何保证一个long运行方法不被多个任务并行执行？

Question

我在 class 中有一个基于字典的基本缓存。

private HashSet<string> _myCache = new HashSet<string>();

private void BuildMyCache()
        {
          
           lock(_cacheLock)
            {

                try
                {
                    _cacheRebuildInProgress = true;
                    _favouritesCache.Clear();
                    //Populate the _myCache hashset here by querying the database
                                              
                        }
                    }
                    _cacheRebuildInProgress = false;
                }
                catch (Exception ex)
                {
                    _cacheRebuildInProgress = false;
                }
            }
        }

然后我在同一个 class 中使用了这个方法，它可以被多个任务同时调用，所有运行。

public bool ValueExistsInMyCache(string valueToSearch)
        {
            while(true)
            {
                if(!_cacheRebuildInProgress)
                {
                    if (_myCache.Count == 0 || stopWatch.ElapsedMilliseconds / 1000 / 60 >= _cacheSettings.TimespanMinutes)
                    {
                        BuildMyCache();
                        stopWatch.Restart();                       
                    }
                    break;
                }
                else
                {
                    Thread.Sleep(1000);
                }
                
            }
           
            var result = _myCache.Contains(valueToSearch);
            return result;
        }

所以我认为这个工作的理想方式是：

1- 多个任务可能会同时尝试从我的缓存中读取。那是很好

2- 如果正在重建缓存，我希望重建方法被调用仅一次，让任何后续任务在重建时等待，然后搜索重建的缓存

问题是，上面的代码是否完成了 2，而且我应该考虑为这个简单的用例使用缓存库，这样我就不必为以后不得不处理难以调试的情况而头疼了，因为我可能会知道如何我写错了。

谢谢

Answer 1

你可以使用 Interlocked.CompareExchange.

if(Interlocked.CompareExchange(ref myField, 1, 0) == 0){

}

这应该确保只有一个调用者在检查中成功，而所有其他调用者都失败了。如果您想运行多次检查，您显然需要重置该字段，并使用锁或 memoryBarrier 来确保写入不会被编译器或 CPU.

移动

另请注意，HashSet<string> 不是线程安全的，因此如果有任何并发访问的机会（并发读取除外），所有访问都需要在锁中，或者可能是其他一些同步机制。发布的示例代码对我来说看起来不安全。 concurrentDictionary might be a much safer option to use. That sleep also looks like a poor design, if some other thread is updating the cache it might be a better option to wait for that to be done using some kind of event.

Answer 2

听起来你想要一把 read/write 锁。您同时允许多个读取器，但您一次只想允许一个写入器（重建缓存的东西），写入器不能在任何读取器读取时操作。

private readonly ReaderWriterLockSlim _myCacheLock = new();
private readonly HashSet<string> _myCache = new HashSet<string>();

private void BuildMyCache()
{
    _myCacheLock.EnterWriteLock();
    try
    {
        _myCache.Clear();
        //Populate the _myCache hashset here by querying the database
    }
    finally
    {
        _myCacheLock.ExitWriteLock();
    }
}

public bool ValueExistsInMyCache(string valueToSearch)
{
    _myCacheLock.EnterReadLock();
    try
    {
        return _myCache.Contains(valueToSearch);
    }
    finally
    {
        _myCacheLock.ExitReadLock();
    }
}

也就是说，将新缓存构建到 new HashSet<string> 中可能更容易，并且只在完成后将其写入 _myCache。这样做的好处是在重建新缓存时读者不会被阻塞：他们将继续使用旧缓存，直到新缓存准备就绪。

您仍然需要锁定 _myCache 字段的 reads/writes，因为读取线程可以简单地忽略对该字段的更改。

private readonly object _myCacheLock = new();
private HashSet<string> _myCache = new HashSet<string>();

private void BuildMyCache()
{
    var newCache = new HashSet<string>();
    // Populate newCache here by querying the database
    lock (_myCacheLock)
    {
        _myCache = newCache;
    }
}

public bool ValueExistsInMyCache(string valueToSearch)
{
    HashSet<string> myCache;
    lock (_myCacheLock)
    {
        myCache = _myCache;
    }
    return myCache.Contains(valueToSearch);
}

请注意，此方法不会阻止所有并行构建缓存对 BuildMyCache 的多个并发调用。这是“安全的”，但如果可能发生这种情况，可能会造成浪费。

Answer 3

您自己创建的是一个带有过期缓存的自旋锁。后者是其他答案似乎没有考虑到的。

您的解决方案可能会奏效，但它可能不会是性能最高或最稳定的，一般情况下最好不要重新发明轮子。

看看 IMemoryCache - 这将为您计算到期时间。您将必须添加 Microsoft.Extensions.Caching.Abstractions 和 Microsoft.Extensions.Caching.Memory nuget 包。不幸的是，您必须自己进行锁定以避免多个线程并行填充缓存。

    private static object locker = new object();

    public bool ValueExistsInMyCache(string valueToSearch)
    {
        string cacheKey = "myCache";

        bool? checkCache()
        {
            var hashSet = cache.Get<HashSet<string>>(cacheKey);
            if (hashSet != null)
                return hashSet.Contains(valueToSearch);
            return null;
        }

        var result = checkCache();
        if (result != null)
            return result.Value;

        lock (locker)
        {
            result = checkCache();
            if (result != null)
                return result.Value;
            var hashSet = new HashSet<string>();
            //populateHashset here
            result = hashSet.Contains(valueToSearch);
            cache.Set(cacheKey, hashSet, TimeSpan.FromMinutes(_cacheSettings.TimespanMinutes)));
        }

        return result.Value;
    }

上面的代码是一个简单的例子——你可以进一步重构它，不再需要 hashSet，但我会把它留给你:-)

如何保证一个long运行方法不被多个任务并行执行？

How to ensure that a long running method is not executed in parallel by multiple tasks?

.net

c#