如何从一个记录中产生多个线索?

How to produce more than one clue from a record?

我正在写一个 CluedIn Crawler,当我获取数据时,我得到一条包含引用记录数组的记录。代码如下所示:

       public IEnumerable<object> GetData(CrawlJobData jobData)
        {
            if (!(jobData is MyCrawlJobData myCrawlJobData))
            {
                yield break;
            }

            var client = clientFactory.CreateNew(myCrawlJobData);

            foreach (var myModel in client.GetMyModels())
            {
                yield return myModel;
            }
        }
    }

然后在线索生成器中,我想为每个引用记录创建一个主线索和一个线索。但问题是 MakeClueImpl returns 的覆盖只有一个 Clue:

protected override Clue MakeClueImpl([NotNull] MyModel input, Guid accountId)
{
    // ...
}

如何避免此限制?

由于 GetDataIEnumerable<object>,您可以通过此方法生成不同的模型对象:

public IEnumerable<object> GetData(CrawlJobData jobData)
{
    if (!(jobData is MyCrawlJobData myCrawlJobData))
    {
        yield break;
    }

    var client = clientFactory.CreateNew(myCrawlJobData);

    foreach (var myModel in client.GetMyModels())
    {
        yield return myModel;
                
        foreach (var relatedRecordModel in myModel.RelatedRecords)
        {
            yield return relatedRecordModel;
        }
    }
}

然后有两个线索制作人:

protected override Clue MakeClueImpl([NotNull] MyModel input, Guid accountId)
{
    // TODO:
    return clue;
}

protected override Clue MakeClueImpl([NotNull] MyRelatedRecordModel input, Guid accountId)
{
    // TODO:
    return clue;
}

另一种方法是仅从 GetData 方法生成和产生线索,然后创建一个虚拟线索生成器,该生成器将接受 Clue:

public IEnumerable<object> GetData(CrawlJobData jobData)
{
    if (!(jobData is MyCrawlJobData myCrawlJobData))
    {
        yield break;
    }

    var client = clientFactory.CreateNew(myCrawlJobData);

    foreach (var myModel in client.GetMyModels())
    {
        yield return myModel;
                
        foreach (var relatedRecord in myModel.RelatedRecords)
        {
            // return a Clue per related record
            yield return MakeRelatedRecordClue(relatedRecord);
        }
    }
}

protected override Clue MakeClueImpl([NotNull] Clue input, Guid accountId)
{
    return clue;
}