如何从 TFS 2015 版本控制中找到一段时间内更改最频繁的前 10 个文件?

How to find top 10 files changed most frequently in a period from TFS 2015 Version Control?

我的团队使用 TFS 2015 作为 ALM 和版本控制系统,我想分析哪些文件更改最频繁。

我发现 TFS 没有开箱即用的这个功能,但是 TFS2015 有一个 REST API 来查询文件的变更集,如下所示:

http://{instance}/tfs/DefaultCollection/_apis/tfvc/changesets?searchCriteria.itemPath={filePath}&api-version=1.0

我的Project Repository中有几千个文件,一个一个查询不是个好主意,请问有没有更好的解决办法?

我认为您的问题没有现成的解决方案,我尝试了两种不同的方法来解决您的问题,我最初专注于 REST API 但后来切换到 SOAP API 查看它支持哪些功能。

在下面的所有选项中,以下 api 应该足够了:

Install the client API link @NuGet

Install-Package Microsoft.TeamFoundationServer.ExtendedClient -Version 14.89.0 or later

在所有选项中都需要以下扩展方法ref

    public static class StringExtensions
   {
       public static bool ContainsAny(this string source, List<string> lookFor)
       {
           if (!string.IsNullOrEmpty(source) && lookFor.Count > 0)
           {
               return lookFor.Any(source.Contains);
           }
           return false;
       }
   }

选项 1:肥皂 API

对于 SOAP API,没有明确要求使用 maxCount 参数限制查询结果的数量,如 QueryHistory 方法 的摘录中所述IntelliSense 文档:

maxCount: This parameter allows the caller to limit the number of results returned. QueryHistory pages results back from the server on demand, so limiting your own consumption of the returned IEnumerable is almost as effective (from a performance perspective) as providing a fixed value here. The most common value to provide for this parameter is Int32.MaxValue.

根据 maxCount 文档,我决定为我的源代码控制系统中的每个产品提取统计信息,因为查看每个系统中有多少代码流量可能很有价值代码库彼此独立,而不是将整个代码库限制为 10 个文件,而整个代码库可能包含数百个系统。

C# REST and SOAP (ExtendedClient) api reference

Install the SOAP API Client link @NuGet

Install-Package Microsoft.TeamFoundationServer.ExtendedClient -Version 14.95.2

limiting criteria are: Only scan specific paths in source control since some systems in source control are older and possibly only there for historic purposes.

  1. only certain file extensions included e.g. .cs, .js
  2. certain filenames excluded e.g. AssemblyInfo.cs.
  3. items extracted for each path: 10
  4. from date: 120 days ago
  5. to date: today
  6. exclude specific paths e.g. folders containing release branches or archived branches
using Microsoft.TeamFoundation.Client;
using Microsoft.TeamFoundation.VersionControl.Client;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
public void GetTopChangedFilesSoapApi()
    {
        var tfsUrl = "https://<SERVERNAME>/tfs/<COLLECTION>";
        var domain = "<DOMAIN>";
        var password = "<PASSWORD>";
        var userName = "<USERNAME>";

        //Only interested in specific systems so will scan only these
        var directoriesToScan = new List<string> {
            "$/projectdir/subdir/subdir/subdirA/systemnameA",
            "$/projectdir/subdir/subdir/subdirB/systemnameB",
            "$/projectdir/subdir/subdir/subdirC/systemnameC",
            "$/projectdir/subdir/subdir/subdirD/systemnameD"
            };

        var maxResultsPerPath = 10;
        var fromDate = DateTime.Now.AddDays(-120);
        var toDate = DateTime.Now;

        var fileExtensionToInclude = new List<string> { ".cs", ".js" };
        var extensionExclusions = new List<string> { ".csproj", ".json", ".css" };
        var fileExclusions = new List<string> { "AssemblyInfo.cs", "jquery-1.12.3.min.js", "config.js" };
        var pathExclusions = new List<string> {
            "/subdirToForceExclude1/",
            "/subdirToForceExclude2/",
            "/subdirToForceExclude3/",
        };

        using (var collection = new TfsTeamProjectCollection(new Uri(tfsUrl), 
            new NetworkCredential(userName: userName, password: password, domain: domain)))
        {
            collection.EnsureAuthenticated();

            var tfvc = collection.GetService(typeof(VersionControlServer)) as VersionControlServer;

            foreach (var rootDirectory in directoriesToScan)
            {
                //Get changesets
                //Note: maxcount set to maxvalue since impact to server is minimized by linq query below
                var changeSets = tfvc.QueryHistory(path: rootDirectory, version: VersionSpec.Latest,
                    deletionId: 0, recursion: RecursionType.Full, user: null,
                    versionFrom: new DateVersionSpec(fromDate), versionTo: new DateVersionSpec(toDate),
                    maxCount: int.MaxValue, includeChanges: true,
                    includeDownloadInfo: false, slotMode: true)
                    as IEnumerable<Changeset>;

                //Filter changes contained in changesets
                var changes = changeSets.SelectMany(a => a.Changes)
                .Where(a => a.ChangeType != ChangeType.Lock || a.ChangeType != ChangeType.Delete || a.ChangeType != ChangeType.Property)
                .Where(e => !e.Item.ServerItem.ContainsAny(pathExclusions))
                .Where(e => !e.Item.ServerItem.Substring(e.Item.ServerItem.LastIndexOf('/') + 1).ContainsAny(fileExclusions))
                .Where(e => !e.Item.ServerItem.Substring(e.Item.ServerItem.LastIndexOf('.')).ContainsAny(extensionExclusions))
                .Where(e => e.Item.ServerItem.Substring(e.Item.ServerItem.LastIndexOf('.')).ContainsAny(fileExtensionToInclude))
                .GroupBy(g => g.Item.ServerItem)
                .Select(d => new { File=d.Key, Count=d.Count()})
                .OrderByDescending(o => o.Count)
                .Take(maxResultsPerPath);

                //Write top items for each path to the console
                Console.WriteLine(rootDirectory); Console.WriteLine("->");
                foreach (var change in changes)
                {
                    Console.WriteLine("ChangeCount: {0} : File: {1}", change.Count, change.File);
                }
                Console.WriteLine(Environment.NewLine);
            }
        }
    }

选项 2A:休息 API

(!! problem identified by OP led to finding a critical defect in v.xxx-14.95.4 of api) - OPTION 2B is the workaround

defect discovered in v.xxx to 14.95.4 of api: The TfvcChangesetSearchCriteria type contains an ItemPath property which is supposed to limit the search to a specified directory. The default value of this property is $/, unfortunately when used GetChangesetsAsync will always use the root path of the tfvc source repository irrespective of the value set.

That said, this will still be a reasonable approach if the defect were to be fixed.

限制对您的 scm 系统影响的一种方法是使用 GetChangesetsAsync 成员的 TfvcChangesetSearchCriteria 类型参数为 query/s 指定限制条件 TfvcHttpClient类型。

您并不特别需要单独检查 scm system/project 中的每个文件,检查指定期间的变更集可能就足够了。并非我在下面使用的所有限制值都是 TfvcChangesetSearchCriteria 类型的属性,所以我写了一个简短的例子来展示我将如何做到这一点,即 您可以指定最初要考虑的变更集的最大数量以及您要查看的特定项目。

注意: TheTfvcChangesetSearchCriteria 类型包含一些您可能要考虑使用的附加属性。

在下面的示例中,我在 C# 客户端中使用了 REST API 并从 tfvc 获取结果。
如果您打算使用不同的客户端语言并直接调用 REST 服务,例如JavaScript;下面的逻辑应该还能给你一些指点。

//targeted framework for example: 4.5.2
using Microsoft.TeamFoundation.SourceControl.WebApi;
using Microsoft.VisualStudio.Services.Client;
using Microsoft.VisualStudio.Services.Common;

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Threading.Tasks;
public async Task GetTopChangedFilesUsingRestApi()
    {
        var tfsUrl = "https://<SERVERNAME>/tfs/<COLLECTION>";
        var domain = "<DOMAIN>";
        var password = "<PASSWORD>";
        var userName = "<USERNAME>";

        //Criteria used to limit results
        var directoriesToScan = new List<string> {
            "$/projectdir/subdir/subdir/subdirA/systemnameA",
            "$/projectdir/subdir/subdir/subdirB/systemnameB",
            "$/projectdir/subdir/subdir/subdirC/systemnameC",
            "$/projectdir/subdir/subdir/subdirD/systemnameD"
        };

        var maxResultsPerPath = 10;
        var fromDate = DateTime.Now.AddDays(-120);
        var toDate = DateTime.Now;

        var fileExtensionToInclude = new List<string> { ".cs", ".js" };
        var folderPathsToInclude = new List<string> { "/subdirToForceInclude/" };
        var extensionExclusions = new List<string> { ".csproj", ".json", ".css" };
        var fileExclusions = new List<string> { "AssemblyInfo.cs", "jquery-1.12.3.min.js", "config.js" };
        var pathExclusions = new List<string> {
            "/subdirToForceExclude1/",
            "/subdirToForceExclude2/",
            "/subdirToForceExclude3/",
        };

        //Establish connection
        VssConnection connection = new VssConnection(new Uri(tfsUrl),
            new VssCredentials(new Microsoft.VisualStudio.Services.Common.WindowsCredential(new NetworkCredential(userName, password, domain))));

        //Get tfvc client
        var tfvcClient = await connection.GetClientAsync<TfvcHttpClient>();

        foreach (var rootDirectory in directoriesToScan)
        {
            //Set up date-range criteria for query
            var criteria = new TfvcChangesetSearchCriteria();
            criteria.FromDate = fromDate.ToShortDateString();
            criteria.ToDate = toDate.ToShortDateString();
            criteria.ItemPath = rootDirectory;

            //get change sets
            var changeSets = await tfvcClient.GetChangesetsAsync(
                maxChangeCount: int.MaxValue,
                includeDetails: false,
                includeWorkItems: false,
                searchCriteria: criteria);

            if (changeSets.Any())
            {
                var sample = new List<TfvcChange>();

                Parallel.ForEach(changeSets, changeSet =>
                {
                    sample.AddRange(tfvcClient.GetChangesetChangesAsync(changeSet.ChangesetId).Result);
                });

                //Filter changes contained in changesets
                var changes = sample.Where(a => a.ChangeType != VersionControlChangeType.Lock || a.ChangeType != VersionControlChangeType.Delete || a.ChangeType != VersionControlChangeType.Property)
                .Where(e => e.Item.Path.ContainsAny(folderPathsToInclude))
                .Where(e => !e.Item.Path.ContainsAny(pathExclusions))
                .Where(e => !e.Item.Path.Substring(e.Item.Path.LastIndexOf('/') + 1).ContainsAny(fileExclusions))
                .Where(e => !e.Item.Path.Substring(e.Item.Path.LastIndexOf('.')).ContainsAny(extensionExclusions))
                .Where(e => e.Item.Path.Substring(e.Item.Path.LastIndexOf('.')).ContainsAny(fileExtensionToInclude))
                .GroupBy(g => g.Item.Path)
                .Select(d => new { File = d.Key, Count = d.Count() })
                .OrderByDescending(o => o.Count)
                .Take(maxResultsPerPath);

                //Write top items for each path to the console
                Console.WriteLine(rootDirectory); Console.WriteLine("->");
                foreach (var change in changes)
                {
                    Console.WriteLine("ChangeCount: {0} : File: {1}", change.Count, change.File);
                }
                Console.WriteLine(Environment.NewLine);
            }
        }
    }

选项 2B

Note: This solution is very similar to OPTION 2A with the exception of a workaround implemented to fix a limitation in the REST client API library at time of writing. Brief summary - instead of invoking client api library to get changesets this example uses a web request direct to the REST API to fetch changesets, thus additional types were needed to be defined to handle the response from the service.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Threading.Tasks;

using Microsoft.TeamFoundation.SourceControl.WebApi;
using Microsoft.VisualStudio.Services.Client;
using Microsoft.VisualStudio.Services.Common;

using System.Text;
using System.IO;
using Newtonsoft.Json;
public async Task GetTopChangedFilesUsingDirectWebRestApiSO()
    {
        var tfsUrl = "https://<SERVERNAME>/tfs/<COLLECTION>";
        var domain = "<DOMAIN>";
        var password = "<PASSWORD>";
        var userName = "<USERNAME>";

        var changesetsUrl = "{0}/_apis/tfvc/changesets?searchCriteria.itemPath={1}&searchCriteria.fromDate={2}&searchCriteria.toDate={3}&$top={4}&api-version=1.0";

        //Criteria used to limit results
        var directoriesToScan = new List<string> {
            "$/projectdir/subdir/subdir/subdirA/systemnameA",
            "$/projectdir/subdir/subdir/subdirB/systemnameB",
            "$/projectdir/subdir/subdir/subdirC/systemnameC",
            "$/projectdir/subdir/subdir/subdirD/systemnameD"
        };

        var maxResultsPerPath = 10;
        var fromDate = DateTime.Now.AddDays(-120);
        var toDate = DateTime.Now;

        var fileExtensionToInclude = new List<string> { ".cs", ".js" };
        var folderPathsToInclude = new List<string> { "/subdirToForceInclude/" };
        var extensionExclusions = new List<string> { ".csproj", ".json", ".css" };
        var fileExclusions = new List<string> { "AssemblyInfo.cs", "jquery-1.12.3.min.js", "config.js" };
        var pathExclusions = new List<string> {
            "/subdirToForceExclude1/",
            "/subdirToForceExclude2/",
            "/subdirToForceExclude3/",
        };

        //Get tfvc client
        //Establish connection
        VssConnection connection = new VssConnection(new Uri(tfsUrl),
            new VssCredentials(new Microsoft.VisualStudio.Services.Common.WindowsCredential(new NetworkCredential(userName, password, domain))));

        //Get tfvc client
        var tfvcClient = await connection.GetClientAsync<TfvcHttpClient>();

        foreach (var rootDirectory in directoriesToScan)
        {
            var changeSets = Invoke<GetChangeSetsResponse>("GET", string.Format(changesetsUrl, tfsUrl, rootDirectory,fromDate,toDate,maxResultsPerPath), userName, password, domain).value;

            if (changeSets.Any())
            {
                //Get changes
                var sample = new List<TfvcChange>();
                foreach (var changeSet in changeSets)
                {
                    sample.AddRange(tfvcClient.GetChangesetChangesAsync(changeSet.changesetId).Result);
                }

                //Filter changes
                var changes = sample.Where(a => a.ChangeType != VersionControlChangeType.Lock || a.ChangeType != VersionControlChangeType.Delete || a.ChangeType != VersionControlChangeType.Property)
                .Where(e => e.Item.Path.ContainsAny(folderPathsToInclude))
                .Where(e => !e.Item.Path.ContainsAny(pathExclusions))
                .Where(e => !e.Item.Path.Substring(e.Item.Path.LastIndexOf('/') + 1).ContainsAny(fileExclusions))
                .Where(e => !e.Item.Path.Substring(e.Item.Path.LastIndexOf('.')).ContainsAny(extensionExclusions))
                .Where(e => e.Item.Path.Substring(e.Item.Path.LastIndexOf('.')).ContainsAny(fileExtensionToInclude))
                .GroupBy(g => g.Item.Path)
                .Select(d => new { File = d.Key, Count = d.Count() })
                .OrderByDescending(o => o.Count)
                .Take(maxResultsPerPath);

                //Write top items for each path to the console
                Console.WriteLine(rootDirectory); Console.WriteLine("->");
                foreach (var change in changes)
                {
                    Console.WriteLine("ChangeCount: {0} : File: {1}", change.Count, change.File);
                }
                Console.WriteLine(Environment.NewLine);
            }
        }
    }

    private T Invoke<T>(string method, string url, string userName, string password, string domain)
    {
        var request = WebRequest.Create(url);
        var httpRequest = request as HttpWebRequest;
        if (httpRequest != null) httpRequest.UserAgent = "versionhistoryApp";
        request.ContentType = "application/json";
        request.Method = method;

        request.Credentials = new NetworkCredential(userName, password, domain); //ntlm 401 challenge support
        request.Headers[HttpRequestHeader.Authorization] = "Basic " + Convert.ToBase64String(Encoding.UTF8.GetBytes(domain+"\"+userName + ":" + password)); //basic auth support if enabled on tfs instance

        try
        {
            using (var response = request.GetResponse())
            using (var responseStream = response.GetResponseStream())
            using (var reader = new StreamReader(responseStream))
            {
                string s = reader.ReadToEnd();
                return Deserialize<T>(s);
            }
        }
        catch (WebException ex)
        {
            if (ex.Response == null)
                throw;

            using (var responseStream = ex.Response.GetResponseStream())
            {
                string message;
                try
                {
                    message = new StreamReader(responseStream).ReadToEnd();
                }
                catch
                {
                    throw ex;
                }

                throw new Exception(message, ex);
            }
        }
    }

    public class GetChangeSetsResponse
    {
        public IEnumerable<Changeset> value { get; set; }
        public class Changeset
        {
            public int changesetId { get; set; }
            public string url { get; set; }
            public DateTime createdDate { get; set; }
            public string comment { get; set; }
        }
    }

    public static T Deserialize<T>(string json)
    {
        T data = JsonConvert.DeserializeObject<T>(json);
        return data;
    }
}

其他参考文献:

C# REST and SOAP (ExtendedClient) api reference

REST API: tfvc Changesets

TfvcChangesetSearchCriteria type @MSDN