Azure 机器学习工作区笔记本的版本控制

Version control of azure machine learning workspace notebooks

我正在尝试使用新的 Azure ML Workspace 的容量,但我找不到任何选项来跟踪我在 git 上的笔记本。

这是可能的吗?你可以用 Azure 笔记本做吗?如果不可能……它应该如何与这个笔记本一起工作?只在这个工作区内?

谢谢!

围绕这个有一个完整的概念叫做 ML Ops. There are also plenty of sample notebooks around this, how to for example use Azure ML together with Azure DevOps. E.g. here

据我所知,Git 当前不受 Azure 机器学习笔记本支持。如果您正在寻找功能更全的开发环境,我建议您在本地设置一个。前面有更多的工作,但它会给你版本控制的能力。查看此开发环境设置指南。 https://docs.microsoft.com/en-us/azure/machine-learning/how-to-configure-environment

| Environment                                                   | Pros                                                                                                                                                                                                                                    | Cons                                                                                                                                                                                 |
|---------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Cloud-based Azure Machine Learning compute instance (preview) | Easiest way to get started. The entire SDK is already installed in your workspace VM, and notebook tutorials are pre-cloned and ready to run.                                                                                           | Lack of control over your development environment and dependencies. Additional cost incurred for Linux VM (VM can be stopped when not in use to avoid charges). See pricing details. |
| Local environment                                             | Full control of your development environment and dependencies. Run with any build tool, environment, or IDE of your choice.                                                                                                             | Takes longer to get started. Necessary SDK packages must be installed, and an environment must also be installed if you don't already have one.                                      |
| Azure Databricks                                              | Ideal for running large-scale intensive machine learning workflows on the scalable Apache Spark platform.                                                                                                                               | Overkill for experimental machine learning, or smaller-scale experiments and workflows. Additional cost incurred for Azure Databricks. See pricing details.                          |
| The Data Science Virtual Machine (DSVM)                       | Similar to the cloud-based compute instance (Python and the SDK are pre-installed), but with additional popular data science and machine learning tools pre-installed. Easy to scale and combine with other custom tools and workflows. | A slower getting started experience compared to the cloud-based compute instance.                                                                                                    |