在需要构建的对象中设置字段
Setting fields in objects that need to build
我有一个 DecisionTree 对象来创建机器学习模型。 DecisionTree 中有很多字段代表设置。每个字段都有一个默认值,在大多数情况下,只需更改其中一两个字段。
问题是,实际构建决策树的计算成本很高。因此,我没有在制作模型时构建模型,而是让制作者只解析和保存数据。在调用 DecisionTree.build 之前不会构建模型。这允许在构建之前更改设置。但是,这也意味着如果在构建之前调用 DecisionTree.predict 将会失败。
我知道让对象始终处于有效状态是一种很好的做法。但这意味着在构造函数中构建树,这是昂贵的,然后如果更改任何设置,则必须重新构建。
示例 1:构建调用是独立的
DecisionTree tree = new DecisionTree(data, classes, attributes);
tree.predict(item); //This would error
tree.maxDepth = 15;
tree.infoGain = 0.5;
tree.build();
tree.predict(item) // Now it would work
示例 2:包含构建调用,设置不在构造函数中
DecisionTree tree = new DecisionTree(data, classes, attributes); // This would take a long time to complete
tree.predict(item); //This would now work
tree.maxDepth = 15;
tree.infoGain = 0.5;
tree.build(); // This would once again take a long time to complete
tree.predict(item) // Done, but takes twice as long as the previous example
示例 3
DecisionTree tree = new DecisionTree(data, classes, attributes, null, null, 15, null, null, 0.5, null, null, null); // Settings are all included in constructor
tree.predict(item); //This would immediately be callable
我的问题是,这 3 个选项是处理许多设置的唯一方法吗? standard/best 的做法是什么?
我不认为用额外的方法来适应算法是一种不好的做法,例如在 scikit-learn, they provide additional methid to fit object, constructor itself just initializes internal variables, and if you call predict before fit it just throws NotFittedError. Besides that, maybe in future you would want to extend your algo to work for example with minibatches, and in this case it's impossible to call constructor more than once, thus you will need something like partial_fit 方法中寻找,以在额外的数据块上适应分类器。所以你不能在构造函数中做所有事情。
如果你在初始化时有大量参数,也许你会发现有用Builder pattern
我有一个 DecisionTree 对象来创建机器学习模型。 DecisionTree 中有很多字段代表设置。每个字段都有一个默认值,在大多数情况下,只需更改其中一两个字段。
问题是,实际构建决策树的计算成本很高。因此,我没有在制作模型时构建模型,而是让制作者只解析和保存数据。在调用 DecisionTree.build 之前不会构建模型。这允许在构建之前更改设置。但是,这也意味着如果在构建之前调用 DecisionTree.predict 将会失败。
我知道让对象始终处于有效状态是一种很好的做法。但这意味着在构造函数中构建树,这是昂贵的,然后如果更改任何设置,则必须重新构建。
示例 1:构建调用是独立的
DecisionTree tree = new DecisionTree(data, classes, attributes);
tree.predict(item); //This would error
tree.maxDepth = 15;
tree.infoGain = 0.5;
tree.build();
tree.predict(item) // Now it would work
示例 2:包含构建调用,设置不在构造函数中
DecisionTree tree = new DecisionTree(data, classes, attributes); // This would take a long time to complete
tree.predict(item); //This would now work
tree.maxDepth = 15;
tree.infoGain = 0.5;
tree.build(); // This would once again take a long time to complete
tree.predict(item) // Done, but takes twice as long as the previous example
示例 3
DecisionTree tree = new DecisionTree(data, classes, attributes, null, null, 15, null, null, 0.5, null, null, null); // Settings are all included in constructor
tree.predict(item); //This would immediately be callable
我的问题是,这 3 个选项是处理许多设置的唯一方法吗? standard/best 的做法是什么?
我不认为用额外的方法来适应算法是一种不好的做法,例如在 scikit-learn, they provide additional methid to fit object, constructor itself just initializes internal variables, and if you call predict before fit it just throws NotFittedError. Besides that, maybe in future you would want to extend your algo to work for example with minibatches, and in this case it's impossible to call constructor more than once, thus you will need something like partial_fit 方法中寻找,以在额外的数据块上适应分类器。所以你不能在构造函数中做所有事情。
如果你在初始化时有大量参数,也许你会发现有用Builder pattern