如何使用 Java API 在 Google Dataproc 集群上设置可选属性?
How to set optional properties on Google Dataproc Cluster using Java API?
我正在尝试按照此文档使用 Java API 创建 Dataproc 集群
https://cloud.google.com/dataproc/docs/quickstarts/quickstart-lib
示例代码如下
public static void createCluster() throws IOException, InterruptedException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-project-id";
String region = "your-project-region";
String clusterName = "your-cluster-name";
createCluster(projectId, region, clusterName);
}
public static void createCluster(String projectId, String region, String clusterName)
throws IOException, InterruptedException {
String myEndpoint = String.format("%s-dataproc.googleapis.com:443", region);
// Configure the settings for the cluster controller client.
ClusterControllerSettings clusterControllerSettings =
ClusterControllerSettings.newBuilder().setEndpoint(myEndpoint).build();
// Create a cluster controller client with the configured settings. The client only needs to be
// created once and can be reused for multiple requests. Using a try-with-resources
// closes the client, but this can also be done manually with the .close() method.
try (ClusterControllerClient clusterControllerClient =
ClusterControllerClient.create(clusterControllerSettings)) {
// Configure the settings for our cluster.
InstanceGroupConfig masterConfig =
InstanceGroupConfig.newBuilder()
.setMachineTypeUri("n1-standard-1")
.setNumInstances(1)
.build();
InstanceGroupConfig workerConfig =
InstanceGroupConfig.newBuilder()
.setMachineTypeUri("n1-standard-1")
.setNumInstances(2)
.build();
ClusterConfig clusterConfig =
ClusterConfig.newBuilder()
.setMasterConfig(masterConfig)
.setWorkerConfig(workerConfig)
.build();
// Create the cluster object with the desired cluster config.
Cluster cluster =
Cluster.newBuilder().setClusterName(clusterName).setConfig(clusterConfig).build();
// Create the Cloud Dataproc cluster.
OperationFuture<Cluster, ClusterOperationMetadata> createClusterAsyncRequest =
clusterControllerClient.createClusterAsync(projectId, region, cluster);
Cluster response = createClusterAsyncRequest.get();
// Print out a success message.
System.out.printf("Cluster created successfully: %s", response.getClusterName());
} catch (ExecutionException e) {
System.err.println(String.format("Error executing createCluster: %s ", e.getMessage()));
}
}
}
所以根据文档,我能够成功创建它,但是有几个可选属性我无法弄清楚如何在此处设置它,以供参考下面的屏幕截图,可以使用 Google云控制台。
可以使用 Google Cloud SDK 添加这些属性,如下所示
gcloud dataproc clusters create my-cluster \
--region=region \
--properties=spark:spark.executor.memory=4g \
... other args ...
如何使用 Java API 进行设置。标签也完全一样,我们如何使用 Java API.
在集群上设置标签
您可以检查 Dataproc Java 客户端库的完整 API reference。
具体来说,要设置您要查看的属性 SoftwareConfig.Builder. Similarly you can associate a label to a Cluster with Cluster.Builder。
我正在尝试按照此文档使用 Java API 创建 Dataproc 集群 https://cloud.google.com/dataproc/docs/quickstarts/quickstart-lib
示例代码如下
public static void createCluster() throws IOException, InterruptedException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-project-id";
String region = "your-project-region";
String clusterName = "your-cluster-name";
createCluster(projectId, region, clusterName);
}
public static void createCluster(String projectId, String region, String clusterName)
throws IOException, InterruptedException {
String myEndpoint = String.format("%s-dataproc.googleapis.com:443", region);
// Configure the settings for the cluster controller client.
ClusterControllerSettings clusterControllerSettings =
ClusterControllerSettings.newBuilder().setEndpoint(myEndpoint).build();
// Create a cluster controller client with the configured settings. The client only needs to be
// created once and can be reused for multiple requests. Using a try-with-resources
// closes the client, but this can also be done manually with the .close() method.
try (ClusterControllerClient clusterControllerClient =
ClusterControllerClient.create(clusterControllerSettings)) {
// Configure the settings for our cluster.
InstanceGroupConfig masterConfig =
InstanceGroupConfig.newBuilder()
.setMachineTypeUri("n1-standard-1")
.setNumInstances(1)
.build();
InstanceGroupConfig workerConfig =
InstanceGroupConfig.newBuilder()
.setMachineTypeUri("n1-standard-1")
.setNumInstances(2)
.build();
ClusterConfig clusterConfig =
ClusterConfig.newBuilder()
.setMasterConfig(masterConfig)
.setWorkerConfig(workerConfig)
.build();
// Create the cluster object with the desired cluster config.
Cluster cluster =
Cluster.newBuilder().setClusterName(clusterName).setConfig(clusterConfig).build();
// Create the Cloud Dataproc cluster.
OperationFuture<Cluster, ClusterOperationMetadata> createClusterAsyncRequest =
clusterControllerClient.createClusterAsync(projectId, region, cluster);
Cluster response = createClusterAsyncRequest.get();
// Print out a success message.
System.out.printf("Cluster created successfully: %s", response.getClusterName());
} catch (ExecutionException e) {
System.err.println(String.format("Error executing createCluster: %s ", e.getMessage()));
}
}
}
所以根据文档,我能够成功创建它,但是有几个可选属性我无法弄清楚如何在此处设置它,以供参考下面的屏幕截图,可以使用 Google云控制台。
可以使用 Google Cloud SDK 添加这些属性,如下所示
gcloud dataproc clusters create my-cluster \
--region=region \
--properties=spark:spark.executor.memory=4g \
... other args ...
如何使用 Java API 进行设置。标签也完全一样,我们如何使用 Java API.
在集群上设置标签您可以检查 Dataproc Java 客户端库的完整 API reference。
具体来说,要设置您要查看的属性 SoftwareConfig.Builder. Similarly you can associate a label to a Cluster with Cluster.Builder。