如何使用 Java API 在 Google Dataproc 集群上设置可选属性?

How to set optional properties on Google Dataproc Cluster using Java API?

我正在尝试按照此文档使用 Java API 创建 Dataproc 集群 https://cloud.google.com/dataproc/docs/quickstarts/quickstart-lib

示例代码如下


  public static void createCluster() throws IOException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String region = "your-project-region";
    String clusterName = "your-cluster-name";
    createCluster(projectId, region, clusterName);
  }

  public static void createCluster(String projectId, String region, String clusterName)
      throws IOException, InterruptedException {
    String myEndpoint = String.format("%s-dataproc.googleapis.com:443", region);

    // Configure the settings for the cluster controller client.
    ClusterControllerSettings clusterControllerSettings =
        ClusterControllerSettings.newBuilder().setEndpoint(myEndpoint).build();

    // Create a cluster controller client with the configured settings. The client only needs to be
    // created once and can be reused for multiple requests. Using a try-with-resources
    // closes the client, but this can also be done manually with the .close() method.
    try (ClusterControllerClient clusterControllerClient =
        ClusterControllerClient.create(clusterControllerSettings)) {
      // Configure the settings for our cluster.
      InstanceGroupConfig masterConfig =
          InstanceGroupConfig.newBuilder()
              .setMachineTypeUri("n1-standard-1")
              .setNumInstances(1)
              .build();
      InstanceGroupConfig workerConfig =
          InstanceGroupConfig.newBuilder()
              .setMachineTypeUri("n1-standard-1")
              .setNumInstances(2)
              .build();
      ClusterConfig clusterConfig =
          ClusterConfig.newBuilder()
              .setMasterConfig(masterConfig)
              .setWorkerConfig(workerConfig)
              .build();
      // Create the cluster object with the desired cluster config.
      Cluster cluster =
          Cluster.newBuilder().setClusterName(clusterName).setConfig(clusterConfig).build();

      // Create the Cloud Dataproc cluster.
      OperationFuture<Cluster, ClusterOperationMetadata> createClusterAsyncRequest =
          clusterControllerClient.createClusterAsync(projectId, region, cluster);
      Cluster response = createClusterAsyncRequest.get();

      // Print out a success message.
      System.out.printf("Cluster created successfully: %s", response.getClusterName());

    } catch (ExecutionException e) {
      System.err.println(String.format("Error executing createCluster: %s ", e.getMessage()));
    }
  }
}

所以根据文档,我能够成功创建它,但是有几个可选属性我无法弄清楚如何在此处设置它,以供参考下面的屏幕截图,可以使用 Google云控制台。

可以使用 Google Cloud SDK 添加这些属性,如下所示

gcloud dataproc clusters create my-cluster \
    --region=region \
    --properties=spark:spark.executor.memory=4g \
    ... other args ...

如何使用 Java API 进行设置。标签也完全一样,我们如何使用 Java API.

在集群上设置标签

您可以检查 Dataproc Java 客户端库的完整 API reference

具体来说,要设置您要查看的属性 SoftwareConfig.Builder. Similarly you can associate a label to a Cluster with Cluster.Builder