无法再为 requirements.txt 文件中的 Google Cloud Dataflow 作业安装 `google-cloud-datastore` 依赖项

Can no longer install `google-cloud-datastore` dependency for Google Cloud Dataflow jobs in requirements.txt file

与上次 post 类似的事情再次发生:Google Python cloud-dataflow instances broke without new deployment (failed pubsub import)

基本上,一夜之间,我们所有的云数据流作业似乎无缘无故地中断了。没有新的部署,没有 SDK 更新,什么都没有,我们的团队只是在 StackDriver 通知中醒来,说我们的映射作业在一夜之间失败了。

这是作业失败的堆栈跟踪

Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 609, in do_work work_executor.execute()
    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 167, in execute op.start()
    File "apache_beam/runners/worker/operations.py", line 340, in apache_beam.runners.worker.operations.DoOperation.start def start(self):
    File "apache_beam/runners/worker/operations.py", line 341, in apache_beam.runners.worker.operations.DoOperation.start with self.scoped_start_state:
    File "apache_beam/runners/worker/operations.py", line 346, in apache_beam.runners.worker.operations.DoOperation.start pickler.loads(self.spec.serialized_fn))
    File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 225, in loads return dill.loads(s)
    File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 277, in loads return load(file)
    File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 266, in load obj = pik.load()
    File "/usr/lib/python2.7/pickle.py", line 864, in load dispatch[key](self)
    File "/usr/lib/python2.7/pickle.py", line 1096, in load_global klass = self.find_class(module, name)
    File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 423, in find_class return StockUnpickler.find_class(self, module, name)
    File "/usr/lib/python2.7/pickle.py", line 1130, in find_class __import__(module)
    File "/usr/local/lib/python2.7/dist-packages/dataflow_pipeline/invoice_overages.py", line 26,
in <module> from google.cloud.datastore.helpers import entity_from_protobuf ImportError: No module named datastore.helpers

我已尝试在本地重现错误,这似乎是由于无法安装 google-cloud-datastore

这是我们当前的 requirements.txt 文件内容

Flask==0.12.2
apache-beam[gcp]
google-cloud-dataflow
gunicorn==19.7.1
google-cloud-datastore==1.3.0
pytz
google-cloud-pubsub
google-gax
grpc-google-iam-v1
googleapis-common-protos
google-cloud==0.32
six==1.10.0
protobuf

我目前几乎无法在本地重现此内容。我安装这些要求

httplib2==0.9.1
oauth2client==3.0.0
google-cloud-dataflow==2.5.0

我得到上面显示的错误,

$ python main.py
Traceback (most recent call last):
  File "main.py", line 25, in <module>
    import dataflow_pipeline.summarize_intervals as summarization_pipeline
  File "/Users/john/camio-mappers/box-counters-pipeline/dataflow_pipeline/summarize_intervals.py", line 31, in <module>
    from google.cloud.datastore.helpers import entity_from_protobuf
ImportError: No module named datastore.helpers
(venv)

但如果我再做 pip install --ignore-installed google-cloud-datastore

我遇到了这个疯狂的错误

$ python main.py
Traceback (most recent call last):
  File "main.py", line 25, in <module>
    import dataflow_pipeline.summarize_intervals as summarization_pipeline
  File "/Users/john/camio-mappers/box-counters-pipeline/dataflow_pipeline/summarize_intervals.py", line 31, in <module>
    from google.cloud.datastore.helpers import entity_from_protobuf
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore/__init__.py", line 61, in <module>
    from google.cloud.datastore.batch import Batch
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore/batch.py", line 24, in <module>
    from google.cloud.datastore import helpers
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore/helpers.py", line 29, in <module>
    from google.cloud.datastore_v1.proto import datastore_pb2
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/__init__.py", line 17, in <module>
    from google.cloud.datastore_v1 import types
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/types.py", line 26, in <module>
    from google.cloud.datastore_v1.proto import datastore_pb2
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/proto/datastore_pb2.py", line 17, in <module>
    from google.cloud.datastore_v1.proto import entity_pb2 as google_dot_cloud_dot_datastore__v1_dot_proto_dot_entity__pb2
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/cloud/datastore_v1/proto/entity_pb2.py", line 28, in <module>
    dependencies=[google_dot_api_dot_annotations__pb2.DESCRIPTOR,google_dot_protobuf_dot_struct__pb2.DESCRIPTOR,google_dot_protobuf_dot_timestamp__pb2.DESCRIPTOR,google_dot_type_dot_latlng__pb2.DESCRIPTOR,])
  File "/Users/john/camio-mappers/box-counters-pipeline/venv/lib/python2.7/site-packages/google/protobuf/descriptor.py", line 878, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "google/cloud/datastore_v1/proto/entity.proto":
  google.datastore.v1.PartitionId.project_id: "google.datastore.v1.PartitionId.project_id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.PartitionId.namespace_id: "google.datastore.v1.PartitionId.namespace_id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.PartitionId: "google.datastore.v1.PartitionId" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.partition_id: "google.datastore.v1.Key.partition_id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.path: "google.datastore.v1.Key.path" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.id_type: "google.datastore.v1.Key.PathElement.id_type" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.kind: "google.datastore.v1.Key.PathElement.kind" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.id: "google.datastore.v1.Key.PathElement.id" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement.name: "google.datastore.v1.Key.PathElement.name" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.PathElement: "google.datastore.v1.Key.PathElement" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key: "google.datastore.v1.Key" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.ArrayValue.values: "google.datastore.v1.ArrayValue.values" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.ArrayValue: "google.datastore.v1.ArrayValue" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.value_type: "google.datastore.v1.Value.value_type" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.null_value: "google.datastore.v1.Value.null_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.boolean_value: "google.datastore.v1.Value.boolean_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.integer_value: "google.datastore.v1.Value.integer_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.double_value: "google.datastore.v1.Value.double_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.timestamp_value: "google.datastore.v1.Value.timestamp_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.key_value: "google.datastore.v1.Value.key_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.string_value: "google.datastore.v1.Value.string_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.blob_value: "google.datastore.v1.Value.blob_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.geo_point_value: "google.datastore.v1.Value.geo_point_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.entity_value: "google.datastore.v1.Value.entity_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.array_value: "google.datastore.v1.Value.array_value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.meaning: "google.datastore.v1.Value.meaning" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value.exclude_from_indexes: "google.datastore.v1.Value.exclude_from_indexes" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Value: "google.datastore.v1.Value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.key: "google.datastore.v1.Entity.key" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.properties: "google.datastore.v1.Entity.properties" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.PropertiesEntry.key: "google.datastore.v1.Entity.PropertiesEntry.key" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.PropertiesEntry.value: "google.datastore.v1.Entity.PropertiesEntry.value" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity.PropertiesEntry: "google.datastore.v1.Entity.PropertiesEntry" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Entity: "google.datastore.v1.Entity" is already defined in file "google/cloud/proto/datastore/v1/entity.proto".
  google.datastore.v1.Key.partition_id: "google.datastore.v1.PartitionId" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Key.path: "google.datastore.v1.Key.PathElement" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.ArrayValue.values: "google.datastore.v1.Value" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Value.key_value: "google.datastore.v1.Key" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Value.entity_value: "google.datastore.v1.Entity" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Value.array_value: "google.datastore.v1.ArrayValue" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Entity.PropertiesEntry.value: "google.datastore.v1.Value" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Entity.key: "google.datastore.v1.Key" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.
  google.datastore.v1.Entity.properties: "google.datastore.v1.Entity.PropertiesEntry" seems to be defined in "google/cloud/proto/datastore/v1/entity.proto", which is not imported by "google/cloud/datastore_v1/proto/entity.proto".  To use it here, please add the necessary import.

我不知道这意味着什么。我似乎确实缺少列出的 .proto 文件,但为什么它们会丢失?

我的主要问题是,因为这是上次发生这种情况:为什么当我们不进行任何新部署时工作中断是可以的?假设是,如果我们不更改任何代码,代码就不会中断。如果我们不改变任何依赖关系,我们不应该以损坏的管道结束,然后我们必须争先恐后地修复。 Cloud Dataflow 已经过测试版并且应该是稳定的,如果它处于测试版并且受到 API 更改的影响,这是可以预料的,但此时它应该是稳定的。截至目前,我们的管道已关闭,我们不知道会持续多久。


我对 google-cloud-datastore==1.4.0 进行了建议的更改,它部分修复了我的管道,但现在它在另一点失败了。看来 google-cloud-pubsub 现在失败了,并出现类似的错误,即缺少 .proto 个文件

ERROR: (gcloud.app.deploy) Error Response: [9]
Application startup error:
    import dataflow_pipeline.tally_overages as overaging_pipeline
  File "/home/vmagent/app/dataflow_pipeline/tally_overages.py", line 29, in <module>
    from google.cloud import pubsub
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub.py", line 17, in <module>
    from google.cloud.pubsub_v1 import PublisherClient
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/__init__.py", line 17, in <module>
    from google.cloud.pubsub_v1 import types
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/types.py", line 30, in <module>
    from google.cloud.pubsub_v1.proto import pubsub_pb2
  File "/env/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/proto/pubsub_pb2.py", line 29, in <module>
    dependencies=[google_dot_api_dot_annotations__pb2.DESCRIPTOR,google_dot_protobuf_dot_duration__pb2.DESCRIPTOR,google_dot_protobuf_dot_empty__pb2.DESCRIPTOR,google_dot_protobuf_dot_field__mask__pb2.DESCRIPTOR,google_dot_protobuf_dot_timestamp__pb2.DESCRIPTOR,])
  File "/env/local/lib/python2.7/site-packages/google/protobuf/descriptor.py", line 878, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "google/cloud/pubsub_v1/proto/pubsub.proto":
  google.pubsub.v1.Topic.name: "google.pubsub.v1.Topic.name" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Topic: "google.pubsub.v1.Topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.data: "google.pubsub.v1.PubsubMessage.data" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.attributes: "google.pubsub.v1.PubsubMessage.attributes" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.message_id: "google.pubsub.v1.PubsubMessage.message_id" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.publish_time: "google.pubsub.v1.PubsubMessage.publish_time" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.AttributesEntry.key: "google.pubsub.v1.PubsubMessage.AttributesEntry.key" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.AttributesEntry.value: "google.pubsub.v1.PubsubMessage.AttributesEntry.value" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage.AttributesEntry: "google.pubsub.v1.PubsubMessage.AttributesEntry" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PubsubMessage: "google.pubsub.v1.PubsubMessage" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.GetTopicRequest.topic: "google.pubsub.v1.GetTopicRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.GetTopicRequest: "google.pubsub.v1.GetTopicRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishRequest.topic: "google.pubsub.v1.PublishRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishRequest.messages: "google.pubsub.v1.PublishRequest.messages" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishRequest: "google.pubsub.v1.PublishRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishResponse.message_ids: "google.pubsub.v1.PublishResponse.message_ids" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.PublishResponse: "google.pubsub.v1.PublishResponse" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest.project: "google.pubsub.v1.ListTopicsRequest.project" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest.page_size: "google.pubsub.v1.ListTopicsRequest.page_size" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest.page_token: "google.pubsub.v1.ListTopicsRequest.page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsRequest: "google.pubsub.v1.ListTopicsRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsResponse.topics: "google.pubsub.v1.ListTopicsResponse.topics" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsResponse.next_page_token: "google.pubsub.v1.ListTopicsResponse.next_page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicsResponse: "google.pubsub.v1.ListTopicsResponse" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest.topic: "google.pubsub.v1.ListTopicSubscriptionsRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest.page_size: "google.pubsub.v1.ListTopicSubscriptionsRequest.page_size" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest.page_token: "google.pubsub.v1.ListTopicSubscriptionsRequest.page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsRequest: "google.pubsub.v1.ListTopicSubscriptionsRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsResponse.subscriptions: "google.pubsub.v1.ListTopicSubscriptionsResponse.subscriptions" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsResponse.next_page_token: "google.pubsub.v1.ListTopicSubscriptionsResponse.next_page_token" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.ListTopicSubscriptionsResponse: "google.pubsub.v1.ListTopicSubscriptionsResponse" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.DeleteTopicRequest.topic: "google.pubsub.v1.DeleteTopicRequest.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.DeleteTopicRequest: "google.pubsub.v1.DeleteTopicRequest" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.name: "google.pubsub.v1.Subscription.name" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.topic: "google.pubsub.v1.Subscription.topic" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.push_config: "google.pubsub.v1.Subscription.push_config" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.ack_deadline_seconds: "google.pubsub.v1.Subscription.ack_deadline_seconds" is already defined in file "google/cloud/proto/pubsub/v1/pubsub.proto".
  google.pubsub.v1.Subscription.retain_acked_messages: "google.pubsub.v1.Subscription.retain_acked_messages" is already defined in 
  google.pubsub.v1.Subscriber.Seek: "google.pubsub.v1.SeekRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Subscriber.Seek: "google.pubsub.v1.SeekResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.CreateTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.CreateTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.UpdateTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.Publish: "google.pubsub.v1.PublishRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.Publish: "google.pubsub.v1.PublishResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.GetTopic: "google.pubsub.v1.GetTopicRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.GetTopic: "google.pubsub.v1.Topic" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopics: "google.pubsub.v1.ListTopicsRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopics: "google.pubsub.v1.ListTopicsResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopicSubscriptions: "google.pubsub.v1.ListTopicSubscriptionsRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.ListTopicSubscriptions: "google.pubsub.v1.ListTopicSubscriptionsResponse" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.
  google.pubsub.v1.Publisher.DeleteTopic: "google.pubsub.v1.DeleteTopicRequest" seems to be defined in "google/cloud/proto/pubsub/v1/pubsub.proto", which is not imported by "google/cloud/pubsub_v1/proto/pubsub.proto".  To use it here, please add the necessary import.

我不太明白这些错误是什么意思。这是我的新 requirements.txt 文件

httplib2==0.9.1
oauth2client==3.0.0
google-cloud-dataflow==2.5.0
dill==0.2.6
Flask
gunicorn
pytz
googledatastore
google-cloud-datastore==1.4.0
google-cloud
google-cloud-pubsub

编辑 - 这是代码的导入行

文件 1

from __future__ import absolute_import

# standard imports
import datetime
import json
import logging
import base64
import collections
import traceback
import hashlib
from functools import reduce

# apache beam / dataflow imports
import apache_beam as beam
from apache_beam.io.gcp.datastore.v1.datastoreio import ReadFromDatastore
from apache_beam.options.pipeline_options import PipelineOptions

# google cloud datastore imports
from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf

# custom utility imports
from .util import *

文件 2

from __future__ import absolute_import

import datetime
import math
import json
import logging
import traceback
import collections
import hashlib
from functools import reduce

import apache_beam as beam
from apache_beam.io.gcp.datastore.v1.datastoreio import ReadFromDatastore
from apache_beam.options.pipeline_options import PipelineOptions

from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf
from google.cloud import pubsub

from .util import *

文件 3

from __future__ import absolute_import

import math
import json
import logging
import traceback
import uuid
import collections
import hashlib
import datetime
from functools import reduce

import apache_beam as beam
from apache_beam.io.gcp.datastore.v1.datastoreio import ReadFromDatastore
from apache_beam.options.pipeline_options import PipelineOptions

from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf
from google.cloud import pubsub

from .util import *

并来自 util.* 个文件

import re
import os
import math
import datetime
import json
import logging
import base64
import traceback
import itertools
import pytz
import requests
from functools import reduce

# google cloud datastore imports
from google.cloud.proto.datastore.v1 import query_pb2
from googledatastore import helper as datastore_helper, PropertyFilter, CompositeFilter
from google.cloud.datastore.helpers import entity_from_protobuf

我最终通过制作最少的管道集合并一个一个地添加依赖项直到我找到一些有效的组合来让它工作。依赖如下

Flask
gunicorn
apache-beam[gcp]==2.6.0
oauth2client==3.0.0
google-cloud-datastore==1.3.0
google-cloud-pubsub==0.28.0
google-cloud-core==0.27.0
google-cloud==0.34.0

仍然很沮丧,因为这些东西在没有任何警告和我们这边没有采取任何行动的情况下崩溃了,我们的生产管道已经离线了一个星期,但现在已经恢复了,希望这会对那里的人有所帮助。

编辑 -

说得太快了。这种组合在本地有效,但在云中仍会中断 >:-(

编辑2 -

所以.. Google 似乎停止从 requirements.txt 文件安装。我的 setup.py 文件中有这个

REQUIRED_PACKAGES = ['google-cloud']

看起来这是唯一安装的依赖项!当我添加这段代码时

if os.path.exists('requirements.txt'):
    with open('requirements.txt') as fh:
        REQUIRED_PACKAGES=[line.strip() for line in fh.readlines()]

一切顺利!现在我的管道运行了。

所以看起来 Google 改变了他们停止从 requirements.txt 安装的地方。

什么!?!