Pandas Join- 只能比较标记相同的 Series 对象

Pandas Join- Can only compare identically-labeled Series objects

我有两个数据帧,telemetryerrors1。我正在对这两个数据帧进行 pandas 连接操作。

遥测数据帧如下所示

并且 errors1 数据框看起来像这样

现在join操作是这样完成的

error_count= telemetry.join(errors1, on= ((telemetry['machineID'] == errors1['machineID']) 
                               & (telemetry['datetime'] == errors1['datetime'])), 
                            how='left')

出现以下错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-222-84983d093331> in <module>
----> 1 error_count= telemetry.join(errors1, on= ((telemetry['machineID'] == errors1['machineID']) 
      2                                & (telemetry['datetime'] == errors1['datetime'])), 
      3                             how='left')

/anaconda/envs/azureml_44cb7df5d7402b6a151767e96abfe35d/lib/python3.6/site-packages/pandas/core/ops/common.py in new_method(self, other)
     62         other = item_from_zerodim(other)
     63 
---> 64         return method(self, other)
     65 
     66     return new_method

/anaconda/envs/azureml_44cb7df5d7402b6a151767e96abfe35d/lib/python3.6/site-packages/pandas/core/ops/__init__.py in wrapper(self, other)
    519 
    520         if isinstance(other, ABCSeries) and not self._indexed_same(other):
--> 521             raise ValueError("Can only compare identically-labeled Series objects")
    522 
    523         lvalues = extract_array(self, extract_numpy=True)

ValueError: Can only compare identically-labeled Series objects

编辑 1- 如果我正在使用此 error_count= telemetry.join(errors1.set_index(['machineID','datetime']), on=['machineID', 'datetime'], how='left'),则会出现以下错误。

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-228-845bbda5ab1b> in <module>
----> 1 error_count= telemetry.join(errors1.set_index(['machineID','datetime']), on=['machineID', 'datetime'], how='left')

/anaconda/envs/azureml_44cb7df5d7402b6a151767e96abfe35d/lib/python3.6/site-packages/pandas/core/frame.py in join(self, other, on, how, lsuffix, rsuffix, sort)
   7204         """
   7205         return self._join_compat(
-> 7206             other, on=on, how=how, lsuffix=lsuffix, rsuffix=rsuffix, sort=sort
   7207         )
   7208 

/anaconda/envs/azureml_44cb7df5d7402b6a151767e96abfe35d/lib/python3.6/site-packages/pandas/core/frame.py in _join_compat(self, other, on, how, lsuffix, rsuffix, sort)
   7227                 right_index=True,
   7228                 suffixes=(lsuffix, rsuffix),
-> 7229                 sort=sort,
   7230             )
   7231         else:

/anaconda/envs/azureml_44cb7df5d7402b6a151767e96abfe35d/lib/python3.6/site-packages/pandas/core/reshape/merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
     84         copy=copy,
     85         indicator=indicator,
---> 86         validate=validate,
     87     )
     88     return op.get_result()

/anaconda/envs/azureml_44cb7df5d7402b6a151767e96abfe35d/lib/python3.6/site-packages/pandas/core/reshape/merge.py in __init__(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate)
    629         # validate the merge keys dtypes. We may need to coerce
    630         # to avoid incompat dtypes
--> 631         self._maybe_coerce_merge_keys()
    632 
    633         # If argument passed to validate,

/anaconda/envs/azureml_44cb7df5d7402b6a151767e96abfe35d/lib/python3.6/site-packages/pandas/core/reshape/merge.py in _maybe_coerce_merge_keys(self)
   1148             # datetimelikes must match exactly
   1149             elif needs_i8_conversion(lk) and not needs_i8_conversion(rk):
-> 1150                 raise ValueError(msg)
   1151             elif not needs_i8_conversion(lk) and needs_i8_conversion(rk):
   1152                 raise ValueError(msg)

ValueError: You are trying to merge on datetime64[ns] and object columns. If you wish to proceed you should use pd.concat

我建议你使用pd.merge

df = pd.merge(telemetry, errors1, how='left', left_on=['machineID','datetime'], right_on = ['machineID','datetime'])