series.tolist() 和 pandas 中的列表有区别吗

Is there a difference between series.tolist() and list in pandas

我定义了一个函数 z。当我传递一个列表时它起作用,但是当我传递一个系列时(即使在转换为列表之后)它 returns 错误的答案。我对函数 z 的输入参数必须是一个系列。如何解决?

list1 = [np.nan, 14975, 98121]
series1 = pd.Series([np.nan,14975,98121])

z(series1.tolist())
['0', '0', '0']

z(list1)
['0', '1', '98121']

我的 z 函数是,

def z(each):
    zipcode_list = []
    for i in each:   
        try:
            if zipcodes.is_real(str(i)):
                zip_code = str(i)      
            else:
                zip_code = str(1)
        except:
            zip_code = str(0)   
        zipcode_list.append(zip_code)
    return zipcodes

虽然 pandas 正确地 return 一个列表(参见 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.tolist.html)他们采取了安全的路线并将系列中的值转换为浮点数而不是整数。因此,当您尝试获取浮点数的邮政编码时,它会出错。

您可以通过 运行 以下内容查看:

import pandas as pd
import numpy as np
import zipcodes
list1 = [np.nan, 14975, 98121]
series1 = pd.Series([np.nan,14975,98121])


def z(each):
    zipcode_list = []
    for i in each:
        print(i, type(i))
        try:
            if zipcodes.is_real(str(i)):
                zip_code = str(i)
            else:
                zip_code = str(1)
        except Exception:
            zip_code = str(0)
        zipcode_list.append(zip_code)
    return zipcode_list


print(z(series1.tolist()))

print(z(list1))

输出:

nan <class 'float'>
14975.0 <class 'float'>
import pandas as pd
98121.0 <class 'float'>
['0', '0', '0']
nan <class 'float'>
14975 <class 'int'>
98121 <class 'int'>
['0', '1', '98121']

更改代码以在将列表传递给 z 之前将列表转换为整数将解决您的问题。参见:

import pandas as pd
import numpy as np
import zipcodes
list1 = [np.nan, 14975, 98121]
series1 = pd.Series([np.nan,14975,98121])


def z(each):
    zipcode_list = []
    for i in each:
        try:
            if zipcodes.is_real(str(int(i))):
                zip_code = str(int(i))
            else:
                zip_code = str(1)
        except Exception:
            zip_code = str(0)
        zipcode_list.append(zip_code)
    return zipcode_list


print(z(series1.tolist()))
# ['0', '1', '98121']

print(z(list1))
# ['0', '1', '98121']