Python 多才多艺的索引切片
Python versatile Index slicing
希望有人能帮我把这个Excel逻辑转换成python
=IF(LEFT(A8,5)="Total",A9,I8)
所以我希望找到一个范围内的所有内容,然后使用该范围内的第一个元素创建一个新列。问题是范围的名称可以更改。
我目前实施的一个解决方案是将列转换为索引,并通过执行以下操作按索引名称手动选择:
Sales = df.loc['1000 - Cash and Equivalents':'Total - 1000 - Cash and Equivalents']
问题这个名称可能会改变并且可能包含更少或更多的行,并且需要使它更通用,所以我不能指定编号范围。
这是数据示例:
和Post转换我的数据如下所示:
使用:
df = pd.read_csv('PL2.csv', encoding='cp1252', engine='python')
#create helper df for total strings
df1 = df.loc[df.iloc[:, 0].str.startswith('Total', na=False), df.columns[0]].to_frame('total')
#first column without Total -
df1['first'] = df1['total'].str.replace('Total - ', '')
print (df1.head(10))
total first
17 Total - 4000 - Sales 4000 - Sales
21 Total - 4200 - Discounts & Allowances 4200 - Discounts & Allowances
24 Total - 4400 - Excise and Duties 4400 - Excise and Duties
25 Total - Sales Sales
37 Total - 5000 - Cost of Goods Sold 5000 - Cost of Goods Sold
#create index by first column
df = df.set_index(df.columns[0])
#filter function - if not matched return empty df
def get_dict(df, first, last):
try:
df = df.loc[first: last]
df['Sub-Category'] = first
except KeyError:
df = pd.DataFrame()
return df
#in dictionary comprehension create dict of DataFrames
d = {k: get_dict(df, k, v) for k, v in zip(df1['first'], df1['total'])}
#print (d)
#select Sales df
print (d['Sales'])
希望有人能帮我把这个Excel逻辑转换成python
=IF(LEFT(A8,5)="Total",A9,I8)
所以我希望找到一个范围内的所有内容,然后使用该范围内的第一个元素创建一个新列。问题是范围的名称可以更改。
我目前实施的一个解决方案是将列转换为索引,并通过执行以下操作按索引名称手动选择:
Sales = df.loc['1000 - Cash and Equivalents':'Total - 1000 - Cash and Equivalents']
问题这个名称可能会改变并且可能包含更少或更多的行,并且需要使它更通用,所以我不能指定编号范围。
这是数据示例:
和Post转换我的数据如下所示:
使用:
df = pd.read_csv('PL2.csv', encoding='cp1252', engine='python')
#create helper df for total strings
df1 = df.loc[df.iloc[:, 0].str.startswith('Total', na=False), df.columns[0]].to_frame('total')
#first column without Total -
df1['first'] = df1['total'].str.replace('Total - ', '')
print (df1.head(10))
total first
17 Total - 4000 - Sales 4000 - Sales
21 Total - 4200 - Discounts & Allowances 4200 - Discounts & Allowances
24 Total - 4400 - Excise and Duties 4400 - Excise and Duties
25 Total - Sales Sales
37 Total - 5000 - Cost of Goods Sold 5000 - Cost of Goods Sold
#create index by first column
df = df.set_index(df.columns[0])
#filter function - if not matched return empty df
def get_dict(df, first, last):
try:
df = df.loc[first: last]
df['Sub-Category'] = first
except KeyError:
df = pd.DataFrame()
return df
#in dictionary comprehension create dict of DataFrames
d = {k: get_dict(df, k, v) for k, v in zip(df1['first'], df1['total'])}
#print (d)
#select Sales df
print (d['Sales'])