如何根据一行划分列的所有值?
how to divide all the values of columns with respect of a row?
如何用最后一行的值划分所有行:
col col col3
'A' 2 3
'B' 8 9
'C' 7 5
'fre' 12 13
我想将整个 col2 的值除以 12,将 col3 的值除以 13:
col col col3
'A' 2/12 3/13
'B' 8/12 9/13
'C' 7/12 5/13
'fre' 12/12 13/13
请给出一种对大量列有效的方法。我想对除第一列以外的所有列都这样做。 (如上例)
假设最后一列的id是唯一的:
如果您需要遍历大量列
from pyspark.sql.functions import col, when
df = spark.createDataFrame(
[
('A',2,3),
('B',8,9),
('C',7,5),
('fre',12,13)
],
['col1','col2','col3']
)
# Get last row
lr = df.tail(1)[0]
# Get last row col1 for otherwise
l_col1 = lr[0]
for c, v in zip(df.columns[1:], lr[1:]):
df = df.withColumn(c, when(col('col1')!=l_col1, col(c)/v).otherwise(v))
如果最后一列也想分割
from pyspark.sql.functions import col, when
df = spark.createDataFrame(
[
('A',2,3),
('B',8,9),
('C',7,5),
('fre',12,13)
],
['col1','col2','col3']
)
# Get last row
lr = df.tail(1)[0]
# Get last row col1 for
l_col1 = lr[0]
for c, v in zip(df.columns[1:], lr[1:]):
df = df.withColumn(c, col(c)/v)
如何用最后一行的值划分所有行:
col col col3
'A' 2 3
'B' 8 9
'C' 7 5
'fre' 12 13
我想将整个 col2 的值除以 12,将 col3 的值除以 13:
col col col3
'A' 2/12 3/13
'B' 8/12 9/13
'C' 7/12 5/13
'fre' 12/12 13/13
请给出一种对大量列有效的方法。我想对除第一列以外的所有列都这样做。 (如上例)
假设最后一列的id是唯一的:
如果您需要遍历大量列
from pyspark.sql.functions import col, when
df = spark.createDataFrame(
[
('A',2,3),
('B',8,9),
('C',7,5),
('fre',12,13)
],
['col1','col2','col3']
)
# Get last row
lr = df.tail(1)[0]
# Get last row col1 for otherwise
l_col1 = lr[0]
for c, v in zip(df.columns[1:], lr[1:]):
df = df.withColumn(c, when(col('col1')!=l_col1, col(c)/v).otherwise(v))
如果最后一列也想分割
from pyspark.sql.functions import col, when
df = spark.createDataFrame(
[
('A',2,3),
('B',8,9),
('C',7,5),
('fre',12,13)
],
['col1','col2','col3']
)
# Get last row
lr = df.tail(1)[0]
# Get last row col1 for
l_col1 = lr[0]
for c, v in zip(df.columns[1:], lr[1:]):
df = df.withColumn(c, col(c)/v)