如何生成 2 列增量值及其所有与 pandas 的唯一组合？

Question

我需要创建一个 2 列数据框。

第一列包含从 7000 到 15000 的值以及该范围内的所有增量 500 (7000,7500,8000...14500,1500)

第二列包含从 6 到 24 的所有整数

我需要一种简单的方法来生成这些值及其所有独特的组合：

6,7000
6,7500
6,8000
....
24,14500
24,15000

Answer 1

您可以使用numpy.arange for generating sequence of numbers, numpy.repeat and numpy.tile for generating cross-product and stack them using numpy.c_ or numpy.column_stack

x = np.arange(6, 25)
y = np.arange(7000, 15001, 500)

pd.DataFrame(np.c_[x.repeat(len(y)),np.tile(y, len(x))])
# pd.DataFrame(np.column_stack([x.repeat(len(y)),np.tile(y, len(x))]))
      0      1
0     6   7000
1     6   7500
2     6   8000
3     6   8500
4     6   9000
..   ..    ...
318  24  13000
319  24  13500
320  24  14000
321  24  14500
322  24  15000

[323 rows x 2 columns]

另一个想法是使用itertools.product

from itertools import product
pd.DataFrame(list(product(x,y)))

Timeit 结果：

# Henry' answer in comments
In [44]: %timeit pd.DataFrame([(x,y) for x in range(6,25) for y in range(7000,15001,500)])
657 µs ± 169 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

# My solution
In [45]: %%timeit
    ...: x = np.arange(6, 25)
    ...: y = np.arange(7000, 15001, 500)
    ...: 
    ...: pd.DataFrame(np.c_[x.repeat(len(y)),np.tile(y, len(x))])
    ...:
    ...:
155 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

#Using `np.column_stack`
In [49]: %%timeit
    ...: x = np.arange(6, 25)
    ...: y = np.arange(7000, 15001, 500)
    ...: 
    ...: pd.DataFrame(np.column_stack([x.repeat(len(y)),np.tile(y, len(x))]))
    ...:
121 µs ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

# `itertools.product` solution
In [62]: %timeit pd.DataFrame(list(product(x,y)))
489 µs ± 7.18 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

如何生成 2 列增量值及其所有与 pandas 的唯一组合？

How to generate 2 columns of incremental values and all their unique combinations with pandas?

python

combinations

intervals

pandas