解释 Numpy-Array 中的特定位
Interpreting specific bits in Numpy-Array
我正在尝试解释 HDF 文件或更准确地说是文件中的光栅带。有一个用于质量评估的 Band,它将位信息表示为相应的整数(例如 01000000 为 64)。
根据我想要计数的特定位,例如有多少像素在位位置 5 上为 1。如果之前的计数考虑到了它们,则不应再次计数。现在我正在根据自己的优先级列表更改每个单元格中的条目。这需要很长时间,而且我真的确信有更快的方法,因为我以前从未使用过位。
这是我的代码:
from osgeo import gdal
import numpy as np
QA_Band = gdal.Open(hdf.GetSubDatasets()[B][0],gdal.GA_ReadOnly)
QA = QA_Band.ReadAsArray()
# Calculate Bit-Representation of QA-Band
bin_vec = np.vectorize(np.binary_repr)
QAB = bin_vec(QA, width = 8)
# Filter by Bit-Values
QAB = np.where(QAB == '11111111', "OutOfSwath", QAB)
for i in range(QAB.shape[0]):
for j in range(QAB.shape[1]):
if QAB[i,j][6] == '1':
QAB[i,j] = "Cloud"
Cloud = (QAB == "Cloud").sum()
elif QAB[i,j][4] == '1':
QAB[i,j] = "Cloud Shadow"
Shadow = (QAB == "Cloud Shadow").sum()
elif QAB[i,j][5] == '1':
QAB[i,j] = "Adjacent Cloud"
AC = (QAB == "Adjacent Cloud").sum()
elif QAB[i,j][7] == '1':
QAB[i,j] = "Cirrus"
Cirrus = (QAB == "Cirrus").sum()
elif QAB[i,j][3] == '1':
QAB[i,j] = "Snow/Ice"
SnowC = (QAB == "Snow/Ice").sum()
elif QAB[i,j][2] == '1':
QAB[i,j] = "Water"
WaterC = (QAB == "Water").sum()
elif QAB[i,j][0:1] == '11':
QAB[i,j] = "High Aerosol"
elif QAB[i,j][0:1] == '10':
QAB[i,j] = "Avrg. Aerosol"
elif QAB[i,j][0:1] == '01':
QAB[i,j] = "Low Aerosol"
elif QAB[i,j][0:1] == '00':
QAB[i,j] = "Aerosol Climatology"
这将导致代表不同事物的字符串数组,但如前所述需要时间。
任何有关如何访问位表示的帮助都会有所帮助:)
您可以使用 numpy 函数 unpackbits 来处理位。对于 numpy,我们更喜欢使用 numpy 方法和函数而不是 python for 循环——通常它更快。所以你可以将每个数字解包到第三个轴,然后像 QAB[i, j][5] == '1'
这样的条件变成 result[bits[5]]
。我颠倒了你的 elif 子句的顺序,就好像你 QAB[i, j][6] == '1'
然后它将它设置为 "Cloud"
并且从不 运行 over 子句所以这种情况应该是最后重写 overs 如果我们 运行 每个条件。另外,您的最后一个案例 QAB[i,j][0:1] == '11'
从未触发,因为左侧的长度始终为 1。所以我用您的意思 QAB[i,j][0:2] == ...
.
的消耗重写了
"""
>>> print(bits)
[[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False True True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False True True]
[ True False True]]
<BLANKLINE>
[[ True False True]
[ True True True]]]
>>> print(result)
[['Cirrus' 'Cloud' 'Cloud']
['Cloud' 'Cloud Shadow' 'OutOfSwath']]
>>> Cloud
3
"""
import numpy as np
QA = [[1, 2, 3], [3, 9, 255]]
bits = np.unpackbits(np.array(QA, dtype=np.uint8, ndmin=3), axis=0).astype(bool)
result = np.array([[""] * bits.shape[2] for _ in range(bits.shape[1])], dtype=object)
result[~bits[0] & ~bits[1]] = "Aerosol Climatology"
result[~bits[0] & bits[1]] = "Low Aerosol"
result[bits[0] & ~bits[1]] = "Avrg. Aerosol"
result[bits[0] & bits[1]] = "High Aerosol"
result[bits[2]] = "Water"
result[bits[3]] = "Snow/Ice"
result[bits[7]] = "Cirrus"
result[bits[5]] = "Adjacent Cloud"
result[bits[4]] = "Cloud Shadow"
result[bits[6]] = "Cloud"
result[bits.all(axis=0)] = "OutOfSwath"
Cloud = (result == "Cloud").sum()
Shadow = (result == "Cloud Shadow").sum()
AC = (result == "Adjacent Cloud").sum()
Cirrus = (result == "Cirrus").sum()
SnowC = (result == "Snow/Ice").sum()
WaterC = (result == "Water").sum()
您的程序逻辑实际上精确映射到 numpy.select
,根据条件(布尔)数组列表从数组列表中按元素选择,第一个匹配获胜。所以你可以紧凑地写一些像
conditions = QAB&(128>>np.arange(8))[:,None,None]
values = ["OutOfSwath","Cloud","Cloud Shadow","Adjacent Cloud","Cirrus","Snow/Ice",
"Water","High Aerosol","Avrg. Aerosol","Low Aerosol","Aerosol Climatology"]
np.select([QAB==255,*conditions[[6,4,5,7,3,2]],*(QAB==np.arange(0,256,64)[::-1,None,None])],values)
我正在尝试解释 HDF 文件或更准确地说是文件中的光栅带。有一个用于质量评估的 Band,它将位信息表示为相应的整数(例如 01000000 为 64)。
根据我想要计数的特定位,例如有多少像素在位位置 5 上为 1。如果之前的计数考虑到了它们,则不应再次计数。现在我正在根据自己的优先级列表更改每个单元格中的条目。这需要很长时间,而且我真的确信有更快的方法,因为我以前从未使用过位。
这是我的代码:
from osgeo import gdal
import numpy as np
QA_Band = gdal.Open(hdf.GetSubDatasets()[B][0],gdal.GA_ReadOnly)
QA = QA_Band.ReadAsArray()
# Calculate Bit-Representation of QA-Band
bin_vec = np.vectorize(np.binary_repr)
QAB = bin_vec(QA, width = 8)
# Filter by Bit-Values
QAB = np.where(QAB == '11111111', "OutOfSwath", QAB)
for i in range(QAB.shape[0]):
for j in range(QAB.shape[1]):
if QAB[i,j][6] == '1':
QAB[i,j] = "Cloud"
Cloud = (QAB == "Cloud").sum()
elif QAB[i,j][4] == '1':
QAB[i,j] = "Cloud Shadow"
Shadow = (QAB == "Cloud Shadow").sum()
elif QAB[i,j][5] == '1':
QAB[i,j] = "Adjacent Cloud"
AC = (QAB == "Adjacent Cloud").sum()
elif QAB[i,j][7] == '1':
QAB[i,j] = "Cirrus"
Cirrus = (QAB == "Cirrus").sum()
elif QAB[i,j][3] == '1':
QAB[i,j] = "Snow/Ice"
SnowC = (QAB == "Snow/Ice").sum()
elif QAB[i,j][2] == '1':
QAB[i,j] = "Water"
WaterC = (QAB == "Water").sum()
elif QAB[i,j][0:1] == '11':
QAB[i,j] = "High Aerosol"
elif QAB[i,j][0:1] == '10':
QAB[i,j] = "Avrg. Aerosol"
elif QAB[i,j][0:1] == '01':
QAB[i,j] = "Low Aerosol"
elif QAB[i,j][0:1] == '00':
QAB[i,j] = "Aerosol Climatology"
这将导致代表不同事物的字符串数组,但如前所述需要时间。
任何有关如何访问位表示的帮助都会有所帮助:)
您可以使用 numpy 函数 unpackbits 来处理位。对于 numpy,我们更喜欢使用 numpy 方法和函数而不是 python for 循环——通常它更快。所以你可以将每个数字解包到第三个轴,然后像 QAB[i, j][5] == '1'
这样的条件变成 result[bits[5]]
。我颠倒了你的 elif 子句的顺序,就好像你 QAB[i, j][6] == '1'
然后它将它设置为 "Cloud"
并且从不 运行 over 子句所以这种情况应该是最后重写 overs 如果我们 运行 每个条件。另外,您的最后一个案例 QAB[i,j][0:1] == '11'
从未触发,因为左侧的长度始终为 1。所以我用您的意思 QAB[i,j][0:2] == ...
.
"""
>>> print(bits)
[[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False False False]
[False True True]]
<BLANKLINE>
[[False False False]
[False False True]]
<BLANKLINE>
[[False True True]
[ True False True]]
<BLANKLINE>
[[ True False True]
[ True True True]]]
>>> print(result)
[['Cirrus' 'Cloud' 'Cloud']
['Cloud' 'Cloud Shadow' 'OutOfSwath']]
>>> Cloud
3
"""
import numpy as np
QA = [[1, 2, 3], [3, 9, 255]]
bits = np.unpackbits(np.array(QA, dtype=np.uint8, ndmin=3), axis=0).astype(bool)
result = np.array([[""] * bits.shape[2] for _ in range(bits.shape[1])], dtype=object)
result[~bits[0] & ~bits[1]] = "Aerosol Climatology"
result[~bits[0] & bits[1]] = "Low Aerosol"
result[bits[0] & ~bits[1]] = "Avrg. Aerosol"
result[bits[0] & bits[1]] = "High Aerosol"
result[bits[2]] = "Water"
result[bits[3]] = "Snow/Ice"
result[bits[7]] = "Cirrus"
result[bits[5]] = "Adjacent Cloud"
result[bits[4]] = "Cloud Shadow"
result[bits[6]] = "Cloud"
result[bits.all(axis=0)] = "OutOfSwath"
Cloud = (result == "Cloud").sum()
Shadow = (result == "Cloud Shadow").sum()
AC = (result == "Adjacent Cloud").sum()
Cirrus = (result == "Cirrus").sum()
SnowC = (result == "Snow/Ice").sum()
WaterC = (result == "Water").sum()
您的程序逻辑实际上精确映射到 numpy.select
,根据条件(布尔)数组列表从数组列表中按元素选择,第一个匹配获胜。所以你可以紧凑地写一些像
conditions = QAB&(128>>np.arange(8))[:,None,None]
values = ["OutOfSwath","Cloud","Cloud Shadow","Adjacent Cloud","Cirrus","Snow/Ice",
"Water","High Aerosol","Avrg. Aerosol","Low Aerosol","Aerosol Climatology"]
np.select([QAB==255,*conditions[[6,4,5,7,3,2]],*(QAB==np.arange(0,256,64)[::-1,None,None])],values)