如何将数组转换为 python 中的数组
how to convert a ndarray to a array in python
我有一个问题,我有以下几行:
s=codecs.open('file.csv', encoding="utf-8").read()
array1=np.asarray(s.splitlines())
print(array1)
然后我变成了数组的结果:
['39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K'
'50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K'
'38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K'
...
'36, Private, 146311, 9th, 5, Married-civ-spouse, Machine-op-inspct, Husband, White, Male, 0, 0, 40, United-States, <=50K'
'47, Self-emp-not-inc, 159869, Doctorate, 16, Married-civ-spouse, Craft-repair, Husband, White, Male, 0, 0, 50, United-States, <=50K'
'21, Private, 204641, Some-college, 10, Never-married,']
我想要的是将其转化为:
[['39', 'State-gov', '77516', 'Bachelors', '13',....,'<=50K]['50'...]]
现在也是一个一行多列的数组,每一列都是一个字符串,我想把每一列变成一行,列数和字符数..
我对此没有任何想法,我想拆分它但我不能
有人可以帮助我吗?
谢谢!
方法 1:从文件生成所需的数组
如果您从 csv 文件开始,您不妨使用 np.genfromtxt
:
如果 filename.csv
看起来像:
39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
然后:
new_arr = np.genfromtxt('filename.csv', dtype='str')
>>> new_arr
array([['39,', 'State-gov,', '77516,', 'Bachelors,', '13,',
'Never-married,', 'Adm-clerical,', 'Not-in-family,', 'White,',
'Male,', '2174,', '0,', '40,', 'United-States,', '<=50K'],
['50,', 'Self-emp-not-inc,', '83311,', 'Bachelors,', '13,',
'Married-civ-spouse,', 'Exec-managerial,', 'Husband,', 'White,',
'Male,', '0,', '0,', '13,', 'United-States,', '<=50K']],
dtype='<U19')
方法 2:修复数组:
否则,如果你已经有了数组:
>>> arr
array(['39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K',
'50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K'],
dtype='<U133')
您可以遍历它并拆分每个字符串以获得您想要的输出:
new_arr = np.array([i.split() for i in arr])
>>> new_arr
array([['39,', 'State-gov,', '77516,', 'Bachelors,', '13,',
'Never-married,', 'Adm-clerical,', 'Not-in-family,', 'White,',
'Male,', '2174,', '0,', '40,', 'United-States,', '<=50K'],
['50,', 'Self-emp-not-inc,', '83311,', 'Bachelors,', '13,',
'Married-civ-spouse,', 'Exec-managerial,', 'Husband,', 'White,',
'Male,', '0,', '0,', '13,', 'United-States,', '<=50K']],
dtype='<U19')
我有一个问题,我有以下几行:
s=codecs.open('file.csv', encoding="utf-8").read()
array1=np.asarray(s.splitlines())
print(array1)
然后我变成了数组的结果:
['39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K'
'50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K'
'38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K'
...
'36, Private, 146311, 9th, 5, Married-civ-spouse, Machine-op-inspct, Husband, White, Male, 0, 0, 40, United-States, <=50K'
'47, Self-emp-not-inc, 159869, Doctorate, 16, Married-civ-spouse, Craft-repair, Husband, White, Male, 0, 0, 50, United-States, <=50K'
'21, Private, 204641, Some-college, 10, Never-married,']
我想要的是将其转化为:
[['39', 'State-gov', '77516', 'Bachelors', '13',....,'<=50K]['50'...]]
现在也是一个一行多列的数组,每一列都是一个字符串,我想把每一列变成一行,列数和字符数..
我对此没有任何想法,我想拆分它但我不能
有人可以帮助我吗?
谢谢!
方法 1:从文件生成所需的数组
如果您从 csv 文件开始,您不妨使用 np.genfromtxt
:
如果 filename.csv
看起来像:
39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
然后:
new_arr = np.genfromtxt('filename.csv', dtype='str')
>>> new_arr
array([['39,', 'State-gov,', '77516,', 'Bachelors,', '13,',
'Never-married,', 'Adm-clerical,', 'Not-in-family,', 'White,',
'Male,', '2174,', '0,', '40,', 'United-States,', '<=50K'],
['50,', 'Self-emp-not-inc,', '83311,', 'Bachelors,', '13,',
'Married-civ-spouse,', 'Exec-managerial,', 'Husband,', 'White,',
'Male,', '0,', '0,', '13,', 'United-States,', '<=50K']],
dtype='<U19')
方法 2:修复数组:
否则,如果你已经有了数组:
>>> arr
array(['39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K',
'50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K'],
dtype='<U133')
您可以遍历它并拆分每个字符串以获得您想要的输出:
new_arr = np.array([i.split() for i in arr])
>>> new_arr
array([['39,', 'State-gov,', '77516,', 'Bachelors,', '13,',
'Never-married,', 'Adm-clerical,', 'Not-in-family,', 'White,',
'Male,', '2174,', '0,', '40,', 'United-States,', '<=50K'],
['50,', 'Self-emp-not-inc,', '83311,', 'Bachelors,', '13,',
'Married-civ-spouse,', 'Exec-managerial,', 'Husband,', 'White,',
'Male,', '0,', '0,', '13,', 'United-States,', '<=50K']],
dtype='<U19')