ValueError: unknown url type: 0.0 error in sklearn

ValueError: unknown url type: 0.0 error in sklearn

我有一个简单的脚本,它试图将 csv 数据文件转换为工具 svm_light 可以接受的格式。这是代码:

    import csv
import sys
import numpy as np
from sklearn.cross_validation import train_test_split

def svm_light_conversion(row):
    conv_row = row[len(row) - 1] + ' '

    for i in xrange(len(row) - 1):
        conv_row = conv_row + str(i + 1) + ':' + str(row[i]) + ' '

    return conv_row

def reaData(inputfile):

    with open(inputfile, 'r') as inFile: 
        reader = csv.reader(inFile)
        my_content = list(reader)

    my_content = my_content[0:len(my_content) - 1]

    return my_content

def converToSVMLiteFormat(outputfile, train, test):

    train_file = outputfile + '_train.dat'
    test_file = outputfile + '_test.dat'
    #svm_light conversion for training data
    with open(train_file, 'wb') as txtfile:
        for i in xrange(len(train)):
            converted_row = svm_light_conversion(train[i]) + '\n'

            txtfile.write(converted_row)

    txtfile.close()

    #svm_light conversion for test data#
    with open(test_file, 'wb') as txtfile:
        for i in xrange(len(test)):
            converted_row = svm_light_conversion(test[i]) + '\n'

            txtfile.write(converted_row)

    txtfile.close()



def main():

    inputfile = sys.argv[1]
    outputfile = sys.argv[2]

    content = reaData(inputfile)

    train, test = train_test_split(content, train_size = 0.8) #split data
    converToSVMLiteFormat(outputfile, train, test)



if __name__ == "__main__":
    main()

之前还好好的,现在突然报错:

(env)fieldsofgold@fieldsofgold-VirtualBox:~/new$ python prac.py data.csv outt
Traceback (most recent call last):
  File "prac.py", line 4, in <module>
    from sklearn.cross_validation import train_test_split
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/cross_validation.py", line 32, in <module>
    from .metrics.scorer import check_scoring
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/metrics/__init__.py", line 7, in <module>
    from .ranking import auc
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 30, in <module>
    from ..utils.stats import rankdata
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/utils/stats.py", line 2, in <module>
    from scipy.stats import rankdata as _sp_rankdata
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/__init__.py", line 338, in <module>
    from .stats import *
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/stats.py", line 189, in <module>
    from . import distributions
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/distributions.py", line 10, in <module>
    from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous,
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py", line 44, in <module>
    from new import instancemethod
  File "/home/fieldsofgold/new/new.py", line 10, in <module>
    response2 = urllib2.urlopen(row[12])
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 396, in open
    protocol = req.get_type()
  File "/usr/lib/python2.7/urllib2.py", line 258, in get_type
    raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: 0.0

谁能帮我解析错误?似乎错误发生在 sklearn 的某个地方,但我不完全理解可能出了什么问题。谢谢

如果您从文件中的行跟踪回溯

from sklearn.cross_validation import train_test_split

您创建了一系列导入。但是如果你稍后在回溯中阅读,你会看到这个

    from new import instancemethod
  File "/home/fieldsofgold/new/new.py", line 10, in <module>

Python 中某处有一个名为 new.py 的模块。但是,您还在当前目录中创建了一个名为 new.py 的模块。因为priority of imports,Python会先在当前工作目录下寻找模块。如果找不到,它会尝试其他地方,根据

>>> import sys
>>> sys.path

所以基本上 Python 导入了错误的 new.py 并且它从那里滚雪球。为了避免这个问题,只需将 new 文件夹和 new.py 文件重命名为其他名称即可。另外,请确保删除已创建的 new.pyc 文件,因为它的存在足以尝试从那里导入。

出于好奇,这是文件的内容,位于 Windows./Python27/Lib/。

"""Create new objects of various types.  Deprecated.
This module is no longer required except for backward compatibility.
Objects of most types can now be created by calling the type object.
"""
from warnings import warnpy3k
warnpy3k("The 'new' module has been removed in Python 3.0; use the 'types' "
            "module instead.", stacklevel=2)
del warnpy3k

from types import ClassType as classobj
from types import FunctionType as function
from types import InstanceType as instance
from types import MethodType as instancemethod
from types import ModuleType as module

from types import CodeType as code