Azure / R-server - rxKmeans 写入没有 header 的文件
Azure / R-server - rxKmeans write file with no header
我正在 Azure / R-server 中进行 kmeans 集群,并且需要能够编写一个没有 header.
的文件
到目前为止我已经尝试过:
k1 <- rxKmeans(formula = ~ var1 + var2 + var3, data = df, seed = 10, numClusters = 5
, outFile = dfOut, extraVarsToWrite = c('CUST_ID'), overwrite = T
, outColName = F
)
我收到这个错误:
Error in rxuHandleClusterJobTryFailure(retObject, hpcServerJob, autoCleanup) :
Error completing job on cluster:
Error : rxIsCharacterScalarNonEmpty(outColName) is not TRUE
我也试过:
k1 <- rxKmeans(formula = ~ var1 + var2 + var3, data = df, seed = 10, numClusters = 5
, outFile = dfOut, extraVarsToWrite = c('CUST_ID'), overwrite = T
, header = F
)
哪个returns:
Error in rxuHandleClusterJobTryFailure(retObject, hpcServerJob, autoCleanup) :
Error completing job on cluster:
Error in rxKmeansBase(formula = formula, data = data, outDataSource = outDataSource, :
unused argument (header = FALSE)
还有其他建议吗?
问题是我在文件定义和 rxKmeans 函数中给出了相互冲突的指令。
我通过省略 rxKmeans 函数的 header
参数并将 firstRowIsColNames
设置为 FALSE 来修复它。
kmeansFile <- paste('~/clusters/ClusterOutput.tsv', sep = '')
dfOut <- RxTextData(kmeansFile, fileSystem = hdfsFS, firstRowIsColNames = F)
k1 <- rxKmeans(formula = ~ var1 + var2 + var3, data = df, seed = 10, numClusters = 5
, outFile = dfOut, extraVarsToWrite = c('id_num'), overwrite = T
# , outColName = F
# , header = F
)
我正在 Azure / R-server 中进行 kmeans 集群,并且需要能够编写一个没有 header.
的文件到目前为止我已经尝试过:
k1 <- rxKmeans(formula = ~ var1 + var2 + var3, data = df, seed = 10, numClusters = 5
, outFile = dfOut, extraVarsToWrite = c('CUST_ID'), overwrite = T
, outColName = F
)
我收到这个错误:
Error in rxuHandleClusterJobTryFailure(retObject, hpcServerJob, autoCleanup) :
Error completing job on cluster:
Error : rxIsCharacterScalarNonEmpty(outColName) is not TRUE
我也试过:
k1 <- rxKmeans(formula = ~ var1 + var2 + var3, data = df, seed = 10, numClusters = 5
, outFile = dfOut, extraVarsToWrite = c('CUST_ID'), overwrite = T
, header = F
)
哪个returns:
Error in rxuHandleClusterJobTryFailure(retObject, hpcServerJob, autoCleanup) :
Error completing job on cluster:
Error in rxKmeansBase(formula = formula, data = data, outDataSource = outDataSource, :
unused argument (header = FALSE)
还有其他建议吗?
问题是我在文件定义和 rxKmeans 函数中给出了相互冲突的指令。
我通过省略 rxKmeans 函数的 header
参数并将 firstRowIsColNames
设置为 FALSE 来修复它。
kmeansFile <- paste('~/clusters/ClusterOutput.tsv', sep = '')
dfOut <- RxTextData(kmeansFile, fileSystem = hdfsFS, firstRowIsColNames = F)
k1 <- rxKmeans(formula = ~ var1 + var2 + var3, data = df, seed = 10, numClusters = 5
, outFile = dfOut, extraVarsToWrite = c('id_num'), overwrite = T
# , outColName = F
# , header = F
)