对 Rcpp 中的字符串连接感到困惑
Confused about concatenation of strings in Rcpp
我正在尝试遍历数据帧并连接在 Rcpp 中由 space 分隔的字块。
我尝试阅读有关 Stack Overflow 的一些答案,但我对 Rcpp 中字符串的连接方式感到非常困惑。 (例如 )
我知道在 C++ 中你可以只使用 + 运算符来添加字符串。
下面是我的 Rcpp 函数
cppFunction('
Rcpp::StringVector formTextBlocks(DataFrame frame) {
#include <string>
using namespace Rcpp;
NumericVector frame_x = as<NumericVector>(frame["x"]);
LogicalVector space = as<LogicalVector>(frame["space"]);
Rcpp::StringVector text=as<StringVector>(frame["text"]);
if (text.size() == 0) {
return text;
}
int dfSize = text.size();
for(int i = 0; i < dfSize; ++i) {
if ( i !=dfSize ) {
if (space[i]==true) {
text[i]=text[i] + text[i+1] ;
}
}
}
return text;
}
')
错误在error: no match for 'operator+'
行
如何在循环内连接字符串?
由于 operator+
是为 std::string
定义的,最简单的方法是将 text
列转换为 std::vector<std::string>
而不是 Rcpp::StringVector
:
Rcpp::cppFunction('
std::vector<std::string> formTextBlocks(DataFrame frame) {
LogicalVector space = as<LogicalVector>(frame["space"]);
std::vector<std::string> text=as<std::vector<std::string>>(frame["text"]);
if (text.size() == 0) {
return text;
}
int dfSize = text.size();
for(int i = 0; i < dfSize - 1; ++i) {
if (space[i]==true) {
text[i]=text[i] + text[i+1];
}
}
return text;
}
')
set.seed(20191129)
textBlock <- data.frame(space = sample(c(TRUE, FALSE), 100, replace = TRUE),
text = sample(LETTERS, 100, replace = TRUE),
stringsAsFactors = FALSE)
formTextBlocks(textBlock)
#> [1] "B" "N" "G" "BM" "M" "O" "C" "F" "OQ" "Q" "FH" "H" "D" "HK" "KH"
#> [16] "H" "S" "LX" "XO" "OY" "Y" "E" "VD" "D" "TN" "N" "LL" "LQ" "Q" "F"
#> [31] "XX" "X" "S" "R" "P" "L" "M" "GK" "KD" "DD" "D" "H" "M" "M" "K"
#> [46] "N" "GP" "PG" "G" "P" "G" "O" "N" "NY" "Y" "OX" "X" "LX" "XF" "FS"
#> [61] "SE" "E" "PS" "S" "YD" "D" "F" "Z" "H" "ZN" "N" "OM" "M" "XH" "HV"
#> [76] "V" "OX" "X" "J" "BZ" "Z" "FZ" "ZE" "E" "SV" "V" "G" "F" "DZ" "ZF"
#> [91] "F" "PB" "B" "K" "N" "U" "B" "PV" "V" "C"
由 reprex package (v0.3.0)
于 2019-11-29 创建
备注:
- 我删除了
#include
和 using
。这些不是必需的,也不属于函数定义。
- 我已经删除了
i != dfSize
测试,它永远不会 false
。
- 循环的长度减一,因为您正在接触元素
i+1
。
我正在尝试遍历数据帧并连接在 Rcpp 中由 space 分隔的字块。
我尝试阅读有关 Stack Overflow 的一些答案,但我对 Rcpp 中字符串的连接方式感到非常困惑。 (例如
我知道在 C++ 中你可以只使用 + 运算符来添加字符串。
下面是我的 Rcpp 函数
cppFunction('
Rcpp::StringVector formTextBlocks(DataFrame frame) {
#include <string>
using namespace Rcpp;
NumericVector frame_x = as<NumericVector>(frame["x"]);
LogicalVector space = as<LogicalVector>(frame["space"]);
Rcpp::StringVector text=as<StringVector>(frame["text"]);
if (text.size() == 0) {
return text;
}
int dfSize = text.size();
for(int i = 0; i < dfSize; ++i) {
if ( i !=dfSize ) {
if (space[i]==true) {
text[i]=text[i] + text[i+1] ;
}
}
}
return text;
}
')
错误在error: no match for 'operator+'
如何在循环内连接字符串?
由于 operator+
是为 std::string
定义的,最简单的方法是将 text
列转换为 std::vector<std::string>
而不是 Rcpp::StringVector
:
Rcpp::cppFunction('
std::vector<std::string> formTextBlocks(DataFrame frame) {
LogicalVector space = as<LogicalVector>(frame["space"]);
std::vector<std::string> text=as<std::vector<std::string>>(frame["text"]);
if (text.size() == 0) {
return text;
}
int dfSize = text.size();
for(int i = 0; i < dfSize - 1; ++i) {
if (space[i]==true) {
text[i]=text[i] + text[i+1];
}
}
return text;
}
')
set.seed(20191129)
textBlock <- data.frame(space = sample(c(TRUE, FALSE), 100, replace = TRUE),
text = sample(LETTERS, 100, replace = TRUE),
stringsAsFactors = FALSE)
formTextBlocks(textBlock)
#> [1] "B" "N" "G" "BM" "M" "O" "C" "F" "OQ" "Q" "FH" "H" "D" "HK" "KH"
#> [16] "H" "S" "LX" "XO" "OY" "Y" "E" "VD" "D" "TN" "N" "LL" "LQ" "Q" "F"
#> [31] "XX" "X" "S" "R" "P" "L" "M" "GK" "KD" "DD" "D" "H" "M" "M" "K"
#> [46] "N" "GP" "PG" "G" "P" "G" "O" "N" "NY" "Y" "OX" "X" "LX" "XF" "FS"
#> [61] "SE" "E" "PS" "S" "YD" "D" "F" "Z" "H" "ZN" "N" "OM" "M" "XH" "HV"
#> [76] "V" "OX" "X" "J" "BZ" "Z" "FZ" "ZE" "E" "SV" "V" "G" "F" "DZ" "ZF"
#> [91] "F" "PB" "B" "K" "N" "U" "B" "PV" "V" "C"
由 reprex package (v0.3.0)
于 2019-11-29 创建备注:
- 我删除了
#include
和using
。这些不是必需的,也不属于函数定义。 - 我已经删除了
i != dfSize
测试,它永远不会false
。 - 循环的长度减一,因为您正在接触元素
i+1
。