在精确定义的间隔上对两个向量进行边距

Margining two vectors on precisely defined intervals

给定两个向量:

vec_nums <- 1:20
vec_ltrs <- letters[1:10]

我想编写一个函数来合并它们,第二个向量中的每个元素都出现在第一个向量中精确定义的位置上。例如,运行:

vec_mrg <- funMergeVectsByPlace(x = vec_num, y = vec_ltrs, position = 3)

应returnvec_mrg以下内容:

 [1] "a"  "b" "1"  "c"  "d"  "2"  "f"  "g"  "3"  "i"  "j"  "4"  "l"  "m"  "5" ...

所需特征:

  1. 该函数将通过 y = 传递的向量中的元素放置在 position = 中从左侧开始计数的给定位置。所以position = 3应该理解为*每隔三位"占3,6,...
  2. 该函数应适用于数字字符串和因子向量以及 return 有序因子。
  3. 该函数应该适用于因子、字符串和数字向量
  4. 如果向量 yx 中的插入数短,函数应该 return x 的剩余部分而不添加任何内容

建议结构

我设想函数的结构如下:

funMergeVectsByPlace <- function(x,y position = 3) {

  # Convert
  vec_a <- as.character(x)
  vec_b <- as.character(y)


  # Missing part
  # Combine two vectors 


  # Create ordered factor
  vec_fac <- factor(vec_mrg, 
                   # levels = 
                   # I want the levels to reflect the order of elements in the vec_merg
                   )

  # Return
  return(vec_fac)
}

示例

最简单

关于尝试,最简单的方法:

vec_mrg <- c(vec_nums, vec_ltrs)
vec_mrg <- order(vec_mrg)

但这不会创建订单

循环

for (i in 1:length(vec_nums)) {
  pos <- position 
  vec_nums[pos] <- vec_ltrs[i]
  pos <- pos + pos
  # i will be out of bounds and the way to move the other vector is missing
}
vec_mrg <- function(x,y,pos) {
  res <- y
  counter <- seq(floor(length(y)/(pos-1)))
  for(i in counter) {
    res <- append(res, x[i], seq(pos-1,by=pos, length.out=length(counter))[i])
  }
  res
}

vec_mrg(vec_nums, vec_ltrs, 3)
#[1] "a" "b" "1" "c" "d" "2" "e" "f" "3" "g" "h" "4" "i" "j"
#[15] "5"

无循环解决方案:

funMergeVectsByPlace <- function( x, y, position )
{
  n <- min( length(y)%/%(position-1), length(x) )    
  A <- rbind( matrix(head(y,n*(position-1)),position-1), head(x,n) )

  rest <- c( x[-(1:n)], y[-(1:(n*(position-1)))] )

  c(c(A),rest)
}

与 Lafortunes 解决方案的速度比较:

> library(microbenchmark)

> vec_nums <- 1:20

> vec_ltrs <- letters[1:10]

> microbenchmark(Lafortune  = vec_mrg(vec_nums,vec_ltrs,3),
+                mra68 = funMergeVectsByPlace(vec_nums,vec_ltrs,3),
+                times .... [TRUNCATED] 
Unit: microseconds
      expr     min      lq      mean  median      uq      max neval
 Lafortune 137.677 143.112 161.12006 146.734 153.980 2931.512 10000
     mra68  77.443  81.067  92.13208  83.331  86.954 2718.204 10000

更大的向量:

> vec_nums <- 1:2000

> vec_ltrs <- letters[rep(1:10,100)]

> microbenchmark(Lafortune  = vec_mrg(vec_nums,vec_ltrs,3),
+                mra68 = funMergeVectsByPlace(vec_nums,vec_ltrs,3),
+                times .... [TRUNCATED] 
Unit: milliseconds
      expr       min        lq      mean    median        uq      max neval
 Lafortune 32.993883 40.991796 63.758011 51.171020 90.122351 456.9748  1000
     mra68  1.101865  1.489533  2.468496  1.751299  3.338881 230.0460  1000

> v1 <- vec_mrg(vec_nums,vec_ltrs,3)

> v2 <- funMergeVectsByPlace(vec_nums,vec_ltrs,3)
> 

请注意,vec_mrg 函数不会将 x 向量的其余部分附加到结果,但 funMergeVectsByPlace 会。否则结果相同:

> v1 <- vec_mrg(1:20,letters[1:10],3)

> v2 <- funMergeVectsByPlace(1:20,letters[1:10],3)

> v1
 [1] "a" "b" "1" "c" "d" "2" "e" "f" "3" "g" "h" "4" "i" "j" "5"

> v2
 [1] "a"  "b"  "1"  "c"  "d"  "2"  "e"  "f"  "3"  "g"  "h"  "4"  "i"  "j"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20"

> identical(v1,v2[1:length(v1)])
[1] TRUE
> 

vec_mrgfunMergeVectsByPlace return 因素都没有。如果一个包含 factor(...),两个函数都变慢了,但是 funMergeVectsByPlace 仍然比 vec_mrg.