如何将范围转换为单个位置

How to transform a range into individual positions

我有一个巨大的"range"数据,这个数据是GenomicRanges格式,如果我把它转换成data.frame,如下一个例子:

file <- "seqnames start end width strand 
chr1  2  5  4      *
chr2  3  7  5      *"
file<-read.table(text=file,header=T)

我想将这个 "ranges" 分解到各个位置,例如这个例子:

file2 <- "seqnames Position 
chr1  2
chr1  3
chr1  4
chr1  5
chr2  3
chr2  4
chr2  5
chr2  6
chr2  7"

file2 <- read.table(text=file2,header=T)

我该怎么做?

如果使用 Bioconductor GenomicRanges,则

> GPos(GRanges(c("chr1:2-5", "chr2:3-7")))
GPos object with 9 positions and 0 metadata columns:
      seqnames       pos strand
         <Rle> <integer>  <Rle>
  [1]     chr1         2      *
  [2]     chr1         3      *
  [3]     chr1         4      *
  [4]     chr1         5      *
  [5]     chr2         3      *
  [6]     chr2         4      *
  [7]     chr2         5      *
  [8]     chr2         6      *
  [9]     chr2         7      *
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

也许先用

GRanges(file)

我们可以使用data.table

library(data.table)
setDT(file)[, .(position = start:end), by = seqnames]

#    seqnames position
# 1:     chr1        2
# 2:     chr1        3
# 3:     chr1        4
# 4:     chr1        5
# 5:     chr2        3
# 6:     chr2        4
# 7:     chr2        5
# 8:     chr2        6
# 9:     chr2        7