R 中文本(推文)及其频率可视化的建议
Recommedations for visualization of text (tweets) and their frequencies in R
我正在寻找 R 或 R 库中的编码示例,以可视化网络图中的词频和关系,与此示例非常相似:http://koaning.io/word-clouds.html(我指的不是 worldclouds,而是主页上的网络图)
到目前为止,我已经清理了数据并且有大约 100 万行包含干净的文本并计算了相关性和词频。
如果你能给我建议并给我一些提示,我将不胜感激。
一切顺利,
雷内
作为初学者,考虑例如:
library(quanteda)
library(igraph)
set.seed(1)
txt <- "I am looking for coding examples in R or R-libraries to visualize words frequencies and relations in a network graph, very similar to this example: http://koaning.io/word-clouds.html (I refer not to the worldclouds, but to the network graph on the homepage)
So far I have cleaned the data and have about 1 million rows with clean text and calculated correlations and word frequencies.
I would highly appreciate if you can advise me and give me some tips on that.
All the best, René"
plot(dfm(txt), min.freq=2L)
edges <- do.call(rbind, strsplit(tokenize(x=txt, ngrams=2L, conc="_")[[1]], "_"))
g <- graph_from_edgelist(edges, directed = FALSE)
g <- simplify(g)
plot(g, vertex.size=degree(g))
我正在寻找 R 或 R 库中的编码示例,以可视化网络图中的词频和关系,与此示例非常相似:http://koaning.io/word-clouds.html(我指的不是 worldclouds,而是主页上的网络图)
到目前为止,我已经清理了数据并且有大约 100 万行包含干净的文本并计算了相关性和词频。
如果你能给我建议并给我一些提示,我将不胜感激。
一切顺利, 雷内
作为初学者,考虑例如:
library(quanteda)
library(igraph)
set.seed(1)
txt <- "I am looking for coding examples in R or R-libraries to visualize words frequencies and relations in a network graph, very similar to this example: http://koaning.io/word-clouds.html (I refer not to the worldclouds, but to the network graph on the homepage)
So far I have cleaned the data and have about 1 million rows with clean text and calculated correlations and word frequencies.
I would highly appreciate if you can advise me and give me some tips on that.
All the best, René"
plot(dfm(txt), min.freq=2L)
edges <- do.call(rbind, strsplit(tokenize(x=txt, ngrams=2L, conc="_")[[1]], "_"))
g <- graph_from_edgelist(edges, directed = FALSE)
g <- simplify(g)
plot(g, vertex.size=degree(g))