伙伴间互动实例
Instances of interactions between partners
研究背景:演讲者(作者)和接受者在关于特定讨论的书面交流中互动主题。第一个发言者是原发帖人 thread.
数据看起来像:
structure(list(topic = c(1, 1, 1, 1, 1, 1, 2, 2), thread = c(1,
1, 1, 2, 2, 2, 3, 3), speaker_id = c(111, 111, 111, 222, 222,
222, 111, 222), recipient_id = c(222, 333, 444, 111, 555, 444,
222, 111), dyad = structure(c(1L, 2L, 3L, 1L, 5L, 4L, 1L, 1L), .Label = c("111_222",
"111_333", "111_444", "222_444", "222_555"), class = "factor")), class = "data.frame", row.names = c(NA,
-8L), codepage = 65001L)
目标是创建两个变量:
- threads_partnered:在一个讨论主题中有多少线程是演讲者和接受者合作的(即,二人组或直接互动)?
- threads_present:在讨论主题中有多少线程(给定线程除外)中的演讲者和接收者作为接收者出现,没有合作(或形成二元组)?
根据示例数据,结果如下:
╔═══════╦════════╦═════════╦═══════════╦═════════╦═══════════╦══════════════════════════════════════════╦═════════╦════════════════════════════════════════════╗
║ topic ║ thread ║ speaker ║ recipient ║ dyad ║ threads ║ note ║ threads ║ note ║
║ ║ ║ id ║ id ║ ║ partnered ║ ║ present ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 222 ║ 111_222 ║ 2 ║ 111 and 222 interacted (made a dyad) ║ 0 ║ Outside the given thread (thread #1) of ║
║ ║ ║ ║ ║ ║ ║ in two different threads (thread #1, #2) ║ ║ the given topic (topic #1), 111 and 222 ║
║ ║ ║ ║ ║ ║ ║ within topic 1 ║ ║ are not found together as recipients ║
║ ║ ║ ║ ║ ║ ║ ║ ║ other than being in a dyad ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 333 ║ 111_333 ║ 1 ║ 111 and 333 interacted in ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ one thread (thread #1) ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 444 ║ 111_444 ║ 1 ║ 111 and 444 interacted in ║ 1 ║ 111 and 444 are found in thread #2, ║
║ ║ ║ ║ ║ ║ ║ one thread (thread #1) ║ ║ where they did not interact (made a dyad), ║
║ ║ ║ ║ ║ ║ ║ ║ ║ but were only recipients of ║
║ ║ ║ ║ ║ ║ ║ ║ ║ the original speaker (111) ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 111 ║ 111_222 ║ 2 ║ 111 and 222 interacted in two different ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ threads within topic 1 ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 555 ║ 222_555 ║ 1 ║ 222 and 555 interacted in one thread ║ 0 ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 444 ║ 222_444 ║ 1 ║ 222 and 444 interacted in one thread ║ 1 ║ 222 and 444 are found together ║
║ ║ ║ ║ ║ ║ ║ ║ ║ in thread #1, where they did not ║
║ ║ ║ ║ ║ ║ ║ ║ ║ interact ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 2.00 ║ 3 ║ 111 ║ 222 ║ 111_222 ║ 1 ║ 111 and 222 interacted in one thread ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ (thread 3) within topic 2 ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 2.00 ║ 3 ║ 222 ║ 111 ║ 111_222 ║ 1 ║ same as above ║ 0 ║ ║
╚═══════╩════════╩═════════╩═══════════╩═════════╩═══════════╩══════════════════════════════════════════╩═════════╩════════════════════════════════════════════╝
不完全确定这是否能满足您的需求,但也许在某些方面会有帮助。
我创建了一个自定义函数来获取发言人、收件人、线程和主题,并根据您的描述确定 threads_present
。这包括查看同一 topic
中的其他 thread
,检查以确保其他 thread
不包含作为 dyad
的演讲者和接受者。最后,thread
应该包括 both 说话者和接收者作为某行的接收者。然后计算这些 thread
。
第二个 threads_partnered
更直接,在评论中有描述。在 group_by
topic
和 dyad
之后,您可以使用 n_distinct
.
确定唯一 thread
的数量
library(tidyr)
library(dplyr)
library(purrr)
my_fun <- function(the_speaker, the_recipient, the_thread, the_topic) {
df %>%
filter(
topic == the_topic,
thread != the_thread,
dyad != paste(min(the_speaker, the_recipient), max(the_speaker, the_recipient), sep = "_")) %>%
group_by(thread) %>%
filter(all(c(the_speaker, the_recipient) %in% recipient_id)) %>%
ungroup() %>%
distinct(thread) %>%
count(name = "threads_present")
}
df %>%
mutate(threads_present = pmap(
list(the_speaker = speaker_id, the_recipient = recipient_id, the_thread = thread, the_topic = topic),
my_fun)
) %>%
unnest(cols = threads_present) %>%
group_by(topic, dyad) %>%
mutate(threads_partnered = n_distinct(thread))
输出
topic thread speaker_id recipient_id dyad threads_present threads_partnered
<dbl> <dbl> <dbl> <dbl> <fct> <int> <int>
1 1 1 111 222 111_222 0 2
2 1 1 111 333 111_333 0 1
3 1 1 111 444 111_444 1 1
4 1 2 222 111 111_222 0 2
5 1 2 222 555 222_555 0 1
6 1 2 222 444 222_444 1 1
7 2 3 111 222 111_222 0 1
8 2 3 222 111 111_222 0 1
研究背景:演讲者(作者)和接受者在关于特定讨论的书面交流中互动主题。第一个发言者是原发帖人 thread.
数据看起来像:
structure(list(topic = c(1, 1, 1, 1, 1, 1, 2, 2), thread = c(1,
1, 1, 2, 2, 2, 3, 3), speaker_id = c(111, 111, 111, 222, 222,
222, 111, 222), recipient_id = c(222, 333, 444, 111, 555, 444,
222, 111), dyad = structure(c(1L, 2L, 3L, 1L, 5L, 4L, 1L, 1L), .Label = c("111_222",
"111_333", "111_444", "222_444", "222_555"), class = "factor")), class = "data.frame", row.names = c(NA,
-8L), codepage = 65001L)
目标是创建两个变量:
- threads_partnered:在一个讨论主题中有多少线程是演讲者和接受者合作的(即,二人组或直接互动)?
- threads_present:在讨论主题中有多少线程(给定线程除外)中的演讲者和接收者作为接收者出现,没有合作(或形成二元组)?
根据示例数据,结果如下:
╔═══════╦════════╦═════════╦═══════════╦═════════╦═══════════╦══════════════════════════════════════════╦═════════╦════════════════════════════════════════════╗
║ topic ║ thread ║ speaker ║ recipient ║ dyad ║ threads ║ note ║ threads ║ note ║
║ ║ ║ id ║ id ║ ║ partnered ║ ║ present ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 222 ║ 111_222 ║ 2 ║ 111 and 222 interacted (made a dyad) ║ 0 ║ Outside the given thread (thread #1) of ║
║ ║ ║ ║ ║ ║ ║ in two different threads (thread #1, #2) ║ ║ the given topic (topic #1), 111 and 222 ║
║ ║ ║ ║ ║ ║ ║ within topic 1 ║ ║ are not found together as recipients ║
║ ║ ║ ║ ║ ║ ║ ║ ║ other than being in a dyad ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 333 ║ 111_333 ║ 1 ║ 111 and 333 interacted in ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ one thread (thread #1) ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 444 ║ 111_444 ║ 1 ║ 111 and 444 interacted in ║ 1 ║ 111 and 444 are found in thread #2, ║
║ ║ ║ ║ ║ ║ ║ one thread (thread #1) ║ ║ where they did not interact (made a dyad), ║
║ ║ ║ ║ ║ ║ ║ ║ ║ but were only recipients of ║
║ ║ ║ ║ ║ ║ ║ ║ ║ the original speaker (111) ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 111 ║ 111_222 ║ 2 ║ 111 and 222 interacted in two different ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ threads within topic 1 ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 555 ║ 222_555 ║ 1 ║ 222 and 555 interacted in one thread ║ 0 ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 444 ║ 222_444 ║ 1 ║ 222 and 444 interacted in one thread ║ 1 ║ 222 and 444 are found together ║
║ ║ ║ ║ ║ ║ ║ ║ ║ in thread #1, where they did not ║
║ ║ ║ ║ ║ ║ ║ ║ ║ interact ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 2.00 ║ 3 ║ 111 ║ 222 ║ 111_222 ║ 1 ║ 111 and 222 interacted in one thread ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ (thread 3) within topic 2 ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 2.00 ║ 3 ║ 222 ║ 111 ║ 111_222 ║ 1 ║ same as above ║ 0 ║ ║
╚═══════╩════════╩═════════╩═══════════╩═════════╩═══════════╩══════════════════════════════════════════╩═════════╩════════════════════════════════════════════╝
不完全确定这是否能满足您的需求,但也许在某些方面会有帮助。
我创建了一个自定义函数来获取发言人、收件人、线程和主题,并根据您的描述确定 threads_present
。这包括查看同一 topic
中的其他 thread
,检查以确保其他 thread
不包含作为 dyad
的演讲者和接受者。最后,thread
应该包括 both 说话者和接收者作为某行的接收者。然后计算这些 thread
。
第二个 threads_partnered
更直接,在评论中有描述。在 group_by
topic
和 dyad
之后,您可以使用 n_distinct
.
thread
的数量
library(tidyr)
library(dplyr)
library(purrr)
my_fun <- function(the_speaker, the_recipient, the_thread, the_topic) {
df %>%
filter(
topic == the_topic,
thread != the_thread,
dyad != paste(min(the_speaker, the_recipient), max(the_speaker, the_recipient), sep = "_")) %>%
group_by(thread) %>%
filter(all(c(the_speaker, the_recipient) %in% recipient_id)) %>%
ungroup() %>%
distinct(thread) %>%
count(name = "threads_present")
}
df %>%
mutate(threads_present = pmap(
list(the_speaker = speaker_id, the_recipient = recipient_id, the_thread = thread, the_topic = topic),
my_fun)
) %>%
unnest(cols = threads_present) %>%
group_by(topic, dyad) %>%
mutate(threads_partnered = n_distinct(thread))
输出
topic thread speaker_id recipient_id dyad threads_present threads_partnered
<dbl> <dbl> <dbl> <dbl> <fct> <int> <int>
1 1 1 111 222 111_222 0 2
2 1 1 111 333 111_333 0 1
3 1 1 111 444 111_444 1 1
4 1 2 222 111 111_222 0 2
5 1 2 222 555 222_555 0 1
6 1 2 222 444 222_444 1 1
7 2 3 111 222 111_222 0 1
8 2 3 222 111 111_222 0 1