伙伴间互动实例

Question

研究背景：演讲者（作者）和接受者在关于特定讨论的书面交流中互动主题。第一个发言者是原发帖人 thread.

数据看起来像：

structure(list(topic = c(1, 1, 1, 1, 1, 1, 2, 2), thread = c(1, 
1, 1, 2, 2, 2, 3, 3), speaker_id = c(111, 111, 111, 222, 222, 
222, 111, 222), recipient_id = c(222, 333, 444, 111, 555, 444, 
222, 111), dyad = structure(c(1L, 2L, 3L, 1L, 5L, 4L, 1L, 1L), .Label = c("111_222", 
"111_333", "111_444", "222_444", "222_555"), class = "factor")), class = "data.frame", row.names = c(NA, 
-8L), codepage = 65001L)

目标是创建两个变量：

threads_partnered：在一个讨论主题中有多少线程是演讲者和接受者合作的（即，二人组或直接互动）？
threads_present：在讨论主题中有多少线程（给定线程除外）中的演讲者和接收者作为接收者出现，没有合作（或形成二元组）？

根据示例数据，结果如下：

╔═══════╦════════╦═════════╦═══════════╦═════════╦═══════════╦══════════════════════════════════════════╦═════════╦════════════════════════════════════════════╗
║ topic ║ thread ║ speaker ║ recipient ║   dyad  ║  threads  ║                   note                   ║ threads ║                    note                    ║
║       ║        ║    id   ║     id    ║         ║ partnered ║                                          ║ present ║                                            ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  1.00 ║    1   ║   111   ║    222    ║ 111_222 ║     2     ║ 111 and 222 interacted (made a dyad)     ║    0    ║ Outside the given thread (thread #1) of    ║
║       ║        ║         ║           ║         ║           ║ in two different threads (thread #1, #2) ║         ║ the given topic (topic #1), 111 and 222    ║
║       ║        ║         ║           ║         ║           ║ within topic 1                           ║         ║ are not found together as recipients       ║
║       ║        ║         ║           ║         ║           ║                                          ║         ║ other than being in a dyad                 ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  1.00 ║    1   ║   111   ║    333    ║ 111_333 ║     1     ║ 111 and 333 interacted in                ║    0    ║                                            ║
║       ║        ║         ║           ║         ║           ║ one thread (thread #1)                   ║         ║                                            ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  1.00 ║    1   ║   111   ║    444    ║ 111_444 ║     1     ║ 111 and 444 interacted in                ║    1    ║ 111 and 444 are found in thread #2,        ║
║       ║        ║         ║           ║         ║           ║ one thread (thread #1)                   ║         ║ where they did not interact (made a dyad), ║
║       ║        ║         ║           ║         ║           ║                                          ║         ║ but were only recipients of                ║
║       ║        ║         ║           ║         ║           ║                                          ║         ║ the original speaker (111)                 ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  1.00 ║    2   ║   222   ║    111    ║ 111_222 ║     2     ║ 111 and 222 interacted in two different  ║    0    ║                                            ║
║       ║        ║         ║           ║         ║           ║ threads within topic 1                   ║         ║                                            ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  1.00 ║    2   ║   222   ║    555    ║ 222_555 ║     1     ║ 222 and 555 interacted in one thread     ║    0    ║                                            ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  1.00 ║    2   ║   222   ║    444    ║ 222_444 ║     1     ║ 222 and 444 interacted in one thread     ║    1    ║ 222 and 444 are found together             ║
║       ║        ║         ║           ║         ║           ║                                          ║         ║ in thread #1, where they did not           ║
║       ║        ║         ║           ║         ║           ║                                          ║         ║ interact                                   ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  2.00 ║    3   ║   111   ║    222    ║ 111_222 ║     1     ║ 111 and 222 interacted in one thread     ║    0    ║                                            ║
║       ║        ║         ║           ║         ║           ║ (thread 3) within topic 2                ║         ║                                            ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║  2.00 ║    3   ║   222   ║    111    ║ 111_222 ║     1     ║ same as above                            ║    0    ║                                            ║
╚═══════╩════════╩═════════╩═══════════╩═════════╩═══════════╩══════════════════════════════════════════╩═════════╩════════════════════════════════════════════╝

Answer 1

不完全确定这是否能满足您的需求，但也许在某些方面会有帮助。

我创建了一个自定义函数来获取发言人、收件人、线程和主题，并根据您的描述确定 threads_present。这包括查看同一 topic 中的其他 thread，检查以确保其他 thread 不包含作为 dyad 的演讲者和接受者。最后，thread 应该包括 both 说话者和接收者作为某行的接收者。然后计算这些 thread。

第二个 threads_partnered 更直接，在评论中有描述。在 group_by topic 和 dyad 之后，您可以使用 n_distinct.

确定唯一 thread 的数量

library(tidyr)
library(dplyr)
library(purrr)

my_fun <- function(the_speaker, the_recipient, the_thread, the_topic) {
  df %>%
    filter(
      topic == the_topic,
      thread != the_thread, 
      dyad != paste(min(the_speaker, the_recipient), max(the_speaker, the_recipient), sep = "_")) %>%
    group_by(thread) %>%
    filter(all(c(the_speaker, the_recipient) %in% recipient_id)) %>%
    ungroup() %>%
    distinct(thread) %>%
    count(name = "threads_present")
}

df %>%
  mutate(threads_present = pmap(
    list(the_speaker = speaker_id, the_recipient = recipient_id, the_thread = thread, the_topic = topic),
    my_fun)
  ) %>%
  unnest(cols = threads_present) %>%
  group_by(topic, dyad) %>%
  mutate(threads_partnered = n_distinct(thread))

输出

  topic thread speaker_id recipient_id dyad    threads_present threads_partnered
  <dbl>  <dbl>      <dbl>        <dbl> <fct>             <int>             <int>
1     1      1        111          222 111_222               0                 2
2     1      1        111          333 111_333               0                 1
3     1      1        111          444 111_444               1                 1
4     1      2        222          111 111_222               0                 2
5     1      2        222          555 222_555               0                 1
6     1      2        222          444 222_444               1                 1
7     2      3        111          222 111_222               0                 1
8     2      3        222          111 111_222               0                 1

伙伴间互动实例

Instances of interactions between partners

r

unique

dplyr

tidyverse