如何通过正则表达式提取电影名称

How can I extract movie names by regular expression

以下是一些数据示例:

1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0

2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0

我想提取带有年份的电影名称:

Toy Story (1995)

GoldenEye (1995)

非常感谢!

在 Java 中,这可以通过使用 String.split:

相对容易地完成
String str = "1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0";
String movieName = str.split("\|")[1];

似乎是管道(|)分隔的数据,所以

df <- read.table(sep = "|", text="
1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0")

然后select第二列:

df[, 2]
# [1] Toy Story (1995) GoldenEye (1995)
# Levels: GoldenEye (1995) Toy Story (1995)