如何通过正则表达式提取电影名称
How can I extract movie names by regular expression
以下是一些数据示例:
1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
我想提取带有年份的电影名称:
Toy Story (1995)
GoldenEye (1995)
非常感谢!
在 Java 中,这可以通过使用 String.split
:
相对容易地完成
String str = "1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0";
String movieName = str.split("\|")[1];
似乎是管道(|
)分隔的数据,所以
df <- read.table(sep = "|", text="
1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0")
然后select第二列:
df[, 2]
# [1] Toy Story (1995) GoldenEye (1995)
# Levels: GoldenEye (1995) Toy Story (1995)
以下是一些数据示例:
1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
我想提取带有年份的电影名称:
Toy Story (1995)
GoldenEye (1995)
非常感谢!
在 Java 中,这可以通过使用 String.split
:
String str = "1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0";
String movieName = str.split("\|")[1];
似乎是管道(|
)分隔的数据,所以
df <- read.table(sep = "|", text="
1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0")
然后select第二列:
df[, 2]
# [1] Toy Story (1995) GoldenEye (1995)
# Levels: GoldenEye (1995) Toy Story (1995)