6.从一行文本中删除标点符号
going <- "a1~!@#$%^&*bcd(){}_+:efg\"<>?,./;'[]-="
gsub(pattern = "[[:punct:]]+",replacement = "",x = going)
7.从包含字母数字字符的字符串中删除数字
c2 <- "day of 2nd ID5 Conference 19 12 2005"
从上面的字符串中,所需的输出是“第二次ID5会议日”。 您不能使用简单的“[[:digit:]] +”正则表达式,因为它将匹配给定字符串中的所有可用数字。 相反,在这种情况下,我们将检测数字边界以获得所需的结果。
gsub(pattern = "\\b\\d+\\b",replacement = "",x = c2)
8.在字符串中查找数字的位置
string <- "there were 2 players each in 8 teams"
gregexpr(pattern = '\\d',text = string) #or
unlist(gregexpr(pattern = '\\d',text = "there were 2 players each in 8 teams"))
9.在字符串中提取括号内的可用信息(括号)
string <- "What are we doing tomorrow ? (laugh) Play soccer (groans) (cries)"
gsub("[\\(\\)]","",regmatches(string, gregexpr("\\(.*?\\)", string))[[1]])
说明:在此解决方案中,我们使用了惰性匹配技术。 首先,使用regmatches,我们用诸如(cries)(呻吟)之类的词来提取括号(笑)。 然后,我们只需使用gsub()函数删除括号。
10.仅提取范围中的第一个数字
x <- c("75 to 79", "80 to 84", "85 to 89")
gsub(" .*\\d+", "", x)
11.从给定字符串中提取电子邮件地址
string <- c("My email address is abc@boeing.com","my email address is def@jobs.com","aescher koeif","paul renne")
unlist(regmatches(x = string, gregexpr(pattern = "[[:alnum:]]+\\@[[:alpha:]]+\\.com",text = string)))
暂无数据