Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

r - Regex match exact number of letters

Let's say I want to find all words in which letter "e" appears exactly two times. When I define this pattern:

pattern1 <- "e.*e" 
grep(pattern1, stringr::words, value = T)

RegEx also matches words such as "therefore", because "e" appears (at least) two times as well. The point is, I don't want my pattern to be "at least", I want it to be "exactly n times".

This pattern...

  pattern2 <- "e{2}"

...finds words with two letter "e", but only if they appear one after each other ("feel", "agre" etc). I'd like to combines these two patterns to find all words with exact number of not necessarily consecutive appearances of a letter "e".

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You may use:

^(?:[^e]*e){2}[^e]*$

See the regex demo. The (?:...) is a non-capturing group that allows quantifying a sequence of subpatterns and is thus easily adjustable to match 3, 4 or more specific sequences in a string.

Details

  • ^- start of string
  • (?:[^e]*e){2} - 2 occurrences of
    • [^e]* - any 0+ chars other than e
    • e - an e
  • [^e]* - any 0+ chars other than e
  • $ - end of string

See the R demo below:

x <- c("feel", "agre", "degree")
rx <- "^(?:[^e]*e){2}[^e]*$"
grep(rx, x, value = TRUE)
## => [1] "feel"

Note that instead of value = T it is safer to use value = TRUE as T might be redefined in the code above.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...