I wrote a TUI application to help you practice Python regular expressions. There are more than 100 exercises covering both the builtin re and third-party regex module.

If you have pipx, use pipx install regexexercises to install the app. See the repo for source code and other details.

  • @[email protected]
    link
    fedilink
    45 months ago

    I would argue that having distinct match and search helps readability. The difference between match('((([0-9]+-[0-9]+)|([0-9]+))[,]?)+[^,]', s) and search('((([0-9]+-[0-9]+)|([0-9]+))[,]?)+[^,]', s) is clear without the need for me to parse the regular expression myself. It also helps code reuse. Consider that you have PHONE_NUMBER_REGEX defined somewhere. If you only had a method to “search” but not to “match”, you would have to do something like search(f"\A{PHONE_NUMBER_REGEX}\Z", s), which is error-prone and less readable. Most likely you would end up having at least two sets of precompiled regex objects (i.e. PHONE_NUMBER_REGEX and PHONE_NUMBER_FULLMATCH_REGEX). It is also a fairly common practice in other languages’ regex libraries (cf. [1,2]). Golang, which is usually very reserved in the number of ways to express the same thing, has 16 different matching methods[3].

    Regarding re.findall, I see what you mean, however I don’t agree with your conclusions. I think it is a useful convenience method that improves readability in many cases. I’ve found these usages from my code, and I’m quite happy that this method was available[4]:

    digits = [digit_map[digit] for digit in re.findall("(?=(one|two|three|four|five|six|seven|eight|nine|[0-9]))", line)]
    [(minutes, seconds)] = re.findall(r"You have (?:(\d+)m )?(\d+)s left to wait", text)
    

    [1] https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html

    [2] https://en.cppreference.com/w/cpp/regex

    [3] https://pkg.go.dev/regexp

    [4] https://github.com/search?q=repo%3Ahades%2Faoc23 findall&type=code

    • @alyth
      link
      35 months ago

      Thank you for the very thorough reply! This is kind of high quality stuff you love to see on Lemmy. Your use cases seem very valid.