Python suspendedfc

position ease box interval due
front 2.5 0 0 2021-05-11T07:42:36Z

ref

\w (word character) matches **any single letter, number or underscore (same as [a-zA-Z0-9_])***.

Match object

  • A Match Object is an object containing information about the search and the result.
  • This object is returned by `re.search()`, `re.match()`, `re.findall()`.
  • Has propterties and methods
    1. `.span()` returns a tuple containing the start-, and end positions of the match.

    2. `.string` returns the string passed into the function

    3. `.group()` returns the part of the string where there was a match

      import re
      txt = "My name is Javeed Ali Khan"
      result = re.search(r"(\bJ\w+) (\bA[li]{2})", txt)
      print(result.span())
      print(result.group()) # all match
      print(result.group(0)) # all match
      print(result.group(1)) # 1st group
      print(result.group(2)) # 2nd group
      print(result.string)
      
      (11, 21)
      Javeed Ali
      Javeed Ali
      Javeed
      Ali
      My name is Javeed Ali Khan
      

re methods 1

ref

re.compile(pattern, flags) Compile a regular expression of pattern, with flag s
re.match(pattern, string) Match pattern only at beginning of string
re.search(pattern,string) Match patterns anywhere in the string
re.split(pattern, string) Split string by occurences of pattern
re.sub(pttrn_2_repl, repl_with, in_string)

re method Objects 1

match.group(“name”) Return subgroup “name” of match
match.groups() Return tuple containing all subgroups of match
match.groupdict() Return dict containing all named subgroups of match
match.start(group) Return start index of substring match by group
match.end(group) Return end index of substring matched by group
match.span(group) Return 2-tuple start and end indices of group in match

re methods 2

re.fullmatch(pattern, string) Match pattern if whole string matches regular expression
re.findall(pattern, string) Return all non-overlapping matches of pattern in string, as a list of strings
re.finditer(pattern, string) Return an iterator yielding match objects over non-overlapping matches of pattern in string
re.subn(pattern, str2, string) Replace left most occurrences of pattern in string with str2, but return a tuple of (newstring, # subs made)
re.purge() Clear the regular expression cache

Difference between match , search and findall

ref

match search findall
1. first occurence 1. first occurence returns all occurences
2. if match found in another line returns null 2. check all lines unline match returns list of strings
3. returns match object 3. returns match object or list of tuples of strings not match object

findall example

``` cc_list = ‘‘‘Ezra Koenig <ekoenig@vpwk.com>, Rostam Batmanglij <rostam@vpwk.com>, Chris Tomson <ctomson@vpwk. Bobbi Baio <bbaio@vpwk.com’’’ >>> matched = re.findall(r'\w+\@\w+\.\w+', cc_list) >>> matched [‘ekoenig@vpwk.com’, ‘rostam@vpwk.com’, ‘ctomson@vpwk.com’, ‘cbaio@vpwk.com’] >>> matched = re.findall(r'(\w+)\@(\w+)\.(\w+)', cc_list) >>> matched [(‘ekoenig’, ‘vpwk’, ‘com’), (‘rostam’, ‘vpwk’, ‘com’), (‘ctomson’, ‘vpwk’, ‘com’), (‘cbaio’, ‘vpwk’, ‘com’)] >>> names = [x[0] for x in matched] >>> names [‘ekoenig’, ‘rostam’, ‘ctomson’, ‘cbaio’] ```

Examples

Named group

```(?P<name>regex)```

print("start")
import re
txt = """Javeed Ali Khan Mohammed 2284440597 lisai Taaina Immune by first dose Last update: Tuesday 8 June, 09:42 PM Immune by first dose until 22/10/2021 New Services Display All > COVID-19 Vaccine Certify Mobile Organ Health Donation Passport Number D00 000 of Javeed Ali Khan Mohammed 2284440597 lisai Taaina Immune by first dose """
txt = """Immune by first dose Last update Mon, 14 Jun 12:00 PM O Current Permits Immune by first dose Last update Mon, 14 Jun 12:00 PM O Current Permits"""

# pattern = r"""(?P<day_of_week>(mon|tues|wed(nes)?|thur(s)?|fri|sat(ur)?|sun)(day)?)(?<day_of_month>\s*\d+)(?<month>\s*(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?),\s*)(?p<time>1[0-2]|0?[1-9]):([0-5][0-9]) ?([AaPp][Mm])"""

pattern = r"""(?P<day_of_week>(mon|tues|wed(nes)?|thur(s)?|fri|sat(ur)?|sun)(day)?,?)(?P<day_of_month>\s*\d+)(?P<month>\s*(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?),?\s*)(?P<time>(1[0-2]|0?[1-9]):([0-5][0-9]) ?([AaPp][Mm]))"""
# pattern = r"""((mon|tues|wed(nes)?|thur(s)?|fri|sat(ur)?|sun)(day)?)(\s*\d+)(\s*(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?),\s*)(1[0-2]|0?[1-9]):([0-5][0-9]) ?([AaPp][Mm])"""

result = re.search(pattern, txt, re.IGNORECASE)
print("result is")
print(result)
if result:
    print(result.group("day_of_week"))
    print(result.group("day_of_month"))
    print(result.group("month"))
    print(result.group("time"))




start
result is
<_sre.SRE_Match object at 0x7fb2d4181510>
Mon,
 14
 Jun
12:00 PM