[ad_1]
I’m trying to search r'CONTENTS\.\n+CHAPTER I\.'
within a string from Gutenberg project, but I’m getting AttributeError
, as it doesn’t match, but the same pattern does match outside the function. My code is below:
def gutenberg(url):
responce=request.urlopen(url)
raw=responce.read().decode('utf8')
print(re.search(r"CONTENTS\.\n+CHAPTER I\.",raw).group())
a=gutenberg("https://www.gutenberg.org/files/76/76-0.txt")
Output:
...
print(re.search(r"CONTENTS\.\n+CHAPTER I\.",raw).group())
AttributeError: 'NoneType' object has no attribute 'group'
And outside the function:
a="""Complete
CONTENTS.
CHAPTER I. Civilizing"""
re.search(r"CONTENTS\.\n+CHAPTER I\.",a).group()
Output:
'CONTENTS.\n\nCHAPTER I.'
Though, it works fine within the function when there’s no new line character in the pattern: print(re.search(r"CONTENTS\.",raw).group())
.
So, I believe I need something like flags
.
What I’ve tried:
print(re.search(r"CONTENTS\.\n+CHAPTER I\.",raw,re.M).group())
pattern=re.compile(r'CONTENTS.\n+CHAPTER I.')
print(pattern.search(raw).group())
I even tried to add a backslash into my pattern: r"CONTENTS\.\\n+CHAPTER I\."
– the same AttributeError
.
I read about flags=regex.VERSION1
here but I couldn’t find information about it in the last Python’s regex guide, so I haven’t tried to use it.
Any ideas how to search for multiline pattern within a function?
In general, what’s confusing me much is different behavior of re.search() inside and outside the function. Is there a conception I’m not aware of?
Thanks in advance! I’ll appreciate any help!
[ad_2]