[ad_1]
Wondering if there is any way that I can simply grab all matches of a RegEx query in a specific, non-unique div in python? I am aware that beautiful soup would be useful but we are specifically not allowed to use it. Why I don’t know. I essentially need any matches in a particular div. Another way of looking at it is all the matches of a findall query that are from the first div.
e.g.
<div class="text">
<p class="content1">content1</p>
<p class="content2">content2</p>
</div>
<div class="text">
<p class="content3">content3</p>
<p class="content4">content4</p>
</div>
I can find the contents of the div easily(?<=<div class="text"><p class="content">)(.+?)(?=</p></div>)
(and then .group(0)) and one content piece e.g. can find content1
totally fine but I need to find content1
and content2
– not content3
and content4
Also taking into account there can be one or more instances of a <p>content</p>
tag in a div. Any recommendations?
[ad_2]