Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

StackOverflow Point

StackOverflow Point Navigation

  • Web Stories
  • Badges
  • Tags
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Web Stories
  • Badges
  • Tags
Home/ Questions/Q 1661
Alex Hales
  • 0
Alex HalesTeacher
Asked: May 30, 20222022-05-30T22:54:52+00:00 2022-05-30T22:54:52+00:00

python – How to findall() a sequence of regular expressions to a pandas dataframe?

  • 0

[ad_1]

I am extracting some patterns with pandas findall function. However, I have several regular expressions. This, how can I findall N regular expressions with pandas?.

For example, lets say that I would like to extract the all the numbers and all the dates inside an specific column:

In:

dfs = pd.DataFrame(data={'c1': ['This dataset 11/12/98 contains 5,000 rows, which were sampled from a 500,000 11/12/12 row dataset spanning the same time period. Throughout these analyses', 

                                'the number of events you count will be about 100 times smaller than they 11/12/78 actually were, but the 01/12/11 proportions of events will still generally be reflective that larger dataset. In this case, a sample is fine because our purpose is to learn methods of data analysis with Python, not to create 100% accurate recommendations to Watsi.']})

dfs

Out:

    c1
0   This dataset 11/12/98 contains 5,000 rows, whi...
1   the number of events you count will be about 1...

I tried to, but I am getting the following error:

In:

dfs['patterns'] = dfs['c1'].str.findall([r'\d+',r'(\d+/\d+/\d+)']).apply(', '.join)

dfs

Out:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-64-af2969e06a61> in <module>()
----> 1 dfs['patterns'] = dfs['c1'].str.findall([r'\d+',r'(\d+/\d+/\d+)']).apply(', '.join)
      2 dfs

/usr/local/lib/python3.5/site-packages/pandas/core/strings.py in wrapper2(self, pat, flags, **kwargs)
   1268 
   1269     def wrapper2(self, pat, flags=0, **kwargs):
-> 1270         result = f(self._data, pat, flags=flags, **kwargs)
   1271         return self._wrap_result(result)
   1272 

/usr/local/lib/python3.5/site-packages/pandas/core/strings.py in str_findall(arr, pat, flags)
    827     extractall : returns DataFrame with one column per capture group
    828     """
--> 829     regex = re.compile(pat, flags=flags)
    830     return _na_map(regex.findall, arr)
    831 

/usr/local/Cellar/python3/3.5.2_2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/re.py in compile(pattern, flags)
    222 def compile(pattern, flags=0):
    223     "Compile a regular expression pattern, returning a pattern object."
--> 224     return _compile(pattern, flags)
    225 
    226 def purge():

/usr/local/Cellar/python3/3.5.2_2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/re.py in _compile(pattern, flags)
    279     # internal: compile pattern
    280     try:
--> 281         p, loc = _cache[type(pattern), pattern, flags]
    282         if loc is None or loc == _locale.setlocale(_locale.LC_CTYPE):
    283             return p

TypeError: unhashable type: 'list'

Therefore how can I “stack”, “nest” or apply several regex with findall function?. What I expect as an output is the resolution of each regular expression separated by , in a single column:

   col
0  '11/12/98', '5', '000', '500', '000', '11/12/12'
1  '100', '11/12/78', '01/12/11', '100'

UPDATE

I tried to:

dfs['patterns'] = dfs['c1'].str.map(findall(),[r'\d+',r'(\d+/\d+/\d+)']).apply(', '.join)
dfs

[ad_2]

  • 0 0 Answers
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report
Leave an answer

Leave an answer
Cancel reply

Browse

Sidebar

Ask A Question

Related Questions

  • xcode - Can you build dynamic libraries for iOS and ...

    • 0 Answers
  • bash - How to check if a process id (PID) ...

    • 4997 Answers
  • database - Oracle: Changing VARCHAR2 column to CLOB

    • 1084 Answers
  • What's the difference between HEAD, working tree and index, in ...

    • 1028 Answers
  • Amazon EC2 Free tier - how many instances can I ...

    • 0 Answers

Stats

  • Questions : 43k

Subscribe

Login

Forgot Password?

Footer

Follow

© 2022 Stackoverflow Point. All Rights Reserved.

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.