Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

StackOverflow Point

StackOverflow Point Navigation

  • Web Stories
  • Badges
  • Tags
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Web Stories
  • Badges
  • Tags
Home/ Questions/Q 3568
Alex Hales
  • 0
Alex HalesTeacher
Asked: June 2, 20222022-06-02T19:25:18+00:00 2022-06-02T19:25:18+00:00

excel – Extract text from a specific pattern in text file using python

  • 0

[ad_1]

I have a text file from which I am trying to extract the titles to excel column. However, the required titles are within specific pattern:

COM *******************
COM * Title 1*
COM *******************

COM ***************************
COM * Sub 1 *
COM ***************************
{
...TEXT DETAILS...
}
COM ***************************
COM * Sub 2 *
COM ***************************
{
...TEXT DETAILS...
}


COM *******************
COM * Title 2*
COM *******************

COM ***************************
COM * T2 Sub 1  *
COM ***************************
{
...TEXT DETAILS...
}
COM ***************************
COM * T2 Sub 2 *
COM ***************************
{
...TEXT DETAILS...
}

The required output of string extraction (title) format is:

['Title 1', 'Sub 1',..,'T2 Sub 2']

or excel column as

CATEGORY
Title 1
Sub 1
Sub 2

Title 2
T2 Sub 1
T2 Sub 2

It is actually the ‘COM *****’ pattern and the middle line consisting of the title that I am unable to implement. I recently extracted required string based on string pattern which I think was similar to my current problem.

For that case i/p text file was in this format:

CTG 'GEN:LT'                               
{
TEXT DETAILS....
}

CTG 'GEN:FR'                               
{
TEXT DETAILS....
}

CTG 'GEN:G_L02'                                
{
TEXT DETAILS....
}

CTG 'GEN:ER'                               
{
TEXT DETAILS....
}

CTG 'GEN:C1' 
{
TEXT DETAILS....
}

My goal was to extract the string after CTG which is in ‘ ‘
My idea here was to detect the CTG string and print the string next to it. And here is how I implemented the same:

import re
def getCtgName(text):     
  matches = re.findall(r"'(.+?)'",text)
  return matches

mylines = []                                # Declare an empty list.
with open ('filepath.txt', 'rt') as myfile:    # Open .txt for reading text.
    for myline in myfile:                   # For each line in the file,
        mylines.append(myline.rstrip('\n')) # strip newline and add to list.

columns = []
substr = "CTG"                  # substring to search for.
for line in mylines:            # string to be searched
  if substr in line:
     columns.append(getCtgName(line)[0])
print(columns)
  

And got the output as:

['GEN:LT', 'GEN:FR',..., 'GEN:C1']

I believe similar logic can be implemented for the Title extraction between those comment (COM****) lines, any help with the code or logic or resources will be appreciated. Thank you!

[ad_2]

  • 0 0 Answers
  • 1 View
  • 0 Followers
  • 0
Share
  • Facebook
  • Report
Leave an answer

Leave an answer
Cancel reply

Browse

Sidebar

Ask A Question

Related Questions

  • xcode - Can you build dynamic libraries for iOS and ...

    • 0 Answers
  • bash - How to check if a process id (PID) ...

    • 8057 Answers
  • database - Oracle: Changing VARCHAR2 column to CLOB

    • 1842 Answers
  • What's the difference between HEAD, working tree and index, in ...

    • 1924 Answers
  • Amazon EC2 Free tier - how many instances can I ...

    • 0 Answers

Stats

  • Questions : 43k

Subscribe

Login

Forgot Password?

Footer

Follow

© 2022 Stackoverflow Point. All Rights Reserved.

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.