Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

StackOverflow Point

StackOverflow Point Navigation

  • Web Stories
  • Badges
  • Tags
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Web Stories
  • Badges
  • Tags
Home/ Questions/Q 3189
Alex Hales
  • 0
Alex HalesTeacher
Asked: June 2, 20222022-06-02T05:16:19+00:00 2022-06-02T05:16:19+00:00

pandas – python nested for loop keeps running indefinitely, yet the intended list is being created

  • 0

[ad_1]

I have a dataframe called ‘dft’ of Netflix’s TV Shows and movies, with a column, named “listed_in” with entries being a string of all the genres TV shows are classified under. Each row entry has multiple genre classification of different lengths. The genres are written as strings and separated by commas.

A single entry is something like, for example: ‘Documentary’,’International TV Shows’,’Crime TV Shows’. Another row entry may have different number of genres it classifies under, some of who may be the same as some of the genres of other rows entries.

Now I want to create a list of the unique values in all the rows.

genres = []

for i in range(0,len(dft['listed_in'].str.split(','))):
    for j in range(0,len(dft['listed_in'].str.split(',')[i])):
        if (dft['listed_in'].str.split(',')[i][j]) not in genres:
            genres.append(dft['listed_in'].str.split(',')[i][j])
        else:
            pass

This keeps the kernel running indefinitely. But the thing is, the list is being created. If I interrupt the kernel after some time, and print the list its there.

Then, I create a dataframe out of this list with the intention of having a column with the count of times each genre appears in the original dataframe.

data = {'Genres':genres,'count':[0 for i in range(0,len(genres))]}
gnr = pd.DataFrame(data = data)

Then to change the count column to each genre’s count of occurrence:

for i in range(0,65):
    for j in range(0,514):
        if gnr.loc[i,'Genres'] in (dft['listed_in'].str.split(',').index[j]):
            gnr.loc[i,'count'] = gnr.loc[i,'count'] + dft['listed_in'].str.split(',').value_counts()[j]
        else:
            pass

Then again this code keeps running indefinitely, but after interrupting it I saw the count for the 1st entry was updated in the gnr dataframe.

I don’t know what is happening.

[ad_2]

  • 0 0 Answers
  • 5 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report
Leave an answer

Leave an answer
Cancel reply

Browse

Sidebar

Ask A Question

Related Questions

  • xcode - Can you build dynamic libraries for iOS and ...

    • 0 Answers
  • bash - How to check if a process id (PID) ...

    • 3 Answers
  • database - Oracle: Changing VARCHAR2 column to CLOB

    • 5 Answers
  • What's the difference between HEAD, working tree and index, in ...

    • 4 Answers
  • Amazon EC2 Free tier - how many instances can I ...

    • 0 Answers

Stats

  • Questions : 43k

Subscribe

Login

Forgot Password?

Footer

Follow

© 2022 Stackoverflow Point. All Rights Reserved.

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.