I’m new to SQlite and I ran into a problem while trying to update 2 columns of a SQlite database for a sentiment analysis on tweets using the TextBlob library. I have to update 6.5 million rows and I want to do it as efficiently as possible. I used the following code for this.
conn = sqlite3.connect('tweets.db') c = conn.cursor() from textblob import TextBlob english_query = """ SELECT tweet_id, text FROM tweetInfo WHERE lang = 'en' """ c.execute(english_query) e_tweets = c.fetchall() conn.commit() data_list =  for i in range(len(e_tweets)): small_list = [TextBlob(e_tweets[i]).sentiment.polarity, TextBlob(e_tweets[i]).sentiment.subjectivity, e_tweets[i]] data_list.append(small_list) update_query = ''' UPDATE tweetInfo SET polarity = ?, subjectivity = ? WHERE tweet_id = ? ''' data = data_list c.executemany(update_query, data) conn.commit()
I used the executemany function because I found online that it was suppossed to be fast with handeling large amount of data. The code seems to work fine, but it takes multiple hours to finish, so I’m wondering if I did anything wrong here with the executemany function or the code in general. Does anyone have a solution for this?