[ad_1]
I’m new to SQlite and I ran into a problem while trying to update 2 columns of a SQlite database for a sentiment analysis on tweets using the TextBlob library. I have to update 6.5 million rows and I want to do it as efficiently as possible. I used the following code for this.
conn = sqlite3.connect('tweets.db')
c = conn.cursor()
from textblob import TextBlob
english_query = """
SELECT tweet_id, text
FROM tweetInfo
WHERE lang = 'en'
"""
c.execute(english_query)
e_tweets = c.fetchall()
conn.commit()
data_list = []
for i in range(len(e_tweets)):
small_list = [TextBlob(e_tweets[i][1]).sentiment.polarity, TextBlob(e_tweets[i][1]).sentiment.subjectivity,
e_tweets[i][0]]
data_list.append(small_list)
update_query = '''
UPDATE tweetInfo
SET polarity = ?, subjectivity = ?
WHERE tweet_id = ?
'''
data = data_list
c.executemany(update_query, data)
conn.commit()
I used the executemany function because I found online that it was suppossed to be fast with handeling large amount of data. The code seems to work fine, but it takes multiple hours to finish, so I’m wondering if I did anything wrong here with the executemany function or the code in general. Does anyone have a solution for this?
[ad_2]