Data Analysis of BBI Tweets Between 19th and 20th of May, 2021 – A Project by Predictive Analytical Resources Limited (PARL).
Engagement in Twitter has become an increasing activity to all the groups in Kenya, including Politicians, Youth, Working groups in different companies etc. Many Kenyans use this social media platform to express their opinion about various matters that affect the country. Most of the media stations in Kenya also use this platform to post what is happening in the country. Additionally, there are a lot of professionals in Kenya, such as lecturers, lawyers, doctors etc. who use this platform to voice their opinion.
In Kenya, in the month of May 2021, there has been an impressive and massive talk about the Building Bridges Initiative (BBI) that was initiated through the Handshake by the President and the Opposition Leader. The hiked talk was brought about by the 5-bench-High-Court ruling against the BBI document.
Our main motive was to use Twitter API to collect tweets about BBI document on dates 19 and 20th of May, 2021. One can remember that it was during these day that one famous lawyer in Kenya, Ahmednassir Abdullahi, hosted one of the largest Twitter spaces with over 7000 participants to deconstruct and talk about the BBI document and the BBI ruling.
We wanted to collect tweets to aid us understand the following:
- The distribution of tweets according to location.
- The distribution of tweets according to the device used.
- How many tweeps have verified Twitter accounts?
- How many tweets were quotes?
- The distribution of tweets between the two dates.
- Whose tweet was retweeted the most?
- How many tweets were retweets?
- What are the most commonly used words?
- What are the top 10 hashtags?
- Who are the top 10 users mentioned?
Methodology
To answer these questions, we collected tweets using a Twitter API and then saved the tweets into a CSV file containing the following information: time and date when the tweet was created, user’s full name, username, whether the account is verified, user’s location, user’s followers count, user’s following count, when the account was created, tweet text, number of likes for the tweet, number of retweets for the tweet, the device used and whether the tweet is a quote.
Using the Twitter API, we collected 10,000 tweets revolving around the search word “BBI”.
Results
The dataframe had 10,000 rows and 13 columns. The rows represent each specific tweet that was collected randomly from Twitter and the columns has the specified information about the tweet.
Each of the observation/row in the dataset had the information contained in this image. There were 13 columns, each with a specific information about the tweep and the tweet.
In this dataframe, most of the Twitter users had their location missing. This is due to the fact that most Twitter users do not like to disclose their location. Apart from that, users from Nairobi were the most. Some of the users also specified their location as Kenya, which is too general but had the second largest appearance.
Most of the Twitter users are using Android phones, Web App and iPhone in that order. There is also a small proportion of the sample using iPad for their tweets.
There is a very small proportion of the sample that has verified Twitter accounts. This might be due to the fact that the Twitter verification process is very long and takes a lot of proofs for you to be verified. Also, most of the tweeps are using fake names and accounts are so to get verified account would be very difficult.
Tweets from the original source were just 3.4% of the whole sample. The reset of the tweets were just quotes.
Of all the tweets, 24.4% were first tweets and 75.6% were retweets. This means that our dataset is more of retweets than first tweets. Most of the users are therefore retweeting the content including an additional information, which makes their tweets quotes, or retweeting without additional information, which makes their tweets both retweets and quotes.
There were 7563 retweets in this dataset. Of all these retweets, tweets from @ahmednasirlaw were the most retweeted followed by tweets from @NelsonHavi. Both of them were the main speakers in the Twitter Space created by @ahmednasirlaw and hence why their tweets were retweeted the most. On the same day, a popular program hosted by Citizen TV called the News Gang was being aired. This explains to why @citizentvkenya had its tweets retweeted among the most and also some of its presenters having their tweets retweeted.
65.4% of the tweets were tweeted on 20-05-2021 and 34.6% of the tweets on 19-05-2021, with the first tweet collected being on 19th at 12:51:18 and the last tweet on 20th at 23:59:17.
The most commonly used hashtags for that period are #BBI, #BBIJudgement, #HotSauce2ndWin, #iHeartAwards, #JKLive and #NewsGand’s.
#JKLive is a popular show hosted by Jeff Koinange in Citizen TV and is aired every Wednesday. This explains as to why the #JKLive was one of the popular hashtags. Similarly, News Gangs is aired in Citizen TV on Thursdays. The talk of the day was the High Court ruling on the BBI document. This is why #BBI and #BBIJudgemenent were the most used hashtags during that period.
Finally, the image shows the most used words in the data text. Building Bridges Initiative was the talk of the day and that is why the most used words during that period were related to it.
Conclusion
Twitter has huge amount of information that can be harnessed to give insights to businesses and companies by using a key word, many keywords, hashtags or even pulling tweets from a certain account.
We at Predictive Analytical Resources Limited have the capacity to transform your real data into insights that can help your business drive forward.