A few months ago, I created a bot on Twitter. @AllTheLanguages tweets a new language from the Ethnologue database once every hour or so, and will do so for about a year. Give or take. Sometimes the bot goes down and I have to reboot it. And there are some other bugs too. But more on that in another post…

When I tell people that I made a twitter bot, the first thing they ask (after “why?”) is “how?” Well, today, I’m going to answer that! Why? Because it was fun! How? Well, it’s complicated…

If you search Google for “how to make a twitter bot” you’ll get a large number of mediocre tutorials. They either (1) assume you know zero programming, and make a really lame twitter bot for you (like this WolphramAlpha answer bot clone), or (2) assume you already know a lot of programming, and gloss over the finer details (like how to set up your Raspberry Pi as a bot server). Some make claims like “your bot will be up and running in just 20 minutes!” (hint: it won’t take just 20 minutes. This bot took me about three days of programming, debugging, and wrestling with the server). Other tutorials assume other things, like what programming language you’re using (this one is a good tutorial, but assumes you’re using JavaScript), what operating system you’re using, and whether or not you have server space freely available to you (on the cloud or elsewhere).

My tutorial assumes you know some programming (functions, loops, lists), preferably in Python, and that you know how to install packages and run simple commands from your terminal, preferably on Mac/Unix. Hopefully, some of the basic concepts will still apply if you’re not using Python or Unix. I also assume you either already have server space, or are willing to learn how to get server space, or are willing to pay real money for server space.

If you don’t know how to do all these things, I recommend learning a programming language over at codecademy and coming back here later. If you do know how to write a function and install a package in your favorite programming language, read on…

The first thing you need before making a twitter bot is stuff to tweet. There’s a couple different ways you can go about this. You could write a whole bunch of tweets and then schedule them to be tweeted at regular intervals. This is time consuming, because you have to write all the content yourself ahead of time. The whole point of robots is automation!

Other ideas: You could use some language algorithms to generate random content, like @metaphorminute. You could pull data from news sites like @earthquakesSF. You could make a bot that answers questions like @DearAssistant. Or you could make a bot that retweets or replies to any tweet which mentions certain keywords, like these bots.

I decided that I would make a bot which tweeted all the languages and a little bit of info about them. Since there’s over 7,000 languages listed in Ethnologue, I calculated that if my bot tweeted once every 70 minutes or so it’d take one year to get through them all.

Once you have an idea, you’ll need a twitter account and its associated API keys. Signing up for a twitter account is fairly straightforward (just go to twitter.com/signup and follow the directions), but if you already have an account you’ll need to use a different email address for your new bot account. Once the profile is made, you’ll need to add a profile picture and cover picture, follow some other twitter accounts relevant to yours, and get your API keys. API keys are secret passwords that allow you to write a program which connects to your twitter account. In order to get your API keys, you’ll need to go to http://apps.twitter.com/ and create a new app. Fill out the form, choose your app, and click on the “API keys” tab. It looks like this:

twitter

I’ve barred out my API keys, since you shouldn’t share them. Copy/paste your API keys to a text file and keep them safe for now. You’ll need them to write your bot’s code in a bit!

Next, you’ll need to write the code that generates tweets. This part really depends on what you want to tweet. You’ll have to come up with your own code. Just as an example, the core functionality of my bot looks like this:

def ethnologue(lngcode):
 url = "https://www.ethnologue.com/language/"+str(lngcode)
 html = urlopen(url).read()
 Language = re.search(r'(?<=\>)[^\<\>]+(?=\</h1\>)',html).group()
 Loc = re.search(r'(?<=[A-Z]"\>)[^\<\>]+(?=\</a\>\</h2\>)',html)
 Location = " of " + Loc.group() if Loc else ""
 End = re.search(r'(\>\d\d?\D? \()(\w+ ?\w+?)(?=\)\.)',html)
 Endangerment = " " + End.group(2) if End else ""
 return Language + " is a" + Endangerment.lower() + " language" + Location + ". More info: " + url

Let’s go through this line by line. In line 1, I’m defining a function called “ethnologue”, which takes as a parameter “lngcode”. In line 2, I’m defining a URL of the language associated with that code. For example, “eng” is the language code for English, so it goes to http://www.ethnologue.com/language/eng. In line 3, I’m parsing the HTML of that URL. In lines 4 through 8, I’m using some funky (and rather clunky, and somewhat error-prone) regular expressions to find specific information about the language. In the last line, I’m returning the thing I want to tweet: A sentence composed of some strings and some of the variables I found using the regular expressions.

Regardless of what your bot will tweet, you’ll need to install several packages. Importantly, if you decide to host on the cloud, you’ll need to have your packages installed on the cloud server as well.

Which packages do you install? Well, for example, if you want to space out your tweets by a certain amount of time (as in my case, tweeting once every hour-ish), you’ll need the package “time”. You’ll also need “tweepy” to do some twitter stuff. If you’re using R, “twitteR” is a good package for this too. Other programming languages have other packages to do these things too, it just depends on what language you’re working in and what you want your bot to do. So my bot program starts off by importing my packages:

from urllib import urlopen
import time, re, tweepy, sys

Then it does some twitter authorization stuff. That looks like this:

CONSUMER_KEY = "yourAPIKeyHere"
CONSUMER_SECRET = "yourAPISecretHere"
ACCESS_KEY = "yourAccessKeyHere"
ACCESS_SECRET = "yourAccessSecretHere"
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)

If you’re building your own bot in Python, you can copy/paste this part of the code and replace the variables with the keys you got from twitter earlier.

Next, my program has my ethnologue() function, followed by a list of all 7,106 language codes from ethnologue. There is probably a less clunky way of doing this, but this was easier/faster than trying to write some kind of loop that went through every three letter combination and checked to see if it was on ethnologue and also checked to see if it had already been tweeted or not. Again, depending on what your bot will be tweeting, you may or may not have something like this.

Lastly, I have the function that does the actual tweeting. With the tweepy package, this part is surprisingly easy. It looks like this:

while len(languageCodes) > 0:
 tweetLng = languageCodes.pop()
 api.update_status(ethnologue(tweetLng))
 time.sleep(4000)

Line 1 says that the function should keep executing while there is at least one (more than zero) language left to tweet in my list of languageCodes. Line 2 pops a language code from the list (thus deleting it from the list), and line 3 takes that language code, passes it into my ethnologue() function from earlier, and updates my twitter status with the result. Line 4 makes the process wait 4000 seconds (66 minutes) before repeating.

This is, of course, just an example. My goal here isn’t to get you to write a bot that tweets ethnologue data, but rather for you to come up with your own bot!! You can replace the ethnologue() function with some other function that you write to generate tweets, you can change the frequency of tweets, and you can change the condition under which it tweets. Each bot is unique! You can post your ideas or ask for help in the comments below!

So that’s it! You’ve written a twitter bot’s code! Buuuuuut… we’re not done. You have to run the script somewhere, and keep it running until you want your bot to stop tweeting (in my case, for a whole year).

If you have a very reliable internet connection and a computer you’re willing to never turn off, you can run the script from that computer and your bot will keep tweeting forever (or until you have a power outage). Or you can just remember to turn your bot on every time you’re on your computer, and your bot will go to sleep when you go to sleep. But, this isn’t an option for most people.

Fortunately, there are a large number of cloud hosting services available. Unfortunately, many of them require a monthly subscription (typically something small, like $5/month, but still more than I was willing to pay just for one twitter bot). There’s also Amazon Web Services, which offers a year of free hosting for apps and data, and then has reasonable prices after the first year. I calculated that with the size of my twitter bot, it’ll cost a few pennies every month to host. A few pennies sounded a lot better than a few dollars, so I went with this option.

The catch with AWS is that it’s not very intuitive or user-friendly if you’re new to programming. It’s really designed for app developers, not Average Joe and Jane. If you know Linux, this is a great option. I’ll be honest, just getting everything set up was very difficult for me, but I learned a lot about Linux in the process. Luckily, I have a subscription to Lynda.com through my University, and was able to take advantage of this tutorial on AWS. I won’t go into detail about setting up the cloud hosting, but the point is you’ll either need to host it yourself, pay someone to host it for you, or use a free service like AWS but be forced to deal with a steep learning curve.

One of the most valuable things I learned through this last step of getting the bot to run on the cloud was nohup, which is a Linux command that allows a script to keep running even after you close the terminal. So, after I log on to my virtual Linux machine on the Amazon cloud, I download my twitbot.py file onto the virtual machine, and then type:

nohup python twitbot.py &

And that’s it! Even after I log out of AWS, my bot will keep happily tweeting! :)

You must be logged in to leave a reply.