Inspired by Amy Iris, I have made a little bit of automation for twitter. On twitter, it is not easy to find others by interest. This little piece of code runs a search on the terms you specify and then checks the bios of each poster for your search terms. With each user that is a match, it will add them as a follower for you.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | #!/usr/bin/env python import getopt import re import simplejson import sys import time import twitter import urllib2 from getpass import getpass from urllib import urlencode def compile_filter(query): good = [] bad = [] words = query.split() for word in words: if word[0] == '-': bad.append(re.compile(word, re.IGNORECASE)) else: good.append(re.compile(word, re.IGNORECASE)) return (good, bad) def filter_user_by_bio(user, filter, api=None): if api is None: api = twitter.Api() bio = api.GetUser(user).GetDescription() if bio is None: return False # We only follow those with bios good, bad = filter goodmatches = [] for word in bad: if not word.search(bio) is None: return False for word in good: if not word.search(bio) is None: goodmatches.append(word) if good == goodmatches: return True return False def follow_by_query(username, password, q, rpp=None, lang=None): filter = compile_filter(q) api = twitter.Api(username=username, password=password) friends = [] for user in api.GetFriends(): friends.append(user.GetScreenName()) goodusers = [] for user in get_users_from_search(q, rpp, lang): if filter_user_by_bio(user, filter, api): goodusers.append(user) newusers = [] for user in goodusers: if not user in friends: api.CreateFriendship(user) friends.append(user) newusers.append(user) return newusers def get_users_from_search(query, resultnum=None, lang=None): q = [] rpp = 10 q.append(urlencode({'q': query})) if not lang is None: q.append(urlencode({'lang': lang})) if not resultnum is None: rpp = resultnum q.append(urlencode({'rpp': rpp})) response = urllib2.urlopen( 'http://search.twitter.com/search.json?', '&'.join(q) ) data = simplejson.load(response) for result in data['results']: yield result['from_user'] def print_usage(): sys.stderr.write(""" Usage: %s -u username [-p password] [-r search_result_number] [-l language] terms ... -l language = Filter search by language. -p password = Optional. If not supplied, you will be asked for it. -r search_result_number = Number of results to pull from twitter searches. -u username = twitter username. """ % sys.argv[0]) if __name__ == '__main__': optlist, args = getopt.getopt(sys.argv[1:], 'l:p:r:u:') if not args: sys.stderr.write("You must specify search terms\n") print_usage() raise SystemExit, 1 optd = dict(optlist) if not '-u' in optd: sys.stderr.write("You must specify a user\n") print_usage() raise SystemExit, 1 username = optd['-u'] query = " ".join(args) if not '-p' in optd: sys.stderr.write("Password:") password = getpass("") else: password = optd['-p'] rpp = None if '-r' in optd: rpp = optd['-r'] lang = None if '-l' in optd: lang = optd['-l'] try: newusers = follow_by_query(username, password, query, rpp, lang) except urllib2.HTTPError, e: sys.stderr.write("Cannot connect to Twitter\n") sys.stderr.write(str(e)) sys.stderr.write("\n") else: if newusers: print ", ".join(newusers), 'Added!' |
The usage is as such, assuming the script is named twitsheep.py:
Usage: ./twitsheep.py -u username [-p password] [-r search_result_number] [-l language]
terms ...
-l language = Filter search by language.
-p password = Optional. If not supplied, you will be asked for it.
-r search_result_number = Number of results to pull from twitter searches.
-u username = twitter username.
Running the program without arguments produces the usage as well. It is best to run this with cron or Scheduled Tasks every thirty minutes at most. The default search results to check are ten, but you can turn it up to about 30. If you start getting 400 Errors, a.k.a Bad Request, you are being throttled by twitter's DoS protection. You should consider a lower amount of search results or a longer duration between searches.
You can see an active test of this script here. It is running with this command line:
./twitsheep.py -u twitsheep -r 20 -l en "python -monty -ball"
If you have any features you would like integrated into this, please leave a comment.

interesting
Looks good, will try it sometimes :)
The article is ver good. Write please more
I have been looking looking around for this kind of information. Will you post some more in future? I'll be grateful if you will.
I've been needing to write a robust twitter submission server for a while. I'll post that when I get it done. :)
Very nice. I will have to take it for a spin. Thanks for writing this, btw.
can we write the code without the 're'? i get uncomfortable at the sight of an re and to be honest, it makes entire code look ugly..~\~
Made a new post just for you. ;)
Loving it!! Will check this out shortly as well as the revisited version you did, lol. :).
(Love the design/colors!!)
--
Brie
I am new to the twitter bot/spidering world and I think this was exactly what I was looking for. I run internet marketing for a client of mine and i wanted an easier way to interact with twitter so I didn't have to run tedious tasks like following followers of another twitter account. Thank you.