So @mitblogs_ebooks exists

x-posted from the MIT Admissions blog

Some of you may remember @horse_ebooks, a Twitter account which, before it was bought and subverted by Buzzfeed, was a truly delightful gibberish machine which spouted pseudorandomly generated spam tweets from a collection of source texts. Some of my favorites:



Fraudulent or not, @horse_ebooks helped inspire an entire genre of surrealist _ebooks-style Twitter bots, which actually do take source texts and produce randomly generated tweets inflected by the voice of various academics, journalists, and programmers. Because they are randomly generated, many, perhaps most, of these tweets aren't very funny. But some of them are really funny, if in an admittedly odd way, because while they are consonant in subject and voice with the source texts, they are probabilistically written in ways that the 'actual' authors never would. The practical result is that you get tweets which sound strangely familiar but are off just enough to be startling and (sometimes) funny. 

A few months ago I decided I wanted to make one for the blogs. Over the last few weeks, after reading and committee ended, I actually did. Here's how: 

First, I wrote a crude but effective scraper in Python. This script crawls the blogs, downloads every entry ever written, uses the BeautifulSoup library to parse the HTML, and writes each parsed line to a text file.

Then, I cobbled together a tweet generator in Ruby. This script takes the text file as a source, uses the MarkyMarkov gem to map probabilistic word relationships, randomly generates sentences, rolls a D20 to decide if they should be SHOUTED IN ALL CAPS, and posts the final result to Twitter.

I uploaded the source text and the ruby script to scripts, a free hosting service operated by MIT students for the MIT community, and set my cron file to run it every three hours.

TL;DR: @mitblogs_ebooks is now a thing. Everything it says is randomly generated from a source text of every blog entry every written. I like to think of it as admissions advice from an alternate universe, spoken not by any single blogger but by the rumbling chorus of a collective, semi-sentient blogger organism:



So there you go. I had never used Ruby, or MarkyMarkov, or a lot of other things before I began this piece of carpentry, but I personally find that trying (and failing, and trying again) is the best way to learn. In making @mitblogs_ebooks, I learned a lot, and sometimes the thing I made even makes me laugh because of how weird it is, which is an added bonus. If you want to try your own exercise in computationally generated weirdness, you can download my code here. Happy making! 

Chris Peterson

About Chris Peterson

Chris Peterson works, teaches, and researches at MIT. As an Assistant Director of Undergraduate Admissions, he oversees MIT's recruitment and evaluation of top academic and technical talent, as well as advising on communications strategy and strategic initiatives; as a Lecturer in CMS/W, he has taught a popular course surveying social media research and scholarship; as an affiliate of the Center for Civic Media, he helps to lead the Mapping Information Access project. He also serves on the Board of Directors at the National Coalition Against Censorship and as a Fellow at the Digital Ecologies Research Project. When not on campus you can probably find him eating hamburgers nearby. Thesis: User-Generated Censorship: Manipulating the Maps Of Social Media