Monthly Archives: April 2013

Using PyInstaller to make EXEs from Python scripts (and a 48-hour game design compo)

How to Create a Single Windows Executable from a Python and PyGame Project (Summary)

Here’s how you use PyInstaller and PyGame to create a single-file executable from a project that has a data directory that contains resources like images, fonts, and music.

  1. Get PyInstaller.
    • On Windows, you might also need pywin32 (and possibly MinGW if you don’t have Visual Studio).
    • On Mac OS X, you will need XCode’s command line tools. To install the Command Line tools, first install XCode from the App Store, then go to Preferences – Downloads and there is an option to download them there.
  2. Modify your code so that whenever you refer to your data directory, you wrap it using the following function:
    def resource_path(relative):
        if hasattr(sys, "_MEIPASS"):
            return os.path.join(sys._MEIPASS, relative)
        return os.path.join(relative)

    An example of usage would be

    filename = 'freesansbold.ttf'
    myfontfile = resource_path(os.path.join(data_dir, filename)

    This is mostly for convenience – it allows you to access your resources while developing, but then it’ll add the right prefix when it’s in the deployment environment.

  3. Specify exactly where your fonts are (and include them in the data directory). In other words, don’t use font = Font(None, 26). Instead, use something like font = Font(resource_path(os.path.join('data', 'freesansbold.ttf')), 14).
  4. Generate the .spec file.
    • Windows: (You want a single EXE file with your data in it, hence --onefile).
      python --onefile
    • Mac OS X: (You want an App bundle with windowed output, hence --windowed).
      python --windowed
  5. Modify the .spec file so that you add your data directory (note that these paths are relative paths to your main directory.
    • Windows: Modify the section where it says exe EXE = (pyz, and add on the next line:
      Tree('data', prefix='data'),
    • Mac OS X: Modify the section where it says app = BUNDLE(coll, and add on the next line:
      Tree('data', prefix='data'),
  6. Rebuild your package.
    python your_main_file.spec
  7. Look for your .exe or your .app bundle in the dist directory.

Phew! That took me a long time – the better part of a few hours to figure out. This post on the PyInstaller list really helped.

So why was I trying to package a Python executable file anyway? Read on…

Ludum Dare 26: 48-hour Game Design Compo

This weekend, I decided to participate in a 48-hour game design “competition”. Ludum Dare is a compo that asks you to create a video game from scratch in a 48-hour time period – you have to write your code and create all of your assets in that time period.

This means no reusing graphics, pictures, music, or sound from other projects, for example. You’re also not supposed to reuse code either. I decided to participate on the Thursday the day before. Most people use the previous weekend as a “warmup weekend” to test their tools, get some practice, and so forth. (My entry is located here, by the way).

I’ll do a more detailed compo writeup later, but I just want to concentrate on one thing that kept me up for hours after the competition: getting a Windows executable created from a Python project that uses PyGame and a data directory.

Python, Distribution, and You

I rather enjoy Python as a programming language. The syntax is reasonably concise, the language does a lot of things for you, and it’s well-laid out. There’s also a lot of good support in the form of third-party libraries. I’ve been using Python for various things for the past few years (usually small scripts for data extraction and analysis in research).

One thing I had never thought about before was distributing a Python project as an executable package, and while it was on my mind throughout the entire compo, I didn’t actually learn the process of creating the package until the last hour of the comp before submission. After you submit your primary platform, Ludum Dare allows you around 48 hours to compile for Windows, since the majority of reviewers use Windows.

The ideal submission is a single binary file (an .exe file for Windows) that doesn’t have to extract a lot of data, so that it’s easy for people to download and run your game.

PyInstaller vs. Py2exe vs. Py2app

I went on a wild goose chase trying to find out how to make a single executable file out of a Python project that would include all of my data assets. I first tried py2exe and py2app. py2app mostly worked all right, but py2exe was a pretty big mess.

The end story is that PyInstaller is newer and shinier than py2exe, and that you need to secret sauce code that someone out there on the Internet found before I did. PyInstaller basically runs EXE files by extracting the assets into a temporary data file that has a path _MEIPASS in it ((technical details here). Be sure that you check that every file is loaded in through that wrapper. The Tree() TOC syntax was also confusing, but basically, it’s the relative path of your data files and it will automatically load all of the files in that directory. Make sure it exists in the EXE portion (Windows) or the APP portion (Mac).

There’s a Make/Build cycle in PyInstaller to generate the spec file and build it in a single step as well – I find it easier to do that to generate the spec file and do an initial binary run, then to modify the spec and run PyInstaller again with the spec file as the argument. PyInstaller is pretty smart about rebuilding, and you save a lot of time.

I think in the long run, if you compare py2exe, py2app, and PyInstaller, PyInstaller is the program worth learning. It did have a pretty sharp curve for me – it didn’t help that I was trying to do this late at night after a challenging weekend!

If you do wish to use py2app to build your Mac OS X application bundle, then do keep in mind that you need to have a import pygame._view because of some kind of obscure issue.

Anyway, that’s all there is to this post for now.


Here’s the I used for py2app.

from setuptools import setup

APP = ['']
DATA_FILES = ['data']

    "argv_emulation": False,
    "compressed" : True,
    # "iconfile":'data/game.icns',        

    options={'py2app': OPTIONS},

The Whats and Hows of Programmers’ Foraging Diets: What Types of Information are Programmers Looking for?

Information seeking is one of the most important activities in human-computer interaction! One of the most influential theories in understanding, modelling, and predicting information seeking is information foraging theory. In our research, we want to understand what kinds of diets – that is, the types of information goals programmers seek while debugging. By investigating the information diets of professional programmers from an information foraging theory perspective, our work aims to help bridge the gap between results from software engineering research and Information Foraging Theory foundations as well as results from human-computer interaction research.

A pork chop taken by johnnystilletto on Flickr

Is this tasty?

A head of broccoli by Jim Mead

Is this tasty?

My co-author, David Piorkowski, is travelling soon to Paris to present our latest work: “The Whats and Hows of Programmers’ Foraging Diets”. It’s a great time to expand on this paper. Here’s the PDF Preprint!

Our Method

We had two coders examine video of nine professional programmers to identify what exactly they were looking for when trying to fix a bug in an unfamiliar open-source program. We tried to identify their overall diet by identifying if they asked questions (and received answers) belonging to one of four categories: (1) finding a place to start in code, (2) expanding on that initial starting point, (3) understanding a group of code, or (4) understanding groups of groups of code.

What is a programmer’s diet while debugging?

Overall, we found that programmers spend 50% of their debugging time foraging for information.

Surprisingly, even though all participants were pursuing the same overall goal (the bug), they sought highly diverse diets. For example, Participant 2 asked mostly about groups of groups, Participant 3 asked about finding a place to start, Participant 5 didn’t really ask about anything at all, and Participant 6 also looked for a place to start. This suggests a need for debugging tools to support “long tail” demand curves of programmer information.

How did a programmer consume these diets?

How exactly did programmers go about finding what they wanted to consume?

Again, participants used a diverse mix of strategies. Participants spent only 24% of their time following between-patch foraging strategies (such as code inspection or simply reading the package explorer straight up-and-down), but between-patch foraging (such as doing data flow or control flow) has received most of the research attention.

Surprisingly, search was not a very popular strategy, accounting for less than 15% of participants’ information foraging – and not used at all by 4 of our 9 participants—suggesting that tool support is still critical for non-search strategies in debugging!

Whats Meets Hows

Participants stubbornly pursued particular information in the face of high costs and meager returns. Some participants followed a single pattern over and over again, using the same strategy. For example, in the cases that involved a programmer looking for Type 1-initial goals, participants used code search and spatial strategies extensively but not particularly fruitfully. This emphasizes a key difference between software development and other foraging domains: the highly selective nature of programmers’ dietary needs!


Thus, we considered what programmers want in their diets and how they forage to fulfill each of their dietary needs. Our results suggest that the diet perspective can help reveal when programming tools help to reduce this net demand—and when they do not—during the 50% of debugging time programmers spend foraging.

References and Links

Are you going to be at CHI 2013? Where and when is David’s talk?  It’s on Thursday, May 2, at 11:00 in Room Blue… be there!

D. Piorkowski, S. D. Fleming, I. Kwan, M. Burnett, C. Scaffidi, R. Bellamy, J. Jordhal. The Whats and Hows of Programmers’ Foraging Diets, to appear in ACM Conference on Human-Computer Interaction (CHI), Paris, France, 2013. PDF Preprint

Our paper on the CHI 2013 web site

And… in case you haven’t seen it yet, the video preview!

Picture of tasty pork chop by Johnny Stilleto. Picture of tasty broccoli by Jim Mead.

Creating CHI Video Previews on the Cheap

Our submission to the ACM Conference on Human Factors of Computing was accepted a while ago. As part of that submission, we were also required to create a video preview for the conference.

This presented to us a few logistical problems. First, none of us had camcorders. We had iPhones, which can record video in a pinch, but that can be rather spotty as far as recording goes. Second, we did not have a lot of time in which to arrange to do principal photography or other setups that would require fieldwork. The CHI video is not particularly long – under one minute – but it still requires us to know what we’re shooting, and to ensure that we can do this all in a clean, professional manner. When you have a week, and when you’re also considering the camera-ready version of a paper, trying to direct a video is a lot of work on top of that.

We decided to take a simpler route and use an animated video instead. This presented its own suite of problems. The first one was that no one on our team had computer-based animation experience in Flash or other art tools. Thus, it was up to us to figure out how we could do this nicely with inexpensive, off-the-shelf software.

First, here’s the video…!

And here’s the preprint of the paper! D. Piorkowski, S. D. Fleming, I. Kwan, M. Burnett, C. Scaffidi, R. Bellamy, J. Jordhal. The Whats and Hows of Programmers’ Foraging Diets, ACM Conference on Human-Computer Interaction (CHI), Paris, France, 2013.

Getting a decent video preview together using inexpensive software

If you want to make an animated video and don’t know other animation tools, you can build something reasonable in Apple Keynote and iMovie! With some high-resolution graphics, transitions and Keynote actions, good builds, and a high-resolution export, we were able to create something that, while it’s not going to win us any awards, serves as a presentable video preview.

I used Apple Keynote 5 because it exports high-frame Quicktime movies. I tried to use PowerPoint on various systems, but it exported movies that were not at a high-enough framerate to animate transitions properly.

The main workflow works like this:

  1. Use Keynote to create an animated presentation, then export it to a MOV file.
  2. Record yourself talking about the slides using a headset and Audacity
  3. Use iMovie (included for free in all Macs) to match the talking to the slides

It should be noted that if you do happen to know how to use a professional animation and video tool like Adobe Flash, Adobe Premiere, Apple FInal Cut Pro, etc., you should use those tools instead! This guide is meant for people who don’t have time to learn a real animation suite or don’t have something to field or screen-record.

When you create your presentation

We decided to use an animated figure doing some actions, some text-based transitions, and a voiceover. These won’t get us the “CHI video preview of the year”, but they’ll communicate the idea across! David, in this case, drew a few initial pictures that I used for the basis of the video.

He gave me static images, so the first thing I did was use an image editor to cut out the hands to “animate” them. I believe that’s the extent of the actual hand-drawn animation in the video. The other main animation that you see is the IFT “magnifying glass”, which is animated using Keynote’s build effects.

Here are some tips:

  • Use the effects for moving, building in, and building out liberally. These make your presentation look like an actual video with movement instead of just a set of still slides. Make sure you set them to fire automatically and after an appropriate delay. Here’s some ideas:
    • Swap two images using a really fast (0.1 second) Dissolve effect to get minor animations going.
    • Use a PNG with a partly-transparent background that moves in front of the scene – this is great for magnifying glasses, speech bubbles, and so forth.
    • Use actions such as “move” and “rotate” at the same time to add more dynamic movements.
  • Be sure that the slides advance automatically. Put a lot of “time” between your slides. Even after exporting the times won’t be accurate, they’ll be off by a few fractions of a second, and that’s enough to mess things up. You’re going to need to fix these manually in iMovie.
  • Break up everything into separate slides. If you can afford to have something not animated on the screen (good for static builds), then you can take a still screenshot and insert that into iMovie – it should seamlessly transition.

Even if you get all of your timings right within the presentation itself, the Quicktime movie that Keynote exports will still not have good timings. This is simply due to limitations with frames per second and how fast you can get transitions to fire in Keynote. You’ll want to edit the resulting file in iMovie.

Editing the Audio

My colleague David recorded the audio. I asked that he use a headset and that he record in a quiet room. Doing so will save you a lot of headaches! If possible, you want to record in a reasonably small room without any noises (closets and bathrooms work great in an emergency!). I don’t know how long it took him to record the video, but it was a reasonably clean recording.

I used Audacity, an open-source program, for my audio editing. I personally sliced up the audio into separate sentences, it made it easier to match the beginning of each sentence to the actual transition. I also took the time to edit out any “ums” or “ers” that were in the reading. His voiceover was pretty good overall but I was able to fine-tune it.

Once that is done, you have to match up the audio with the video in iMovie. This is a reasonably tedious, manual process. iMovie is accurate to something like a tenth of a second, and likes to nudge things to the closest 0.4 seconds, so you do want your video to be liberal enough to not go up against these limitations. Use the “clip adjustments” and “audio adjustments” settings. If necessary, slice up the audio into smaller segments and drag and drop them into iMovie at the right points to fine-tune.


I added music to my video. Be aware of licensing arrangements! When you submit a video to the ACM, they require you to have proof of copyright usage of any media, which includes the video and music. In my case, I asked permission from the original author. (The track, by the way, is called World of Snow by DDRKirby(ISQ)).

One thing you don’t want is for your music to drown out your speaker. iMovie almost automates this process, though! There’s an iMovie feature called “ducking” that reduces the volume of the music while someone is talking – this is a common radio technique. For all of your voiceover tracks, set them to “duck other audio tracks” to a small value, like 8%. If you happen to have any gaps in your voiceover, you may need to insert some silence so that your audio doesn’t duck in and out while your speaker pauses.

In Conclusion

I am by no means a professional video editor. No one on our paper-writing team was! That was what led us down this path to creating a short video on the cheap, but the reality is that a lot of people out there aren’t professionals, and many had not ever edited a video clip or tried creating a movie. I think this just underlines the fact that researchers in any discipline have to have a large number of diverse skills, and require the ability to be resourceful and adaptable. We couldn’t afford the time or energy to go shooting video or learning video-editing software, but we also wanted something that was reasonably engaging and that would stand on its own!

I hope that some of the content on this particular post will help some others out there who are working on brief video clips or previews of their own content, be it research papers or other work. Enjoy!

Automating the Web with Selenium: Complete Tasks Automatically and Write Test Cases!

While teaching Software Engineering I during the Winter 2013 term, I learned of a web testing suite named Selenium.

I was on the lookout for a good unit testing suite for Javascript. I had previously been introduced to the YUI Testing Framework, which provides a console and enables you to easily write and run tests from a browser window, but one limitation of YUI is that, out of the box, it doesn’t support interaction with the site itself. So, while the basic usage is good for verifying libraries and similar, I wanted something that was able to interact with page elements in a different way.

Enter Selenium. Now, here’s a way to not only automate web interactions, but to make them into automated unit tests for my research projects. This post is going to serve as a brief introduction to Selenium and how to start using it to interact with websites.


Selenium is a web automation framework that enables a user to essentially script a web site. In this post we’ll talk about using Selenium version 2, which is the main, up-to-date version of the software.

There are two main ways for you to interact with a website.

The Selenium IDE: Recording actions by Demonstration

First, you can use the Selenium IDE, which is a Firefox plugin that allows you to demonstrate interactions with a web site. You essentially record actions and then as you click around the page, type in form elements, and push buttons, the IDE records the actions for you in a pseudo-markup language. (If you happen to follow my other research, it is actually somewhat similar to CoScripter, another program-by-demonstration tool for the web, but with better web interaction and fewer features. For example, Selenium cannot store temporary data into tables.)

Thus, even with minimal web development or programming experience, you could create a script for Selenium that plays back, for example, a series of clicks on a page, completes a survey, or similar.

There are some limitations with the IDE. The main one that led me to using the WebDriver (below) is that it doesn’t interact very well with “contentEditable” div tags and other HTML5 elements.

Note that the Selenium IDE is version 1.x, which does NOT correspond to Selenium itself (which is version 2.x).

The Selenium WebDriver: Programming actions in code

The second way to interact with a website is using the Selenium Webdriver, which is a driver that essentially launches a website and then enables you to look through that website’s DOM to interact with elements on the page. Thus, you can use your own browser, like Firefox or Internet Explorer to load and navigate a web site.

I wanted Selenium to be able to work with a highly interactive web app: Gidget. Gidget is a programming game for kids and teenagers that I am working on in collaboration with Andy Ko and Michael Lee. Gidget runs with a lot of HTML5, JQuery, and Javascript and is extremely visual – something that seems perfect for Selenium.

Unfortunately, since Gidget isn’t available publicly yet, I can’t actually put tests on the blog that’ll directly run Gidget so instead I’ll use Google.

Building a Selenium Script

Selenium has a number of bindings in Java, Ruby, Python, and Javascript. I personally chose to use Python – I like its concise syntax and the fact that it has pretty good library support. I’ll focus exclusively on Python in this particular post. Most of the Java bindings can be derived from the Python commands if you remove the underscores and instead use CamelCase – for example, “element.is_enabled()” in Python would be “element.isEnabled()” in Java.

To install Selenium, you generally need to only type

pip install selenium

assuming that Python exists already on your system.

The first place I’m going to point you at is the Selenium Documentation Example of WebDriver, where they already include an example of making a query on Google. Here’s a reproduction of the code.

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from import WebDriverWait
from import expected_conditions as EC

# Create a new instance of the Firefox driver
driver = webdriver.Firefox()

# go to the google home page

# find the element that's name attribute is q (the google search box)
inputElement = driver.find_element_by_name("q")

# type in the search

# submit the form (although google automatically searches now without submitting)

# the page is ajaxy so the title is originally this:
print driver.title

    # we have to wait for the page to refresh, the last thing that seems to be updated is the title
    WebDriverWait(driver, 10).until(EC.title_contains("cheese!"))

    # You should see "cheese! - Google Search"
    print driver.title


It works exactly as advertised and opens up a web browser, goes to Google, and then searches for “cheese!”. However, there are a number of features that are essential to actually testing web pages.

Waiting for Pages

The unfortunate reality of web pages nowadays is that you have to wait a lot. Whether it’s a form submit where you have to wait for the POST request to complete, or some really slow JQuery fading box, not everything you want to interact with is available. To get around this, you have to use the “wait” commands in Selenium.

The Selenium documents do provide a few examples but I had to do a lot of searching and testing to get things working so I’ll just provide my use cases here directly.

To wait for an element to appear in Selenium, you need to provide an explicit wait along with a condition. It basically waits until either the identified element loads, or until the timeout passes (at which point it will throw an exception). There is an example of that in the Selenium example above, but if you have Google Instant turned on, you’ll realise notice that Google now returns search results to you as soon as you start typing. How can you interact with page elements if you don’t even know what’s going to pop up, when?

In this case, we’re going to wait until the “Search Results” text pops up.

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from import WebDriverWait
from import Select
from import By
from import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

if __name__ == "__main__":
    driver = webdriver.Firefox()
    wait = WebDriverWait(driver, 100)
    inputElement = driver.find_element_by_name("q")
    inputElement.send_keys("Irwin Kwan")
    wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@href='']")))
    blog = driver.find_element_by_xpath("//a[@href='']")

Basically, the driver loads, we set up a “wait” that waits up to 100ms, and then we do a search on Google for my name. Note that we don’t actually submit the form – instead, we wait for the link to my blog to appear in the Google search results, find the element, then click on it. And yes, this is a cheap way to rack up those hits for my blog, so everyone run this code. 😉

There are a lot of expected conditions available in Selenium and this is probably the most important thing to be aware of when first starting. You can’t interact with page elements unless they’re available, after all. Here are some of the useful ones:

  • EC.element_to_be_clickable: Wait for the element to be clickable. Good for elements that aren’t always visible or enabled. I use this a lot in the game.
  • EC.visibility_of_element_located: Wait for the element to be visible. A lot of pages load all of their content, but hide it from the user. Here’s how you ensure that what’s being interacted with is actually visible.
  • EC.presence_of_element_located: The element exists somewhere on the page.

There are a lot of these expected conditions. A full list is available on the API document page.

The second key aspect here is the interaction. To do this, you have to “find” the element. So far, I’ve used two main ways of finding an element: ID, and XPath.

Finding elements by ID

Finding an element by ID is pretty much what it sounds like: searching for an element using its ID tag. ID tags are unique in the DOM and are therefore ideal for searching and testing. You can use it like this:


Finding elements by XPath

XPath is a query language designed to navigate XML. I won’t go through all of the details of XPath here, but I’ll present a few basic use cases. You basically can search on a number of conditions that you specify in the language so you can see if that element exists in the DOM.

  • If you want to search if a specific tag exists:
  • If you want to search for nested tags:
  • If you want to search that text within the tags matches:
    driver.find_element_by_xpath("//h1[text()='Heading 1']")
  • If you want to search for text in attributes
    driver.find_element_by_xpath("//img[@title='Irwin Kwan']")

With these two find_element commands, you can find most of what you need in your web pages. Selenium supports a number of other ways to search, including searching by CSS, but I haven’t needed to use it yet.

Dynamic Interactions with a Page

Another issue I encountered with the Gidget game is clicking through introduction text when I didn’t know how many pages were present. Essentially, the problem is that there’s a button that I have to push on the page, and if I push it a certain number of times, it’ll be disabled. However, I don’t know how many times I have to push it because it might be different each time.

I managed to get around this with a little fragment of code below:

wait.until(EC.element_to_be_clickable((By.ID, 'main_buttonMissionTextNext')))
while EC.element_to_be_clickable((By.ID,'main_buttonMissionTextNext')):
    if not driver.find_element_by_id("main_buttonMissionTextNext").click().is_enabled():
    wait.until(EC.element_to_be_clickable((By.ID, 'main_buttonMissionTextNext')))

There were two gotchas here: First, I wasn’t aware of it at the time, but the button was actually regenerated whenever you clicked it, so I had to search for it again when I clicked it. Second, I didn’t realise that there was an is_enabled() function you could use to test if an element is enabled or not. But now, you know!

I posted a StackOverflow question about this (which I ended up answering myself).

Using the Selenium IDE and the Selenium WebDriver Together To Save Time

While writing code is nice and fun, HTML pages are very large and are covered with tags with various IDs. It becomes tedious very quickly, even with good web development debugging tools, to search through the DOM elements to identify what you need to interact with, then writing the code to search and click on it.

So work smarter, not harder and use the Selenium IDE. If you record your actions with a page and save it in the Selenium IDE, you can use “Export Test Case As… > Python 2 / unittest / WebDriver”. Now you have Python code for that series of actions that you just performed and can integrate it into your other tests.

In my case, I used the Selenium IDE to automatically complete an exit survey in Gidget, then exported it and used the code in my other test suites. It’s a great way to save time writing code.

Making your code into a Test Suite

In Python, the unit testing is built in. You simply have to import unittest, create a class that extends it, and then write your setup and teardown functions, along with a method beginning with “test”. Here’s the previous code for interacting with Google converted into a Python test suite.

import unittest

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from import WebDriverWait
from import Select
from import By
from import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

class WebTester(unittest.TestCase):
    def setUp(self):
        self.driver = webdriver.Firefox()
        self.wait = WebDriverWait(self.driver, 100)
    def test_load_blog(self):
        inputElement = self.driver.find_element_by_name("q")
        inputElement.send_keys("Irwin Kwan")
        self.wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@href='']")))
        blog = self.driver.find_element_by_xpath("//a[@href='']")
        self.wait.until(EC.visibility_of_element_located((By.XPATH, "//h2[text()='Irwin Kwan']")))
        self.assertTrue(self.driver.find_element_by_xpath("//h2[text()='Irwin Kwan']"))
    def tearDown(self):
if __name__ == "__main__":

If you place multiple test suite classes in the file, unittest will run them all. Selenium will launch a new, fresh browser instance for each one as well.

Conclusion: Web Automation is Pretty Cool

I’m really just a new user. Selenium has a number of features that I haven’t needed and therefore don’t know much about. There’s a “remote control” mode that allows you to use a separate server to run tests for you. There are ways to store session variables, load specific Firefox profiles with add-ons, and there is a “Selenium server” mode as well. If you need these features, chances are you’ll be able to find information about them on the Selenium site documents.

I feel that this information should set most people up with enough information to get started with Selenium and making it work for them in a useful way. I hope that this post is useful to you guys!

Happy automating!

Reading and Writing just got easier with Markdown

I need to quickly take notes and share them as part of my job. It may be keeping a log during meetings, recording field notes during an observation session, or just doing preliminary notes for a web page I’m posting. However, a lot of the existing tools on computers have limitations.

  • Plain text is easy to write and generally easy to read, but there’s a lack of formatting. You can’t identify headings, create tables, insert images, or easily create hyperlinks. You can keep that information using some kind of markup language…. like HTML.
  • HTML is standard. It’s all over the place, you can format your pages nicely, and it does a reasonably good job at separating information from presentation, especially if you stick with the vanilla tags. Everyone can view HTML on their browser as well, so it’s very potable. Unfortunately, HTML is also really verbose and difficult to write. You can’t easily add linebreaks or headings. Miss a close tag and you’re toast. You have to type a LOT. You can ease this by using a WYSIWYG editor, but then your HTML markup becomes very ugly, and you’re committed to using that editor.
  • LaTeX is also very flexible but is mostly limited to PDF output (yes, you can output to HTML, but that requires a non-trivial amount of commitment to installation and setup).
  • Microsoft Word is easy to use, but is not portable to systems without Word. It exports to HTML but also has reasonably ugly HTML output. It also takes a really long time to launch and is difficult to check into a version control system.

So what is the solution? Well, I’ve discovered Markdown a few months ago, which is a lightweight markup language that is readable as plain text, but allows easy conversion to outputs like HTML. So what this means is that you can write your documents in a plain text editor, read them in a plain text editor, but also convert them easily into a web-viewable format. I’m beginning to take notes and make initial drafts of documents in Markdown because you can easily go from Markdown to almost anything else by either copy-and-pasting it, or by converting it to your target output format.

On Mac OS X, I use a command-line tool called MultiMarkdown by Fletcher Penny which is easily installable if you have Homebrew ( Then, you can either convert your documents using the command line, or use one of many text editors set up to compile your plain text into something else, like HTML.

I personally use the TextMate bundle that’s supplied by the author of MultiMarkdown. I can easily output my text file as HTML with a keyboard command, and then I can move that onto a website or similar if needed. Just today, I set up TextMate to write a timestamp when I press “ll<tab>” (that’s two lowercase letter Ls followed by a tab) so that I can easily take notes during a user study. Now, I don’t have to watch the time and try to write it in. And, since it’s in Markdown, the time shows up as bold text underneath some nice headers. If I want, I can copy and paste my original notes into Excel, post them online, or just look at them in the plain text editor without seeing all of the ugly HTML associated with it.

I’m not sure what the Windows equivalent is, but there are a lot of Markdown text editors out there and even Javascript client-side applications that allow you to write Markdown and show you a real-time preview of the Markdown in HTML! That is pretty nifty.

There are a lot of Markdown tutorials on the web, but here’s a brief command reference just to emphasize its simplicity.

Writing Headings

# This is a Level 1 heading.

## This is a Level 2 heading.

### This is a Level 3 heading. 

You can also Underline your headings.

Creating lists

* List item
* List item
    * Nested List Item
1. Numbered item
2. Numbered item


Here is some [text of the link]( "Example")

You could also use reference-style [labels][1].

[1]: "Example"


![alt text](/path/to/image.jpg "Title")

Code spans

This is a Unix command: `find . -type d -exec chmod 755 {} \;`

You can also include larger code blocks by indenting code with 4 spaces or 1 tab.


**Bold text** __Also bold text__ *Italic text* _Also Italic text_

MultiMarkdown also has syntax for tables and equations that I haven’t yet had a need for (and therefore haven’t learned), but I suspect that could come in handy as well.

In any case, I am feeling already that Markdown is going to make my life easier because it’s lightweight, you can write and read it anywhere, and it is easily transformed to useful output formats. I like to keep my thinking about formatting separate from content, and this enables me to do so in a lightweight way. While LaTeX is nice for print documents and Word is good for collaborating with others, having a simple system for quick notes is really convenient as well.