Tag Archives: web

Do People Seek Information Like Animals Forage for Food? An Introduction to Information Foraging Theory

Much of my research at Oregon State University examines debugging using a lens called Information Foraging Theory. I’ve written a few posts on this topic but I haven’t really given a good overview of what Information Foraging Theory is and what it provides for software engineering.

The theory, in a nutshell, is a theory of human behaviour that describes how people forage for information. They are theorized to forage in a way to provide maximum benefit for minimum value and to make decisions based on input from the environment that affects this cost/benefit ratio. This theory is applicable to software engineering because software engineering is a very information-seeking intensive activity. People spend a lot of time looking for things—whether it’s “What does this variable do?” down to, “Where can I start investigating this problem?”

Another reason why this theory is valuable in software engineering is because software engineering research often is built on ideas but not necessarily on underlying theories. Information foraging theory provides a theoretical framework that can help consolidate previous results and provide not only an explanation for why previous tools and findings have worked in the past, but also can make predictions for how people may behave in the future.

Now that we have an idea of what it is and why it’s relevant to software engineering, let’s dive into what information foraging is. Much of this post is adapted from material that appears in An Information Foraging Theory Perspective on Tools for Debugging, Refactoring, and Reuse Tasks that appears in the ACM Transactions on Software Engineering and Methodology (TOSEM), 2013. In another post, we’ll talk about how it relates to software engineering research.

Information Foraging Theory: What it is

Information Foraging Theory was originally proposed by Peter Pirolli and Stuart Card at what was then Xerox PARC to explain how individuals search the web for information. The idea was inspired by ecology’s Optimal Foraging Theory which is the idea that foraging animals attempt to maximize their energy intake (by finding food) over the time required to find that food.

Constructs and Theory

In Information Foraging Theory, the human, called a predator, is looking for information in an environment, like the web. A predator can seek information from an information source, called an information patch, and a topology is made up of many patches. Many patches make up an information topology. Patches are connected to each other through links—each link requires a certain cost to go from one patch to another. Within each patch, there are information features. These features might be words or sentences on a screen, graphics and pictures, icons, even colours and shapes.

A rounded, shaded rectangle contains hexagons with numbers inside them. Some of these hexagons are associated with outgoing links to other shaded rectangles that each have their own hexagons with numbers in them. The links have a number on top of them representing the cost of traversing the link.

Information patches (shaded boxes) in an information topology. In each information patch, there are features (hexagons) with a numerical value. Some of these features are attached to links (dashed line). Each link navigates to a different patch and has a cost.

The predator has an information goal in mind and want to seek information that satisfies that goal. This predator forages through the information topology seeking prey, which are information features that are related to the predator’s goal.

The activity of getting at information has a cost (usually time) but consuming information from a source also has an associated value (how relevant or important the information is). After consuming some amount of information (which is called prey), the predator may decide that it’s no longer worth the predator’s time to continue processing that patch and the predator navigates away from the patch to a new one that is considered more valuable.

Some information features are connected to links. In web pages, links are usually located in particular places, are coloured differently, and are sometimes underlined when you mouse over them. These features are called cues. A predator can use these cues to try to predict the value of the information on the other side of a link.

Three-panel representation of a developer looking at a screen of information. In the first panel, the developer is staring at a panel at the top of the screen. In the second panel, the developer is choosing to move to a new part of the same screen. In the third panel, the developer has chosen an alternate route of changing the view to look at an entirely new screen.

A developer decides whether to continue foraging in the same screen of information, or whether to refresh a view (which has a cost) and getting new information.

So, a developer who is foraging for information has to make a decision whether to stay within the current patch and continue processing the information in it or to access a different patch and process information from there. To make the optimal decision, the developer wants highest value information for the lowest cost!

If we decide to use math to represent this relationship, it looks like this:

A mathematical formula: Predator's desired choice equals max(V over C).

The predator wants to maximize value V of processing information and minimize the cost C of travelling to find information.

This is pretty basic so far—everyone wants to maximize their value and get the lowest cost! What is really interesting about this theory is what people’s perceptions of high value and low cost are.

Perceptions and Scents

Even though a predator wants to maximize value and get low cost, one of the main issues is that predators don’t know everything. They only know what they can see currently. Thus, predators perceive an expected value and an expected cost whenever they are processing information features from a patch, including the cues that indicate if a patch is worth leaving.

Since most patches have multiple cues, this means that the predator has to make a number of estimations, based on the cue (and possibly other factors) about whether to leave the patch. This is called information scent. Scent is often represented in practice by measures of textual similarity. Scent is also influenced by the amount of attention—for example, how big the cue’s visual size is, or the position of the cue.

Summary of Information Foraging Theory Constructs

That’s a lot of constructs. Fortunately, Fleming et al. (in an article that I helped write) built a pretty handy table to remind everyone what all of these concepts are.

Construct Description
Topology Collection of information patches and links between those patches within a particular information environment
Information patch Region in the topology that contain information features
Links L Traversable arcs between patches
Information features Elements of the environment that the predator can process to gain knowledge
Cues Set of information features associated with a particular link
Predator Person in search of information
Information goal Set of information features that the predator wants to find
Prey An individual feature in the goal set
Information scent Given a link with an associated cue, the predator’s estimation of the probability that traversing the link will lead to prey
Attention Amount of attention that a predator pays to a particular cue
Information value V Benefit of processed information to the predator
Interaction cost C Value that the predator anticipates gaining through a particular course of action (e.g., following a particular link)
Expected value E(V) Value that the predator anticipates gaining through a particular course of action (e.g., following a particular link)
Expected cost E(C) Cost that the predator anticipates incurring in following of a particular course of action

IFT’s Key Constructs, adapted from Fleming et al. 2013, An Information Foraging Theory Perspective on Tools for Debugging, Refactoring, and Reuse Tasks, ACM Transactions on Software Engineering and Methodology.

Predictions and Validations

There’s a lot of scientific work that has designed mathematical models of information foraging theory in the web domain. Pirolli and Card, 1999 investigated models to predict how people surf the web; this work was further augmented by incorporationg scent Chi et al. 2000, Chi et al. 2001.

Information foraging theory has also since been used to investigate collaborative search on the web, as well as social media tagging.

Next time: Information Foraging Theory in Software Engineering

Now that we have an idea of what information foraging theory is, I will present an overview next time about how this theory’s been applied in software engineering. So far, information foraging theory has been applied primarily to debugging tasks. Margaret Burnett has been leading the charge in this direction, but the concept is beginning to take hold in other areas of software engineering. Nan Niu, for instance, recently published at ICSE a requirements engineering paper on traceability using constructs from information foraging theory.

Stay tuned for the next part in this series!

Automating the Web with Selenium: Complete Tasks Automatically and Write Test Cases!

While teaching Software Engineering I during the Winter 2013 term, I learned of a web testing suite named Selenium.

I was on the lookout for a good unit testing suite for Javascript. I had previously been introduced to the YUI Testing Framework, which provides a console and enables you to easily write and run tests from a browser window, but one limitation of YUI is that, out of the box, it doesn’t support interaction with the site itself. So, while the basic usage is good for verifying libraries and similar, I wanted something that was able to interact with page elements in a different way.

Enter Selenium. Now, here’s a way to not only automate web interactions, but to make them into automated unit tests for my research projects. This post is going to serve as a brief introduction to Selenium and how to start using it to interact with websites.

Selenium

Selenium is a web automation framework that enables a user to essentially script a web site. In this post we’ll talk about using Selenium version 2, which is the main, up-to-date version of the software.

There are two main ways for you to interact with a website.

The Selenium IDE: Recording actions by Demonstration

First, you can use the Selenium IDE, which is a Firefox plugin that allows you to demonstrate interactions with a web site. You essentially record actions and then as you click around the page, type in form elements, and push buttons, the IDE records the actions for you in a pseudo-markup language. (If you happen to follow my other research, it is actually somewhat similar to CoScripter, another program-by-demonstration tool for the web, but with better web interaction and fewer features. For example, Selenium cannot store temporary data into tables.)

Thus, even with minimal web development or programming experience, you could create a script for Selenium that plays back, for example, a series of clicks on a page, completes a survey, or similar.

There are some limitations with the IDE. The main one that led me to using the WebDriver (below) is that it doesn’t interact very well with “contentEditable” div tags and other HTML5 elements.

Note that the Selenium IDE is version 1.x, which does NOT correspond to Selenium itself (which is version 2.x).

The Selenium WebDriver: Programming actions in code

The second way to interact with a website is using the Selenium Webdriver, which is a driver that essentially launches a website and then enables you to look through that website’s DOM to interact with elements on the page. Thus, you can use your own browser, like Firefox or Internet Explorer to load and navigate a web site.

I wanted Selenium to be able to work with a highly interactive web app: Gidget. Gidget is a programming game for kids and teenagers that I am working on in collaboration with Andy Ko and Michael Lee. Gidget runs with a lot of HTML5, JQuery, and Javascript and is extremely visual – something that seems perfect for Selenium.

Unfortunately, since Gidget isn’t available publicly yet, I can’t actually put tests on the blog that’ll directly run Gidget so instead I’ll use Google.

Building a Selenium Script

Selenium has a number of bindings in Java, Ruby, Python, and Javascript. I personally chose to use Python – I like its concise syntax and the fact that it has pretty good library support. I’ll focus exclusively on Python in this particular post. Most of the Java bindings can be derived from the Python commands if you remove the underscores and instead use CamelCase – for example, “element.is_enabled()” in Python would be “element.isEnabled()” in Java.

To install Selenium, you generally need to only type

pip install selenium

assuming that Python exists already on your system.

The first place I’m going to point you at is the Selenium Documentation Example of WebDriver, where they already include an example of making a query on Google. Here’s a reproduction of the code.

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Create a new instance of the Firefox driver
driver = webdriver.Firefox()

# go to the google home page
driver.get("http://www.google.com")

# find the element that's name attribute is q (the google search box)
inputElement = driver.find_element_by_name("q")

# type in the search
inputElement.send_keys("Cheese!")

# submit the form (although google automatically searches now without submitting)
inputElement.submit()

# the page is ajaxy so the title is originally this:
print driver.title

try:
    # we have to wait for the page to refresh, the last thing that seems to be updated is the title
    WebDriverWait(driver, 10).until(EC.title_contains("cheese!"))

    # You should see "cheese! - Google Search"
    print driver.title

finally:
    driver.quit()

It works exactly as advertised and opens up a web browser, goes to Google, and then searches for “cheese!”. However, there are a number of features that are essential to actually testing web pages.

Waiting for Pages

The unfortunate reality of web pages nowadays is that you have to wait a lot. Whether it’s a form submit where you have to wait for the POST request to complete, or some really slow JQuery fading box, not everything you want to interact with is available. To get around this, you have to use the “wait” commands in Selenium.

The Selenium documents do provide a few examples but I had to do a lot of searching and testing to get things working so I’ll just provide my use cases here directly.

To wait for an element to appear in Selenium, you need to provide an explicit wait along with a condition. It basically waits until either the identified element loads, or until the timeout passes (at which point it will throw an exception). There is an example of that in the Selenium example above, but if you have Google Instant turned on, you’ll realise notice that Google now returns search results to you as soon as you start typing. How can you interact with page elements if you don’t even know what’s going to pop up, when?

In this case, we’re going to wait until the “Search Results” text pops up.

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

if __name__ == "__main__":
    driver = webdriver.Firefox()
    wait = WebDriverWait(driver, 100)
    driver.get("http://google.com")
    
    inputElement = driver.find_element_by_name("q")
    inputElement.send_keys("Irwin Kwan")
    
    wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@href='https://irwinhkwan.wordpress.com/']")))
    blog = driver.find_element_by_xpath("//a[@href='https://irwinhkwan.wordpress.com/']")
    blog.click()
    
    driver.quit()

Basically, the driver loads, we set up a “wait” that waits up to 100ms, and then we do a search on Google for my name. Note that we don’t actually submit the form – instead, we wait for the link to my blog to appear in the Google search results, find the element, then click on it. And yes, this is a cheap way to rack up those hits for my blog, so everyone run this code. 😉

There are a lot of expected conditions available in Selenium and this is probably the most important thing to be aware of when first starting. You can’t interact with page elements unless they’re available, after all. Here are some of the useful ones:

  • EC.element_to_be_clickable: Wait for the element to be clickable. Good for elements that aren’t always visible or enabled. I use this a lot in the game.
  • EC.visibility_of_element_located: Wait for the element to be visible. A lot of pages load all of their content, but hide it from the user. Here’s how you ensure that what’s being interacted with is actually visible.
  • EC.presence_of_element_located: The element exists somewhere on the page.

There are a lot of these expected conditions. A full list is available on the selenium.webdriver.support.expected_conditions API document page.

The second key aspect here is the interaction. To do this, you have to “find” the element. So far, I’ve used two main ways of finding an element: ID, and XPath.

Finding elements by ID

Finding an element by ID is pretty much what it sounds like: searching for an element using its ID tag. ID tags are unique in the DOM and are therefore ideal for searching and testing. You can use it like this:

driver.find_element_by_id("menubutton_id")

Finding elements by XPath

XPath is a query language designed to navigate XML. I won’t go through all of the details of XPath here, but I’ll present a few basic use cases. You basically can search on a number of conditions that you specify in the language so you can see if that element exists in the DOM.

  • If you want to search if a specific tag exists:
    driver.find_element_by_xpath("//h1")
  • If you want to search for nested tags:
    driver.find_element_by_xpath("/html/body/h1")
  • If you want to search that text within the tags matches:
    driver.find_element_by_xpath("//h1[text()='Heading 1']")
  • If you want to search for text in attributes
    driver.find_element_by_xpath("//img[@title='Irwin Kwan']")

With these two find_element commands, you can find most of what you need in your web pages. Selenium supports a number of other ways to search, including searching by CSS, but I haven’t needed to use it yet.

Dynamic Interactions with a Page

Another issue I encountered with the Gidget game is clicking through introduction text when I didn’t know how many pages were present. Essentially, the problem is that there’s a button that I have to push on the page, and if I push it a certain number of times, it’ll be disabled. However, I don’t know how many times I have to push it because it might be different each time.

I managed to get around this with a little fragment of code below:

wait.until(EC.element_to_be_clickable((By.ID, 'main_buttonMissionTextNext')))
while EC.element_to_be_clickable((By.ID,'main_buttonMissionTextNext')):
    driver.find_element_by_id("main_buttonMissionTextNext").click()
    if not driver.find_element_by_id("main_buttonMissionTextNext").click().is_enabled():
        break
    wait.until(EC.element_to_be_clickable((By.ID, 'main_buttonMissionTextNext')))

There were two gotchas here: First, I wasn’t aware of it at the time, but the button was actually regenerated whenever you clicked it, so I had to search for it again when I clicked it. Second, I didn’t realise that there was an is_enabled() function you could use to test if an element is enabled or not. But now, you know!

I posted a StackOverflow question about this (which I ended up answering myself).

Using the Selenium IDE and the Selenium WebDriver Together To Save Time

While writing code is nice and fun, HTML pages are very large and are covered with tags with various IDs. It becomes tedious very quickly, even with good web development debugging tools, to search through the DOM elements to identify what you need to interact with, then writing the code to search and click on it.

So work smarter, not harder and use the Selenium IDE. If you record your actions with a page and save it in the Selenium IDE, you can use “Export Test Case As… > Python 2 / unittest / WebDriver”. Now you have Python code for that series of actions that you just performed and can integrate it into your other tests.

In my case, I used the Selenium IDE to automatically complete an exit survey in Gidget, then exported it and used the code in my other test suites. It’s a great way to save time writing code.

Making your code into a Test Suite

In Python, the unit testing is built in. You simply have to import unittest, create a class that extends it, and then write your setup and teardown functions, along with a method beginning with “test”. Here’s the previous code for interacting with Google converted into a Python test suite.

import unittest

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

class WebTester(unittest.TestCase):
    def setUp(self):
        self.driver = webdriver.Firefox()
        self.wait = WebDriverWait(self.driver, 100)
        
    def test_load_blog(self):
        self.driver.get("http://google.com")
    
        inputElement = self.driver.find_element_by_name("q")
        inputElement.send_keys("Irwin Kwan")
    
        self.wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@href='https://irwinhkwan.wordpress.com/']")))
        blog = self.driver.find_element_by_xpath("//a[@href='https://irwinhkwan.wordpress.com/']")
        blog.click()
        
        self.wait.until(EC.visibility_of_element_located((By.XPATH, "//h2[text()='Irwin Kwan']")))
        self.assertTrue(self.driver.find_element_by_xpath("//h2[text()='Irwin Kwan']"))
    
    def tearDown(self):
        self.driver.quit()
    
if __name__ == "__main__":
    unittest.main()

If you place multiple test suite classes in the file, unittest will run them all. Selenium will launch a new, fresh browser instance for each one as well.

Conclusion: Web Automation is Pretty Cool

I’m really just a new user. Selenium has a number of features that I haven’t needed and therefore don’t know much about. There’s a “remote control” mode that allows you to use a separate server to run tests for you. There are ways to store session variables, load specific Firefox profiles with add-ons, and there is a “Selenium server” mode as well. If you need these features, chances are you’ll be able to find information about them on the Selenium site documents.

I feel that this information should set most people up with enough information to get started with Selenium and making it work for them in a useful way. I hope that this post is useful to you guys!

Happy automating!