Tag Archives: human-computer interaction

Do People Seek Information Like Animals Forage for Food? An Introduction to Information Foraging Theory

Much of my research at Oregon State University examines debugging using a lens called Information Foraging Theory. I’ve written a few posts on this topic but I haven’t really given a good overview of what Information Foraging Theory is and what it provides for software engineering.

The theory, in a nutshell, is a theory of human behaviour that describes how people forage for information. They are theorized to forage in a way to provide maximum benefit for minimum value and to make decisions based on input from the environment that affects this cost/benefit ratio. This theory is applicable to software engineering because software engineering is a very information-seeking intensive activity. People spend a lot of time looking for things—whether it’s “What does this variable do?” down to, “Where can I start investigating this problem?”

Another reason why this theory is valuable in software engineering is because software engineering research often is built on ideas but not necessarily on underlying theories. Information foraging theory provides a theoretical framework that can help consolidate previous results and provide not only an explanation for why previous tools and findings have worked in the past, but also can make predictions for how people may behave in the future.

Now that we have an idea of what it is and why it’s relevant to software engineering, let’s dive into what information foraging is. Much of this post is adapted from material that appears in An Information Foraging Theory Perspective on Tools for Debugging, Refactoring, and Reuse Tasks that appears in the ACM Transactions on Software Engineering and Methodology (TOSEM), 2013. In another post, we’ll talk about how it relates to software engineering research.

Information Foraging Theory: What it is

Information Foraging Theory was originally proposed by Peter Pirolli and Stuart Card at what was then Xerox PARC to explain how individuals search the web for information. The idea was inspired by ecology’s Optimal Foraging Theory which is the idea that foraging animals attempt to maximize their energy intake (by finding food) over the time required to find that food.

Constructs and Theory

In Information Foraging Theory, the human, called a predator, is looking for information in an environment, like the web. A predator can seek information from an information source, called an information patch, and a topology is made up of many patches. Many patches make up an information topology. Patches are connected to each other through links—each link requires a certain cost to go from one patch to another. Within each patch, there are information features. These features might be words or sentences on a screen, graphics and pictures, icons, even colours and shapes.

A rounded, shaded rectangle contains hexagons with numbers inside them. Some of these hexagons are associated with outgoing links to other shaded rectangles that each have their own hexagons with numbers in them. The links have a number on top of them representing the cost of traversing the link.

Information patches (shaded boxes) in an information topology. In each information patch, there are features (hexagons) with a numerical value. Some of these features are attached to links (dashed line). Each link navigates to a different patch and has a cost.

The predator has an information goal in mind and want to seek information that satisfies that goal. This predator forages through the information topology seeking prey, which are information features that are related to the predator’s goal.

The activity of getting at information has a cost (usually time) but consuming information from a source also has an associated value (how relevant or important the information is). After consuming some amount of information (which is called prey), the predator may decide that it’s no longer worth the predator’s time to continue processing that patch and the predator navigates away from the patch to a new one that is considered more valuable.

Some information features are connected to links. In web pages, links are usually located in particular places, are coloured differently, and are sometimes underlined when you mouse over them. These features are called cues. A predator can use these cues to try to predict the value of the information on the other side of a link.

Three-panel representation of a developer looking at a screen of information. In the first panel, the developer is staring at a panel at the top of the screen. In the second panel, the developer is choosing to move to a new part of the same screen. In the third panel, the developer has chosen an alternate route of changing the view to look at an entirely new screen.

A developer decides whether to continue foraging in the same screen of information, or whether to refresh a view (which has a cost) and getting new information.

So, a developer who is foraging for information has to make a decision whether to stay within the current patch and continue processing the information in it or to access a different patch and process information from there. To make the optimal decision, the developer wants highest value information for the lowest cost!

If we decide to use math to represent this relationship, it looks like this:

A mathematical formula: Predator's desired choice equals max(V over C).

The predator wants to maximize value V of processing information and minimize the cost C of travelling to find information.

This is pretty basic so far—everyone wants to maximize their value and get the lowest cost! What is really interesting about this theory is what people’s perceptions of high value and low cost are.

Perceptions and Scents

Even though a predator wants to maximize value and get low cost, one of the main issues is that predators don’t know everything. They only know what they can see currently. Thus, predators perceive an expected value and an expected cost whenever they are processing information features from a patch, including the cues that indicate if a patch is worth leaving.

Since most patches have multiple cues, this means that the predator has to make a number of estimations, based on the cue (and possibly other factors) about whether to leave the patch. This is called information scent. Scent is often represented in practice by measures of textual similarity. Scent is also influenced by the amount of attention—for example, how big the cue’s visual size is, or the position of the cue.

Summary of Information Foraging Theory Constructs

That’s a lot of constructs. Fortunately, Fleming et al. (in an article that I helped write) built a pretty handy table to remind everyone what all of these concepts are.

Construct Description
Topology Collection of information patches and links between those patches within a particular information environment
Information patch Region in the topology that contain information features
Links L Traversable arcs between patches
Information features Elements of the environment that the predator can process to gain knowledge
Cues Set of information features associated with a particular link
Predator Person in search of information
Information goal Set of information features that the predator wants to find
Prey An individual feature in the goal set
Information scent Given a link with an associated cue, the predator’s estimation of the probability that traversing the link will lead to prey
Attention Amount of attention that a predator pays to a particular cue
Information value V Benefit of processed information to the predator
Interaction cost C Value that the predator anticipates gaining through a particular course of action (e.g., following a particular link)
Expected value E(V) Value that the predator anticipates gaining through a particular course of action (e.g., following a particular link)
Expected cost E(C) Cost that the predator anticipates incurring in following of a particular course of action

IFT’s Key Constructs, adapted from Fleming et al. 2013, An Information Foraging Theory Perspective on Tools for Debugging, Refactoring, and Reuse Tasks, ACM Transactions on Software Engineering and Methodology.

Predictions and Validations

There’s a lot of scientific work that has designed mathematical models of information foraging theory in the web domain. Pirolli and Card, 1999 investigated models to predict how people surf the web; this work was further augmented by incorporationg scent Chi et al. 2000, Chi et al. 2001.

Information foraging theory has also since been used to investigate collaborative search on the web, as well as social media tagging.

Next time: Information Foraging Theory in Software Engineering

Now that we have an idea of what information foraging theory is, I will present an overview next time about how this theory’s been applied in software engineering. So far, information foraging theory has been applied primarily to debugging tasks. Margaret Burnett has been leading the charge in this direction, but the concept is beginning to take hold in other areas of software engineering. Nan Niu, for instance, recently published at ICSE a requirements engineering paper on traceability using constructs from information foraging theory.

Stay tuned for the next part in this series!

Advertisement

The Whats and Hows of Programmers’ Foraging Diets: What Types of Information are Programmers Looking for?

Information seeking is one of the most important activities in human-computer interaction! One of the most influential theories in understanding, modelling, and predicting information seeking is information foraging theory. In our research, we want to understand what kinds of diets – that is, the types of information goals programmers seek while debugging. By investigating the information diets of professional programmers from an information foraging theory perspective, our work aims to help bridge the gap between results from software engineering research and Information Foraging Theory foundations as well as results from human-computer interaction research.

A pork chop taken by johnnystilletto on Flickr

Is this tasty?

A head of broccoli by Jim Mead

Is this tasty?

My co-author, David Piorkowski, is travelling soon to Paris to present our latest work: “The Whats and Hows of Programmers’ Foraging Diets”. It’s a great time to expand on this paper. Here’s the PDF Preprint!

Our Method

We had two coders examine video of nine professional programmers to identify what exactly they were looking for when trying to fix a bug in an unfamiliar open-source program. We tried to identify their overall diet by identifying if they asked questions (and received answers) belonging to one of four categories: (1) finding a place to start in code, (2) expanding on that initial starting point, (3) understanding a group of code, or (4) understanding groups of groups of code.

What is a programmer’s diet while debugging?

Overall, we found that programmers spend 50% of their debugging time foraging for information.

Surprisingly, even though all participants were pursuing the same overall goal (the bug), they sought highly diverse diets. For example, Participant 2 asked mostly about groups of groups, Participant 3 asked about finding a place to start, Participant 5 didn’t really ask about anything at all, and Participant 6 also looked for a place to start. This suggests a need for debugging tools to support “long tail” demand curves of programmer information.

How did a programmer consume these diets?

How exactly did programmers go about finding what they wanted to consume?

Again, participants used a diverse mix of strategies. Participants spent only 24% of their time following between-patch foraging strategies (such as code inspection or simply reading the package explorer straight up-and-down), but between-patch foraging (such as doing data flow or control flow) has received most of the research attention.

Surprisingly, search was not a very popular strategy, accounting for less than 15% of participants’ information foraging – and not used at all by 4 of our 9 participants—suggesting that tool support is still critical for non-search strategies in debugging!

Whats Meets Hows

Participants stubbornly pursued particular information in the face of high costs and meager returns. Some participants followed a single pattern over and over again, using the same strategy. For example, in the cases that involved a programmer looking for Type 1-initial goals, participants used code search and spatial strategies extensively but not particularly fruitfully. This emphasizes a key difference between software development and other foraging domains: the highly selective nature of programmers’ dietary needs!

Takeaways

Thus, we considered what programmers want in their diets and how they forage to fulfill each of their dietary needs. Our results suggest that the diet perspective can help reveal when programming tools help to reduce this net demand—and when they do not—during the 50% of debugging time programmers spend foraging.

References and Links

Are you going to be at CHI 2013? Where and when is David’s talk?  It’s on Thursday, May 2, at 11:00 in Room Blue… be there!

D. Piorkowski, S. D. Fleming, I. Kwan, M. Burnett, C. Scaffidi, R. Bellamy, J. Jordhal. The Whats and Hows of Programmers’ Foraging Diets, to appear in ACM Conference on Human-Computer Interaction (CHI), Paris, France, 2013. PDF Preprint

Our paper on the CHI 2013 web site

And… in case you haven’t seen it yet, the video preview!

Picture of tasty pork chop by Johnny Stilleto. Picture of tasty broccoli by Jim Mead.

CHI2013 Paper Accepted: The Whats and Hows of Programmers’ Foraging Diets

In more news from the conference acceptance front, our CHI paper “The Whats and Hows of Programmers’ Foraging Diets” has also been accepted. This paper examines how programmers forage for information while they are debugging and in particular pays specific attention to the types of information they seek. These participants were trying to track down a Java bug in JEdit using Eclipse.

The findings of this paper include the fact that participants used very diverse strategies to pursue the same task, that the participants’ enrichment strategies of searching and breakpoint debugging (that is, modifying the environment by providing information) were very repetitive, and the fact that participants often foraged for information within a single information patch (especially participants who scanned the package explorer and the outline view thoroughly).

I’ll do a more detailed writeup on this paper soon, when the camera-ready version is prepared!

D. Piorkowski, S. D. Fleming, I. Kwan, M. Burnett, C. Scaffidi, R. Bellamy, J. Jordhal. The Whats and Hows of Programmers’ Foraging Diets, to appear in ACM Conference on Human-Computer Interaction (CHI), Paris, France, 2013.

Information Foraging Theory for Collaborative Software Development at FutureCSD

IFT Poster for Future of Collaborative Software Engineering

A poster describing the potential application of Information Foraging Theory to the way people seek information in social collaborative software development settings.

This is a preview of my poster that will be presented at the Future of Collaborative Software Engineering workshop held in conjunction with the Conference on Computer Supported Collaborative Work 2012 in Seattle (Feb. 11-15).

The poster explores how a theory of how people forage for information in their environments might be applied to a social setting where information may be in people’s heads as well as in the artifacts that they work on.

Information foraging theory in general is a theory that postulates that people search for information in a way similar to how foragers in the wild search for food. In an environment, the forager wants to maximize the amount of high-value food for as low a cost as possible. In addition, indicators in the environment, like cues, suggest to the forager where high-yield places may be.