I’ve spent the last couple of months working on an experiment design framework in follow-up to an initial study concerning pragmatic understanding of UI designs.
Though I discovered a few frameworks out there for constructing online experiments. I’m working on several experiments involving hundreds of submissions and requiring:
- Subject anonymity - In at least one experiment, questions are sensitive and anonymity will be essential
- Relatively, browser neutral - I don’t want to exclude people on the basis that they don’t have the most current browser technology
- Extensible - I want to be able to easily re-use code across experiments
- Deployable to Amazon Mechanical Turk - I should be able to deploy an online experiment across multiple venues, but especially via AMT.
- Secure, reliable, … - It’s a lot of work to ensure all layers of your software stack is resilient on the open Internet.
I started down the path of writing my own JS front-end built on top of node.js server and mongoDB. I learned early on that I could avoid back-end development by using services from firebase.com. But I spent quite a time spinning up on the angular.js front-end framework because it was something I started to use for other purposes. Both Firebase and angular.js are spectacular new technologies and I wish I had more time to go down this route!
However – time is flying and progress has still been slow. Following my schedule, I set aside front-end development and moved on to learning about how to integrate with mechanical turk. And… I quickly discovered an alternative development option.
Hopefully, I will have a new blog post in a few weeks detailing useful information on how to extend Qualtrics for design of behavioral UI experiments. Thus far, everything seems doable. Qualtrics uses Prototype.js and has made it incredibly easy to interact with their Question API.
Interacting with graphical user interfaces (GUIs) sometimes feels conversational and sometimes not. A dialogue box that asks a yes-no question feels much the same as a verbal yes-no question. And a blog post can feel very un-conversational in the sense of a lecture or commentary.
When we speak our own language about everyday matters, we speak effortlessly with little thought to how to produce an utterance and little thought to how to understand one spoken to us. It’s a bit different with written language. We endure years of school learning to read and write following style and rhetorical guidelines which we memorize and practice endlessly until we know how to recognize and avoid passive constructions - as well as generate them with practiced ease. Reading and writing text is just not as intuitive as speaking conversationally and most people require years of practice to achieve any competency.
When GUIs sprang into public view in the 70s and 80s, it because possible for people to interact with a computer without having to learn seemingly arbitrary complex commands typed into a terminal window. GUIs made sense. You could point and push buttons. An icon was a symbol that stood for something meaningful like a “program”. And there was plain old English language mixed in with icons and graphics (since GUIs originated in the United States). GUIs felt natural.
During all of the hype of GUI-based interfaces, software developers learned to adopt the notion of software patterns. The idea was to document shared knowledge of best practice of how to solve common or typical problems in software design. Software patterns is a fantastic concept for anyone learning to write software. Writing software and designing UIs has much in common with learning how to read and write text. It requires a lot of practice before you become any good at it. Software design patterns help new programmers speed up the process so they don’t have to figure out how to solve every problem encountered through brute trial-and-error.
When desktop computers became really popular - and web interaction more so in the 2000s, some software engineers began to specialize in “front-end” development. Anyone dabbling with web interaction might consult the Yahoo UI design pattern library while developing a new web application. In fact, Yahoo’s purpose for its design pattern library was to solve a business problem. They wanted a way to communicate standards across development teams in order to increase “consistency, predictably, and usability” across their site - and their brand.
Human Computer Interface (HCI) guidelines capture specific problems, examples, usage, rationales, and supporting research, standards, etc. Patterns range across stylistic conventions (e.g., page headers and footers), attentional mechanisms such as animation, navigation and organization, layout, common functions such as registration or login, and even more complex patterns such as social sharing and feedback.
In addition to useful, everyday patterns, the notion of anti-patterns and dark patterns document practice in common use which may be ineffective (anti-patterns) or ethically questionable (dark patterns) patterns.
It turns out that when we learn language, some concepts and patterns become entrenched with frequent use. So, for example, the phrase “I don’t know” is not something most of us have to think about before uttering. The grammatical construction I (subject) + do (1st person aux verb) + negation + know (infinitive) is not something you have to think about assembling before you say it. Such forms become routinized through repeated activation and use - such that both production and understanding require less cognitive effort and is more automated.
This sort of automatization is not limited to conversational speech. Routinization happens at many levels of production and understanding. For example, when you see a familiar word like “key”, pronunciation is automatic. It also interacts with syntactic chunks such that when someone says, “cat in the ___”, you anticipate “hat”.
Moreover, when you see a tall hat with red and white stripes, you may immediately think cat-in-the-hat, as well as “cat”, “Dr. Seuss”, and any number of related concepts. According to Pickering and Garrod (2004), priming may occur at different levels including lexical, syntactic, semantic, and situation.
Routinization isn’t limited to long-term memory. Suppose you are having a conversation with a friend. You are talking about a movie of which neither of you can remember the name. So you say, “that movie with Harrison Ford”. It would not be surprising if your friend then referred to the same movie as, “the Harrison Ford movie”. You can routinize a reference to something during the course of an interaction in order to communicate more easily.
Priming is applicable to GUI patterns. When the GUI presents a dialog box like the one below, it is familiar. It offers a choice (1) or (2). Typically, the choice is binary - cancel or accept; yes or no; permit or deny, etc. You read the text and make a choice.
Image credit - (Why ok buttons in dialog boxes work best on the right)
Once you’ve seen this pattern, you don’t have to ponder over similar dialog boxes each time you encounter one. In fact, the more often you see and recognize a design pattern, the more entrenched it becomes. The same cognitive architecture that supports your understanding of language also supports and facilitates understanding of user interfaces
So lets talk about the difference between interaction design patterns and linguistic patterns. Well one obvious difference is while we use language all day long, only a few of us know how to produce UIs. And when people interact with UIs, they aren’t interacting directly with the designer. So the designer doesn’t get direct (or continuous) feedback on how well the user understands the interaction. Production is not directly linked to comprehension in a realtime feedback loop like face-to-face dialog; it is a bit more like interacting with monologue text. This means that patterns are not aligned and refined in the same way.
Interaction design patterns help improve communication, but because they are easily modified by the designer with no direct feedback about how such changes affect a user’s comprehension – there is propensity for non-obvious error.
In fact, there might be a lot of information packed into a dialog box. Dialog boxes don’t have to be simple binary choices. The UI designer can make a dialog box for any purpose. Here’s a very simple one where explicit choice is omitted. I guess I can say “not okay” by clicking the “x”. Maybe. Hmmm.
Here’s one where the designer decided that a form could be a dialog box. What happens if I don’t fill something out right?
This is a familiar open document dialog. Sometimes you can open more than one document but there is no way to know without trying it.
Here’s a complex dialog that combines the basic cancel, accept pattern with other buttons and choices. From experience, I expect that I can do a bunch of things and then choose “OK” or “Cancel” when I’m done. But it requires a bit more thought on my part – and also a bit of trust where if I spend a lot of time on this and then mess something up, I don’t know what will happen.
Here’s a dialog box where I worry that if I click on the hyperlink that I don’t know if I’m still in this dialog or I’m sent off on a wild goose chase.
Below, is it more confusing if I move the “cancel” button like this? Will the user even see the “cancel” button? Will they become confused because the design conflicts with expectation?
Considering the examples above, it’s easy for UI and software designers to break the simplicity of such a simple design pattern by:
- packaging the information differently (adding more to a screen or component, for example)
- altering labels so that our expectations are jarred
- altering the position of a component so it is not in a familiar position
- creating a semantic mis-match between text and button labels (if you don’t understand the text what the heck do you do?)
- dispersing choice across form buttons, images, and text (e.g., embedded hyperlinks in text)
Common design patterns are easily broken. Any designer or developer with a text editor can so it easily without understanding the impact on comprehension. Producing comprehensible GUIs requires as much practice as writing clear, well-structured text. Design patterns arguably serve as a cognitive aid boosting mechanisms supporting routinization and automatization of understanding. But as designers we need to be very sensitive to the effect of alteration, no matter how benign the change might seem. Clearly, we learn such sensitivity writing prose. Why not user interaction?
Last Fall, late into comprehensive exams, I paused to consider the effort it took to research and produce answers to questions that would only be used in a 2-hour comprehensive exam. Of course, if I was lucky, some of the writing might make it into my dissertation. And the time spent reading, thinking, and learning was potentially invaluable. But the process was painful. For each of four papers, I considered my question, amassed sources, crammed notes into Evernote, outlined, wrote, re-wrote, re-thought, re-searched, re-collected, re-assembled, re-drafted and finally spend a day inserting and formatting sources properly. That was just for a first draft.
In the end, all of my work was in disconnected Evernote notes and Word documents.
Like many other doctoral students, I came to the realization that my method was incredibly inefficient and could not scale. In fact, I realized that my largest problem was actually the ability to re-use my own thinking, knowledge, and experience.
So I read everything I could find on academic workflows.
What I found was huge discussion on the relative merits and trade-offs on organization (knowledge management) tools, brain-storming tools, note-taking, bibliography, and writing tools. Naturally, I downloaded a variety and began to explore. At first, I couldn’t see why there were so many over-lapping functionalities. For instance, various students and researchers would admit to using multiple brain-storming tools simultaneously. What the heck?! (I’ll get back to this later.)
Fairly quickly, I came to the conclusion that my best chance at settling on a workflow was to focus on studying the workflows of researchers who were very skilled. I reasoned that someone who publishes a lot probably was good at re-use of previous thinking and work. This post by Steven Berlin Johnson describing his use of DevonThink is representative.
While developing my own workflow, I ran across an article from Pirolli and Card (2005) that reminded me intensely of academic workflows! In fact, they had done a cognitive task analysis (CTA) of intelligence analysts. What they described was sense-making. In essence, sense-making is a set of processes for framing data to a model and also fitting data to a mental model (Klein, Moon, & Hoffman 2006). The first part (framing data) is posed as an “information foraging loop” while the second (fitting data to a model) as a “sense-making loop”. Another way to look at this is as an interleaving of bottom-up processes with top-down processes: search, filter, and understand versus find evidence, examine counter arguments, and re-evaluate. The diagram below is extracted from Pirolli and Card (2005).
My current workflow is represented below. In effect, it represents a simple CTA that generalizes the academic workflow that I now use.
Though I drew in arrows that indicate a sort of continuous back-and-forth process at a micro-level, there do seem to be larger foraging and sense-making loops. It seem so since I spend a day or days at a time doing focused work within specific tools. In fact, I spend a lot of time in Scrivener. Writing seems to require much more effort than note-taking in other parts of my workflow. Probably, before adopting this workflow, I spent most of my time in the foraging loop - and then losing or forgetting much of the learning I had attained. (Let’s face it, hunting is more fun.)
The major tools I use appear under the loops in the diagram above:
- Papers 2 - bibliography and PDF management. I do a lot of reading and note-taking directly in Papers 2.
- DevonThink - content management which gets feeds directly from RSS feeds and web clippings. DT is a big shoebox for everything I collect that is not a PDF.
- Skim PDF reader - free PDF reader with some very useful note-taking and export functionality.
- Tinderbox - thinking and brainstorming
- Microsoft OneNote - note-taking using pen and clippings on my Surface Pro. Skydrive sync lets me view and edit notes on the Mac, as well.
- Scrivener - writing. Even my blog is written using Scrivener. Any notes captured in tools above can be imported into Scrivener by converting them to PDF first.
Not shown is a production tool. I typically export from Scrivener to LaTeX (the subject of a future post).
Another way to look at this process is in terms of note-taking. At each stage in the process, my notes become more compact and inter-connected. When I read an article, I may outline summaries and extract concepts and quotes. But when I’m sense-making, I’m primarily working in Tinderbox or OneNote and what’s important is thinking and bridging – the relationship between notes. They are typically short, but need to link back to summaries, extracts, and sources. When I write prose, I work in Scrivener. The link back to intermediate sources (Tinderbox & OneNote) and original sources (Pages & DT) is important to maintain. Sometimes there is the need to go back and work on material in more depth. But, in the end, all of these intermediary products are re-usable.
Now, back to the question of overlapping functionality. Why is it that, at least on the Mac, there is so much overlap in functionality between “academic workflow” tools? Why would I want to take notes in multiple tools? It turns out that note-taking is a multi-faceted beast. Depending where I am in the sense-making process, I’m taking different sorts of notes. For example, below are different sorts of notes serving different sorts of purposes - which may be used in different parts of the sense-making loop. I adapted the scheme below based on an analysis of historians using DT.
Labels, tags, notes (summaries, maps, highlights, etc.) are all really just notes. We just use them in different ways. And some are obviously more compact than others. As Mark Bernstein of eastgate.com remarks:
- Notes are records, reminding us of ideas and observations that we might otherwise forget
- Shared notes are a medium, an efficient communication channel between colleagues and collaborators
- Notes are a process for clarifying thinking and for refining inchoate ideas
(These remarks can be found in Bernstein’s talk on the youtube video link at the bottom of this post.)
There are also other small workflows which impact productivity in a cumulative sense. One of the most valuable has been to create a small workflow which is triggered whenever I save a PDF to the desktop. When this happens, a hazel script immediately imports the document into Papers2. This causes me to source the document right away so that when I need to reference a source downstream, its already entered into Papers2.
In the end, I hope to capture my own knowledge and experience over time so that I can build on previous work without having to re-think and re-organize every time I start a new paper. It’s an experiment in progress. FYI - I still use Evernote. But more for random odds-and-ends such as notes on programming and how to keep my blog running smoothly.
Here are some of the references online that I found particularly inspiring for my own workflow. Interestingly, I found lots of useful information about DT from historians, while historical fiction writers seem more excited by tools like Tinderbox. Scrivener, however, definitely crosses boundaries. Apparently, all of us are deeply concerned with writing and production. Enjoy!
Klein, G., Moon, B., & Hoffman, R. (2006). Making Sense of Sensemaking 2: A Macrocognitive Model. Intelligent Systems, IEEE, 21(5).
Pirolli, P., & Card, S. (2005). The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. Presented at the Proceedings of International Conference on Intelligence Analysis.
It is interesting to think that the we still have much to learn about the most studied part of the brain - the visual cortex. Given this, how is it that we understand so little about such a basic neurological phenomenon as “visual integration”. By this, I mean how do we take basic signals coded as color, shape, orientation, motion and make meaning of these?
A persistent, yet controversial concept, in linguistics is known as the Whorf-Sapir Relativity Hypothesis. This hypothesis holds that language itself affects how we see the world: our perception of the world is influenced by language. Though over the years, cognitive psychologists and others have made reference to Whorf-Sapir in a variety of studies, I had not seen any neurophysiological support for this theory until recently. In fact, there is plenty of evidence to suggest that humans all perceive basic visual stimuli similarly. So how could language possibly affect perception?
Color categorization has long been of interest to linguistic anthropologists and cognitive psychologists. Berlin and Kay (1969) conducted a cross-linguistic study of color categorization discovering that color categories are not uniform across languages. There seemed to be focal, or prototypic colors, but the number varied depending on the language. Also, boundary members varied in terms of categorical membership (that is, whether a particular color was classified as blue versus green). However, in their study, speakers of languages that classified colors differently, did not perceive colors differently. Berlin and Kay hypothesized a set of universal constraints and, thus, some prediction for color categorization. Their study also seemed to suggest strong evidence against the relatively hypothesis: linguistic categories do not affect or change perception. (Also, see Kay and McDaniel (1978) which is available as a downloadable PDF.)
In intervening years, a number of researchers have countered Berlin and Kay’s theory, primarily on the basis of methodological flaws. More recently, however, a 2009 (Siok, et al.) functional MRI study has lent credible biological evidence that language plays a role in the categorical perception of color. The faculty for language is found in the left hemisphere of the brain. But by looking at brain activation maps across both hemispheres in a visual search task, these researchers discovered that enhanced activity in the right visual field coincided with enhanced activity in the part of the left hemisphere concerned with lexical processing. Results suggest that language may provide some top-down control modulating the activation of the visual cortex.
In 2012, Loreto, Mukherjee, and Tria published a paper in PNAS demonstrating through a multiagent simulation how a population, subject to the perceptual constraint “Just Noticeable Difference”, categorized and named colors through a purely cultural negotiation. They claim strong quantitative agreement with the World Color Survey (WCS) pioneered by Berlin and Kay.
In the end, the question of whether language affects perception does have foundation in neurophysiological processes. Perhaps, not to the extent originally conceived by linguistic relativists. But maybe there is additional supportive evidence not quite so dependent on basic visual stimuli.
Berlin, B. & Kay, P. (1969). Basic Color Terms. Berkeley: University of California Press.
Kay, P. and McDaniel, C. (1978). The linguistic significance of the meanings of basic color terms. Language, 54(3):610-646.
Loreto, V., Mukherjee, A., & Tria, F. (2012). On the origin of the hierarchy of color names. Proceedings of the National Academy of Sciences, 109(18), 6819–6824.
Siok, W., Kay P., Wang, W., Chan, A., Chen, L., Luke, K., & Tan, L. (2009). Language regions of the brain are operative in color perception. Proceedings of the National Academy of Sciences, 106(20), 8140-8145.