Enterprise Information Architecture, with Louis Rosenfeld: Seminar Notes and Opinons

Chicago, September 29, 2005
Notes and Opinions

The running definition of IA: “the art and science of structuring, organizing and labeling information to help people find and manage information.” However, it’s still impossible to distill a single definition that accurately describes IA, even among its experts.

Since I’m aware that my own track record with presenting information for public use is troubled, I signed up for this seminar to get some help. Since you might be in the same boat, I’m posting my notes for you to get a taste of what’s presented, too.

Disclaimer

Some of the information you’re about to read is the copyrighted intellectual property of Louis Rosenfeld. Much of it is my own scribbling too though – they were my notes on the lecture, after all. I wrote down what interested me, as it pertained to my own work.

To avoid even the appearance of plagiarism, I’m quite fine with attributing all of this to Lou – unless it’s italicized. Those were my own thoughts added after the notes were taken, while they were being typed up.

If you find these tips, observations and ideas useful, you really owe it to yourself to talk with Lou in person. Get your hands on the Polar Bear book from O’Reilly. Attend his seminar. Participate in the discussion and get yourself educated in person.

There’s tremendously more information presented than I could take notes on, and Lou tailored the presentation to the audience: there were whole sections of slides we didn’t cover that you might greatly benefit from. Go to the happy hour and have a beer with him. I’m glad I did, and you’ll be glad you did.

On With The Show

There’s only one IA rule: the Pareto Principle, or "80/20" rule:

20% of content satisfies 80% of users’ needs
20% of possible IA options address 80% of content
20% of IA options address 80% of users’ needs

IA’s goal: figure out which 20%. How? There’s no hard and fast rules on that part. If you work with a consultant who says there are, get a different consultant.

That always sounds to me like, “You need me to do this, because it’s complicated and the only easy part is telling you what you’re doing wrong" – especially in the book, after reading the ump-teenth "trained librarian" reference.

In the big picture, there are certainly cases where IA is critical. Airport Signs do have to be direct and properly placed. It’s the smaller things that I never see addressed: "a brochure page has 8 option categories: title, border, typestyle, logo/image, text, color, background color, page order. There are 13 other pages too. How do I ask a user to pick those in a way that’s not overwhelming?" Maybe I need to RTFM again, but smaller examples than Airports or Global Corporations – like the brochure options – would be appreciated.

Basically, pick the few ways of organizing information that get the most bang for the buck.

So what’s an Enterprise, then? Any organization so big that one hand doesn’t know what the other is doing (or wants to kill it).

IA’s calling in this environment is to provide centralized access to information in a widely dispersed environment, where information is frequently organized by company hierarchy (the “org chart”), and not in a way that users practically think of it.

EIA depends a lot on E context.

A major assumption of central to all of this EIA discussion is that there’s some kind of content management process out there, with people creating content for use on a website. That’s not always true. And at this point, the single developer for a not-quite-enterprise company starts wondering if this’ll be worth his entrance fee. But buck up camper, there’s some valuable nuggets ahead.

Competition between trends in handling information in an Enterprise lead users around in circles. On one side of the "pendulum":

Cheap and easy democratizing technology (internal blog, anyone?) and human tendency toward autonomy.

On the other:

Users’ desire for single point of access and management’s desire to control costs and communications (i.e. Centralization).

They cancel each other out.

You end up with “silos” of information stored and used differently in separate departments, and internal (or worse: external) users who can’t find authoritative, consistent and readily accessible information on a single topic across the various collections.

E.g.: “How come I didn’t know your department was developing a product similar to ours?” or “Why do our sales and support staff keep giving our customers inconsistent information?”

There end up being about 4 main methods of managing information in this E culture, that manifest themselves on a corporate website. They’re not mutually exclusive by any means, and Lou gives a practical roadmap to improving each:

Top down navigation – Characterized by site-design-by-org-chart.
Bottom up navigation – Content Management System Heaven. Multiple content creators, metadata out the wazoo, on-the-fly link creation and management by a software system that fits info into various templates.
Search – The ever-present search box. Who cares what org system it’s in, as long as it’s indexed? Ah, but is your search doing what it ought to?
Guerilla Information Architecture – Blogs, wikis, intranets, tied together in some kind of corporate “portal”.

Plunging into details…

Top Down Navigation:

Try and anticipate what the users will be asking when they get here. This, however, leads to internal politics divvying up the homepage, and multiple internal vocabularies in conflict.

How do you anticipate what questions users will have? Ask the receptionist what questions they hear from outside callers most often. That’s not the only way of course, but it’ll be pretty useful. Your receptionist IS your corporate face to anyone calling or coming in the front door.

On Re-Designs:

“Re-design” implies only a cosmetic change. Too many redesigns are just changing the window dressing on the same product – decidedly not changing the product itself.

If the problem that prompts a redesign is that the information isn’t being used (because it’s not well presented/organized/found/etc), then changing what color it is doesn’t genuinely help.

Further, “re-design” invites corporate in-fighting between divisions that each think they deserve the most pixels on the new homepage. Stay out of this fight.

Ubiquitous Google spidering brings people into your site past the homepage anyway. Let the execs settle their homepage space squabbles themselves. If you get caught up in that argument, you lose the ability to fairly look past the homepage into the much larger size of the rest of the site and its own, more important IA needs.

The homepage isn’t unimportant, but if you’re distracted by it, you could dis-serve the rest of the site.

Sitemaps
A very useful element of a site is a map. In fact, there’s even a mystical name for this handy tool. “Sitemap” (ooh, ahh). You might have heard of them. You might have created them, gotten tired of their 1997-ness, and never created them again. But forgiving the term’s age, it IS a useful thing for a newbie needing to get acquainted with what you do, and where to find information about it.

However, if your sitemap very closely mirrors your org-chart, it’s probably not presenting information to the newb in an intuitive manner (intuitive to the newb, not you. You don’t count.)

Sitemaps have another spiffy side-effect: they give spiders a map of your site’s nooks and crannies. In some cases, a single link from the sitemap is all the connection a page needs to your site.

Lou advocates being “sneaky”. Sitemaps are rarely seen by the politically powerful in your organization. Use its passed-over-ness to effect change in how you determine the site to be better presented. If you get lucky, you could actually change the rest of the site’s structure too, when a lost executive finds it and decides he’s come up with a new way to present the information about your company to the web.

Doesn’t matter that you finished that site map in your own organizational model months ago – it’s his idea now (isn’t your job glamorous?).

I’m not trying to be gender specific. Women make fantastic bosses and executives too, given their … well, differences from men. They simply think about things differently, and that’s a great thing. So when I use some male pronoun, I mean female too. Duh.

If you’ve got so much crap that a single site index is just insurmountable to create and organize, split it up into content-specific sitemaps. Create a single map to those. It’s easier to maintain them separately, and easier for a user to digest as well.

Guides
Another very useful element to complex (and a lot of…) information? Guides.

What are the 3-5 most important things people need to know/do at my site?

At my company, we need to use this concept for more than just our Online Services section (hidden behind a login requirement). It’s very useful for new customers to know how a process works. What happens, how to proceed, etc. “How do I work with the Emergency Response people? What do I have to do to get set up, what happens when I actually have an emergency, what should I expect them to do, ask, need, etc?” Ditto disposal processes, and anything else big.

Guides should be created by “subject matter experts”. People who know it inside and out. Who handle the customers. Who answer the questions. Who know what will be asked.

IT has no business doing this, and trust me, they WILL try to pawn it off on you.

What Do They Want to Know?
How else do you anticipate what questions people will have when they reach your site? Look at what they ask! Geez, this stuff writes itself.

You have a search box? Start looking at its logs. Try the searches yourself, even – but don’t let your searches taint the logs, ok?

Specifically:

what terms are people searching for?
when?
how many results do they get?
What does this tell you?
Do people misspell some words a lot?
Do they think of things in terms you’re using?
Do they find what they’re looking for?*
Do you need to tweak your search engine to index things that aren’t coming up?
Do you need to start adding synonyms or thesaurus capabilities to your search system because dammit, they just keep asking for the same idea in different words?

*How do you know if they found what they’re looking for? Look at the web traffic logs right after their search. See what pages they went to. Look at the referrer data – do they end up at several pages from the same search results page? Do they start a new path from the results? Do they go away?

Lou says there’s not a tool that adequately does this. Note to self: Write a simple grep based on IP and referrer fields. Sell it to every IA junkie trying to justify their existence and build a department a’la Lou’s evil plan. Retire.

Other ways to know what they’re looking for:

What parts of your site get the most traffic?
What questions does the customer service staff answer most?

A lot of people are “old-school”. Can’t find what they want on the new-fangled inkernet? Call. That’s a valid option, too… maybe your information isn’t laid out so they can find their answer. You should count it fortunate if they call, not "What a moron."

Make all these top questions and interests the most prominent things on your site. Remember the one IA rule: 80% of the people want only 20% of your content.

But you can’t use this to entirely trump the politics. You like having a job, don’t you? Balance the information desired to be seen by the user, with the information (read: marketing) desired to be seen by the organization.

Want to know what % of space to give to each? Me too.

It varies by business goals, politics, etc… but executives love solid statistics on use. It appeals to their “what can I sell the visitors?” lust. If they can tie their marketing schtuff into content the user’s coming for anyway, it’s a higher-percentage conversion – just ask Google Ads.

In Lou’s diagram of Top-Down navigation, Product Hierarchy is analogous to Service Descriptions/Hierarchy, except you need to generalize them. My own biggest issue with every IA example I’ve seen is that it assumes companies always have products. After all, most do. But service companies see themselves differently at some levels, and most of the Polar Bear book, and this lecture neglect that.

It took a break-time conversation with Lou to illuminate for me that Services can be described a lot like products, you just don’t sell them in carts. To fully describe something you don’t fully understand, like wide-ranging, and varying services, you have to have someone else provide the information – including the FAQ’s hurdles, regular flow of work, etc, unlike a product that has a relatively well defined sku, color, size, etc

I’d been trying to figure out how to take that on by myself. I felt like an idiot for asking.

Bottom Up Navigation
The most easily grasped example of this to me was cross-sells. You’re buying pants, and the site shows you matching styles/colors of shirts. This is navigation based on content that cuts across product hierarchies.

Ways to map related content to each other:

Put sticky notes on a whiteboard. Have users draw lines by their desired use. You start to see thought patterns and associations that you might not have had on your own, and if they’re dominant among a set of test users, are probably the right way to associate things.
Show pages without navigation and ask a user: “Where would you want to go next? What do you think is missing from this picture?”

Bottom Up systems usually rely on CMS systems. CMS’s are difficult because they contain information. They don’t describe content. After buying yourself a hundred-thousand-dollar CMS system, there’s still a tremendous expense in filling it and making it useful.

It’s hard to get content experts to fill it in, because the topics are complex. It’s time consuming. The systems (business, more than software) are confusing to describe.

In a CMS, links between related content based on metadata as the “connective tissue” should be created on the fly. This is so that “archived” or removed data doesn’t continue to be linked. 404 bad.

So metadata is pretty important… It defines a common vocabulary for describing the data you want to display. Everybody agrees on the term “color” instead of “hue” or “shade”, etc. That one sounds easy, so try this example (I won’t give away Lou’s exact object experiment that wonderfully demonstrates this problem):

Think of all the ways you would describe your computer’s display. Got a pretty good list?

Now think of the ways it was marketed to you. Oh… you didn’t include price in your list? Didn’t include warranty in your list?

That’s a single dimension of discovery.

Now think of the display in the terms that it’s manufacturer marketed it to the place you bought it from. Manufacturing cost didn’t show up in your lists? Country of origin wasn’t in your lists? Quality control wasn’t in your lists?

Get it?

Metadata is a sea of description so deep you can’t ever reliably find the bottom of it. This is where Lou says he gets a reputation for being "down" on metadata. He’s not "down" on it – it’s just not the silver bullet CMS companies try to convince you it is.

CMS’s are only effective when they actually contain a lot of it for any piece of data. But the CMS company’s business is in making the container, not filling it. That’s your job.

So obviously, setting up vocabularies to confine that sea of descriptions to is quite important from early on. And then… oh yeah, you have to decide which terms are equivalent. Thesaurus. Have multiple vocabularies for different subjects? Thesauri. And they have to relate to each other to be useful. Getting a headache yet? Buying the CMS is only the beginning, my friend. Mwahaha.

Faceted Classification
Defined as "Context-specific vocabularies." Lou’s example of this is epicurious.com. It stores recipes and lets you search (browse, actually) for them from several points of view.

You can find recipes by holiday or seasonal association.
You can find them by cooking method.
Ingredients.
Cuisine-style.
Etc.

I’ll bet you never thought about so many ways to describe potato salad.

That the different “flavors” of vocabularies have overlapping members means you can narrow down your information by selecting terms from each vocabulary.

For instance: Think of Christmas related food.

Now, also, food that contains chocolate.

Now also, it’s a drink.

Now also, it’s hot.

Ended up with “hot chocolate”? Good (If you didn’t, bugger off). Until you got to the last “facet” you could also have gotten a nice Bailey’s irish crème & chocolate syrup served under mistletoe to set your night ablaze.

That’s faceted classification. It can be extremely useful in some data sets, but it’s also pretty hard to get set up, and to maintain.

Search
This jumps way into usability, but the two topics aren’t entirely distinct. I’ll post my notes from Steve Krug’s "Don’t Make Me Think" seminar from the following day in the near future.

Search boxes on a web site are only effective if they’re consistent in location and function.

It’s better to offer advanced search options in context. That is, in the results, rather than as a blank slate accessed by an “Advanced Search” link that clutters up the space around your nice clean “Search” box.

Analogy: Contextual help in an application is way better than just the application and 2-inch thick manual thunked down beside your computer.

According to Jakob Nielson, over half of all web users are search-dominant. They go right to the search box before trying to browse your site for information. I thought I was search dominant because I’m always using Ctrl-F, before I heard that definition.

A built-in spell check in your search system (e.g. “Did you mean…”) might help 10% of the searches that happen on your site. Therefore you’re really aiding 5% of your total user base. Here’s a stat your execs probably won’t sneeze at, in this context: You could be losing 5% of your business to typos you’re not fixing. Mmmm, delicious info.

Word Stemming (you searched for hazard we also returned hazards, hazardous, hazardously, but not the Dukes of Hazzard) is also a helpful sounding idea on its face, but consider how much extra data you might be throwing back at the user.

“Engineer” matching “Engineers” might be fine.

But matching “Engineered” might not be what a person looking for a job was interested in at all.

No system can tell what the user means by “Engineer”. Even in context – let’s say they were just surfing your jobs list… what if they’re ADHD, and now their interest has changed from a job, to how bridges are built to withstand storm forces?

You just can’t anticipate that well, and while the ADHD scatterbrained user sounds contrived, users do things we think are stupid, because we don’t know what their motivations and frames of reference really are.

It ultimately might be best just to search for what the user typed, and offer to find variants of the word as links from the search result page.

And while we’re on search results, don’t show users that pathetic “relevancy score” your search engine produces. It won’t mean anything to them, unless (and this is so rare that this exclusion is stupid to me. The results are usually all 100 or 98) the first few results are all spot-on matches, and the 4th is only a 10% match.

Want to know how well your search engine is performing? Go do similar searches at your competitors’ sites.

The next day we saw an example where one site’s clustering (show 4 similar results, offer a link to see more, then show 4 results from another category, a’la CNet) was presented in a confusing manner. It looked like a much smaller competitor offered many more products matching a specific term.

Obviously, that clustering interface needs some help – but that’s not something you could determine just by doing searches at that site. It didn’t look confusing on its own. You had to compare it to another site’s search results to see it.

That was a very useful nugget for me.

Search results are better when they show you what section of a site the result came from, especially on a large (think Ford) site, where “pricing” might show results from product promotions, product descriptions, credit departments, etc.

A user looking for a car price might not care about the credit department (yet), so this saves them frustration.

Make sure your search results appear to be distinctly different pages!

Google does this for us, so it’s not something we think of anymore. If your search engine shows a small set of information from PDF files and the first 3 results all show that same small set, and oops, they all have the same default information in them, then the top 3 search results for that topic look identical and useless.

Review your search logs!

Best Bets
“Best Bets” results are pretty useful… They’re like a short-circuit to the information you’re most likely after when you type a specific term.

If a person searches for “invoicing” and you’ve got a guide that describes the invoice payment process on your site, that’s probably the info they want… but invoicing could show up in lots of other documents first, so short-circuiting the list and displaying that “Best Bet” is a good idea.

Managing Knowledge
Quick, name a publishing system that lets people easily post information that other people want, with minimal overhead, easily searchable, easily updateable, and available to the public?

Did you say “blog” or “wiki”? Good. Now, does your intranet actually have one? Oh.

So imagine “Bob” in HR is the guy who best understands filing expense reports, but his productivity is low because everybody calls him all freaking day because he doesn’t have a way of posting that info for everyone to use. Gee, that sounds inefficient.

The availability of sending an internal email with the instructions is not enough. You’re hiring or transferring people all the time, and they weren’t on Bob’s original distribution list, or they just didn’t exist when Bob sent it.

Post it somewhere on an accessible system already! Use an intranet and allow authenticated access from the outside world, and now even your remote locations can get to it.

Expand on that idea: Tie your employee directory to internally available blogs. Now hear this: “Not only is this the information you need to contact me at my desk, or by cell or email, but here’s the questions I get asked frequently. Here are the answers.” Oooooh. My mouth just watered.

Decision Making
Now you’ve got all this great info about what you should be changing in your business world. How do you prioritize it?

How do you identify what things are causing you the most grief? And not just your opinion, but actual data?

Card sorts, surveys, logs, remote observation (VNC to a user who’s having problems doing something).

Want software to make that easier? Ezsort (card sorting), Surveymonkey (surveys), Webtrends (logs), etc.

Card sorting online can both raise awareness of content’s existence and indicate how users expect information to be presented. How much do you want to bet that most people at your company don’t even know there’s a page about…?

Validating IA’s Presence in a Corporation
This is where my attention nose-dived. My company doesn’t have these internal political structures to the degree assumed. I particularly liked this advice: “Charge a lot so they’ll respect what you do, versus thinking of you as just an HTML jockey that they can commoditize and devalue”. A whole lot of description ensued of CYA tactics.

Sorry, not interested in them right now, but thanks for the slides, I’ll keep them around in case we get that big someday.

Conclusion
I rate the seminar highly and found a lot of this to be very useful stuff. I just wish there were more of some parts, and less of others.

Here’s what I’d like to also get out of it, and maybe factors that’ll help shape future discussions and decisions:

More about the methods. Leaving all the “practice” stuff out of the “theory” makes it hard for me to go home and tell my boss what I learned.

Separation of IA and Politics I could literally go home and tell my boss that a quarter of what I learned was,

“I might have to be sneaky, potentially undermining you and your boss, appealing to higher powers to establish a new department to do work that we’re already pretty good at because it’s mostly common sense.

"I learned a step by step model of creating my own department, justifying my own internal budget and pricing to other departments, elevating myself by necessarily hiring a diverse staff of people with non-overlapping skills you wouldn’t normally hire for in our industry."

"I learned that because this task is both absolutely necessary and impossible to nail down, I’ll have to hire an outside consultant to help us out at hourly rates. And coincidentally, I also learned how to contact one of the pre-eminent authors (and consultants, I might add) in the field.”

Now, I know that wasn’t the intention in the seminar, and that the info really would help those in Enterprise situations with realistic problems and plans. I don’t discredit this stuff in the least, nor do I think it was out of place given the subject, “Enterprise Information Architecture”.

But I sure would have appreciated the IA without so much E.

Tools! It would’ve been nice to hear mention of the major tools and players in the market to address common IA situations. I’m not even looking for recommendations, just ideas to go research. My boss will ask me, “Did you find a recommendation to help us with X?” And I have to say “Nope”.

Now, I did look at the back of the presentation slide book afterward, and there’s a list of tools, so maybe I need to just chill out and RTFM some more, but the list doesn’t distinguish between tools at the large end of data, and tools for smaller companies. At least a little discussion of it would be nice, when we’re talking about search or faceted classification that heavily relies on tools.

I’d definitely recommend this seminar to others, and given a guarantee of focused discussion on tools, more methods, and more examples, would probably even go again myself.

One reply on “Enterprise Information Architecture, with Louis Rosenfeld: Seminar Notes and Opinons”