Sei sulla pagina 1di 126

Taxonomy Strategies

LLC

Taxonomy & metadata strategies for effective content management


Melbourne, Sydney, Canberra Masterclass

6-15 June 2007

Copyright 2007 Taxonomy Strategies LLC. All rights reserved.

Todays agenda
9:00-9:10 9:10-9:15 9:15-9:45 9:45-10:00 10:00-10:30 10:30-11:00 11:00-12:00 12:00-12:30 12:30-13:30 13:30-14:30 14:30-14:45 14:45-15:15 15:15-16:15 16:15-16:30 16:30-17:00 10 minIntroduction 5 minWarm-up exercise 30 minTaxonomy fundamentals: Building taxonomies 15 minTaxonomy exercise 30 minTaxonomy fundamentals: Taxonomy business case 30 minTea Break 60 minTaxonomy governance 30 minCapabilities self-assessment 60 minLunch 60 minTaxonomy benchmarking 15 minBenchmarking exercise 30 minTea Break 60 minContent tagging 15 minTagging exercise 30 minQ&A
2

Taxonomy Strategies LLC The business of organized

Who I am: Joseph Busch


y Over 25 years in the business of organized information. Founder, Taxonomy Strategies LLC Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies
(acquired by Interwoven, November 2000)

Program Manager, Getty Foundation Manager, Pricewaterhouse

y Metadata and taxonomies community leadership. President, American Society for Information Science & Technology Director, Dublin Core Metadata Initiative Adviser, National Research Council Computer Science and Telecommunications Board Reviewer, National Science Foundation Division of Information and Intelligent Systems Founder, Networked Knowledge Organization Systems/Services
Taxonomy Strategies LLC The business of organized 3

What we do

Organize Stuff

Taxonomy Strategies LLC The business of organized

For us, taxonomy work includes:


y Metadata specification defines

the properties needed to describe content so that it can be found & used. y Vocabularies are collections of terms that are used to specify some of the metadata properties.
Some vocabularies are big

and hierarchical, some are small and flat.


y An application profile specifies

what metadata & vocabularies are required, and then represents them formally.
Taxonomy Strategies LLC The business of organized 5

Recent & current projects: http://www.taxonomystrategies.com/html/clients.htm Government Commercial

Not-for-Profit

Taxonomy Strategies LLC The business of organized

Who are you? What sectors do you work in?


Your Role
y Administrator y Records Manager y Content Manager y Communications y Editor y Information Architect y Usability Expert y Librarian y Knowledge Engineer y Ontologist y Chief Information Officer

Industrial Sector
y Agriculture & Processing Food, Lumber, Pulp & Paper y Financial Services
Banking & Insurance

y Government
Public administration Public safety

y High Tech Computers, Software & Telecommunications y Heavy Manufacturing


Steel, Automobiles & Aircraft

y Manufacturing Consumer Products y Medical & Health Care y Mining & Refining Petrochemicals, Oil & Gas y Pharmaceuticals

Taxonomy Strategies LLC The business of organized

Why are you here?


y What are the key questions that you want answered in todays

workshop? y Please rank the questions from the most important (5) to the least important (1) y Please provide your job title, organization and department; your name is optional.
Priority (1-5) Questions

Your title or role: Your org or industry: Your dept: Your name: (optional)

Taxonomy Strategies LLC The business of organized

Todays agenda
9:00-9:10 9:10-9:15 9:15-9:45 9:45-10:00 10:00-10:30 10:30-11:00 11:00-12:00 12:00-12:30 12:30-13:30 13:30-14:30 14:30-14:45 14:45-15:15 15:15-16:15 16:15-16:30 16:30-17:00 10 minIntroduction 5 minWarm-up exercise 30 minTaxonomy fundamentals: Building taxonomies 15 minTaxonomy exercise 30 minTaxonomy fundamentals: Taxonomy business case 30 minTea Break 60 minTaxonomy governance 30 minCapabilities self-assessment 60 minLunch 60 minTaxonomy benchmarking 15 minBenchmarking exercise 30 minTea Break 60 minContent tagging 15 minTagging exercise 30 minQ&A
9

Taxonomy Strategies LLC The business of organized

The Taxonomy problem: How to pick from > 5,000 faucets? By:
y Category y Price y Brand y Color/Finish y # Handles y Series Name y Water Filter? y Faucet Spray y Handle Shape y Soap Dispenser?
Taxonomy Strategies LLC The business of organized 10

The main issue: What goes here?

y When do the

things in the list change? y How do we maintain the list? y What rules do we follow?

Taxonomy Strategies LLC The business of organized

11

Seven phases of taxonomy development


Week: 1 Identify Objectives 2 Inventory Resources 3 Specify Metadata 4 Model Content 5 Specify Vocabularies 6 Specify Procedures 1 2 3 4 5 6 7 8 9 10 11 12

Conduct interviews

Identify, gather & review resources Define fields & purpose Define content chunks & XML DTDs Compile controlled vocabularies Develop workflow, rules & procedures Manually tag small sample

7 Test & Train

Taxonomy Strategies LLC The business of organized

12

Taxonomy design phases need to be iterated


Plan & Prototype 1 Identify Objectives 2 Inventory Resources 3 Specify Metadata 4 Model Content 5 Specify Vocabularies 6 Specify Procedures
Interview core team and stakeholders

Alpha Dev & Test


Review tagged samples, default procedures Gather additional resources, if any Revise if needed, bake into alpha CMS Revise if needed, bake into alpha CMS

Beta D&T
Interview alpha users Gather additional sources, if any Modify CMS for beta

Final D&T
Interview beta users

Identify, gather & review resources

Define fields & purpose

Modify for 1.0

Define content chunks & XML DTDs Compile controlled vocabularies Develop workflow rules & procedures Manually tag small sample

Modify CMS for beta Revise, use in beta CMS Modify & extend workflows

Modify for 1.0 Revise using team procedu re Finalize procedure materials

Revise, use in alpha CMS

alpha workflows in CMS

7 Test & Train

Use alpha CMS to tag larger sample

Use beta CMS to tag larger sample

Finalize training materials & train staff

Taxonomy Strategies LLC The business of organized

13

Licensing an existing taxonomy


See Factivas taxonomy www.taxonomywarehouse.com
y There are usually license fees, but these will be less than the

effort to develop an equivalent taxonomy. y But pre-existing taxonomies rarely fit an organizations needs and may require extensive customization.

Recommendation
y Adopt a faceted approach. y Reuse existing (especially internal) vocabularies for as many

of the facets as possible. y Plan on doing full-custom Content Type and Topic taxonomies.

Taxonomy Strategies LLC The business of organized

14

Free sources for 8 common taxonomies


Taxonomy
Organization Content Type

Definition
Organizational structure.

Potential Sources
SP 800-87, U.S. Government Manual, Your organizational structure, etc.

Structured list of the various types of Dublin Core Type Vocabulary, AGLS Document content being managed or used. Type, Your records management policy, etc. Broad market categories such as lines of business, life events, or industry codes. Place of operations or constituencies. Business activities or functions performed to accomplish mission and goals. Business topics relevant to your mission & goals. Subset of constituents to whom a piece of content is directed or is intended to be used by. Names of products/programs and services. SIC, NAICS, Your market segments, etc.

Industry

Location

FIPS 5-2, FIPS 55-3, ISO 3166, UN Statistics Div, US Postal Service, Your sales regions, etc. Federal Enterprise Architecture Business Reference Model, Enterprise ontology, Your business functions, etc. Federal Register Thesaurus, NAL Agricultural Thesaurus, Your research areas, etc. GEM, ERIC Thesaurus, IEEE LOM, Your psycho-graphics or personas, etc. ERP system, Your products and services, etc.

Business Activity Topic Audience

Products & Services

Taxonomy Strategies LLC The business of organized

15

Typical product catalog: A-Z, then idiosyncratic categories

Taxonomy Strategies LLC The business of organized

16

How to analyze existing product catalog categories: Principles and priorities


Preparing a product catalog for facet browsing (aka Guided Navigation) requires a category hierarchy and additional attributes. Principles
1. Categories and subcategories that could be swapped are candidates for

conversion to attributes. 2. Repeated lists of subcategories signal a possible need for an attribute. 3. The number of attributes should not exceed six or seven, so not all attribute candidates should be used.
Avoid selecting strongly correlated attributes, such as Weight and Shipping

Weight.

Priorities
1. Choose Categories that apply to many products, over those with few

products. 2. Choose Attributes that apply to many Categories over those that apply only to very few categories.

Taxonomy Strategies LLC The business of organized

17

Product categories example: Wireless carrier

Products
Accessories Content Phones Services Batteries Cases Chargers Data Hands-Free Headsets Miscellaneous Purchased Subscription Versatile Phones Smart Devices Basic Phones Prepaid Phones International Only Phones Mobile Broadband Cards Conferencing Internet / Data Landline Phone Network & Roaming Relay Services Solutions Wireless Data

Taxonomy Strategies LLC The business of organized

18

Product attributes example: Digital cameras in an electronics catalog


Resolution

y Types of attributes Generic attributes


Brand/Product Family/Model Price Range Usually Ships

3 Megapixels (4) 4 Megapixels (5) 5 Megapixels (27) 6-8 Megapixels (21)

Brand
Canon (15) Fuji (10) Kodak (17) Nikon (8) Olympus (9)

Merchandising attributes Usage (E-mail, Internet Browsing, Programming, ) Segment (Home, Business, Education, Government ) Region & Country Most Popular New Related Products Specialized attributes Capacity (Battery; Memory; MB; GB; BPS, ) Resolution (DPI; Megapixels; XGA, XGA, UXGA, ) Size (Display; Screen; ...) Standard (a, b, g, n, ; scsi, ata, sata, eide, ; dimm, simm, ) Type (Camera; Battery; Display; Printer; Server; Storage; Switch; )
Taxonomy Strategies LLC The business of organized

Type
Point & Shoot (25) Digital SLR (10) Packages (5)

Price Range
$100-250 (5) $250-500 (16) $500-1000 (19) More than $1000 (3)

19

Faceted taxonomy theory & practice


y How many terms are needed to provide sufficient

granularity? Not as many as you think! y Post-coordinate indexing allows several simple controlled vocabularies to be combined, rather than using a single large pre-coordinated vocabulary.

Taxonomy Strategies LLC The business of organized

20

The power of faceted taxonomy


4 independent categories of 10

nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (104)
Easier to maintain Easier to tag by content authors Can be easier to navigate

Audience
Advocacy Contractors & Grantees Environmental Professionals Federal Facilities General Public Industry Kids Researchers & Scientists Small Business Students

Health
Advisory Exposure Food Safety Health Assessment Health Effect Health Risk Occupational Health Pesticide Effects Sun Protection Toxicity

Industry
Agriculture & Cattle Automobile Repair Chemical Dry Cleaning Electronics & Computer Energy Extractive Industries Food Processing Leather Tanning & Finishing Metal Finishing

Substance
Allergen Biological Contaminant Carcinogen Chemical Explosive Liquid Waste Microorganism Ozone Pesticide Radioactive Waste

y Its more effective to increase

the number of facets, than to increase the number of terms per facet.

Taxonomy Strategies LLC The business of organized

21

Automatically created taxonomies


y Documents can be clustered

based on similarities and differences. y Problems:


Typically only a single

hierarchy No overall plan Results hard for people to navigate

What does North mean on this map?

Taxonomy Strategies LLC The business of organized

22

Automatic taxonomy construction software


y Software can scan large quantities of

content and extract statistically significant words and phrases. y Example:


Archive of 10 publications analyzed for

topics related to copyright.

y Software does a poor job of De-duplication. Turning significant words and phrases

into a larger structure. Discriminating between gold and garbage.

y Software is good for Getting an understanding of the key noun

phrases in a large collection. Providing test cases for evaluating a taxonomy.

Source: Sample data courtesy of nStein.


Taxonomy Strategies LLC The business of organized 23

Most popular flickr tags on 20 Feb 2007 http://www.flickr.com/photos/tags/

Sort flickr categories into 5 or fewer groups. Then label each group.
Taxonomy Strategies LLC The business of organized 24

Taxonomy exercise Facet grouping


y Universal taxonomy facets
By location (spatially) By time (chronologically) By type (genre) By physical properties (size, color, shape, etc.) By subject (topic)
Richard Saul Wurman. Information Architects (1996)

Taxonomy Strategies LLC The business of organized

25

Taxonomy exercise Facet grouping

Sort flickr categories into 5 or fewer groups. Then label each group.
Taxonomy Strategies LLC The business of organized 26

Todays agenda
9:00-9:10 9:10-9:15 9:15-9:45 9:45-10:00 10:00-10:30 10:30-11:00 11:00-12:00 12:00-12:30 12:30-13:30 13:30-14:30 14:30-14:45 14:45-15:15 15:15-16:15 16:15-16:30 16:30-17:00 10 minIntroduction 5 minWarm-up exercise 30 minTaxonomy fundamentals: Building taxonomies 15 minTaxonomy exercise 30 minTaxonomy fundamentals: Taxonomy business case 30 minTea Break 60 minTaxonomy governance 30 minCapabilities self-assessment 60 minLunch 60 minTaxonomy benchmarking 15 minBenchmarking exercise 30 minTea Break 60 minContent tagging 15 minTagging exercise 30 minQ&A
27

Taxonomy Strategies LLC The business of organized

Business case and motivations for taxonomies


y How are we going to use content, metadata, and

taxonomies in applications to obtain business benefits?

Taxonomy Strategies LLC The business of organized

28

What technology analysts have said: Add metadata to search on!


y Adding metadata to unstructured content allows it to be managed like

structured content. Applications that use structured content work better.

y Enriching content with structured metadata is critical for

supporting search and personalized content delivery.

y Content that has been adequately tagged with metadata can be

leveraged in usage tracking, personalization and improved searching.

y Better structure equals better access: Taxonomy serves as a

framework for organizing the ever-growing and changing information within a company. The many dimensions of taxonomy can greatly facilitate Web site design, content management, and search engineering. If well done, taxonomy will allow for structured Web content, leading to improved information access.
29

Taxonomy Strategies LLC The business of organized

Fundamentals of taxonomy ROI


y Tagging content using a taxonomy is a cost, not a benefit. y There is no benefit without exposing the tagged content

to users in some way that cuts costs or improves revenues. y Putting taxonomy into operation requires UI changes and/or backend system changes, as well as data changes. y You need to determine those changes, and their costs, as part of the ROI.

Taxonomy Strategies LLC The business of organized

30

Product utilization: Taxonomy compared to search


y Conversion rate increases.
HomeDepot.com Double digit increase. 1-800-Flowers.com More than a 10% increase. Otto Group (Kaleidoscope, Freemans, Grattan, and lookagain

catalogs) 130% increase.

y Lift in average order size.

Taxonomy Strategies LLC The business of organized

31

Product catalog: Taxonomy compared to search

Benefit: Increased conversion rate & revenue lift


Web sales net income Increased conversion rate $ Order size lift $ Potential revenue increase per year $ $ 80,000,000 30% 24,000,000 10% 8,000,000 32,000,000

Taxonomy Strategies LLC The business of organized

32

Usability research: Taxonomy compared to search


y We found that users preferred a browsing oriented

interface for a browsing task, and a direct search interface when they knew precisely what they wanted.

Marti Hearst (and others)


y The category interface is superior to the list interface in

both subjective and objective measures.

Hao Chen & Susan Dumais

Taxonomy Strategies LLC The business of organized

33

Usability research: Taxonomy compared to search

Category is 36% faster

Category is 48% faster

140 120 100 80 60 40 20 0 C ategory List


In top 20 results Not in top 20 results
34

Source: Chen & Dumais


Taxonomy Strategies LLC The business of organized

Median Search Time in Seconds

Time saved: Taxonomy compared to search 1 hour per day searching x 36% faster = 22 minutes each day

22 minutes x 250 working days per year = 5500 minutes or 92 hours per year

Taxonomy Strategies LLC The business of organized

35

Time saved: Taxonomy compared to search

Benefit:

Increase service efficiency


Number of call center calls per month Average cost per call Call response costs per month Total call response costs per year Percentage of self-serviced calls due to improved information browsing Service costs savings per year $ 50,000 20

$ 1,000,000 $12,000,000 30% $ 3,600,000

Taxonomy Strategies LLC The business of organized

36

Trusted advisers: Taxonomy avoids costs


y The amount of time wasted in futile searching for vital

information is enormous, leading to staggering costs

Sue Feldman,
y Suns usability experts calculated that 21,000 employees

were wasting an average of six minutes per day due to inconsistent intranet navigation structures. When lost time was multiplied by staff salaries, the estimated productivity loss exceeded $10M per yearabout $500 per employee per year.
Jakob Nielsen, useit.com

Taxonomy Strategies LLC The business of organized

37

Knowledge workers spend up to 2.5 hours each day looking for information

Communicating

Searching

Creating

But find what they are looking for only 40% of the time.
Source: Kit Sims Taylor
Taxonomy Strategies LLC The business of organized 38

Knowledge workers spend more time re-creating existing content than creating new content

Communicating

Searching

Recreating existing content 25%

Creating new content 8%

Source: Kit Sims Taylor (cited by Sue Feldman in her original article)
Taxonomy Strategies LLC The business of organized 39

Cost saved by not recreating content

Benefit: Increase in productivity


Number of employees Average employee salary Employee costs per year Increase in productivity from not re-creating content Employee cost savings per year $ 100 80,000

$8,000,000 25% $2,000,000

Taxonomy Strategies LLC The business of organized

40

Business case summary


1. Classifications and classification-like schemes are

being used to facilitate information seeking in the workplace, and on the web. scheme (faceted navigation) when it is made available in the user interface. User Interface.

2. Users take advantage (and prefer) this type of

3. Hierarchical or facet navigation can be guided by the 4. Facet navigation is best combined with keyword

searching. E.g., keyword search followed by faceted navigation of results.


41

Taxonomy Strategies LLC The business of organized

Todays agenda
9:00-9:10 9:10-9:15 9:15-9:45 9:45-10:00 10:00-10:30 10:30-11:00 11:00-12:00 12:00-12:30 12:30-13:30 13:30-14:30 14:30-14:45 14:45-15:15 15:15-16:15 16:15-16:30 16:30-17:00 10 minIntroduction 5 minWarm-up exercise 30 minTaxonomy fundamentals: Building taxonomies 15 minTaxonomy exercise 30 minTaxonomy fundamentals: Taxonomy business case 30 minTea Break 60 minTaxonomy governance 30 minCapabilities self-assessment 60 minLunch 60 minTaxonomy benchmarking 15 minBenchmarking exercise 30 minTea Break 60 minContent tagging 15 minTagging exercise 30 minQ&A
42

Taxonomy Strategies LLC The business of organized

Taxonomy requires a business processes


y Taxonomies must change, gradually, over time if they are

to remain relevant. y Maintenance processes need to be specified so that the changes are based on rational cost/benefit decisions.

Taxonomy Strategies LLC The business of organized

43

Taxonomy governance can be viewed as a standards process


y Taxonomy must evolve, but in a predictable way. y Team structure, with an appeals process
Taxonomy stewardship is part-time role at most organizations. Team needs to make decisions based on costs and benefits.

y Documentation and educational materials. y Comment-handling responsibilities (part of error-

correction process) y Issue Logs. y Release Schedule.

Taxonomy Strategies LLC The business of organized

44

Taxonomy governance: Change process overview


2: Taxonomy Team decides when to update CV 2: NASA snapshots Taxonomy Team

Taxonomy Facets

CV Consumers
Site Search Tool

CV Sources
Subject Codes Codes

decides when to update snapshots of external CVs

Site Search Tool


Portal Portal

Taxonomy Working Copies of CVs, maintain in Tool Taxonomy Tool

Project Archives

Working Papers

NASA Expertise Competencies

CVsOther from other NASA Sources Internal

3: 3: Team adds value to Team adds value via definitions, through snapshots definitions, synonyms, synonyms, classification rules, classification rules, training materials, etc. training materials, etc. Internally Internally Created CVs Created

Web CMS

External External Standard Vocabularies Standard

4: Updated versions of CVs 4: Updated versions of CVspublished to to Consumers consumers

DMS DAM

Tagging Metatagging Tool Tool Search UI Search UI

1: External controlled vocabularies (CVs) change on their own schedule

Taxonomy NASA Taxonomy Governance Governance Environment


Environment

CV = Controlled Vocabulary
Taxonomy Strategies LLC The business of organized 45

Who should build the taxonomy?


y The taxonomy (and metadata specification) should be

produced by a cross-functional team which includes business, technical, information management, and content creation stakeholders. y The team should plan on maintaining the taxonomy as well as building it.
Maintenance will not (usually) be anyones full-time job. Exact mix of people on team will change.

y It should be built in an iterative fashion, with more content

and broader review for each iteration.

Taxonomy Strategies LLC The business of organized

46

Taxonomy governance: Generic team charter


y Taxonomy Team is responsible for maintaining:
The Taxonomy, a multi-faceted classification scheme. Associated taxonomy materials, such as: Editorial Style Guides. Taxonomy Training Materials. Metadata Standard. Team rules and procedures for change management.

y Taxonomy Team will consider costs and benefits of

suggested changes. y Taxonomy Team will:

Manage relationship between providers of source vocabularies

and consumers of the Taxonomy. Identify new opportunities for use of the Taxonomy across the enterprise to improve information management practices. Promote awareness and use of the Taxonomy.
Taxonomy Strategies LLC The business of organized

47

Taxonomy governance team: Generic roles


Business Lead
Technical Specialist Taxonomy Specialist Content Specialist Content Owners

Keeps committee on track with larger business objectives. Balances cost/benefit issues to decide appropriate levels of effort. Obtains needed resources if those on committee cant accomplish a particular task.
Estimates costs of proposed changes in terms of amount of

data to be retagged, additional storage and processing burden, software changes, etc. Helps obtain data from various systems.
Committees liaison to content creators. Estimates costs of proposed changes in terms of editorial

process changes, additional or reduced workload, etc.


Suggests potential taxonomy changes based on analysis of

query logs, indexer feedback. Makes edits to taxonomy, installs into system with aid of IT specialist.
Reality check on process change suggestions.

Taxonomy Strategies LLC The business of organized

48

Where taxonomy changes come from


Firewall Application UI
Application Logic

Tagging UI Content Tagging Logic

Taxonomy Query log analysis

Staff notes missing concepts Tagging Staff

End User

Recommendations by Editor 1. Small taxonomy changes (labels, synonyms) 2. Large taxonomy changes (retagging, application changes) 3. New best bets content.

Taxonomy Editor

Team Considerations 1. Business goals.


experience 2. Changes in user experience.

Taxonomy Team

3. Retagging cost.
Requests from other Requestsof NASA parts from other parts of the organization
49

Taxonomy Strategies LLC The business of organized

Taxonomy maintenance processes


y Different organizations will need to consider their own

change processes.

Organization 1: A custodian is responsible for the content, but

checks facts with department heads before making changes. Organization 2: Analysts suggest changes, editors approve, copyeditors verify consistency. Organization 3: Marketing reps ask for a change, taxonomy editor makes demo, web representative approves it.

y Change process MUST also consider cost of

implementing the change


Retagging data. Reconfiguring auto-classifier. Retraining staff. Changes in user expectations.

Taxonomy Strategies LLC The business of organized

50

Taxonomy maintenance workflow


Taxonomy Tool

Yes

Problem?

No

Suggest new name/category

Review new name

Copy edit new name

Add to enterprise Taxonomy

Problem?
Yes

No

Taxonomy

Analyst

Editor

Copywriter

Sys Admin

Taxonomy Strategies LLC The business of organized

51

Sample taxonomy editor: Data Harmony


Hierarchy Browser

Standard Term Info

Taxonomy Strategies LLC The business of organized

52

Taxonomy editing tools vendors


Most popular taxonomy editor is MS Excel
high

An immature area No vendors are in upper-right quadrant!

Ability to Execute

High functionality /high cost products ($100K+)


Niche Players Visionaries

MultiTes is widely used, cheap with

low

Completeness of Vision

Taxonomy Strategies LLC The business of organized

53

Taxonomy maturity model


y Taxonomy governance processes must fit the organization. y As consultants, we notice different levels of maturity in the business

processes around content management, taxonomy, and metadata. y Honestly assess your organizations metadata maturity in order to design appropriate governance processes. y We are starting to define a maturity model, similar to the Software Capability Maturity Model (CMM)

Initial: Ad hoc, each project begins from scratch. Repeatable: Procedures defined and used, but not standardized across

organization or are misapplied to projects. Defined: Standard processes are tailored for project needs. Strategic training for long-range goals is in place. Managed: Projects managed using quantitative quality measures. Process itself is measured and controlled. Optimizing: Continual process improvement. Extremely accurate project estimation.

Taxonomy Strategies LLC The business of organized

54

Purpose of maturity model


y Estimating the maturity of an organizations information

management processes tells us:


How involved the taxonomy development and maintenance

process should be
Overly sophisticated processes will fail.

What to recommend as first steps.

y Maturity is not a goal, it is a characterization of an

organizations methods for achieving particular goals. y Mature processes have expenses which must be justified by consequent cost savings or revenue gains. y IT Maturity may not be core to your business.

Taxonomy Strategies LLC The business of organized

55

Taxonomy maturity scorecard


Initial Organizational Structure Executive Sponsorship Budgeting Hiring & Training Quality Assurance Manual Processes Automated Processes Project Management Estimating & Scheduling Cost Control Project Methodology Design and Execution Planning Design Excellence Development Maturity * * * * * * 2 * * 1 * * * Repeatable Defined Managed Optimizing

1 X is starting to examine search query logs, which is an important first step in improving search. But this is only an isolated example. 2 IT has a project methodology they are trying to use across all projects. But not all business units have project methodologies.

Taxonomy Strategies LLC The business of organized

56

Taxonomy governance self-assessment


Background
1. Rate your organizations overall taxonomy maturity from 1 to 2. Does the search engine index more than 4 repositories

around the organization?

10.

Immature

4 5 6 Mature

10

3. Are system features and metadata fields added based on

2. What type of change was most recently made to your

cost/benefit analysis, or because they are easy to do with the current applications and tools? Cost/Benefit Easy been analyzed, or are major purchases sometimes made to use up year-end money? Requirements Year-End taxonomy positions? Yes No

organizations taxonomy management environment? Standards Tools People

4. Are applications and tools acquired after requirements have

Functionality

Data Quality

2. What is the area for your organizations taxonomy

management environment improvement? Standards Tools

5. Are there hiring and training practices for metadata and

Functionality Basic

People

Data Quality

If there is training, describe it briefly.

1. Is there a process in place to examine search query logs?

Yes

No

2. Is there an organization-wide metadata standard, such as the

Advanced
1. Are there established qualitative and quantitative measures of

Dublin Core, for use by search tools? Yes

No

metadata quality?

Yes

No

Intermediate
1. Is there an ongoing data cleansing procedure to look for any

If there are measures, describe them briefly.

redundant, obsolete or trivial content (ROT)? If there is a process, describe it briefly.

Yes

No

2. Can the CEO explain the return on investment (ROI) for

content management, search and metadata?

Yes

No

Taxonomy Strategies LLC The business of organized

57

2005 Maturity survey: Search practices


n=87 Search Box in standard place on all web pages. Search engine indexes multiple repositories in addition to web sites. Spell Checking. Synonym Searching. Search results grouped by date, location, or other factors in addition to simple relevance score. Queries are logged and the logs are regularly examined Common queries identified, 'best' pages for those queries are found, and search engine configured to return them at the top. (Best Bets) Advanced computation of relevance based on data in addition to the text of the document. A faceted search tool, such as Endeca, has been implemented for the organization's external site or product catalog search. A faceted search tool, such as Endeca, has been implemented for the organization's internal website(s) or portal.
Taxonomy Strategies LLC The business of organized

Not current practice

Being developed

In practice

Former practice

NA or Unknown

20% (12) 25% (15) 31% (19) 41% (25) 37% (22) 31% (19) 46% (28) 43% (26) 68% (41) 57% (34)

11% (7) 21% (13) 18% (11) 23% (14) 20% (12) 25% (15) 25% (15) 16% (10) 7% (4) 15% (9)

62% (38) 44% (27) 38% (23) 30% (18) 37% (22) 31% (19) 21% (13) 25% (15) 10% (6) 17% (10)

2% (1) 2% (1) 0% (0) 0% (0) 0% (0) 5% (3) 0% (0) 0% (0) 0% (0) 0% (0)

5% (3) 8% (5) 13% (8) 7% (4) 7% (4) 8% (5) 8% (5) 16% (10) 15% (9) 12% (7)

58

2005 Maturity survey: Metadata practices


n=87 Metadata standards are developed for the needs of each system with no overall attempt to unify them. An Organization-wide metadata standard exists and new systems consider it during development. The Organization-wide metadata standard is based on the Dublin Core. Multiple repositories comply with metadata standard. A Cataloging Policy document exists to teach people how to tag data in compliance with organizational metadata standard. The Cataloging Policy document is revised periodically. A centralized metadata repository exists to aggregate and unify metadata from disparate sources. Metadata is manually entered into web forms. Metadata is generated automatically by software. Metadata is generated automatically, then reviewed manually for correction. Not current practice Being developed In practice Former practice NA or Unknown

22% (13) 37% (22) 52% (30) 52% (31) 48% (29)

12% (7) 37% (22) 16% (9) 20% (12) 20% (12)

37% (22) 20% (12) 21% (12) 17% (10) 20% (12)

20% (12) 0% (0) 0% (0) 0% (0) 0% (0)

10% (6) 7% (4) 12% (7) 12% (7) 12% (7)

48% (29) 57% (34) 15% (9) 38% (23) 48% (29)

15% (9) 17% (10) 12% (7) 18% (11) 18% (11)

17% (10) 17% (10) 61% (36) 27% (16) 17% (10)

0% (0) 0% (0) 3% (2) 2% (1) 2% (1)

20% (12) 10% (6) 8% (5) 15% (9) 15% (9)

Taxonomy Strategies LLC The business of organized

59

2005 Maturity survey: Taxonomy practices


n=87 Org Chart Taxonomy - One based primarily on the structure of the organization. Products Taxonomy - One based primarily on the products and/or services offered by the organization. Content Types Taxonomy - One based primarily on the different types of documents. Topical Taxonomy - One based primarily on topics of interest to the site users. Faceted Taxonomy - One which uses several of the approaches above. The Taxonomy, or a portion of it, was licensed from an outside taxonomy vendor. The Taxonomy follows a written 'style guide' to ensure its consistency over time. The Taxonomy is maintained using a taxonomy editing tool other than MS Excel. The Taxonomy was validated on a representative sample of content during its development. A Roadmap for the future evolution of the Taxonomy has been developed.
Taxonomy Strategies LLC The business of organized

Not current practice

Being developed

In practice

Former practice

NA or Unknown

36% (21) 37% (22) 28% (16) 20% (12) 32% (19) 75% (44) 47% (28) 35% (21) 28% (17) 38% (23)

10% (6) 10% (6) 21% (12) 36% (21) 29% (17) 3% (2) 22% (13) 17% (10) 22% (13) 40% (24)

34% (20) 32% (19) 40% (23) 34% (20) 34% (20) 14% (8) 20% (12) 40% (24) 33% (20) 13% (8)

5% (3) 5% (3) 5% (3) 3% (2) 0% (0) 0% (0) 0% (0) 2% (1) 3% (2) 0% (0)

15% (9) 15% (9) 7% (4) 7% (4) 5% (3) 8% (5) 10% (6) 7% (4) 13% (8) 8% (5)
60

Todays agenda
9:00-9:10 9:10-9:15 9:15-9:45 9:45-10:00 10:00-10:30 10:30-11:00 11:00-12:00 12:00-12:30 12:30-13:30 13:30-14:30 14:30-14:45 14:45-15:15 15:15-16:15 16:15-16:30 16:30-17:00 10 minIntroduction 5 minWarm-up exercise 30 minTaxonomy fundamentals: Building taxonomies 15 minTaxonomy exercise 30 minTaxonomy fundamentals: Taxonomy business case 30 minTea Break 60 minTaxonomy governance 30 minCapabilities self-assessment 60 minLunch 60 minTaxonomy benchmarking 15 minBenchmarking exercise 30 minTea Break 60 minContent tagging 15 minTagging exercise 30 minQ&A
61

Taxonomy Strategies LLC The business of organized

Taxonomy testing methods


Method
Walk-thru

Process

Who

Requires
Rough taxonomy

Validation
Approach Appropriateness to task

Show & explain Taxonomist SME Team Taxonomist Check conformance to editorial rules Users Contextual analysis (card sorting, scenario testing, etc.) Survey Users

Walk-thru

Draft Consistent look and feel taxonomy Editorial Rules Rough taxonomy Tasks & Answers Rough Taxonomy UI Mockup Search prototype Tasks are completed successfully Time to complete task is reduced Reaction to taxonomy Reaction to new interface Reaction to search results

Usability Testing

User Satisfaction

Tagging Samples

Tag sample content with taxonomy

Taxonomist Team Indexers

Sample Content fit content Fills out content inventory Rough Training materials for people & taxonomy (or algorithms better)
62

Taxonomy Strategies LLC The business of organized

Walk-through method Show & explain


ABC Computers.com

Content Type
Award Case Study Contract & Warranty Demo Magazine News & Event Product Information Services Solution Specification Technical Note Tool Training White Paper Other Content Types

Competency

Industry

Service

Product Family
Desktops MP3 Players Monitors Networking Notebooks Printers Projectors Servers Services Storage Televisions Other Brands

Audience

Line of Business
All Home & Home Office Gaming Government, Education & Healthcare Medium & Large Business Small Business

RegionCountry
All Asia-Pacific Canada EMEA Japan Latin America & Caribbean United States

Business & Finance Interpersonal Development IT Professionals Technical Training IT Professionals Training & Certification PC Productivity Personal Computing Proficiency

Banking & Finance Communications E-Business Education Government Healthcare Hospitality Manufacturing Petro-chemicals Retail / Wholesale Technology Transportation Other Industries

Assessment, Design & Implementation Deployment Enterprise Support Client Support Managed Lifecycle Asset Recovery & Recycling Training

All Business Employee Education Gaming Enthusiast Home Investor Job Seeker Media Partner Shopper First Time Experienced Advanced Supplier

Taxonomy Strategies LLC The business of organized

63

Walk-through method Editorial rules consistency check


y y y y y y y y y y y y y y y

Abbreviations Ampersands Capitalization General, More, Other Languages & character sets Length limits Multiple parents Plural vs. singular form Scope notes Serial comma Sources of terms Spaces Synonyms & acronyms Term order (Alphabetic or ) Term label order (Direct vs. inverted)

Rule Name
Abbreviations

Editorial Rule
Abbreviations, other than colloquial terms and acronyms, shall not be used in term labels. Example: Public Information NOT: Public Info. The ampersand [&] character shall be used instead of the word and. Example: Licensing & Compliance NOT: Licensing and Compliance Title case capitalization shall be used. Example: Customer Service NOT: CUSTOMER SERVICE NOT: Customer service NOT: customer service

Ampersands

Capitalization

General, The term labels General, More, and More, Other Other shall be used for categories which contain content items that are not further classifiable. Example: Other Property Other Services General Information General Audience

Taxonomy Strategies LLC The business of organized

64

Task-based testing*

* Based on Donna Maurers usability work with the Australian government

y 15 representative questions were selected Perspective of various organizational units Most frequent website searches Most frequently accessed website content Correct answers to the questions were agreed in advance by team. y 15 users were tested Did not work for the organization Represented target audiences y Testers were asked where would you look for under which facet Topic, Commodity, or Geography? Then, under which category? Then, under which sub-category? Tester choices were recorded y Testers were asked to think aloud Notes were taken on what they said y Pre- and post questions were asked Tester answers were recorded
Taxonomy Strategies LLC The business of organized 65

Task-based testing Representative questions


How much cotton is imported from China? What are the impacts of mad cow" disease on U.S. meat production, sales? What is the average farm income level in your state? How much of our diet comes from fast food? How many people receive WIC benefits (Special Supplemental Nutrition Program for Women, Infants, and Children)? 6. How much acreage is planted to genetically engineered corn? 7. What is the cost of foodborne illness in the United States? 8. What part of food costs go to farmers, retailers? 9. Which States produce the most tobacco? 10. What percentage of farms in the United States are small farms? 11. What are the costs and benefits associated with providing more traceability in the U.S. food supply? 12. How many people in America dont get enough to eat? 13. What is behind the trade balance (surplus or deficit) in agricultural goods? 14. What is the extent of conservation compliance? How does that impact farmer's decisions? 15. What are the impacts of foreign trade restrictions on U.S. farmers, U.S. food prices?
1. 2. 3. 4. 5.
Taxonomy Strategies LLC The business of organized 66

Task-based testing Closed card sorting


3. What is the average farm income level in your state?

1. Topics 2. Commodities 3. Geographic Coverage

1. Topics 1.1 Agricultural Economy 1.2 Agriculture-Related Policy 1.3 Diet, Health & Safety 1.4 Farm Financial Conditions 1.5 Farm Practices & Management 1.6 Food & Agricultural Industries 1.7 Food & Nutrition Assistance 1.8 Natural Resources & Environment 1.9 Rural Economy 1.10 Trade & International Markets

1.4 1.4.1 1.4.2 1.4.3 1.4.4 1.4.5 1.4.6 1.4.7

Farm Financial Conditions Costs of Production Commodity Outlook Farm Financial Management & Performance Farm Income Farm Household Financial Well-being Lenders & Financial Markets Taxes

Taxonomy Strategies LLC The business of organized

67

Task based testing Card sort analysis


Find-it Tasks
1. Cotton 2. Mad cow 3. Farm income 4. Fast food 5. WIC 6. GE Corn 7. Foodborne illness 8. Food costs 9. Tobacco 10. Small Farms 11. Traceability 12. Hunger 13. Trade balance 14. Conservations 15. Trade restrictions

User 1
Cotton Cattle Farm Income

User 2
Cotton Food Safety Farm Income Asia Cattle

User 3

User 4
Cotton Cattle Farm Income

User 5
Cotton Cattle Farm Income Diet Quality & Nutrition WIC Program Corn

US States

Food Consumption Diet Quality & Nutrition WIC Program Corn WIC Program Corn

Food Expenditures Diet Quality & Nutrition WIC Program Corn WIC Program Corn

Foodborne Disease Foodborne Disease Consumer Food Safety Food Prices Tobacco Farm Structure Food System Food Security Market Structure Tobacco Farm Structure Labeling Policy Food Security Market Analysis Tobacco Farm Structure Food Safety Innovations Food Security

Foodborne Disease Foodborne Disease Food Expenditures Retailing & Wholesaling Tobacco Farm Structure Tobacco Farm Structure

Food Safety Policy Food Prices Food Security Food Security Commodity Trade

Commodity Trade Trade & Intl Markets

Commodity Trade Market Analysis

Cropping Practices Conservation PolicyConservation PolicyConservation PolicyConservation Policy Trade Policy Food Safety & Trade WTO Market Analysis Commodity Trade
68

Taxonomy Strategies LLC The business of organized

Task based testing Card sort results


y In 80% of the trials users looked for information under the

categories that we expected them to look for it. y Breaking-up topics into facets makes it easier to find information, especially information related to commodities.

Taxonomy Strategies LLC The business of organized

69

Task based testing Card sort results


Test Questions
1. Cotton 2. Mad cow 3. Farm income 4. Fast food 5. WIC 6. GE corn 7. Foodborne illness 8. Food costs 9. Tobacco 10. Small farms 11. Traceability 12. Hunger 13. Trade balance 14. Conservation 15. Trade restrictions

% Correct % Agree
91% 73% 100% 91% 100% 100% 82% 55% 100% 91% 36% 100% 36% 91% 55% 82% 64% 55% 73% 100% 100% 82% 27% 100% 91% 18% 73% 64% 91% 36%

Possible change required. Change required.

Policy of Traceability needs to be clarified. Use quasi-synonyms.

On these trials, only 50% looked in the right category, & only 27-36% agreed on the category.

Possible error in categorization of this question because 64% thought the answer should be Commodity Trade.

Taxonomy Strategies LLC The business of organized

70

Task-based testing User satisfaction survey


y Was it easy, medium or difficult to choose the appropriate

Topic?

Easy Medium Difficult

y Was it easy, medium or difficult to choose the appropriate

Commodity?
Easy Medium Difficult

y Was it easy, medium or difficult to choose the appropriate

Geographic Coverage?
Easy Medium Difficult

Taxonomy Strategies LLC The business of organized

71

User satisfaction survey Results


More Difficult
2.00 Difficult 1.50 1.00 0.50 Topic Commodity Facet Geography

Easier

Taxonomy Strategies LLC The business of organized

Easy

-->

72

User interface survey Which search UI is better?


y Criteria User satisfaction
Success completing tasks Confidence in results Fewer dead ends

y Methodology Design tasks from specific to


general Time performance Calculate success rates Survey subjective criteria Pay attention to survey hygiene:

Participant selection Counterbalancing T-scores

Source: Yee, Swearingen, Li, & Hearst


Taxonomy Strategies LLC The business of organized 73

User interface survey Results (1)


Which Interface would you rather use for these tasks? Find images of roses Find all works from a certain period Find pictures by 2 artists in the same media Overall assessment: More useful for your usual tasks Easiest to use Most flexible More likely to result in dead-ends Helped you learn more Overall preference Google-like Baseline 4 8 6 28 1 2 Faceted Category 28 23 24 3 31 29 Google-like Baseline 15 2 1 Faceted Category 16 30 29

Source: Yee, Swearingen, Li, & Hearst


Taxonomy Strategies LLC The business of organized 74

User interface survey Results (2)


9 8 7 6 5 4 3 2 1 0
to
7.7 6.3 4.7 4.6 3.5 5.8 7.8 6.0 4.8 4.0

7.6 6.0

7.2 6.7

7.4 5.5

se U

sy Ea

m Si

e pl

e Fl

e bl xi

s ou i ed T

In

s re te

g tin Ea sy to

w ro B

se

y jo En

e bl a O w er v

m el h

g in

Google-like Baseline Faceted Category


Taxonomy Strategies LLC The business of organized

Source: Yee, Swearingen, Li, & Hearst


75

Tagging samples How many items?


Goal Illustrate metadata schema Develop training documentation Qualitative test of small vocabulary (<100 categories) Number of Items 1-3 10-20 25-50 Criteria Random (excluding junk) Show typical & unusual cases Random (excluding junk)

Quantitative test of vocabularies 3-10X numberUse computer-assisted * of categories methods when more than 1020 categories. Pre-existing metadata is the most meaningful.
* Quantitative methods require large amounts of tagged content. This requires specialists, or software, to do tagging. Results may be very different than how real users would categorize content.
Taxonomy Strategies LLC The business of organized 76

Tagging samples Manually tagged metadata sample


Attribute Title URL Description Jupiters Ring System http://ringmaster.arc.nasa.gov/jupiter/ Overview of the Jupiter ring system. Many images, animations and references are included for both the scientist and the public. Web Sites; Animations; Images; Reference Sources Educators; Students Ames Research Center Voyager; Galileo; Cassini; Hubble Space Telescope Jupiter Scientific and Technical Information Planetary and Lunar Science 1979-1999
77

Values

Content Types Audiences Organizations Missions & Projects Locations Business Functions Disciplines Time Period

Taxonomy Strategies LLC The business of organized

Tagging samples Spreadsheet for tagging 10s-100s of items


1) Clickable URLs for sample content 2) Review small sample and describe 3) Drop-down for tagging (including Other entry for the unexpected

4) Flag questions
Taxonomy Strategies LLC The business of organized 78

Rough bulk tagging Facet demo (1)


y Collections: 4 content sources
NTRS, SIRTF, Webb, Lessons Learned

y Taxonomy
Converted MultiTes format into RDF for Seamark

y Metadata
Converted from existing metadata on web pages, or Created using simple automatic classifier (string matching with

terms & synonyms) 250k items, ~12 metadata fields, 1.5 weeks effort

y OOTB Seamark user interface, plus logo

Taxonomy Strategies LLC The business of organized

79

Rough bulk tagging Facet demo (2)

Taxonomy Strategies LLC The business of organized

80

Document distribution How evenly does it divide the content?


y Documents do not distribute uniformly across categories y Zipf (1/x) distribution is expected behavior y 80/20 rule in action (actually 70/20 rule)
Measured v Expected Distribution of Top 10 Content Types in Library of Congress Database
350,000 Number of Records 300,000 250,000 200,000 150,000 100,000 50,000 0
Bi bl io gr ap hy Co ng re ss es og ra ph y Fi ct io n er at ur e ca ls M ap s itio ns Pe rio di St at is tic s

Leading candidate for splitting

Leading candidates for merging

Ex hi b

Bi

Top 10 Content Types

Taxonomy Strategies LLC The business of organized

Ju ve ni le

lit

81

Document distribution How evenly does it divide the content?


y Methodology: 115 randomly selected URLs from corporate intranet

search index were manually categorized. Inaccessible files and junk were removed.
y Results: Slightly more uniform than Zipf distribution. Above the curve

is better than expected.


Measured v Expected Intranet Content Type Distribution
25 20 # Documents 15 10 5 0 News & Events People, Groups & Places Operations & Internal Communications Regulations, Policies, Procedures & Templates Papers & Presentations Other & Unclassified Marketing & Sales Programs, Proposals, Plans & Schedules Manuals & Learning Materials

Content Type

Taxonomy Strategies LLC The business of organized

82

Document distribution How does taxonomy shape match that of content?


Background:
y Hierarchical taxonomies allow
Term Group Administrators Community Groups Counselors Federal Funds Recipients and Applicants Librarians News Media Other Parents and Families Policymakers Researchers School Support Staff Student Financial Aid Providers Students Teachers % Terms % Docs 7.8 2.8 3.4 9.5 2.8 0.6 7.3 2.8 4.5 2.2 2.2 1.7 27.4 25.1 15.8 1.8 1.4 34.4 1.1 3.1 2.0 6.0 11.5 3.6 0.2 0.7 7.0 11.4

comparison of fit between content and taxonomy areas

Methodology:
y 25,380 resources tagged with

taxonomy of 179 terms. (Avg. of 2 terms per resource) y Counts of terms and documents summed within taxonomy hierarchy

Results:
y Roughly Zipf distributed (top 20

terms: 79%; top 30 terms: 87%) y Mismatches between term% and document% flagged
Taxonomy Strategies LLC The business of organized

Source: Courtesy Keith Stubbs, US. Dept. of Ed.


83

Usability testing How intuitive (repeatable) are the categorizations (1)?


y Methodology: Closed Card Sort
For alpha test of a grocery site 15 Testers put each of 71 best-selling product types into one of

10 pre-defined categories Categories where fewer than 14 of 15 testers put product into same category were flagged

Taxonomy Strategies LLC The business of organized

84

Usability testing How intuitive (repeatable) are the categorizations (2)?

Taxonomy Strategies LLC The business of organized

85

Usability testing How intuitive (repeatable) are the categorizations?


% of Testers 15/15 14/15 13/15 12/15 11/15 <11/15 Cumulative % of Products 54% 70% 77% 83% 85% 100% With Poly-Hierarchy 69% 83% 93% 100% 100% 100%
86

Taxonomy Strategies LLC The business of organized

The #1 underused source of quantitative information on how to improve your taxonomy?

Query Logs & Click Trails

Taxonomy Strategies LLC The business of organized

87

Query log & click trail examination Who are the users & what are they looking for?
y Only 30-40% of organizations regularly examine their

logs*. y Sophisticated software available, but dont wait. y 80% of value comes from basic reports

Taxonomy Strategies LLC The business of organized

88

Query log & click trail examination Query log


UltraSeek Reporting
y Top queries y Queries with no results y Queries with no click-through y Most requested documents y Query trend analysis y Complete server usage

summary

Taxonomy Strategies LLC The business of organized

89

Query log & click trail examination Click trail packages


y iWebTrack y NetTracker y OptimalIQ y SiteCatalyst y Visitorville y WebTrends

Taxonomy Strategies LLC The business of organized

90

Summary Start a Measure & Improve mindset


y Taxonomy changes do not stand alone
Search system improvements Navigation improvements Content improvements Process improvements

Taxonomy Strategies LLC The business of organized

91

Benchmarking exercise
y What are 5 representative questions that your users ask or tasks

that your users do when using your application? y Is it currently easy, medium or difficult to answer these questions or accomplish these tasks?

Rating (Easy/ Medium/Difficult)

Questions or Tasks

Taxonomy Strategies LLC The business of organized

92

Conclusion What is a good taxonomy?


y Incremental, extensible process that identifies and y y y y

enables owners, and engages stakeholders. Quick implementation that provides measurable results as quickly as possible. A means to an end, and not the end in itself. Not perfect, but it does the job it is supposed to dosuch as improving search and navigation. Improved over time, and maintained.

Taxonomy Strategies LLC The business of organized

93

Todays agenda
9:00-9:10 9:10-9:15 9:15-9:45 9:45-10:00 10:00-10:30 10:30-11:00 11:00-12:00 12:00-12:30 12:30-13:30 13:30-14:30 14:30-14:45 14:45-15:15 15:15-16:15 16:15-16:30 16:30-17:00 10 minIntroduction 5 minWarm-up exercise 30 minTaxonomy fundamentals: Building taxonomies 15 minTaxonomy exercise 30 minTaxonomy fundamentals: Taxonomy business case 30 minTea Break 60 minTaxonomy governance 30 minCapabilities self-assessment 60 minLunch 60 minTaxonomy benchmarking 15 minBenchmarking exercise 30 minTea Break 60 minContent tagging 15 minTagging exercise 30 minQ&A
94

Taxonomy Strategies LLC The business of organized

Tagging Overview
y Tagging is better than the words that happen to occur in a

piece of content. y All tagging is useful


End user tagging Tagging by librarians Automated tagging by OS and algorithms

y Content should be tagged throughout its lifecycle, each

time the content is handled and used so that it accrues value or its significance is diminished.

Taxonomy Strategies LLC The business of organized

95

MS Office: File Properties

Ho wm any

peo ple f

ill t his

in?

Taxonomy Strategies LLC The business of organized

96

Organize

Ho wm an yp eo p le

cli ck on thi s?

Taxonomy Strategies LLC The business of organized

97

What is social tagging?


y End user tagging y Easy, intuitive tagging interfaces y Almost instantaneous feedback
Enables people to tag & re-tag content in response to seeing their tags in context with other tags.

y Emergent categories
Resembles open card sort process in which patterns emerge rather than validating categories using closed card sorts.

Taxonomy Strategies LLC The business of organized

98

Social tagging innovators


y flickr founders
Caterina Fake Stewart Butterfield

y del.icio.us founder
Joshua Schachter

y del.icio.us & flickr are now both part of Yahoo! y As of April 2006 flickr had 130 million photos posted by 3

million registered users.

Taxonomy Strategies LLC The business of organized

99

Four tagging rules for end users


Rule Description

Use specific terms Apply the most specific terms when tagging content. But do not tag every possible topic, just the ones that are most important or best characterize the content as a whole. Use multiple terms Use as many terms as necessary to describe overall What the content is about & Why it is important. Do not over-tag. Use appropriate terms Consider how content will be used Only fill-in the facets & values that make sense. Not all facets apply to all content. Anticipate how the content will be searched for in the future, & how to make it easy to find it. Remember that search engines can only operate on explicit information.
100

Taxonomy Strategies LLC The business of organized

Agenda
y Content Tagging y Tagging Interface

Taxonomy Strategies LLC The business of organized

101

Requirements for a tagging interface


y Automated form fill-in (automatically fills in known data) y Tagging precedents (see tags already assigned by y y y y y y y y y

others) Controlled vocabularies, e.g., with pull-down list Multi-valued tags Geo-tagging Group tagging Clean-up tag tools, e.g., alpha list Batch editing Share/Dont share (Public/Private) Identified owner (who can be emailed) Almost immediate feedback, e.g., tag cloud

Taxonomy Strategies LLC The business of organized

102

Form fill-in: Automatically filled-in known data

Taxonomy Strategies LLC The business of organized

103

Form fill-in: Automatically filled-in known data

Manual form fill-in w/ check boxes, pull-down lists, etc.

Auto keyword & summarization

Taxonomy Strategies LLC The business of organized

104

Form fill-in: Automatically filled-in known data


Auto-categorization Rules & pattern matching

Parse & lookup (recognize names)

Taxonomy Strategies LLC The business of organized

105

Tagging precedents: See tags assigned by others

Taxonomy Strategies LLC The business of organized

106

Multi-valued group tagging

Taxonomy Strategies LLC The business of organized

107

Group geo-tagging

Taxonomy Strategies LLC The business of organized

108

Group geo-tagging

Taxonomy Strategies LLC The business of organized

109

Clean up tag tools: Alpha list

Taxonomy Strategies LLC The business of organized

110

Batch edit

Taxonomy Strategies LLC The business of organized

111

Share or dont share tagging

Taxonomy Strategies LLC The business of organized

112

Bulk tagging
y ID collection of related content items by pattern or context y Then, apply same attributes to all content items

Taxonomy Strategies LLC The business of organized

113

Tag a folder
y Drag & drop content items into folder y Then, content items inherit properties of folder

Taxonomy Strategies LLC The business of organized

114

Workflow
y Approve & improve mindset

Create Content Review & Improve

Add Metadata Review & Improve

Publish

Taxonomy Strategies LLC The business of organized

115

Interactive rewards
y Almost instantaneous exposure of tags in simple user

interfaces on the web provides positive reinforcement for user tagging that simply did not exist before. y For example,
Most popular Tag clouds Alerts

Taxonomy Strategies LLC The business of organized

116

Most popular

Another example is most emailed from, e.g., the NY

Times.

Taxonomy Strategies LLC The business of organized

117

Tag cloud

Taxonomy Strategies LLC The business of organized

118

Alerts
y New (content selected by date) y Subscriptions (content selected by tags) y Interest (content selected by other people) y Individual (content selected for you by other people)

Taxonomy Strategies LLC The business of organized

119

Taxonomy Strategies

LLC

Is faceted indexing the future of social tagging?

6-15 June 2007

Copyright 2007 Taxonomy Strategies LLC. All rights reserved.

Tagging exercise: Blog tagging (a)

ALA Tech Source. http://www.techsource.ala.org/blog/2007/04/google-buys-oclc-announces-new-products.html


Taxonomy Strategies LLC The business of organized 121

Tagging exercise: Blog tagging (b)

HBSP. http://discussionleader.hbsp.com/davenport/2007/04/cause_and_effect_reporting_raw.html#comments
Taxonomy Strategies LLC The business of organized 122

Tagging exercise: Taxonomy facetsdefinitions


Taxonomy Facets Business activity Industry / Product Geography Organization Person / Role Content Type Audience Topic Descriptions Use for common business function or activity such as finance, marketing and sales. Use for content that is about or related to an industrial sector or product such as construction equipment. Use for content that is about a region, country or city. Use for named organizations, brands and business entities. Use for named people and the roles people have in organizations. Use for content genres such as letters, memos and reports. Use to indicate the intended audience. Use for other business and associated topics that the content is about or related to.
123

Taxonomy Strategies LLC The business of organized

Tagging exercise: Taxonomy facetsvalues


Business activity Geography Industry / Product Organization / Entity People / Role Content Type Audience

Accounting Auditing Finance HR management IT Marketing Operations management Sales

Africa Americas Antarctica Asia Europe Oceania Global Historical geography Oceans & seas Regions

Agriculture Mining Utilities Construction Manufacturing Wholesale trade Retail trade Transportation & warehousing Information Finance & insurance Real estate Professional Management Administrative support Education Health care Arts, entertainment & recreation Accommodation & food Other services Public administration

Business entities Companies & brands Government agencies International NGOs Organization types

Business Leaders Thought Leaders Political Leaders Roles

Basic facts & information Blog Brochure Database E-mail Letter Memo Multimedia Report Newsletter Podcast Press Release Research & Analysis RSS Feed

Consumer Employee Manager Executive

Taxonomy Facets Business activity Industry / Product Geography Organization Person / Role Content Type Audience

Tags

Taxonomy Strategies LLC The business of organized

Topic

124

Summary
y There are lessons to be learned from web tagging about

how to get good metadata in document and content management applications. y Document and content management system tagging must be simple, and it must be almost instantaneously easier to find relevant work products.

Taxonomy Strategies LLC The business of organized

125

Taxonomy Strategies

LLC

Questions?
Joseph A. Busch + 415-377-7912 jbusch@taxonomystrategies.com http://www.taxonomystrategies.com

6-15 June 2007

Copyright 2007 Taxonomy Strategies LLC. All rights reserved.

Potrebbero piacerti anche