BLOG ARTICLE

A Metadata Primer (and why you should care)

If you are looking for information on what metadata is and how it can help your customers and end users find your data, you’ll want to pay close attention to Don’s latest blog entry.

Blog Main
Shape Divider Header Theme Image

The fundamental selling point of DITA (Darwin Information Typing Architecture) XML is content reuse—storing small chunks of content in one place and using them across multiple instances.

Content reuse dramatically decreases the volume of content technical publications teams must manage. It also increases quality by ensuring consistent messaging across all documentation.

Soren Weimann has a video explaining the idea of content reuse in DITA using Lego™ blocks (which represent individual topics), paper outlines of Lego™ blocks (maps), a bowl (to represent a Content Management System) and a camera (to illustrate output).

One of the oft-cited challenges of implementing content reuse is that it’s so hard to actually do. Many of the organizations that have implemented DITA have very little (or none) content reuse. It’s typically viewed as a future enhancement that hasn’t been implemented yet.

The problem? Many organizations implementing DITA struggle with actual reuse. The most common complaint: “It’s hard to find similar topics.”

A comprehensive metadata strategy solves this challenge.

What is metadata?

Metadata is information about your content—think of it as the label on a can of soup.

Without opening the can, the label tells you:

  • Ingredients
  • Nutritional information
  • Directions for use
  • Related recipes

Similarly, metadata helps you understand a topic’s content without opening the file. When organized effectively, it becomes a powerful search and classification tool.

Why is metadata Matters for Content Reuse

Metadata makes similar topics easier to find, which is critical for reuse.

Example: Your company is developing the SuperWorks 300 based on the SuperWorks 200, with improved electronics and remote update capabilities.

With proper metadata, you can quickly identify:

  • Reusable content from SuperWorks 200
  • Topics requiring updates for new electronics features
  • Gaps where new content is needed

This segregation saves time and ensures accuracy.

What is a metadata strategy?

Like a soup label, technical content metadata should include:

  • Topic information – What it covers
  • Product applicability – Which products/models
  • Authorship – Who created or revised it

An effective metadata strategy balances:

  1. Current needs – Understanding existing processes and pain points
  2. Future requirements – Anticipating product evolution

Important: Metadata strategy evolves with your product lines. What was critical 20 years ago (e.g., “uses Freon-12”) may now be obsolete. New criteria (e.g., “IoT sensor capability”) emerge constantly.

Options for a (useful) metadata include:

  • Author information
  • Audience information
  • Content information/usage
  • Applicable product information
  • Review status

Our experience is that full text search is not a useful metadata strategy because there are too many false positives.

A (potential) boon for end-users

Extending metadata into and outside of the enterprise not just for content reuse– it also supports end users searching your content to find that needle in the (content) haystack. Examples include supporting a faceted search on a website, retrieving a repair topic based on diagnostic codes or serving up content based on product feature.

Amazon uses metadata to help you zero-in your search to help you find (and buy) exactly what you are looking for. A search for “solar panel” allows me to narrow my search based on:

  • Rated Power
  • Feature keyword
  • Brand
  • Avg. Customer Review

What about taxonomy?

Related to metadata is Taxonomy – the practice and science of classification. The basic idea behind taxonomy is to provide a controlled vocabulary for metadata attributes, and to specify relationships between terms in the controlled vocabulary. Taxonomies allow a search for one thing and have results that are related to that thing – automatically. If you use Amazon then you’re probably familiar with the concept of taxonomy. Selecting a hammer will also show results for items that are related to a hammer.

Screwdrivers and tool bags have a similar classification to hammers and an interest in one may lead to an interest in another.

Putting it together

A thoughtful metadata and taxonomy strategy can be a critical aspect to implementing a viable content reuse strategy. Investing the time to standardize the vocabulary and identify the “Five Ws” (Who, What, Why, When, Where) go a long way to helping find similar topics.

Gray Shape Divider Theme Image