[These are my notes from the KMWorld 2012 Conference. Since I’m publishing them as soon as possible after the end of a session, they may contain the occasional typographical or grammatical error. Please excuse those. To the extent I’ve made any editorial comments, I’ve shown those in brackets.]
- Unstructured Content. Unstructured content is highly varied: it can range from a Twitter feed to a Word document or a scanned image. It can cover a range of subjects — as many subjects as arise in our lives.
- Information Management, circa 1950. In the 1950s, the focus was on manual tagging, content management, indexing, search and distribution. What is the same today? The scope of infomration and the process (e.g., tagging, indexing, search, etc.) The great crime is the continuing reliance on manual tagging. What is different? The techhnology and variety of information has changed. Further, we’ve moved from “information overload” to what it is currently called: “Big Data.” (Calling it Big Data suggests that we can cope with it, in a way that we couldn’t cope with Information Overload.) Other changes are the velocity of information and the complexity of requirements. The complexity relates to different audiences interested in different aspects of the information we have. It also relates to the different uses of that information.
- Flow. For the purposes of information management today, Flow = velocity X volume. (This is not entirely accurate according to fluid dynamics, but works for KM.) Being able to harness information in real time (in that flow), gives you a competive advantage and efficiencies. It is the relationships between data that present the opportunities.
- Content Intelligence Allows You to “Enrich” Your Content. This is now the Holy Grail for organizations. Knowledge our content helps us find opportunities, gives us competitive advantage and helps us stop it from leaking out of the organization. At the heart of content intelligence is labeling = metadata. Another key element of content intelligence is extracting the key information. We also need to classify the content so that we can provide indicators as to what it’s about. Given how much content there is and how many topics are covered by that content, it is impossible to manually tag it all effectively (especially since you don’t know who is looking for content and what they are looking for). Therefore, deriving metadata should occur at the point of use, not at the point of archiving. This is a huge reason why manually tagging can’t work. Content needs to be integrated with the existing collection. Finally, it needs to found so we need tools to help surface the content relevant to the person looking for it.
- Closing Questions. Can you afford the risks inherent in manual tagging? Can you afford to ignore customer feedback via huge unstructured data flows (e.g., via social media)? Do you have the means to track trends reliably?