While the popularity of wikis in the corporate world continues to grow due to the ease with which they allow anyone to publish web-based, shareable and easily accessible content, they also have a number of serious limitations which have been difficult to overcome mostly due to the inherent design. The most significant of these are:
1. Hierarchical structure of pages -- unless you maintain multiple hierarchical relationships (or references) of your pages, you have to always remember where a specific page content (or topic) may exist. Not only that but the consumers of this content must also always be aware of the structure or taxonomy. In the absence of this, endless navigation through the trees or wildcard searches are the only other way to find what you (or they) seek. In the corporate environment, people seldom have the time or opportunity to construct such taxonomies unless it is a dedicated activity or the wiki package comes with some pre-defined and ready-to-use templates.
2. Content updates -- Once a wiki grows to hundreds or thousands of pages, it is much more difficult to find content and keep it up-to-date since it is a manual process. However, if you can't keep it current then the repository loses its credibility and the wikis quickly become shelf-ware.
3. Contextual relevance -- While people create content on the wikis at will, its organization and hierarchy are often pretty arbitrary. Furthermore, there may or may not be a context associated with the content, or perhaps if there is one, it is implied by the author. So you end up with lots of content without any seemingly contextual relevance. Once again, something that becomes difficult to maintain and leverage over time.
With all the right spirit of generating and organizing a lot of content with ease, I have seen wikis become cumbersome, unwieldy and useless fairly quickly.
With this in mind, I envisioned (and began researching two years ago) the idea behind wikis that could support semantic annotations, thereby adding some needed structure to a naturally unstructured content store. Semantic annotations, I believed, would allow content to become much more self-organized, accessible using contextual searches and queries and updatable using software agents (where relevant) -- all adding up to some powerful capabilities.
I found a number of projects exploring this concept during this time. I think the closest one to the vision that I had and the one that we at zAgile have chosen to leverage in our platform -- is the result of efforts by Zdenko (Denny) Vrandečić & the team at Institute AIFB, Universität Karlsruhe, in the form of an extension to MediaWiki called Semantic MediaWiki (or SMW). We believe it to be a very efficient implementation of semantic technology on top of a wiki engine.
Supporting a very popular wiki (the engine behind Wikipedia), this extension allows the user to create annotations to any page in a free-form manner. It’s the easiest and most user-friendly approach to creating some structure to a vastly unstructured pool of content. Once annotated, pages can now be organized in any way using templates and queries and the content of any or all of these pages can easily be reflected elsewhere through simple semantic queries, rearranged, summarized, tabulated, accessed by external applications, etc. etc. The idea is not much different from the queries and result sets that we are used to getting from relational tables. However, the depth and expressivity of semantic annotations goes beyond what we are normally able to express in relational models (of course!! another topic) and the interfacing with external applications (both in and out; using RDF exports) is much easier.
This capability, in my opinion, makes the wiki a very dynamic and valuable application.
For our purpose, we have taken this base foundation of MediaWiki + SMW and extended it in the zAgile Platform to provide some interesting and useful functionality for the Software Engineering domain, although the functionality would be equally relevant anywhere else (our focus is only SE, for now :)). We have integrated applications with it to both create content into the wiki as well as access content from it.
We have enabled the wiki to become a valuable 'Knowledge Repository' rather than a 'Content Repository' .
· First, we have extended (or constrained, depending upon your point-of-view) the general free-form annotation capability of SMW with well-defined (and expressive) set of ontologies for the Software Engineering domain. Thus, the wiki now has the 'semantic awareness' of typical SE concepts, such as Artifact, Project, Person, Role, Team, Process, Work Product, Task, etc. This set will continue to extend with richer and broader set of ontologies as we move forward. Rather than allowing the users to arbitrarily create annotations, we constrain the wiki's semantic 'openess' by importing (or referencing) very specific set of ontologies. This is mostly to ensure semantic integrity of the knowledge domain, as well as interoperability between tools (the core value of the zAgile platform). If you continue to create arbitrary 'semantics' in your content, it is unlikely that anyone else will understand the meaning. So constraining the annotations has been a critical part of our efforts.
· Second, we have developed tools that generate 'annotated' content into the wiki. Using zAgile's Method Composer, for example, you can define (or derive from) a methodology that may be relevant for your organization or project -- and publish it on the wiki with appropriate annotations that describe Lifecycles, Milestones, Tasks, Documents, Roles, etc. Using our Doc2Wiki converter, you can take MS Office, OpenOffice or PDF files and load them into the wiki in a WYSIWYG form but with annotations that describe basic properties such as Creation/Modification Dates, Author, Subject, Title, etc. but also allow you to extend the semantics of the documents based on the type being uploaded (an Invoice, a Status Report, a Project Plan, etc.).
With these applications and widgets generating 'annotated' content on the wiki, you can now begin to organize this content using simple queries. For example, a set of pre-defined templates (with embedded queries) can now automatically generate My ProjectX page on the wiki. A page that includes sections containing My ProjectX Milestones, Processes, Phases, Teams, Roles, Status Reports, Invoices, Artifacts (and types), etc. None of this needing to either be organized or maintained manually. Based on the templates and queries, any combination or composition of pages is possible, thus allowing you to create virtual binders of projects, products, people, processes -- at will -- and share it across the organization. This has been done before by many others using wikis, but only manually, and most of the time, such content has been incomplete and out-of-date. Always a good start but not sustainable enough through the project lifecycle, therefore, generating lots of useless content.
- And finally, we have extended the typical search capability across this content using our Semantic Search engine. This is a slight paradigm shift to the way we normally use search. Rather than searching on keywords which can return any type of content match, now you can search for a specific concept. In this model, for example, one is either looking for a Person or Document or Project or Product-related information. Thus the search-string provided will return results that match the concept being searched on. Performing a search on a document with the string "Lucas" returns documents authored by Lucas. Similarly, a search on a project with the string "Lucas" will return Projects in which Lucas may have played some role. Rather than returning free-form result set, the search not only returns the concept being searched but also relevant properties and relations to other concepts (Document X -- authored by Lucas; created on Jan 19, 2008; copyrighted by zAgile Inc.; associated with Project Y; created in Inception Phase, etc. …) .
And of course, this search engine exists outside of the wiki (as a portlet) -- another example of the ease of interaction of semantic data with external apps.
-Sanjiva
Prague, Jan ‘08
Recent Comments