Document Management Systems
V. Balasubramanian, Ph.D.
E-Papyrus, Inc.
[email protected]
November 3, 1999
Overview
Introduction
Definitions
Benefits
Types of Documents: Industries
Applications
DM Components
DM Functionality
DM Functionality for the Web
Merrill Lynch Case Study
Conclusions
References
Research and Job Opportunities
Introduction
Eras of Systems:
1960s and 1970s: Computational Systems (CS)
1980s and 1990s: Database Management Systems (DBMS)
Image Management Systems (IMS)
Late 1990s: Document Management Systems (DMS)
Knowledge Management Systems (KMS)
First Decade of 21st Century: Multimedia Management Systems
(MMS)
Estimated that 90% of an organization’s information is in
documents rather than structured databases (Sprague, 1995).
True today more than ever.
Introduction
Limitations of RDBMS for document management
Based on E-R data models
Suitable for structured data
Traditional business applications, decision support systems,
reporting tools
No inherent support to manage electronic documents
Introduction
•Documents are results of most business processes. They can be made of multiple
media.
•Once you have them, you need to manage them.
•Only if you have documents, you can have relationships (hypertext).
•If you have a process for creating, reviewing, approving documents, you need
workflow.
•When you’ve documents you need ways to retrieve them.
Hypertext Workflow
Document
Management
Information Multimedia
Retrieval
Convergence of enabling technologies
Definitions
A document is an artifact resulting from the transformation of a
set of ideas by people following a set of processes.
An electronic document has the following characteristics
(Sprague, 1995):
holds information of multiple media: text, graphics, audio, video
contains multiple structures: headers, footers, TOC, sections,
paragraphs, tables
is dynamic: can be updated on the fly
may depend on other documents
Definitions Technology
People enables change
in process
Input Process Output
Tech
nolo
gy Change
Process: Decision making process, design process, etc.
Input: Thoughts, ideas, issues, concerns
Output: Documents (memos, news, design documents, white
papers, marketing literature, contracts, manuals)
People: Executives, Designers, Lawyers, Scientists
Definitions
Document Management (DM) (Sprague, 1995): creation, storage,
organization, transmission, retrieval, manipulation, update,
archival and retirement of documents based on organizational
needs.
Benefits
Sprague (1995) states that document management systems
(DMS) enable:
Generation of revenue producing products
For publishing industry, documents are a direct source of revenue
Organizational Communication
Concepts, ideas, decisions are shared in the form of electronic
documents to increase efficiency and effectiveness
Business Process Re-engineering
Current business processes designed around paper documents;
electronic documents help to reduce cycle time
Organizational Memory
Both hard data and soft/tacit knowledge stored as documents providing
access to history, design/decision rationale, expertise, best practices, etc.
Benefits
Reduce time to create, review, approve and publish mission critical
documents
Increase accessibility to information; retrieval using business
characteristics and full-text searches
Ensure currency
Provide access and version control
Enable enterprise-wide collaboration; reduce email
Facilitate workflows (sequential and parallel)
Maintain audit trail
Increase re-use of components (produce multiple documents from
same components)
Publish electronic & paper documents simultaneously
Types of Industries & Documents
Industry Segment Document Type
Automobile, Engineering drawings
Construction
Pharmaceutical New drug applications to FDA
Insurance Claims
Financial Product brochures, swaps and
derivatives
Consulting Contracts and agreements
Architecture, Blueprints and photographs
Engineering
Consumer Products, Marketing literature
Financial
Lawyers Legal briefs
Airlines* Manuals and handbooks
All Memos/White Papers
* It is said that Boeing ships three plane loads full of manuals for every plane
Applications
Financial
Product catalogs (marketing information): Org Comm
Back-office: confirmation of trades, customized letters and promotions:
Revenue Generation
Policies: Org Comm
Pharmaceutical
New drug applications submitted to FDA (approximately 600 volumes
of 200 pages each): Business Process Re-engineering
Product labeling information:
Standard operating procedures, laboratory manuals: Org Comm
Organizational knowledge on drug development: Org Memory
Regulatory guidelines: Org Memory
Competitive intelligence
DM Components Document
Management
Authors =
Title
Description
Attribute
Creation Date
Version Number Management
Modified Date
…….
+
Content: Text Content
Graphics Management
Index Terms
DM Components
Document Management Functions Applications
Organizational
Communication
Create/
Retain/ Capture Store/
Archive Organize
Org Assemble/ Attributes Control/
BPR
Memory
Publish/ and Access/
Print Content Version
Retrieve/ Transmit/
Synthesize Review/ Route
Annotate
Core Components Revenue
Generation
DM Functionality
Capture/Create
Scanning paper, importing electronic documents
Capture meta-data or attributes: author, date, title, keywords,
document type, purpose, bus characteristics
Check-in/Check-Out
Locking mechanism to prevent overwriting
Store/Organize
Compound documents made of components of multiple media
types
Structured as hierarchies: cabinets/folders
Distributed storage of content and meta-data
DM Functionality
Access/Version Control
Provide access to members with various roles and privileges:
author (Read/Write/Delete), reviewer (Read/Annotate), approver
(Read, Change Status)
Provide version management so that older versions can be accessed
for historical or legal reasons
Retrieve/Synthesize
Powerful retrieval mechanisms based on attributes, concepts, full-
text
Stored queries that can be executed periodically
Automatic change notifications
DM Functionality
Transmit/Route
Create workflows among stakeholders and monitor status
Encrypt/decrypt sensitive information
Review/Annotate
Enable reviewers to read and annotate documents; merge
annotations
Assemble/Publish/Print
Assemble views by combining components based on audience
WYSIWYG displays on screen in native format or printing
Retain/Archive
Set up rules to retain published and original content (and versions)
or to send it to long-term storage (optical disks)
DM Functionality for the Web
Immature Web infrastructure for industrial-strength, document-
intensive applications
Need to extend Web infrastructure using document
management functionality (Rein, et al., 1997)
IETF Working Group (WEBDAV) defining standards to extend
HTTP for:
name space management
overwrite protection
version management
meta-data management
DM Functionality for the Web
Complementary Technologies
Document Management Web Technologies
Manage large amounts of material Deliver multiple media
Provide consistent and predictable Provide user interface and navigation
structure
Enable hyper-linking
Ensure currency
Facilitate non-technical authors with Facilitate non-technical authors with
templates WYSIWYG tools
Support roles, responsibilities and
access control
Enable workflow
Publish multiple views
Enable version control
Provide document locking
Enable recording of attributes Enable attribute searching using meta-tags
Stable, well-defined functionality Continuously evolving
Merrill Lynch Case Study*
Objectives
Manage and deliver large amounts of product and services
marketing material in multiple media via the Intranet/Internet.
Provide a consistent structure and user interface.
Enable linking of related material.
Ensure information is up-to-date.
Facilitate non-technical authors in creating content.
Support well-defined roles, responsibilities, and access control for
various stakeholders in various departments.
Enable workflow between authors, product managers, content
administrators, editors, attorneys, and system administrators.
*Hypertext ‘97 Proceedings and Communications of the ACM, July 1998
ML Case Study (Objectives)
Objectives (Continued...)
Enable assembling and publishing of different views of marketing
information for different audiences: financial consultants, clients,
and the public.
Provide version control to support regulatory requirements.
Provide a locking or concurrency control mechanism to prevent
two or more people from simultaneously updating the same
content.
Enable searching and retrieval of content using predefined business
characteristics of products and services.
ML Case Study (User Interface)
Top Frame
Global
Navigation
Bar Locator
Bar
Title
Topics Linked
to Components
within Body Description
Component
Related
Documents
Client
Suitability
Buttons Component
Left Frame
Body Frame
ML Case Study (Publishing)
Component Product Attributes
Attributes Components FC View
Description Client Segment
FC Description
Client Suitability Next Generation
Client ML Pacific Fund
Public is….. Client Benefits
Financial Goal
Performance Education
FC Benefits
Client Suitability
FC Next Generation
Client Clients….. Client View
Description
FC Client Benefits Assembly Client Suitability
Client Client Benefits
Performance
FC Performance Public View
Client
Description
FC Benefits
FC
ML Case Study (Architecture)
Graphics/Interface
Components/Templates
Technical Team
Production
Web
Create/
Administer
Publish
for Release
Document
Management Link Full-text
System Checker indexer
Publish
for Preview/Review
Create/
Update Edit Comment
Notify Notify
Staging
Authors Editors Legal Web
Components View
Integrated hypermedia and document management functionality
New Drug Applications (NDA)
Pharmaceutical companies spend an average of $350 million over 10
years to manufacture drugs and conduct clinical trials.
No assurance product will make it to the market; even if it does, only 7
years to recover costs and make profits.
Information about drug, safety, efficacy, risk-benefit ratio, adverse
events, etc., reported in NDA to FDA.
About 600 volumes of 200 pages each. Heavily paper-oriented.
On the average FDA takes 18 months to a year to review a NDA.
Reducing cycle time by producing documents in electronic form that
can be reviewed both internally and externally saves about $1 million a
day.
Big push by FDA to go completely electronic in the next few years.
More importance to electronic document management and publishing.
Issues
DMS not good at relationship management; cannot easily
manage links between documents.
Template management not easy.
True joint authoring and merging components or documentsis
not possible.
Different vendors specialize in different parts of the market
making system integration a challenging task.
Web-based document management systems are emerging only
now.
Conclusions
DMS will:
become the primary living repositories for organizational
information/intellectual assets
enable linking of related information (hypertext)
provide workflow facilities for various stakeholders
increase accessibility to information through meta-data and full-
text retrieval and agents
enable handling of multimedia
Conclusions
Evolution of information management systems
Knowledge
Management
Systems
Filesystems DBMS DMS Org Learning
Hierarchical/ Systems
Networked Org Memory
Relational Systems
Object-Oriented
References
Balasubramanian, V., and Bashian, A. (1998). Document Management and Web
Technologies: Alice Marries the Mad Hatter, Communications of the ACM, July
1998.
Balasubramanian, V., Bashian, A., and Porcher, D. (1997). A Large-Scale
Hypermedia Application using Document Management and Web Technologies,
Proceedings of Hypertext ‘97, ACM Press.
Rein, G. L., McCue, D. L., and Slein, J. A. (1997). A Case for Document
Management Functions on the Web, Communications of the ACM, September
1997.
Documentum: http://www.documentum.com
Opentext/Livelink: http://www.opentext.com
Saros/Mezzanine: http://www.saros.com
PC DOCS: http://www.pcdocs.com
Research and Job Opportunities
Reviewing and implementing WEBDAV recommendations to
extend Web infrastructure
Template management: propagation of changes to documents
instantiated out of templates
Indexing and retrieval based on concepts, synonyms
Increasing number of jobs in pharmaceutical and financial
sectors
Managing Web content using DMS
UI, Server-side programming, Web-DMS gateways
Link management