Enterprise content management (ECM) is an approach to organizing the unstructured data owned by a company. This process covers everything that happens to content from creation until deletion.
ECM increasingly relies on cloud storage and automated processing.
What is Content?
Most organizations store large amounts of unstructured data. Much of this is in files such as:
- Word documents
- Excel files
- Audio files
- Video files
- Call recordings
- Social media records
For people within the organization, finding this data isn't always easy. There are questions about how and where to store it, about how to make it available to other areas of the business, and how to decide when it is no longer required.
ECM is an approach to content management that creates a clear workflow for all stages of the content lifecycle.
What is the Content Lifecycle?
The content life cycle begins when an organization acquires content, and it ends when that content is no longer needed. Each stage of the lifecycle requires clear policies and the right technology infrastructure to support it.
First, the organization obtains the content. This can happen in several ways, such as:
- A user creates and saves an Excel spreadsheet
- The mail department scans and saves an incoming letter
- A supplier sends an email with a PDF invoice
- The training department prepares an educational presentation
- A customer service agent speaks to a customer, and the system records the call
- The social media team archive a selection of posts as a text file for future analysis
All of these scenarios have the same result: The organization now has a new digital file which it must now distribute, store, and eventually delete.
Content is unstructured data, which means that it does not exist in a structured format, such as a relational database format. It is also not semi-structured, like a CSV or JSON file.
To keep track of this content, the ECM process will need to have an indexing system. This system can take many forms, including:
- Filing: The document moves to the relevant folder within an organized storage repository. This approach is similar to the pre-digital strategy of keeping paper documents in an indexed filing cabinet.
- Standardized file names: The document name includes some important indexing information, such as the file's creation date. This method is useful for sorting large amounts of content together.
- Tagging: Content has tags that indicate the nature of the file. This approach can be useful when a piece of content is useful across multiple departments. For example, a customer's refund request could have finance, billing, and customer relations tags.
- Metadata: Metadata contains a description of the content. Metadata can be structured data, so it is possible to incorporate this into a relational database, making it available to queries.
- Conversion: If the content is not compliant with the ECM process, it might benefit from conversions into another format. For example, some methodologies rely heavily on text search to retrieve content. If a text document is scanned and saved as an image, the text won't appear in search. As a workaround, these documents can be read and converted to text by OCR. The converted text can act as metadata, or function as a discrete piece of searchable content.
Organizations may use a combination of these methods if required. For large volumes of data, companies will generally lean towards indexing techniques that can be easily automated.
New content often triggers a linked business process. For example, if an invoice arrives, it must activate a payment process. Therefore, a person or process must route the invoice to the payment team.
The ECM process must outline business rules that clarify:
- Which content goes to which department?
- How to flag content that requires immediate action?
- How to handle conflicts, such as when content has multiple owners?
- How to deal with duplicates and inconsistencies?
- Who is responsible for deciding the routing procedures?
- What types of documents can go directly to the archive?
Often, this step requires the business to take a step back and look back at their processes as a whole. When does content trigger a new process step?
4) Archival and Retrieval
When no longer in use by a process, organizations move content to a storage repository.
This repository may be a local data storage structure, such as an on-premise hard drive or a cloud storage solution like Box. For long-term storage, the company may store content in a data lake, which is a repository for large quantities of unstructured data, including video and audio. Data lake storage is relatively cost-effective, and content stored in a data lake can be used for analytics purposes.
In all cases, content must be available for retrieval whenever required. Retrieval scenarios can include:
- A business user needs to access a specific file
- An analyst needs all files from a specific date range
- An automated process requires access to the file
- A customer logs in to view one of their own files
- The file is available for viewing or downloading from a company website
ECM policy should offer a realistic expectation of the kind of file availability required.
Content reaches the end of its lifecycle when:
- The business no longer requires the content
- Customers or third parties have no reasonable expectation of being able to access the content
- There is no legal requirement to continue storing the content
- There is a valid request to delete the content, in the case of GDPR, for instance
ECM policy will typically clarify the ways to assess these criteria. For example, some financial data must be retained for a minimum period.
Companies can delete content manually or set up an automatic process where content is flagged for deletion when it has reached a specified age. The deletion process will remove all copies of the content from the company's repositories. It will also expunge any relevant metadata.
What are the Main Enterprise Content Management Services?
There are several private services currently offering ECM services. Some of the market leaders include:
- IBM ECM
- Xerox Docushare
- Oracle Enterprise Content Management
- Veeva Vault
All of these systems offer cloud-based enterprise content management services.