The ability to migrate content from paper, from legacy file stores, from Access databases etc. forms an essential part of most document management and other SharePoint projects. Protecting your content and recovering it when it is there is equally important.
The problem of content ingest (moving content into a system) breaks down into a number of parts:
- Identifying what needs to be moved into SharePoint (and what doesn’t)
- Cleaning and classifying the documents and other content
- Building libraries etc. in SharePoint with an appropriate Information Architecture
- Copying the documents and content into SharePoint (and removing the originals)
- Enhancing the copied content with further metadata and other improvements
The Cloud2 Content Ingest Pack is designed to address this breadth of needs.
Moving documents from legacy systems
The two most common document stores in use today are File Servers and Outlook/Exchange Server. Both are badly designed for modern content management and are heavily abused. Neither support the metadata that is vital to good document management.
AvePoint’s DocAve tools, combined with Concept Search for classification and de-duplication and Cloud2 consulting enable large sets of content to be moved into SharePoint or, in some cases, remain where they are but enable them to be accessed and managed via SharePoint.
Moving paper
For very large paper document migration, Cloud2 work with a number of leading digitisation bureaus and agencies who can scan, OCR and tag substantial volumes of documents.
For ad hoc and onsite scanning needs, Cloud2 work closely with Kodak, using their high performance scanners and SharePoint enabled software.
Kodak have expanded their scanning solutions with two new software products for Microsoft SharePoint. KODAK Document Viewer Software and KODAK Scan and View Software allow scanning and document viewing to be embedded directly in a SharePoint page. The viewing software allows a document to be selected in a SharePoint library and an instant preview of that document is shown in an adjacent web part, in the same way that users of Outlook can preview documents without opening the full document. The Scan software allows simple capture of documents from an attached scanner straight into SharePoint from a web part.
Kodak have also improved their Capture Plus Pro integration with SharePoint, supporting a variety of scanning scenarios including very high volume scanning.
Content classification
A major issue with ‘old’ content is that it rarely has any meaningful metadata. Even core items such as title and author are incomplete or incorrect. While a certain amount of insight can be obtained from the filename and the directory structure (folder name), there is a large gap between what exists in legacy content and what is needed in a modern Information Architecture driven intranet.
While SharePoint provides some tools and tricks for managing classification, such as bulk entry via the datasheet view and setting a field with a default value during upload, these all require a knowledgeable user to apply the correct classifications and metadata. For large scale ingest this rapidly becomes impractical.
The Cloud2 Ingest Pack uses technology from Concept Searching to automatically develop taxonomies and to classify large numbers of documents against these based on their content.
Correctly classifying content with metadata ensures efficient search, views, workflow, information governance and more. It also allows effective de-duplication and content clustering.
Backup and recovery
Once a SharePoint platform is in use it is vital that a robust strategy is in place to ensure content can be recovered in the event that a problem arises. This may be due to a hardware of software failure or, commonly, user error . Regardless of the cause, it is essential that not only can the system and it’s content be brought back safely but also that it can be done quickly and with minimal loss of data in the period between the failure and the last back up.
Although most current server backup technologies can support Backup and Recovery of the underlying SQL server, they are largely inadequate for use with a sophisticated SharePoint platform. Recovering from the backup would require restoring the entire farm, perhaps to a virtual platform from where the missing content can be manually copied over. This can easily take a couple of days. Furthermore they tend not to differentiate between content that is rarely updated, such as published policies, and information that may be updated every few minutes. This latter scenario can lead to significant data loss if the backup solution only runs daily.
We advise all our clients invest in a dedicated SharePoint solution. We recommend the AvePoint DocAve Backup and Restore product and can provide further information, a demonstration and pricing as required.


