Archive Deployment Scenarios & Checklist
- Litigation & Discovery
- Storage Optimization
- Regulatory Compliance & Audit Trail
- Selective Message Retention in a Multi-Site Environment
- Fault-Tolerant, Cost- Effective Retention in a Multi-Site Environment
- Create Policies
- Plan Your Architecture & Storage Requirements
- Pre-Deployment Planning
- Post-Deployment Plan
Archive Deployment Scenarios
You might be experiencing challenges with storage or litigation relating to a very important source of unstructured data in organizations: email. To help you reflect on solutions, we listed problems that real-life customers were facing, and how we helped solve their challenges with a true legal archive (Don't have much time? Read the Top 10 Things You Should Know About Archiving instead).
Scenario 1 Litigation & Discovery
The MovieTime Popcorn Company has been running the same primary email system for 4 years and, due to large post office sizes, has implemented an auto-archive policy, whereby all emails older than 90 days are automatically archived to the users’ workstations because of space restrictions on the servers. Recently, a legal litigation case required MovieTime to produce emails from 34 different individuals dating back 2 years (Learn more on why organizations need to possess executive email records here). Consequently, the IT staff had to recover backup tapes and access archives on 34 local machines, 2 of which could not be accessed as their systems had had local hard drive failures and had lost all the local archived information.
- Centrally managed and searchable archive repository
- 100% retention
- Discovery of user accounts under investigation
The MovieTime Popcorn Company needs easy access to information, so archiving of email to XML format for easy discovery and access was deemed a mandatory requirement. An archiving policy was devised, whereby all data from existing archives and the system would be published to XML for discovery purposes. MovieTime wished to retain all information in a centralized location and controlled access, so they could start to enforce a retention and deletion policy to avoid further litigation issues.
Planning was conducted to determine the necessary disk space required to conduct the conversion and storage of information.
All Client Archives were copied to the server and published to XML using NetGovern running on multiple systems.
NetGovern was used to create an Archiving Policy (Learn more about creating policies here) that exported all data older than 30 days to the XML repository. The policies were configured to run every day.
Scenario 2 Storage Optimization
Fleetburg College provides messaging services for approximately 10,000 faculty and students. The Faculty has longer tenure, but student accounts are active for approximately 4 years. The College has email post offices in excess of 160GB and needs to implement some sort of data storage management (Is it cheaper to keep everything forever or to manage data? Learn it here). Up until this point, they have not implemented any type of archiving due to the costs of deploying excessive storage to accommodate it and the fact that the student accounts are accessed using a web-based email client. Currently, message discovery is not an issue as there are only 1-2 requests a year to review a limited number of accounts. The college wants to provide all users with direct access to all their archived data.
- Reduce post office storage sizes
- Provide client access to all archived data
- Lower the costs of message storage
First, an internal archiving policy had to be designed. It was decided that all data older than 120 days or 1 term would be automatically exported into XML format. Fleetburg adopted a storage architecture, whereby they would place the archive indexes on the current SAN and would store the physical archived data and attachments on inexpensive NAS devices which they were able to acquire for under 1/10th the cost of traditional SAN storage. Due to the reduced storage costs, Fleetburg estimated it was less expensive to acquire duplicate storage devices and mirror the data between two of its campus sites rather than increase their current tape backup capacity.
NetGovern was utilized to run archiving profiles for different groups of users. The live email data was archived to the network directory and automatically divided into multiple folders based on year.
All users were able to access the first 120 days worth of email directly from the live message system.
All users were able to access the previous 4-48 months of information via the NetGovern web portal.
Scenario 3 Regulatory Compliance & Audit Trail
Eighth National Bank has recently been under intense pressure to comply with SEC and NASD regulations for financial institutions. There are approximately 450 employees across multiple sites but with a centralized email infrastructure. The company has been up to this point keeping everything in the live message system and using proxy rights to check users’ mailboxes but cannot guarantee full retention or compliance to SEC regulators for the 20 individuals that fall under this requirement.
- Provide SEC Compliance
- Create full retention policies for individuals under SEC mandate
- Provide the ability to search email messages, tag them where necessary and provide SEC with audit report of the messages reviewed and screened
NetGovern was used to create a multi-tiered policy where specific users were under 100% retention for the 3-year period specified by SEC. The remaining users were provided with the ability to delete non-relevant emails. Data was stored to a NetApp FAS storage device with Compliance module for WORM storage, however, through the NetGovern integration, only the data for the 20 financial trading analysts was committed in compliant WORM format while all other data was stored on the same device in erasable format. The SEC compliance officer was granted access to just those 20 accounts and was able to perform keyword searches including re-useable word lists to search for suspicious messages. Once found, those messages could be reviewed and tagged with comments. An audit report at the end of each month showed the number of messages that had been reviewed over that period with any comments thus satisfying NASD 3010 regulations and the SEC.
NetGovern was also configured to automatically delete all messages from these accounts once their 3-year retention time had expired and the NetApp system automatically released them from their WORM state (Learn more about balancing retention and deletion policies here).
Dual archive policies were created; one general retention policy and one for financial analysts.
WORM Compliant storage was implemented with the NetApp Fabric-Attached Storage device.
Archiving and data destruction jobs were configurated to provide full archive data management.
Scenario 4 Selective Message Retention in a Multi-Site Environment
The City of Dwight Falls has about 1200 employees dispersed across three separate sites connected via a T1 link. The City has been under ever-increasing pressure to provide information based on state access to information requirements and has been spending approximately 28 hours to service each information request. In addition, the City also has a policy to furnish departing elected officials with a copy of their mailboxes when they leave office.
City legal council has determined that they can implement a 5-year retention policy (Learn more about the need for electronic records retention policies here) in accordance with state guidelines. Their policies are to keep all messages for elected officials but provide all other city staff with 1 year of online email and the ability to delete transient messages and, in accordance with privacy legislation, not archive certain personal items.
- Provide retention of messages for a period of 5 years for discovery
- Support multiple sites
- Create different archive policies by user groups
- Configure two separate publishing profiles, one for full retention and one for user discretion retention and set to run on a daily basis
- Provide portability of messages
NetGovern was used to create a centralized archival repository of all messages. T1 lines were sufficient to collect email from each of the remote sites and centralize the repository with the computer services group. NetGovern was used to create a Personal folder in each user mailbox. Then two archiving policies were created, one for elected officials which took a copy of all emails older than 5 days and copied them to the central repository and another policy applied to all other staff which archived all messages older than 30 days with the exception of those items within the Trash and Personal folder structures.
A 180 days auto-deletion policy was enabled for the live email data.
Users were provided with access to all data older than 365 days via the NetGovern web portal.
When officials left the City, their accounts were fully archived and the archives exported to single or multiple CD/DVD volumes along with a utility to view and search their information.
Scenario 5 Fault-Tolerant, Cost-Effective Retention in a Multi-Site Environment
Finance Co. has been running a distributed email system with separate email servers in over 40 remote sites. They want to centralize the email services to reduce cost and implement a clustered, fault-tolerant environment. Unfortunately, there has never been a retention policy in the organization and individuals have been able to keep as much email as they want, as well as to maintain personal archives (Learn more about the challenges brought by personal archives here). The amount of email across the 40 separate systems would cause the size of the cluster to be larger than the customer desires. A solution to manage the information in order to consolidate systems is required.
- Create a retention policy
- Organize existing unmanaged data
- Support multiple sites
- Provide a fault-tolerant system
- Provide a cost-effective storage solution
An archiving policy was created and presented internally. It was decided that all data older than 90 days would be automatically exported into XML format. NetGovern was implemented and used to migrate the existing unmanaged data from the live message store into the XML repository. NetGovern was configured to automatically archive any new data older than 90 days to the XML repository. End users would have access to the archived data via NetGovern Search.
As a storage solution, Finance Co. opted for NetGovern Store, because of its fault tolerance across multiple nodes, cost-effectivness, and high scalability. A single fault-tolerant storage solution was preferred to multiple storage solutions.
Archive Deployment Checklist
If you have now established that an archive could solve your problem, you might be wondering where to start. The following pages list what you need to consider to take on this project. Redacting policies is the first thing that should be done. Once policies are created, you can start thinking about the architectural requirements of your environment. Then, you must plan the deployment of your archive. Finally, some of the novelties in your system need to be communicated to your colleagues post-deployment.
1 Create Policies
Feel free to copy the content of the NetGovern Acceptable Email Usage Policy and adapt it to your organizational requirements. If you prefer to start from scratch, the key elements of your policy should include:
- Purpose of the policy
- Scope of the policy (who is affected by it)
- Explanation of what and how email is being monitored and manipulated
- Clear description of what is and what is not acceptable
- State what constitutes a breach
- Disciplinary procedure in cases of policy breach
It is useful to give each employee a pamphlet explaining what the email policy stipulates. The guide should not only clarify what is or is not deemed adequate, but it should also demonstrate the benefits of having a policy in place.
It is essential to receive employee support, agreement and acceptance of the policy. Employees should be educated about the policy to ensure that they understand it. State clearly the reasons for the action undertaken: to emphasize your point, perhaps cite recent court cases, productivity loss statistics and other relevant data. In addition, communicate the benefits to the employees and the business in the same way that you would sell the benefits of your product or service to your customers. It can also be beneficial to provide users with feedback on how the email policy is helping your business.
You need to remind employees (and inform new hires) of the email policy on a recurring basis. You can do this by sending the policy out via email every 6 months, by including it in your employee handbooks, holding seminars on the most effective ways of using email and reporting back on the benefits of having the policy in place.
2 Plan Your Architecture & Storage Requirements
Now that you have captured the information about your users, your environment and your requirements, it is now time to formulate your architecture. The first thing that should be planned is your archive repository or where the archived data will reside. A number of factors will govern the location of your information, among these, you will certainly want to consider the following:
Remote Locations: Can your current WAN Links sustain the transfer of data across the WAN on a nightly basis without significant impact to existing applications? You may be faced with the prospect of either increasing WAN link sizes or placing local repositories at each site. As a general rule, distributed archives are not recommended from a performance and management perspective, but they are possible.
Centralized or Distributed: Sometimes it may be necessary to place archive repositories on multiple servers across your organization due to available storage space or due to political divisions (e.g., Departmental Archives).
Security: It may be required to place more sensitive email accounts (e.g., CXO-level Archives) on a separate server or location to facilitate different security requirements.
You should, by this stage, have a good understanding of how much storage space you will need immediately and how much you will need in 12-18 months or longer. You should also have an understanding of your access needs (whether you will require direct client access, web-based access, etc.) and what the frequency of access will be.
Most organizations already employ sophisticated storage area networks comprised of high-performance SCSI or Fiber Channel hard drives. Typically, these systems provide exceptional performance and capacity, but at a premium price. Storing archival data on primary storage networks does not typically make sense from a cost perspective. The following scenarios represent some storage options you may wish to consider:
Low-Cost Storage Solution
- Number of Archived Accounts: <1000
- Access to Archived Data by Users: Mild to Frequent
- Access Method: Web Only for Users
- Auditor Access to Data: Infrequent
In this model, access to archived data over a certain number of days would be fairly infrequent by end users so that the storage system would not have to support frequent simultaneous direct access requests and immediate retrieval speeds are not of prime importance. In this instance, data can be written directly to a lower-cost Network Attached Storage (NAS) device for inexpensive storage of archived data. NAS devices also provide for inexpensive expansion and advanced features since they are usually based on Windows Storage server and therefore can support Single Instance Storage and Compression.
Compliance & Security Solution
- Number of Archived Accounts: Unlimited
- Access to Archived Data by Users: Infrequent
- Access Method: Web Only for Users
- Auditors Access to Data: Moderate
- Compliance Requirement: Non-alterable, Write Once, Read Many Storage, Data Encryption & Wiping
Requirements for this business model assume that while email users will access archived data, the archives will be accessed infrequently to moderately and that access will only be conducted for these users through the web interface. It is also agreed that a small latency for retrieving items is acceptable. On the other hand, the organization wishes to store the data in a secured format that cannot be modified and altered and in some circumstances, ensure that data is wiped and unrecoverable after it reaches its retention threshold.
A scalable and secure architecture is recommended whereby the storage system will provide the security and data integrity checking to ensure data protection. NetGovern allows by integrating perfectly with Caringo SWARM, developed specifically for archival content. It provides a robust and easily expandable system which stores files as objects directly to disk without requiring a traditional file system interface. Without a file system, Caringo SWARM does not suffer from any file browsing security issues. Additionally, it provides a low management, low maintenance system where data is continuously protected through intelligent replication with continuous data healing, ensuring instant recovery from hardware failures and the complete elimination of backups.
High Availability Storage
- Number of Archived Accounts: Unlimited
- Access to Archived Data by User: Moderate to Frequent
- Access Method: Web Only for Users
- Auditors Access to Data: Moderate to Frequent
- Data Integrity/Security: Yes
Replication is far more desirable than tape backups to preserve the integrity of your messages. Depending on your current infrastructure, you will need to determine if you currently have the capacity to include the additional archive data volume in your NAS devices to accommodate the implementation of a data replication solution. With this solution, duplicate NAS devices would be installed, providing integrated hardware level data mirroring with snapshot technology allowing the organization to deploy an offsite mirrored storage unit that creates backup snapshots. When using data replication, recovery from disaster or system failure is virtually immediate. Redundancy can also be used successfully to replicate data between two remote repositories to provide fault tolerance and localized global searching.
3 Pre-Deployment Planning
As with any IT initiative, you will want a formal plan in place with expected timelines and resourcing for various tasks. The plan allows you to schedule other tasks around the implementation so that impact to the environment and the users is managed and kept to a minimum.
In larger environments, it is recommended to implement a pilot project for a number of reasons:
- Validating data transfer rates and WAN loading
- Providing initial hands-on deployment training
- Providing proof of concept to all parties
- Determining scope and formal project rollout timeframes
Archiving Deployment Strategies
During deployment of an archiving solution, information must be captured from the live message system and placed in the archive. One must perform a full archive of the system before you can do incremental archives.
It is during this initial phase that performance may seem slow; however, one has to understand that in some systems that have never been cleaned up or have had no restrictions applied, there can be literally tens of thousands of messages per mailbox with an undisclosed number of attachments which all have to be copied off the live message server and placed into an archive repository.
It is here that one should concentrate on adopting the policies that were defined previously, especially with respect to Primary Archive retention thresholds. Do not be afraid to apply additional hardware to the task of exporting messages and mailboxes.
Does your Messaging System have a Clean Bill of Health?
Before deploying an archiving solution, it is important to ensure that your messaging system is in optimal condition. Your live messaging system is the source for all information that will be placed into the archive system. As with all messaging systems, especially large ones, errors are to be expected. One error that is frequently encountered is a missing record. If a user cannot access an email record due to an error, then no archiving software will be able to access that record either. It is therefore imperative that you seek out and fix any messaging errors before deploying your new archiving solution.
4 Post-Deployment Plan
After successfully deploying an archiving solution, it is important to advise your employees about the archiving solution. It is especially important that employees understand the archiving policy that has been adopted. Communicating the following information internally could facilitate adherence to the archiving policy:
- The fact that a new archiving policy is in effect
- The effective date of the archiving policy
- Why the policy was created
- How the policy affects business correspondence
- How the policy affects personal correspondence
- How compliance with the policy will be monitored
- Penalties for policy breach
The archiving policy should be presented to employees of the organization once it is implemented. The policy should be presented to new members of the organization at the time of hire. It is advisable to present the archiving policy in written form to avoid any misunderstanding of the policy.
Download this resource
Found relevant information on this page and you'd like to take this content with you?
No problem! Just fill in this form to get a link to a PDF document.