PSTs - The many-faceted challenge of personal archives
It has been common knowledge for years that PSTs are a problem. Many methods have been offered to resolve the issues, yet there are still a lot of organizations that are still stuck with the problem.
The PST challenges
The challenges with PSTs are many. The first challenge that your IT team will encounter is storage. This is usually a self-made problem. The email team, trying to manage the size of the Exchange system for performance and cost reasons, imposes quotas on users. However, under pressure from the organization, they let end users create PSTs, so they can keep longer-term emails that are required for the normal operation of the business. Users are smart though and do not want to lose their data, so they will often copy PSTs on a regular basis to network drives. Not only does the organization end up with the information back on its servers, but now it has multiple copies of the data. This can grow to a tremendous size. We recently helped a hospital that had accumulated more than 35TB of PSTs on their file servers over the years.
The second challenge is from a compliance and security perspective. By spreading the information in PSTs, we are increasing the risk to the organization. We are increasing the risk of losing information - as emails and PSTs will contain sensitive information. If the file servers are accessible, or if they are located on removable drives that get taken home, the chances of sensitive information being lost are much greater. We are also increasing the cost of compliance. By forcing legal hold and retention on servers that are spread throughout the network, applying retention policies to this information becomes a lot more difficult.
There is another more subtle problem with PSTs. There is a strong possibility that some users will have brought PSTs from other organizations. A history of emails from a previous job, for example. As an organization, it is probably not appropriate for you to keep these emails around. They may actually expose you to a lawsuit if you are using stolen information to run your business.
What can be done with PSTs?
The first solution that comes to mind for most organizations when there is a storage problem is to use the Cloud. In the large data centers that sustain today’s public clouds, the cost for storage is very low. In the case of email, many organizations are moving to O365, where mailboxes are practically unlimited, so users will expect to continue to access older emails.
The solution of choice is to bring the PSTs into O365, in each user's mailbox. Microsoft offers a number of ways this can be done, but these methods are not that simple to use. Testament to this fact is that a few years ago Microsoft was offering free PST uploads to O365 to help with on-boarding users to O365. They have since stepped away from offering the service for free and now charge for the service. There also are various self-service methods, as described below.
The first is a method that leverages Azure storage; you can read the method here. It is a two-step process that allows you to 1) upload the files to Azure, and 2) run an import job from O365. Creating the jobs and building the PST mapping file is an error-prone process. This method is rather complex and does not scale to a large number of accounts. Furthermore, it does not help with collection and risk management. We will address risk management shortly.
The second option Microsoft offers is to ship a drive with the PSTs. The only difference with this method compared to the Azure upload method above is that instead of uploading the information to Azure, you ship it to Microsoft on a drive. Microsoft then uploads the information into Azure, using the above method, for a fee.
None of these methods proposed by Microsoft solves the fundamental problem with PSTs. PSTs are located all over the place, so they must be collected in a central location for sorting before they can be imported. The owner of the PSTs must also be identified and the information sorted intelligently so the tool can run. Also, there is no way to ensure that sensitive information that was collected over the years under a different security paradigm is actually filtered prior to putting the information into the Cloud.
There is one final thought with regards to PSTs: what about the PSTs for users who are no longer with the organization? What do you do with these? From a business standpoint, they may still be needed. Your retention policy may call for you to keep these files for a certain period of time. It is also possible that certain users have shared their PSTs with managers or colleagues prior to leaving the organization. Special consideration should be made when sorting these emails.
The appropriate method to resolve the PST problem is a simple 4 step process. This process will cover all of the issues related to bringing the PSTs to a cloud service. With every project, change management is important, so you will need to communicate to your users every step of the way. This starts with disabling the creation and addition of emails into PSTs. The last thing we want is for users to create PSTs once we are in the process of collecting them. This will have to be communicated to users with appropriate timelines.
The first thing to do with PSTs is to collect them. If your files are already centralized in one location, this will be easy. However, there is a good chance you may have PSTs all over your network. And although it is possible to find where active PSTs (connected to Outlook) are, you will need to run a scan to find the PSTs that are located in other locations. This can be done through smart use of desktop management tool such as SCCM. There are a few scripts out there that may help you do this. We also provides a set of tools to inventory and collect the PSTs to a central location. The inventory will allow you to set a timeline and plan a strategy for your overall project. These tools are typically run on each workstation via a simple install using the desktop management software or a login script. They run transparently in the background to perform the work.
Once you have collected the PSTs from various locations (network drives, personal computers), there will be a certain number of them that will need to be identified. PSTs do not come with an owner or identifier. Unless they were connected to Outlook, it will be necessary to look inside the PST to find out who was the owner of the PST. This may be very tricky as it is entirely related to what was copied by the user into the PST. NetGovern has the capability to identify and sort the PSTs by owner. This is based on a set of rules that were developed over time and experience. This sorting process is highly automated and is run in two steps. The first is to create a repository for all active users. The second is to process the PSTs to find the owner and drop it in the correct folder.
3. FILTERING & CLEANSING
If we are left with PSTs that are not identified, these can also be imported into NetGovern Archive for processing, but you will have to decide what to do with them. One of the methods is to create an account in Office 365 for every user that has PSTs and who has left the organization. You would then bring the PSTs into Office 365. This can be a complex and expensive proposition. Another method is to bring these into NetGovern. Delegate access can be granted to users to be able to search and find relevant emails. For example, a manager may be granted access to emails for employees that have left the organization. Access can also be granted to the appropriate individuals for eDiscovery purposes.
A data move also presents a tremendous opportunity to clean up your data. Before you move your data to the Cloud, you can take advantage of the move to "cleanse" your data. There is sensitive information that should not be found in emails. Some of these include credit card information, which must be stored encrypted at all times. Other types of personal information also includes medical files (PHI), or any personally identifiable information (PII). There may also be information that your organization deems sensitive, such as design information or unannounced products that you may be more comfortable removing from the emails before moving them to the Cloud. NetGovern provides the search capabilities and the tools to find and remove this type of information.
4. IMPORT or STORE
Finally, once you have the appropriate clean data in hand, jobs can be run to inject this data at high speed into O365 mailboxes. This data may be injected in the primary mailbox or in the personal archive mailbox. If you have the appropriate license, you may want to inject data into the personal archive mailbox to create the same end user experience. You may also decide at this point to apply your retention policy and inject only a certain amount of emails into each mailbox. A copy can also be kept in the NetGovern system for eDiscovery purposes if required. Depending on your business needs, and the cost of Office 365 licenses, it may also be more appropriate to simply keep the data in NetGovern archive. We can certainly help you make the right ROI calculation.
Handling PSTs is a lot more complex than just pushing data to the Cloud. There are retention concerns and user concerns that need to be taken into consideration. With the right tool set, at the end of this simple 4 step process, you end up with the right sanitized data in the right location and have solved a pretty hairy PST problem.