What is Purging
Purging is the process of freeing up space in the database or of deleting obsoletedata that is not required by the system. The purge process can be based on the age of the data or the type of data. Archiving is the process of backing up the obsolete data that will be deleted during the purge process
Reasons not to purge:
1. Disk space is cheap, why not keep the data
2. You don't know if you will ever "Need" the data
3. What if we are audited
Reason to purge the data
1. Improve performance - Why would you want to search through 1 Million records when you could search through 500k records - Query times,
2. Legal reason -PII requires you to setup a purging policy.
3. Data beyond X number of years is irrelevant.
4. Costs of maintaining data - physical disks, space, bloated databases
Sample Purging policy
• Every day, purge data older than 45 days from the current data set, moving it to the history data set.
• The first day of every month, purge data older than 13 months from the history data set, moving it to the dormant data set.
• Every day, purge data older than eight years from the dormant data set
Even if your organization utilizes an ODS where you have current information for X years and then an EDW where you have historical data, the EDW would still need to have some sort of purging policy or you will experience problems down the road in database performance or lack thereof. Working with your data governance team to review issues with data, setting data standards and retention policy can only be good for your organization. Sometimes your organization has to champion the data governance for your organization in order to show the business that data is important to your organization but maintaining that data is important to both IT and the business.
No comments:
Post a Comment