David Neiger31 July 2009, 1:41 PM
Acronis and Symantec are spruiking the massive benefits of a new 'de-duping' technology that squashes your home backups into less disk space.
One of the biggest problems with large hard disks (both at home and at data centres) is where to back up all of this data. Let’s face it, you would have to be insane to attempt to backup your 1Tb hard disk onto 228 DVDs. Even with data compression (which is about 50 – 60% efficient for typical word processing and database files, considerably less so for music and video) that’s still over 100 DVDs!
Although more and more people are opting to get big NAS boxes with multiple drive arrays at home, the latest weapon against the terabytes of data stored on hard disks throughout the world is deduping or deduplication. This technology has existed in datacentres for several years but is only now affordable for home and small business use.
Deduping technology looks for identical files and blocks within files and only backs up one copy of the data with any subsequent copies replaced by links to the original data. According to Acronis, who have just released version 10 of their Acronis Backup software, deduping technology can save up to 90% of backup storage, which according to our calculations could allow you to fit that Tb hard disk onto just 23 DVDs although in practice this would be extremely unlikely unless you kept saving the same file over and over again in different directories.
On a more practical level, if you run a server with several virtual machines, there is a very good chance that you would have several instances of the exactly the same file stored both on the host server and in each of the VMs. Even on workstations there are often several copies of DLLs, common files and other data spread throughout the hard disk. Mail servers also tend to host a considerable amount of duplicated data as multiple copies of emails are often kept by several users.
If you are wondering just how much money you could save, Acronis have a calculator on their
website which shows you the massive savings you can make by using their technology. In our little test environment with 200Gb of data on the servers and 30Gb on the workstations, we could save a massive $13,197 per year. Bring on the Porsche!

Like any ROI calculator, the figures are really skewed in the company’s favour. In reality, achieving a 90% saving in data storage would rely on the server having multiple copies of the same data such as a mail server where everyone has cc’ed everyone else the same joke.
It also assumes that you do multiple full backups over time and that most of the data does not change so that the same data is backed up over and over again (and you don’t just do incremental backups). The other assumption is that it costs around $10 - 15 per Gb to backup this data. Last time we looked, 1Tb drives were selling for under $200 (20c per Gb) but if you were to back up to tape and pay staff to manage the tapes then data storage could be expensive. Maybe in a large data centre you could achieve significant cost savings but for home or small business use we suspect that deduping technology will allow you to cram more backups onto the hard disk but will not give you such massive cost savings particularly with the current pricing of massive hard drives.