“But I don’t want to back up now. I’ll do it tomorrow, I swear.”
Is there a soul alive who has created digital data who has not been exposed to the concept of data backup (redundancy) in case of that data vanishing?
It does not seem likely. Similar to insurance, no one wants to buy insurance (or even talk about it), but what could the consequences be if we don’t create a data backup when appropriate, and a catastrophe occurs?
Vanishing data could be caused by a disk drive crash, a malware attack, an electrical surge, a disk drive data capture such as with ransomware, and other miscellaneous actions like device theft. What about operating system (OS) and program files? Should they be backed up along with associated data files?
Digital data does not always refer to data created with computers or computer networks. Cameras, cell phones, smart TVs, DVRs, and a host of other electronic devices can be included in this conversation.
The dominant factor in redundancy preparation is almost always determined by the value of the digital data. Let’s consider some of the basic concepts of why backing up digital information should be considered. First, I hope this story sets the tone.
Floppy Disk Backup at ComputerLand
In 1983, I was working for ComputerLand, a retail computer store in Oklahoma City. During the 1980s, ComputerLand was the largest retail computer enterprise in the U.S. which showcased Apple and IBM personal computers as their anchor products.
A client of ours, a mail order company, burned completely to the ground. They were within a hair of completely losing their business, employees losing their jobs, and lives changing forever. However, someone had the foresight to have completed a backup of the entire company database on floppy disks and had secured it in their fire safe. The data remained intact during the burn.
Because they were a client, management offered (free of charge) to take that large stack of 5¼ inch floppy disks and restore them to a new 10 megabyte hard disk. For three days, all day long, many of us took turns restoring (inserting one disk, then another) their database back to a usable state using the PC DOS Restore utility – one 360K floppy disk at a time. A very slow process, but a business was saved!
Why is Having a Backup So Important?
Organizations (or anyone) who loses a whole building and everything in it, like in the 9/11 disaster or the Oklahoma City bombing, and do not have an off-site backup store, have a very high chance of losing their business.
Losing a business this way depends on more than one factor, including the type of business. For example, an ice cream store that loses data due to a lightning strike can open up their doors the next day and commence scooping ice cream again, without too much consternation.
But a financial institution or an online company where data availability is critical creates a much different circumstance if data is compromised at any level.
So what are the primary reasons to keep an accurate backup of business data?
The first concept, and perhaps the dominant one, is the concept of livelihood. Today’s business records are kept almost completely in digital format and stored locally and/or online. Data is produced and kept on just about every type of transaction imaginable. Losing just the right kind and amount of data can create a reduction or complete elimination of positions within any type of organization.
When data is lost, jobs can be lost, bills potentially may not be paid, and lives can be changed. A relatively simple task of backing up data to multiple locations can prevent many recovery challenges.
Next, is the concept of legality. Organizations that accumulate data under restrictions of legal authority must take extra caution when planning a data backup strategy. For example, losing financial or medical data due to carelessness or poor planning may have a cost of more than just data loss. Government watchdogs could be looming.
Going to court, paying for and defending yourself or your company, does not seem to be a highly productive activity for any person or organization. Legal issues can drain human and financial resources, as well as lower personal or corporate morale.
There is a reason why there are many underground and above ground secure data centers (also known as data backup storage centers or bunkers) scattered around the world. Many financial (and other) institutions send backup copies of their data to these secure data bunker centers. Backups are usually in the form of digital tape media. They can be delivered (overnight) to these centers after the day’s business activity ends and the backups are completed. Many organizations have dedicated “backup personnel” with the responsibility of seeing that their organizations’ redundancy process is completed in a timely manner.
The majority of organizations that produce data may not be under the constraints of legal or mandatory data redundancy. However, as a matter of policy they may incorporate programs to protect their data. This can include part of or all of the following: local tape drive backups, hard drive imaging (exact copy of the hard drive in use), R.A.I.D. (Redundant Array of Independent Disks), server imaging to remote locations, copying data files to offline hard disks or to optical media (CD-ROM, DVD-ROM, Blu Ray), and a host of other partial procedures.
For Windows users, partial redundancies could include creating a Windows Recovery Drive, a copied backup of the Windows Registry, or using the Windows Restore Point feature. For Mac (Apple Macintosh) systems, Remote Disc, or Time Machine are built-in utilities for easy access. Linux and UNIX operating systems have a multitude of backup programs available. Many are open source and free; others may be unique to the specific Linux distribution, such as Ubuntu or Red Hat. A simple internet search will provide many hours of backup education.
Convenience, or rather extreme inconvenience, is a minimum description if a dataset is lost and redundancy is non-existing. Really! Does this need much explanation? How inconvenient would it be to lose your livelihood, go to court, or explain to superiors or customers that their data is lost?
A data catastrophe, (i.e., a hard drive crash) is bad enough. But in IT (information technology) crash terms, convenience is having an identical and up-to-date dataset within close access and all that is needed is for a handy tech to install it and get the system up and running again within minutes! Redundancy planning and implementation can be acutely convenient compared to the alternative.
How to Backup Data
Tools available today for data backup and security are seemingly unlimited. Discussing the dozens of tools could take many books to break down the specific software and hardware options and how they relate to the multiple operating systems used by different organizations. Instead, here are some basic technologies popular today.
The first is not a specific technology, but is a key consideration – planning. Optimal data recovery depends on optimal planning.
Planning Ahead for Optimal Recovery
Optimal planning consists of understanding what backup technologies are available, how they are installed, their function, how they are maintained, the level of technical sophistication, the costs involved, and understanding the procedures used to restore lost datasets. It has been said that “no two computer networks are the same.” This means that perhaps no two data backup strategies would be the same even on very similar networks.
Components of a successful backup plan can consist of three major categories: operating system (OS) files, program files, and data files. As mentioned, data is the primary target – both for a backup strategy and for hackers. However, having intact OS and program files can eliminate hours of additional labor restoring a hard drive. They are not as critical because organizations have them either in their possession or have the licenses to be downloaded and reinstalled.
Are data the only backup targets? OS and program files will have to be reinstalled as well. In addition, they may have to be updated. Updating to the most recent versions of these files also has the potential of adding additional time to recovery. It is almost a certainty that in today’s internet climate, not updating OS and program files can lead to security risks and software operation glitches. Organizations may also want to create a backup security policy to protect these backups. Consider having a backup plan meeting to create a plan and get approval for it.
Copy & Paste
But what is a computer backup? You have probably already used the most common backup technique! The most common utility known for duplicating files for safekeeping is the copy feature offered by all common operating systems. While copy is not considered a formal backup strategy, it is widely used to create file redundancy. It’s quick and easy to use.
Windows graphical user interface (GUI) users know the copy command (Ctrl+C) as copy-n-paste or cut-n-paste. Other copy commands in Microsoft Windows, MS-DOS, and PC-DOS operating systems are xcopy.exe and robocopy.exe. These are used from the command line and offer more features than the easy to use copy command. More complex data backup software is also available.
Many of the Linux distributions that have a GUI offer the copy and paste function. From the Linux command line, the cp utility can be used.
Mac users also have copy and paste functions. OS X offers both copy and paste and cut and paste functions. The copy and paste function can be done two ways: Command+C or Command+V, and using the mouse and the Option key.
Backup media is not discussed in detail here. Backup media generally includes one or more of the following: magnetic tape systems, external or internal (to the computer) hard disk drives, USB (jump/thumb) drives, optical media such as CD-ROMs, DVDs, and Blu Ray, remote backup servers (computers located in another location such as across the room or in another city), or online secure data backup.
Full, Differential, Incremental vs. R.A.I.D.
Data backup strategies vary widely. Two common strategies than have been around since the 1970’s and 1980’s are full, differential, and incremental, as well as R.A.I.D. These are not strategies necessarily to be used in place of the other, but rather to be used with each other to help provide the most secure redundancy possible. Each strategy performs very different services and can be used with various kinds of data backup software.
The full strategy functions almost like it sounds. Although all strategies have software settings allowing customization, the full backup service is designed to fully backup “everything” on a hard disk drive. This includes all OS, program, and data files. Should a computer hard disk drive become unusable or stolen, the full strategy can be used to create an identical state of the hard disk drive at the moment the full backup strategy was completed.
The advantage of the full strategy is important. Only one file is created containing everything on the hard disk drive. All data, programs, OS and user settings are kept intact. Restore the file to a new or blank hard disk and everything should be exactly as the moment before the backup process began. It’s fairly easy to copy or distribute a single file to provide additional security for the backed up file.
Challenges also follow the full strategy. That one single file is often enormous in size. Depending on how much information is on a hard disk drive, it can take hours for the backup process to complete. It can also take hours for the file to reassemble on a new hard disk drive during the restore process. The backup file only contains data as of the date of backup. Any changes made since the backup are not included. Individual files are not accessible, meaning that to access any programs, folders, or files on the backed up media, the backup file would have to be restored first. This can be an inconvenience.
(Note: There are newer backup systems now available that do allow for individual folder and file access. Some of these newer systems also allow for an accelerated backup and restore process, saving time.)
For backing up files that have changed since the last full backup, differential or incremental strategies can be used in conjunction with the full strategy. A differential is a type of backup that copies all the data that has changed since the last full backup.
A full/differential recovery would include restoring the last full backup first, and then the last differential backup performed. Differential file sizes would be much smaller, allowing for a quicker restore, at least on the differential restore. Again using this system, two files are used for restoring, the last full and the last differential.
The differential type of backup does not clear the archive bit. This means that the next time a differential backup is performed, it backs up everything that has changed since the last full backup. Previous differential files can be discarded.
A full/incremental strategy is also an option. With an incremental strategy, each backup captures all computer software changes since the last backup, usually an incremental backup. A completed incremental backup does clear the archive bit. This means that it closes out (completes) the backup process.
For example, a full backup on Friday, an incremental backup on Monday, Tuesday, Wednesday, and Thursday, with a hard disk crash on the next Friday, means that restoring a hard disk drive up to date through the last Thursday, would mean restoring a total of five files (the last full, and the incremental backups for Monday through Thursday).
Backup and Restore Schedule Samples
|Full||Differential||Incremental||Restore from Differential||Restore from Incremental|
|Backup #1||All data||—||—||Last Full||Last Full|
|Backup #2||All data||Changes from backup #1||Changes from backup #1||—||Backup #2|
|Backup #3||All data||Changes from backup #1||Changes from backup #2||—||Backup #3|
|Backup #4||All data||Changes from backup #1||Changes from backup #3||—||Backup #4|
|Backup #5||All data||Changes from backup #1||Changes from backup #4||Backup #5||Backup #5|
Determining which system and desired frequency would be optimal to use for any given computer or network requires evaluating how the systems are used, the importance of timely recovery, and the volume of data to be restored. In addition, a hybrid of full, differential, and incremental can be implemented.
R.A.I.D. is another option that can be used on its own, or with any other data security strategy. R.A.I.D. is not a specific strategy, but offered in many different forms called levels. For example, some of the most popular levels are R.A.I.D.1, R.A.I.D.5, R.A.I.D.10.
R.A.I.D. usually requires more technical knowledge to install and implement. Computer BIOSes, microprocessors, PC motherboards, controller cards, and software knowledge is required before optimal R.A.I.D. implementations can be incorporated into PCs, servers, and networks.
The basic concept of R.A.I.D. is that multiple hard drives are installed into an array. When a hard drive (or hard drive controller chip) fails, one of two actions can take place, depending on the R.A.I.D. level. First, a duplicate hard drive could be immediately available because of a process called mirroring (an exact copy). The mirrored hard drive is instantly online and the users do not know that an error (crash) has occurred.
Another R.A.I.D. level might send an error message to the administrator that a crash has occurred. The administrator then can rebuild the lost data to a new hard drive using data stored on the remaining hard drives in the array. No data is lost, only a little time is required to get the system back up and running.
There are many variations of R.A.I.D. They vary on their own, and can vary on implementation depending on the OS supported. R.A.I.D. is a very popular concept, and is widely included in data redundancy.
Much has been written about data security and is available in many places including online. However, like a regular Sunday sermon, we users need to be constantly reminded of securing our data before it becomes lost forever!
Whether you are a home user that has spent many hours scanning your family albums, a financial or medical organization, or anything in between, you can help prevent inconvenience, livelihood or legal consequences by protecting important data. Now is a great time to plan for a data loss emergency.
Perform a data backup now!
Do you love everything IT?
If you want to start or advance a career in IT, LeaderQuest can help! We offer 5-10 day courses with traditional instruction and hands-on labs designed to quickly get you ready for your certification exam. Our Career Services department works one-on-one with each student to help them prepare for the job hunt, and connect them with our local Employer Partners. You could be doing Data Backup for a company professionally!Learn More
Born in Billings, MT, Gary attended the Oklahoma City University, the University of Central Oklahoma, The University of Oklahoma, and Oklahoma City Community College. His proudest accomplishments come from helping students connect with job placement contacts. He’s worked with companies like Dell, Purina, and many others to facilitate great jobs.