PDQIE - PDQ Industrial Electric
Data Integraty, Backup and
Recovery
In Information Technology, a backup or the process of backing up refers to making copies of data so that these
additional copies may be used to restore the original after a data lossevent. The verb form is 'back up' in two
words, whereas the noun is 'backup' (often used like an adjective in compound nouns).
Backups have two distinct purposes. The primary purpose is to recover data as a reaction to data loss, be it by
data deletion or corrupted data. Data loss is a very common experience of computer users. 67% of internet users
have suffered serious data loss. The secondary purpose of backups is to recover data from a historical period
of time within the constraints of a user-defined data retention policy, typically configured within a backup
application for how long copies of data are required. Though backups popularly represent a simple form of
disaster recovery, and should be part of a disaster recovery plan, by themselves, backups should not alone be
considered disaster recovery. Not all backup systems and/or backup applications are able to reconstitute a
computer system, or in turn other complex configurations such as acomputer cluster, active directory servers,
or a database server, by restoring only data from a backup.
Since a backup system contains at least one copy of all data worth saving, the data storage requirements are
considerable. Organizing this storage space and managing the backup process is a complicated undertaking. A
data repository model can be used to provide structure to the storage. In the modern era of computing there are
many different types of data storage devices that are useful for making backups. There are also many different
ways in which these devices can be arranged to provide geographic redundancy, data security, and portability.
Before data is sent to its storage location, it is selected, extracted, and manipulated. Many different
techniques have been developed to optimize the backup procedure. These include optimizations for dealing with
open files and live data sources as well as compression, encryption, and de-duplication, among others. Many
organizations and individuals try to have confidence that the process is working as expected and work to define
measurements and validation techniques. It is also important to recognize the limitations and human factors
involved in any backup scheme.
STORAGE, THE BASE OF A BACKUP SYSTEM
DATA REPOSITORY MODELS
Any backup strategy starts with a concept of a data repository. The backup data needs to be stored
somehow and probably should be organized to a degree. It can be as simple as a sheet of paper with a list of all
backup tapes and the dates they were written or a more sophisticated setup with a computerized index, catalog, or
relational database. Different repository models have different advantages. This is closely related to choosing a
backup rotation scheme.
UNSTRUCTURED
An unstructured repository may simply be a stack of floppy disks or CD-R/DVD-R media with minimal information about
what was backed up and when. This is the easiest to implement, but probably the least likely to achieve a high
level of recoverability.
FULL + INCREMENTALS
A full + incremental repository aims to make it more feasible to store several copies of the source data. At first,
a full backup (of all files) is made. After that, any number of incremental backups can be made. There are many
different types of incremental backups, but they all attempt to only back up a small amount of data (when compared
to the size of a full backup). An incremental backup copies everything that has changed since the last backup
(full, differential or incremental). Restoring a whole system to a certain point in time would require locating the
last full backup taken previous to that time and all the incremental backups that cover the period of time between
the full backup and the particular point in time to which the system is supposed to be restored. The scope of an
incremental backup is typically defined as the period of time between other full or incremental backups. Different
implementations of backup systems frequently use specialized or conflicting definitions of these terms.
DIFFERENTIAL BACKUP
A differential backup copies files that have been created or changed since the last full backup. It does not mark
files as having been backed up (in other words, the archive attribute is not cleared). If you are performing a
combination of full and differential backups, restoring files and folders requires that you have the last full as
well as the last differential backup.
REVERSE DELTA
A reverse delta system stores the differences between current versions of a system and previous versions. A reverse
delta backup will start with a normal full backup. After the full backup is performed, the system will periodically
synchronize the full backup with the live copy, while storing the data necessary to reconstruct older versions.
This can either be done using hard links, or using binary diffs. This system works particularly well for large,
slowly changing, data sets. Examples of programs that use this method are rdiff-backup and Time Machine.
CONTINUOUS DATA PROTECTION
Instead of scheduling periodic backups, the system immediately logs every change on the host system. This is
generally done by saving byte or block-level differences rather than file-level differences. It differs from simple
disk mirroring in that it enables a roll-back of the log and thus restoration of old image of data.
FULL SYSTEM BACKUP
This type of backup is designed to allow an entire PC to be recovered to "bare metal" without any installation of
operating system, application software and data. Most users understand that a backup will prevent "data" from being
lost. The expense in a full system recovery is in the hours that it takes for a technician to rebuild a machine to
the point of restoring the last data backup. So, a full system backup makes a complete image of the computer so
that if needed, it can be copied back to the PC, usually using some type of bespoke software such as Ghost, and the
user can carry on from that point.
STORAGE MEDIA
Regardless of the repository model that is used, the data has to be stored on some data storage
medium somewhere.
MAGNETIC TAPE Magnetic tape has long been the most
commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of
magnitude better capacity/price ratio when compared to hard disk, but recently the ratios for tape and hard disk
have become a lot closer.[6] There are myriad formats, many of which are proprietary or specific to certain markets
like mainframes or a particular brand of personal computer. Tape is a sequential access medium, so even though
access times may be poor, the rate of continuously writing or reading data can actually be very fast. Some new tape
drives are even faster than modern hard disks. A principal advantage of tape is that it has been used for this
purpose for decades (much longer than any alternative) and its characteristics are well understood.
HARD DISK The capacity/price ratio of hard disk
has been rapidly improving for many years. This is making it more competitive with magnetic tape as a bulk storage
medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.[7]
External disks can be connected via local interfaces like SCSI, USB, FireWire, or eSATA, or via longer distance
technologies like Ethernet, iSCSI, or Fibre Channel. Some disk-based backup systems, such as Virtual Tape
Libraries, support data deduplication which can dramatically reduce the amount of disk storage capacity consumed by
daily and weekly backup data. The main disadvantages of hard disk backups are that they are easily damaged,
especially while being transported (e.g., for off-site backups), and that their stability over periods of years is
a relative unknown.
OPTICAL STORAGE Blu-ray Discs dramatically
increase the amount of data possible on a single optical storage disk. Systems containing Blu-ray discs can store
massive amounts of data and be more cost efficient than hard drives and magnetic tape. Some optical storage systems
allow for cataloged data backups without human contact with the discs, allowing for longer data integrity. A
recordable CD can be used as a backup device. One advantage of CDs is that they can be restored on any machine with
a CD-ROM drive. (In practice, writable CD-ROMs are not always universally readable.) In addition, recordable CD's
are relatively cheap. Another common format is recordable DVD. Many optical disk formats are WORM type, which makes
them useful for archival purposes since the data can't be changed. Other rewritable formats can also be utilized
such as CD-RW or DVD-RAM.
FLOPPY DISK During the 1980s and early 1990s, many
personal/home computer users associated backing up mostly with copying to floppy disks. The low data capacity of a
floppy disk makes it an unpopular and obsolete choice today.
SOLID STATE STORAGE Also known as flash memory,
thumb drives, USB flash drives, CompactFlash, SmartMedia, Memory Stick, Secure Digital cards, etc., these devices
are relatively costly for their low capacity, but offer excellent portability and ease-of-use.
REMOTE BACKUP SERVICE As broadband internet access
becomes more widespread, remote backup services are gaining in popularity. Backing up via the internet to a remote
location can protect against some worst-case scenarios such as fires, floods, or earthquakes which would destroy
any backups in the immediate vicinity along with everything else. There are, however, a number of drawbacks to
remote backup services. First, Internet connections are usually slower than local data storage devices. Residential
broadband is especially problematic as routine backups must use an upstream link that's usually much slower than
the downstream link used only occasionally to retrieve a file from backup. This tends to limit the use of such
services to relatively small amounts of high value data. Secondly, users must trust a third party service provider
to maintain the privacy and integrity of their data, although confidentiality can be assured by encrypting the data
before transmission to the backup service with an encryption key known only to the user. Ultimately the backup
service must itself use one of the above methods so this could be seen as a more complex way of doing traditional
backups.
MANAGING THE DATA REPOSITORY
Regardless of the data repository model or data storage media used for backups, a balance needs to
be struck between accessibility, security and cost. These media management methods are not mutually exclusive and
are frequently combined to meet the needs of the situation. Using on-line disks for staging data before it is sent
to a near-line tape library is a common example.
ON-LINE On-line backup storage is typically the
most accessible type of data storage, which can begin restore in milliseconds time. A good example would be an
internal hard disk or a disk array (maybe connected to SAN). This type of storage is very convenient and speedy,
but is relatively expensive. On-line storage is quite vulnerable to being deleted or overwritten, either by
accident, by intentional malevolent action, or in the wake of a data-deleting virus payload.
NEAR-LINE Near-line storage is typically less
accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would
be a tape library with restore times ranging from seconds to a few minutes. A mechanical device is usually involved
in moving media units from storage into a drive where the data can be read or written. Generally it has safety
properties similar to on-line storage.
OFF-LINE Off-line storage requires some direct
human action in order to make access to the storage media physically possible. This action is typically inserting a
tape into a tape drive or plugging in a cable that allows a device to be accessed. Because the data is not
accessible via any computer except during limited periods in which it is written or read back, it is largely immune
to a whole class of on-line backup failure modes. Access time will vary depending on whether the media is on-site
or off-site.
OFF-SITE DATA PROTECTION To protect against a
disaster or other site-specific problem, many people choose to send backup media to an off-site vault. The vault
can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened,
temperature-controlled, high-security bunker that has facilities for backup media storage. Importantly a data
replica can be off-site but also on-line (e.g., an off-site RAID mirror). Such a replica has fairly limited value
as a backup, and should not be confused with an off-line backup.
BACKUP SITE OR DISASTER RECOVERY CENTER (DR
CENTER) In the event of a disaster, the data on backup media will not be sufficient to recover.
Computer systems onto which the data can be restored and properly configured networks are necessary too. Some
organizations have their own data recovery centers that are equipped for this scenario. Other organizations
contract this out to a third-party recovery center. Because a DR site is itself a huge investment, backing up is
very rarely considered the preferred method of moving data to a DR site. A more typical way would be remote disk
mirroring, which keeps the DR data as up to date as possible.
SELECTION AND EXTRACTION OF DATA
A successful backup job starts with selecting and extracting coherent units of data. Most data on
modern computer systems is stored in discrete units, known as files. These files are organized into filesystems.
Files that are actively being updated can be thought of as "live" and present a challenge to back up. It is also
useful to save metadata that describes the computer or the filesystem being backed up.
Deciding what to back up at any given time is a harder process than it seems. By backing up too
much redundant data, the data repository will fill up too quickly. Backing up an insufficient amount of data can
eventually lead to the loss of critical information.
FILES
COPYING FILES
Making copies of files is the simplest and most common way to perform a backup. A means to perform this basic
function is included in all backup software and all operating systems.
PARTIAL FILE COPYING Instead of copying whole
files, one can limit the backup to only the blocks or bytes within a file that have changed in a given period of
time. This technique can use substantially less storage space on the backup medium, but requires a high level of
sophistication to reconstruct files in a restore situation. Some implementations require integration with the
source filesystem.
When backing up over a network, the rsync utility automatically transmits a minimum set of changes to bring an
earlier version of a file at the destination up to date with the current version at the source. Rsync can
dramatically reduce the network traffic needed to maintain a remote mirror of a large set of files undergoing
small, frequent changes.
FILESYSTEMS
FILESYSTEM DUMP
Instead of copying files within a filesystem, a copy of the whole filesystem itself can be made. This is also known
as a raw partition backup and is related to disk imaging. The process usually involves unmounting the filesystem
and running a program like dd (Unix). Because the disk is read sequentially and with large buffers, this type of
backup can be much faster than reading every file normally, especially when the filesystem contains many small
files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that
contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is
nearly empty. Some filesystems, such as XFS, provide a "dump" utility that reads the disk sequentially for high
performance while skipping unused sections. The corresponding restore utility can selectively restore individual
files or the entire volume at the operator's choice.
IDENTIFICATION OF CHANGES
Some filesystems have an archive bit for each file that says it was recently changed. Some backup software looks at
the date of the file and compares it with the last backup to determine whether the file was changed.
VERSIONING FILE SYSTEM
A versioning filesystem keeps track of all changes to a file and makes those changes accessible to the user.
Generally this gives access to any previous version, all the way back to the file's creation time. An example of
this is the Wayback versioning filesystem for Linux.
If a computer system is in use while it is being backed up, the possibility of files being open for
reading or writing is real. If a file is open, the contents on disk may not correctly represent what the owner of
the file intends. This is especially true for database files of all kinds. The term fuzzy backup can be used to
describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at
any single point in time. This is because the data being backed up changed in the period of time between when the
backup started and when it finished. For databases in particular, fuzzy backups are worthless.
SNAPSHOT BACKUP
A snapshot is an instantaneous function of some storage systems that presents a copy of the file system as if it
were frozen at a specific point in time, often by a copy-on-write mechanism. An effective way to back up live data
is to temporarily quiesce it (e.g. close all files), take a snapshot, and then resume live operations. At this
point the snapshot can be backed up through normal methods.[10] While a snapshot is very handy for viewing a
filesystem as it was at a different point in time, it is hardly an effective backup mechanism by itself.
OPEN FILE BACKUP
Many backup software packages feature the ability to handle open files in backup operations. Some simply check for
openness and try again later. File locking is useful for regulating access to open files.
When attempting to understand the logistics of backing up open files, one must consider that the backup process
could take several minutes to back up a large file such as a database. In order to back up a file that is in use,
it is vital that the entire backup represent a single-moment snapshot of the file, rather than a simple copy of a
read-through. This represents a challenge when backing up a file that is constantly changing. Either the database
file must be locked to prevent changes, or a method must be implemented to ensure that the original snapshot is
preserved long enough to be copied, all while changes are being preserved. Backing up a file while it is being
changed, in a manner that causes the first part of the backup to represent data before changes occur to be combined
with later parts of the backup after the change results in a corrupted file that is unusable, as most large files
contain internal references between their various parts that must remain consistent throughout the file.
COLD DATABASE BACKUP
During a cold backup, the database is closed or locked and not available to users. The datafiles do not change
during the backup process so the database is in a consistent state when it is returned to normal operation.
HOT DATABASE BACKUP
Some database management systems offer a means to generate a backup image of the database while it is online and
usable "hot". This usually includes an inconsistent image of the data files plus a log of changes made while the
procedure is running. Upon a restore, the changes in the log files are reapplied to bring the database in sync.
METADATA
Not all information stored on the computer is stored in files. Accurately recovering a complete
system from scratch requires keeping track of this non-file data too.
SYSTEM DESCRIPTION System specifications are
needed to procure an exact replacement after a disaster.
BOOT SECTOR
The boot sector can sometimes be recreated more easily than saving it. Still, it usually isn't a normal file and
the system won't boot without it.
PARTITION LAYOUT
The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly
recreate the original system.
FILE METADATA
Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly
recreate the original environment.
SYSTEM METADATA
Different operating systems have different ways of storing configuration information. Microsoft Windows keeps a
registry of system information that is more difficult to restore than a typical file.
MANIPULATION OF DATA AND DATASET
OPTIMIZATION
It is frequently useful or required to manipulate the data being backed up to optimize the backup
process. These manipulations can provide many benefits including improved backup speed, restore speed, data
security, media usage and/or reduced bandwidth requirements.
COMPRESSION
Various schemes can be employed to shrink the size of the source data to be stored so that it uses less storage
space. Compression is frequently a built-in feature of tape drive hardware.
DE-DUPLICATION
When multiple similar systems are backed up to the same destination storage device, there exists the potential for
much redundancy within the backed up data. For example, if 20 Windows workstations were backed up to the same data
repository, they might share a common set of system files. The data repository only needs to store one copy of
those files to be able to restore any one of those workstations. This technique can be applied at the file level or
even on raw blocks of data, potentially resulting in a massive reduction in required storage space. Deduplication
can occur on a server before any data moves to backup media, sometimes referred to as source/client side
deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process
can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.
DUPLICATION
Sometimes backup jobs are duplicated to a second set of storage media. This can be done to rearrange the backup
images to optimize restore speed or to have a second copy at a different location or on a different storage
medium.
ENCRYPTION
High capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.
Encrypting the data on these media can mitigate this problem, but presents new problems. Encryption is a CPU
intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective
as the security of the key management policy.
MULTIPLEXING
When there are many more computers to be backed up than there are destination storage devices, the ability to use a
single storage device with several simultaneous backups can be useful.
REFACTORING
The process of rearranging the backup sets in a data repository is known as refactoring. For example, if a backup
system uses a single tape each day to store the incremental backups for all the protected computers, restoring one
of the computers could potentially require many tapes. Refactoring could be used to consolidate all the backups for
a single computer onto a single tape. This is especially useful for backup systems that do incrementals forever
style backups.
STAGING
Sometimes backup jobs are copied to a staging disk before being copied to tape. This process is sometimes referred
to as D2D2T, an acronym for Disk to Disk to Tape. This can be useful if there is a problem matching the speed of
the final destination device with the source device as is frequently faced in network-based backup systems. It can
also serve as a centralized location for applying other data manipulation techniques.
MANAGING THE BACKUP PROCESS
It is important to understand that backing up is a process. As long as new data is being created
and changes are being made, backups will need to be updated. Individuals and organizations with anything from one
computer to thousands (or even millions) of computer systems all have requirements for protecting data. While the
scale is different, the objectives and limitations are essentially the same. Likewise, those who perform backups
need to know to what extent they were successful, regardless of scale.
OBJECTIVES
RECOVERY POINT OBJECTIVE (RPO) The point in time
that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a
result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more
recent recovery point achievable requires increasing the frequency of synchronization between the source data and
the backup repository.
RECOVERY TIME OBJECTIVE (RTO) The amount of time
elapsed between disaster and restoration of business functions.
DATA SECURITY In addition to preserving access to
data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that
does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media
handling policies.
LIMITATIONS
An effective backup scheme will take into consideration the limitations of the situation.
BACKUP WINDOW
The period of time when backups are permitted to run on a system is called the backup window. This is typically the
time when the system sees the least usage and the backup process will have the least amount of interference with
normal operations. The backup window is usually planned with users' convenience in mind. If a backup extends past
the defined backup window, a decision is made whether it is more beneficial to abort the backup or to lengthen the
backup window.
PERFORMANCE IMPACT
All backup schemes have some performance impact on the system being backed up. For example, for the period of time
that a computer system is being backed up, the hard drive is busy reading files for the purpose of backing up, and
its full bandwidth is no longer available for other tasks. Such impacts should be analyzed.
COSTS OF HARDWARE, SOFTWARE, LABOR
All types of storage media have a finite capacity with a real cost. Matching the correct amount of storage capacity
(over time) with the backup needs is an important part of the design of a backup scheme. Any backup scheme has some
labor requirement, but complicated schemes have considerably higher labor requirements. The cost of commercial
backup software can also be considerable.
NETWORK BANDWIDTH
Distributed backup systems can be affected by limited network bandwidth.
IMPLEMENTATION
Meeting the defined objectives in the face of the above limitations can be a difficult task. The
tools and concepts below can make that task more achievable.
SCHEDULING Using a job scheduler can greatly
improve the reliability and consistency of backups by removing part of the human element. Many backup software
packages include this functionality.
AUTHENTICATION
Over the course of regular operations, the user accounts and/or system agents that perform the backups need to be
authenticated at some level. The power to copy all data off of or onto a system requires unrestricted access. Using
an authentication mechanism is a good way to prevent the backup scheme from being used for unauthorized
activity.
CHAIN OF TRUST Removable storage media are
physical items and must only be handled by trusted individuals. Establishing a chain of trusted individuals (and
vendors) is critical to defining the security of the data.
MEASURING THE PROCESS
To ensure that the backup scheme is working as expected, the process needs to include monitoring
key factors and maintaining historical data.
BACKUP VALIDATION
(also known as "backup success validation") The process by which owners of data can get information about how their
data was backed up. This same process is also used to prove compliance to regulatory bodies outside of the
organization, for example, an insurance company might be required under HIPAA to show "proof" that their patient
data are meeting records retention requirements.[16] Disaster, data complexity, data value and increasing
dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful
backups to ensure business continuity. For that reason, many organizations rely on third-party or "independent"
solutions to test, validate, and optimize their backup operations (backup reporting).
REPORTING In larger configurations, reports are
useful for monitoring media usage, device status, errors, vault coordination and other information about the backup
process.
LOGGING In addition to the history of computer
generated reports, activity and change logs are useful for monitoring backup system events.
VALIDATION Many backup programs make use of
checksums or hashes to validate that the data was accurately copied. These offer several advantages. First, they
allow data integrity to be verified without reference to the original file: if the file as stored on the backup
medium has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can
use checksums to avoid making redundant copies of files, to improve backup speed. This is particularly useful for
the de-duplication process.
MONITORED BACKUP
Backup processes are monitored by a third party monitoring center. This center alerts users to any errors that
occur during automated backups. Monitored backup requires software capable of pinging the monitoring center's
servers in the case of errors. Some monitoring services also allow collection of historical meta-data, that can be
used for Storage Resource Management purposes like projection of data growth, locating redundant primary storage
capacity and reclaimable backup capacity. The Wizards Storage Portal is an example of a solution that monitors
IBM's well known Tivoli Storage Manager(TSM) solution.
LORE
CONFUSION Due to a considerable overlap in
technology, backups and backup systems are frequently confused with archives and fault-tolerant systems. Backups
differ from archives in the sense that archives are the primary copy of data, usually put away for future use,
while backups are a secondary copy of data, kept on hand to replace the original item. Backup systems differ from
fault-tolerant systems in the sense that backup systems assume that a fault will cause a data loss event and
fault-tolerant systems assure a fault will not.
ADVICE * The more important the data that is
stored on the computer, the greater is the need for backing up this data.
* A backup is only as useful as its associated restore strategy. For critical systems and data, the
restoration process must be tested.
* Storing the copy near the original is unwise, since many disasters such as fire, flood, theft, and
electrical surges are likely to cause damage to the backup at the same time. In these cases, both the original and
the backup medium are likely to be lost.
* Automated backup and scheduling should be considered, as manual backups can be affected by human error.
* Backups can fail for a wide variety of reasons. A verification or monitoring strategy is an important part
of a successful backup plan.
* Multiple backups on different media, stored in different locations, should be used for all critical
information.
* Backed up archives should be stored in open and standard formats, especially when the goal is long-term
archiving. Recovery software and processes may have changed, and software may not be available to restore data
saved in proprietary formats.
* System administrators and others working in the information technology field are routinely fired for not
devising and maintaining backup processes suitable to their organization.
* If you already have a tape backup system, a second backup program may be necessary. Perform an additional
backup to an external hard disk with an automatic backup program so you will have doubled the data security, and it
is easy to check the backed-up files in the external hard disk.
Wikipedia
|