-
Notifications
You must be signed in to change notification settings - Fork 9
Home
This documentation describes some of the details behind the implementation of these tools, the reasons behind them and their usage.
Our backup and recovery mechanism consists of four tools: two backup tools and two recovery tools.
- Backup – Creates snapshots of HFiles
- LogCopier – Archives HLogs for future use
- Import – Restores a previous backup
- WALPlayer – Replays HLogs to achieve (mostly) consistent restores
This tool is responsible for creating snapshots of HBase tables. Put simply, it copies the underlying HFiles that correspond to the tables we are extracting and writes them to a location in HDFS where HBase cannot modify them. If you are familiar with HBase, you know that, in an environment as active as production, it is frequently merging and deleting files due to events like flushes and region splits. It is virtually impossible to predict what the framework will do, which often results in errors. Our backup tool takes these failure scenarios into account and performs certain actions to create a successful backup.
HBase can be very active in the background, and this activity will inevitably cause failures. Backup was designed to embrace them. It's something that we can't control, so we work around it, and we find that this is easier to do if we work at the region level. This means that when we copy files we copy the files for that region first. If we can't successfully copy all of its files, then the entire region is failed and it is queued for the next attempt. In the mean time, we move on to the next region and repeat the same steps. When the this process completes, the next step is to assess why the region failed. Did the regions split? Or were files merely added/removed? Knowing this can help us determine how to proceed. This tool is probably the more complex of the four, but the algorithm is fairly straight forward:
- LR ← Get a list of all regions from .META.
- for each region R in LR
- If R is a daughter region of any other region in LR
- Remove R from LR (According to the HBase book by Lars George, the parent regions contain the actual data. Daughter regions only contain references to it which we do not want, at least until a compaction is performed on them and the parent is deleted)
- (end if)
- If R is a daughter region of any other region in LR
- For each region R in LR (This step is done in parallel in a MapReduce job)
- Flush region
- LF ← List of files in this region
- For each file F in LF
- Copy F to backup location
- If failed to copy F
- Remove R from backup location
- Add R to retry queue
- Move on to next region
- (end if)
- (end for loop)
- LFR ← List of failed regions from retry
- LCR ← List of completed regions: LR - LFR
- If size of LFR is greater than zero
- LR ← Get the most current regions in .META.
- Remove from LR those regions which are already in LCR (Don't want to copy it twice)
- Remove from LR those regions which are daughters of any regions on LCR (Already have parent data)
- Go to step 3
- Verify LCR by making sure that the start and end keys all line-up
The backup tool works well on its own, but there can be a significant time window between the start of a backup and the time it finishes. Within that time, a lot can change. By the time we are done copying one region and moved on to the next, records could have been added or modified in the previous region. Some files may be more up to date than others, thus, we need a way of achieving better snapshots. This is where the LogCopier fits into the backup process.
HLog files contain a history of the changes to the database. In cases of failure, HBase can replay them and guarantee that no data is lost, but once these changes are finally written into HFiles, it moves the files into a directory where they are later deleted. The LogCopier process monitors the HLog directories and looks for new files that may have been added, and copies those files into a different HDFS directory.
Now that we have our backups, we need a way of importing them into HBase. Manually moving the files into the HBase directory doesn't just work. Every region needs to be registered with .META. The Import tool does some light checks to make sure that the files we are importing are valid. It checks for .tableinfo files and .regioninfo files and makes sure that they all look correct. It will then start importing tables one region at a time.
This tool is available in the HBase repository, but not it is not set to be released until 0.94 [HBASE-5604]. We simply ported the code to work with the current version we are using (0.92.1). It takes the directory containing the log files we archived with LogCopier. It can also take a time range to filter and apply only the updates that happened within that range.
This is an overview of how to create backups. The tools run independently of each other, so run them in any order. Though, it is important to remember that to accomplish a point-in-time backup, you'll need all the log files from the time the Backup was created to the time you wish to restore to.
- Run LogCopier to copy HLog files to a safe location.
- Should run as a monitored background process.
- It should be configured to copy rather frequently. Say anywhere from 10 minutes to 1 hour. A good rule might be to have it copy the files at the same rate that HBase rolls its log files.
- If for any reason, the process shuts down, it can be restarted and it will continue where it left off (as long as it is restarted before HBase gets around to deleting them).
- Run Backup tool against the cluster
- Preferably at a time when there is less cluster activity and lower chances of a major compaction. This helps reduce the copy time-window which minimizes the odds of failure.
- This tool should run less frequently than the LogCopier. Once or twice a week would work.
- When complete, the program will output the backup directory which can be passed to the Import tool as an argument.
Restoring a previous backup consists of the following steps, in order:
- Run Import tool against the cluster
- Takes the directory generated by the Backup tool as input. The files need to reside in the same cluster.
- Make sure HBase is running and that the tables you are restoring do not exist.
- The tool works by moving files from the backup directory to HBase. So, you loose your backup when you do this. You can tell it to copy the files instead of just moving them. However, note that copying large files will significantly increase the time it takes to restore a backup.
- Run hbase hbck to verify that all regions look good to HBase.
- Run WALPlayer against the cluster.
- As input, it takes the directory where HLog files are archived (the same one passed to LogCopier), and time range for log edits to replay.
- The start time should be the start time of the backup and end time is the time to restore to (which should be equal to or greater than the time the backup finished).
- Run hbase hbck one last time for assurance.
Make sure to check out the command line arguments to see what options are available. This example covers the basics of how to perform a backup and restore.
When running LogCopier for the first time in a while, it's possible that it will fail to copy some logs because HBase may delete them before we get a chance to copy them. This should not happen after subsequent runs, but if it does, the copy frequency may be set too high and should be lowered. For instance, if you roll logs every 10 minutes and copy files every hour, then you'll probably miss some logs. Make sure that the copy frequency is set to an optimal value. Finally, make sure the class path points to all the necessary resources:
$ java org.oclc.firefly.hadoop.backup.LogCopier -d /user/espinozca/backup/logs -m 10 12/05/07 09:57:23 INFO backup.LogCopier: Copy frequency : 10 minutes 12/05/07 09:57:23 INFO backup.LogCopier: Archive directory : /user/espinozca/backup/logs 12/05/07 09:57:23 INFO backup.LogCopier: Starting run on Mon May 07 09:57:23 EDT 2012 12/05/07 09:57:24 INFO backup.LogCopier: HLog: hdfs://finddev07.dev.oclc.org:29318/hbase/.logs/finddev07.dev.oclc.org,29319,1335815149226/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815183030 12/05/07 09:57:25 INFO backup.LogCopier: HLog: hdfs://finddev07.dev.oclc.org:29318/hbase/.logs/finddev07.dev.oclc.org,29319,1335815149226/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815647578 12/05/07 09:57:26 INFO backup.LogCopier: HLog: hdfs://finddev07.dev.oclc.org:29318/hbase/.logs/finddev07.dev.oclc.org,29319,1335815149226/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815736477 12/05/07 09:57:27 INFO backup.LogCopier: HLog: hdfs://finddev07.dev.oclc.org:29318/hbase/.logs/finddev07.dev.oclc.org,29319,1335815149226/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815951712 12/05/07 09:57:28 INFO backup.LogCopier: HLog: hdfs://finddev07.dev.oclc.org:29318/hbase/.logs/finddev07.dev.oclc.org,29319,1335815149226/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335816116619 12/05/07 09:57:28 INFO backup.LogCopier: HLog: hdfs://finddev07.dev.oclc.org:29318/hbase/.logs/finddev07.dev.oclc.org,29319,1335815149226/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335816193148 12/05/07 09:57:30 INFO backup.LogCopier: Completed on Mon May 07 09:57:30 EDT 2012 (0m, 7s) Results: [ last: 6/0 total: 6/0 ]
This will continue running and every 10 minutes, it will check for new logs. Also, even though this job is not map-reduce, we can use the ./hadoop script for convenience. We can also run it like this:
$ hadoop jar map-reduce-job.jar org.oclc.firefly.hadoop.backup.LogCopier -d /user/espinozca/backup/logs
Once we have some logs archived, we can now create our Backup. It's important to consider the number of mappers to run. The number of regions to copy is evenly divided among this number of mappers.
It might also be interesting to consider the starting and final replication value. The initial replication value (default is 1) is the replication factor with which files are copied. The final replication value (default is 3) is the replication factor that the copied files are set to once the backup completes. Having a low initial replication factor, like 1, reduces the copy time-window and minimizes the chances of failures.
The following is an example of how to backup all tables in HBase:
$ hadoop jar firefly-map-reduce-job.jar org.oclc.firefly.hadoop.backup.Backup 12/05/07 10:11:00 INFO backup.Backup: HBase backup tool 12/05/07 10:11:00 INFO backup.Backup: -------------------------------------------------- 12/05/07 10:11:00 INFO backup.Backup: Destination fs : hdfs://finddev07.dev.oclc.org:29318 12/05/07 10:11:00 INFO backup.Backup: Initial replication: 1 12/05/07 10:11:00 INFO backup.Backup: Final replication : 1 12/05/07 10:11:00 INFO backup.Backup: Number of attempts : Until nothing left to copy 12/05/07 10:11:00 INFO backup.Backup: Username : espinozca 12/05/07 10:11:00 INFO backup.Backup: Number map tasks : 2 12/05/07 10:11:00 INFO backup.Backup: Backup store path : hdfs://finddev07.dev.oclc.org:29318/user/espinozca/backup 12/05/07 10:11:00 INFO backup.Backup: -------------------------------------------------- ... omitting output of ZK environment variables 12/05/07 10:11:01 INFO backup.Backup: Exporting the following tables: 12/05/07 10:11:01 INFO backup.Backup: . TransactionJournalProcessIdLocator 12/05/07 10:11:01 INFO backup.Backup: . PendingEvents ... omitting output of most tables being exported 12/05/07 10:11:01 INFO backup.Backup: Starting backup path: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/backup/bak-20120507.101101.491 12/05/07 10:11:02 INFO backup.Backup: Backup bak-20120507.101101.491 (Attempt 1) 12/05/07 10:11:02 INFO backup.Backup: -------------------------------------------------- 12/05/07 10:11:02 INFO backup.Backup: Number of regions : 37 12/05/07 10:11:02 INFO backup.Backup: Number of map tasks: 2 12/05/07 10:11:02 INFO backup.Backup: Mapper input path : /user/espinozca/tmp/backup/input/input-0 12/05/07 10:11:02 INFO backup.Backup: Mapper output path : /user/espinozca/tmp/backup/output/output-0 12/05/07 10:11:02 INFO backup.Backup: -------------------------------------------------- 12/05/07 10:11:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 12/05/07 10:11:03 INFO input.FileInputFormat: Total input paths to process : 2 12/05/07 10:11:03 INFO mapred.JobClient: Running job: job_201204301545_0003 12/05/07 10:11:04 INFO mapred.JobClient: map 0% reduce 0% 12/05/07 10:11:17 INFO mapred.JobClient: map 50% reduce 0% 12/05/07 10:11:18 INFO mapred.JobClient: map 60% reduce 0% 12/05/07 10:11:21 INFO mapred.JobClient: map 100% reduce 0% 12/05/07 10:11:22 INFO mapred.JobClient: Job complete: job_201204301545_0003 12/05/07 10:11:22 INFO mapred.JobClient: Counters: 19 12/05/07 10:11:22 INFO mapred.JobClient: Backup 12/05/07 10:11:22 INFO mapred.JobClient: FilesCopied=66 12/05/07 10:11:22 INFO mapred.JobClient: RegionsCopied=37 12/05/07 10:11:22 INFO mapred.JobClient: Job Counters ... omitting output of other MR counters 12/05/07 10:11:22 INFO backup.Backup: MR job finished successfully 12/05/07 10:11:22 INFO backup.Backup: Checking table: Authority ... omitting output of verification 12/05/07 10:11:22 INFO backup.Backup: Verification passed succesfully 12/05/07 10:11:22 INFO backup.Backup: -------------------------------------------------- '''12/05/07 10:11:22 INFO backup.Backup: Backup located at: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/backup/bak-20120507.101101.491-20120507.101122.698''' 12/05/07 10:11:22 INFO backup.Backup: Backup complete
Note the highlighted line in the example. This is the location of the backup. The name of the directory is bak-20120507.101101.491-20120507.101122.698, and the format of the name is
bak-<StartBackupTime>-<EndBackupTime>
The format of the start time and end time is
yyyyMMdd.kkmmss.SSS
These times can be useful when replaying logs. Finally, you can also backup individual tables instead of backing up the entire database. This example creates copies of two tables, Country and State:
$ hadoop jar map-reduce-job.jar org.oclc.firefly.hadoop.backup.Backup -t Country,State
In the case that we have data corruption or we have accidentally lost some HBase files, then the Import can help restore the database to a previous copy. Using the directory created in the previous example, we can import the entire database:
$ java org.oclc.firefly.hadoop.backup.Import -i /user/espinozca/backup/bak-20120507.101101.491-20120507.101122.698 ... omitting output of ZK environment variables 12/05/07 10:32:57 INFO backup.Import: HBase import tool 12/05/07 10:32:57 INFO backup.Import: -------------------------------------------------- 12/05/07 10:32:57 INFO backup.Import: Backup start time : Mon May 07 10:11:01 EDT 2012 12/05/07 10:32:57 INFO backup.Import: Backup end time : Mon May 07 10:11:22 EDT 2012 12/05/07 10:32:57 INFO backup.Import: Retain original copy: false 12/05/07 10:32:57 INFO backup.Import: HBase location : hdfs://finddev07.dev.oclc.org:29318/hbase 12/05/07 10:32:57 INFO backup.Import: Backup location : /user/espinozca/backup/bak-20120507.101101.491-20120507.101122.698 12/05/07 10:32:57 INFO backup.Import: -------------------------------------------------- 12/05/07 10:32:57 INFO backup.Import: Importing tables 12/05/07 10:32:57 INFO backup.Import: . Authority 12/05/07 10:32:57 INFO util.FSTableDescriptors: Current tableInfoPath = hdfs://finddev07.dev.oclc.org:29318/hbase/Authority/.tableinfo.0000000001 12/05/07 10:32:57 INFO util.FSTableDescriptors: TableInfo already exists.. Skipping creation ... omitting output of most tables being imported 12/05/07 10:32:59 INFO backup.Import: Import results 12/05/07 10:32:59 INFO backup.Import: -------------------------------------------------- 12/05/07 10:32:59 INFO backup.Import: Number of tables: 37 12/05/07 10:32:59 INFO backup.Import: Imported tables : 37 12/05/07 10:32:59 INFO backup.Import: Failed : 0 12/05/07 10:32:59 INFO backup.Import: -------------------------------------------------- 12/05/07 10:32:59 INFO backup.Import: Import completed successfully.
We can also import only the tables we want. The following example imports the Country table and the State table. Also, we tell it to create a copy of the files. Remember that files get moved into HBase, so this way we can keep a copy of our backup.
$ java org.oclc.firefly.hadoop.backup.Import -i /user/espinozca/backup/bak-20120507.101101.491-20120507.101122.698 -t Country,State --copy
We can also use the hadoop script to run Import if we prefer:
$ hadoop jar map-reduce-job.jar org.oclc.firefly.hadoop.backup.Import -i /user/espinozca/backup/bak-20120507.101101.491-20120507.101122.698
The final step in the recovery process is to replay the HLogs so that we can get a point-in-time snapshot. For this tool, you have to specify all the table names for the logs you want to replay. Also, it is important to specify the start and end time for the logs to replay. Otherwise, it will replay everything it finds. You can specify the time in milliseconds or in the following format
yyyy-MM-dd'T'HH:mm:ss.SS i.e. 2001-02-20T16:35:06.99
Say we want to restore to the time 2012-05-07T11:11:22.69. Then we want to replay the logs from the beginning of the backup from the previous example (2012-05-07T10:11:01.49) to the time we want to restore. This is how to replay the logs for the Country and State tables given the time range:
$ hadoop jar firefly-map-reduce-job.jar org.apache.hadoop.hbase.mapreduce.WALPlayer -Dhlog.start.time=2012-05-07T10:11:01.49 -Dhlog.end.time=2012-05-07T11:11:22.69 /user/espinozca/testLogCopier Country,State 12/05/07 12:14:01 INFO mapreduce.HLogInputFormat: Found: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/testLogCopier/2012-05-03/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815183030 12/05/07 12:14:01 INFO mapreduce.HLogInputFormat: Found: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/testLogCopier/2012-05-03/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815647578 12/05/07 12:14:01 INFO mapreduce.HLogInputFormat: Found: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/testLogCopier/2012-05-03/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815736477 12/05/07 12:14:01 INFO mapreduce.HLogInputFormat: Found: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/testLogCopier/2012-05-03/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335815951712 12/05/07 12:14:01 INFO mapreduce.HLogInputFormat: Found: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/testLogCopier/2012-05-03/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335816116619 12/05/07 12:14:01 INFO mapreduce.HLogInputFormat: Found: hdfs://finddev07.dev.oclc.org:29318/user/espinozca/testLogCopier/2012-05-03/finddev07.dev.oclc.org%2C29319%2C1335815149226.1335816193148 12/05/07 12:14:01 INFO mapred.JobClient: Running job: job_201204301545_0008 12/05/07 12:14:02 INFO mapred.JobClient: map 0% reduce 0% 12/05/07 12:14:12 INFO mapred.JobClient: map 33% reduce 0% 12/05/07 12:14:16 INFO mapred.JobClient: map 50% reduce 0% 12/05/07 12:14:17 INFO mapred.JobClient: map 66% reduce 0% 12/05/07 12:14:21 INFO mapred.JobClient: map 100% reduce 0% 12/05/07 12:14:23 INFO mapred.JobClient: Job complete: job_201204301545_0008 12/05/07 12:14:23 INFO mapred.JobClient: Counters: 15 12/05/07 12:14:23 INFO mapred.JobClient: Job Counters 12/05/07 12:14:23 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=28235 12/05/07 12:14:23 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/05/07 12:14:23 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/05/07 12:14:23 INFO mapred.JobClient: Launched map tasks=6 12/05/07 12:14:23 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 12/05/07 12:14:23 INFO mapred.JobClient: FileSystemCounters 12/05/07 12:14:23 INFO mapred.JobClient: HDFS_BYTES_READ=380813123 12/05/07 12:14:23 INFO mapred.JobClient: FILE_BYTES_WRITTEN=441084 12/05/07 12:14:23 INFO mapred.JobClient: Map-Reduce Framework 12/05/07 12:14:23 INFO mapred.JobClient: Map input records=0 12/05/07 12:14:23 INFO mapred.JobClient: Physical memory (bytes) snapshot=1115549696 12/05/07 12:14:23 INFO mapred.JobClient: Spilled Records=0 12/05/07 12:14:23 INFO mapred.JobClient: CPU time spent (ms)=24240 12/05/07 12:14:23 INFO mapred.JobClient: Total committed heap usage (bytes)=2411986944 12/05/07 12:14:23 INFO mapred.JobClient: Virtual memory (bytes) snapshot=5058162688 12/05/07 12:14:23 INFO mapred.JobClient: Map output records=0 12/05/07 12:14:23 INFO mapred.JobClient: SPLIT_RAW_BYTES=1332