- 
                Notifications
    
You must be signed in to change notification settings  - Fork 3.4k
 
HBASE-29644: Refresh_meta triggering compaction on user table #7385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: HBASE-29081
Are you sure you want to change the base?
Conversation
10afd02    to
    3a989c8      
    Compare
  
    | 
           🎊 +1 overall 
 
 This message was automatically generated.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to have such safeguard in case we face any unexpected attempt of performing a write operation in a read-only cluster, but it's not an acceptable solution for the use case here. We know something triggers compaction when the refresh_meta command is executed on a read replica cluster, so we should find out where that's been triggered and put a check there to avoid waste of resources, rather than relying on exception being thrown. That would cause log pollution and could create confusion for operators.
| 
           @wchevreuil Thanks for the suggestion, we will check what is the root cause of this.  | 
    
| 
           🎊 +1 overall 
 
 This message was automatically generated.  | 
    
| 
           @sharmaar12 Try the following: create a unit test which triggers the problem, attach debugger and set a breakpoint in your event handler   | 
    
e7cb788    to
    a279f76      
    Compare
  
    | 
           @wchevreuil @anmolnar  | 
    
| 
           🎊 +1 overall 
 
 This message was automatically generated.  | 
    
Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29644 Description: Consider the two cluster setup with one being active and one read replica. If active cluster create a table with FILE based SFT. If you add few rows through active and do flushes to create few Hfiles and then do refresh_meta from read replica its triggering minor compaction. Which should not happen via read replica, it may create inconsitencies because active is not aware of that event. Cause: This is happening because we should block the compaction event in ReadOnlyController but we missed adding read only guard to preCompactSelection() function. Fix: Add internalReadOnlyGuard to preCompactSelection() in ReadOnlyController
a279f76    to
    465bb2c      
    Compare
  
    | 
           🎊 +1 overall 
 
 This message was automatically generated.  | 
    
| if (isReadOnlyEnabled()) { | ||
| LOG.info("Ignoring compaction request for " + region + ",because read-only mode is on."); | ||
| return; | ||
| } | ||
| 
               | 
          
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we don't simply disable compaction altogether in the read replica cluster? See line #343 in CompactionSplit, there's already a check for compaction enabled flag. I would rather refrain from polluting CompactiSplit code with logic for read replica.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use that approach but then one issue I can think of is that hbase.global.readonly.enabled property is dynamically configurable using update_all_config but is it true for hbase.hstore.compaction.enabled also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like @wchevreuil 's idea.
How about adding the read-only check to the getter?
  public boolean isCompactionsEnabled() {
    return compactionsEnabled && !isReadOnlyEnabled();
  }You don't need to dynamically change the compaction flag.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we may need to at least modify the log messages to mention that either compaction is disabled or readonly mode is on. Otherwise compaction may be enabled but we are logging it as disabled because of read-only mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG.info("Ignoring compaction request for " + region + 
  (!isReadOnlyEnabled ? ", because compaction is disabled." : " in read-only mode"));There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just leave it as is, not a biggy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hbase.hstore.compaction.enabled
The actual property name is hbase.regionserver.compaction.enabled. Compaction is actual "switchable" via the Admin.compactionSwitch() method (we also expose an hbase shell command for that). The CompactSplit thread itself exposes a switchCompaction method which could be called on both RS startup and the dynamic config handler for the hbase.global.readonly.enabled property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful with switching the property directly. User might have intentionally disabled it and you should not enable it when go from R/O -> R/W mode. My approach seems safer to me.
Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29644
Description:
Consider the two cluster setup with one being active and one read replica. If active cluster create a table with FILE based SFT. If you add few rows through active and do flushes to create few Hfiles and then do refresh_meta from read replica its triggering minor compaction. Which should not happen via read replica, it may create inconsitencies because active is not aware of that event.
Cause:
This is happening because we should block the compaction event in ReadOnlyController but we missed adding read only guard to preCompactSelection() function.
Fix:
Add internalReadOnlyGuard to preCompactSelection() in ReadOnlyController