Friday, April 22, 2011

Managing connections in HBase 0.90 and beyond

Users of HBase have complained about high number of connections to zookeeper after upgrading to 0.90.  Jean-Daniel, responding to user comments, did some initial work in HBASE-3734.

In the following discussion, the term connection refers to the connection between HBase client and HBase, managed by HConnectionManager.

In the early days of 0.90 release, some decisions were made in HBASE-2925 where a new connection would be established given a Configuration instance without looking at the connection-specific properties in it.

I made a comment in HBASE-3734 at 05/Apr/11 05:20 with the following two ideas:
  • We should reuse connection based on connection-specific properties, such as "hbase.zookeeper.quorum"
  • In order for HConnectionManager.deleteConnection() to work, reference counting has to be used.
I want to thank Karthick who is brave to bite the bullet and try to nail this issue through HBASE-3777.
 He and I worked together for over a week to come up with the solution - patch version 6.

We discovered a missing connection property, HConstants.ZOOKEEPER_ZNODE_PARENT, which caused TestHBaseTestingUtility to fail.

We refined the implementation several times based on the outcome of test results.

Here is summary of what we did:
  • Reference counting based connection sharing is implemented
  • There were 33 references to HConnectionManager.getConnection(), we make sure that all of those references are properly deleted (released)
  • Some modification in unit tests is made to illustrate the recommended approach - see TestTableMapReduce.java
For connection sharing, Karthick introduced the following:
  static class HConnectionKey {
    public static String[] CONNECTION_PROPERTIES = new String[] {
        HConstants.ZOOKEEPER_QUORUM,
        HConstants.ZOOKEEPER_ZNODE_PARENT,
        HConstants.ZOOKEEPER_CLIENT_PORT,
        HConstants.ZOOKEEPER_RECOVERABLE_WAITTIME,
        HConstants.HBASE_CLIENT_PAUSE,
        HConstants.HBASE_CLIENT_RETRIES_NUMBER,
        HConstants.HBASE_CLIENT_RPC_MAXATTEMPTS,
        HConstants.HBASE_RPC_TIMEOUT_KEY,
        HConstants.HBASE_CLIENT_PREFETCH_LIMIT,
        HConstants.HBASE_META_SCANNER_CACHING,
        HConstants.HBASE_CLIENT_INSTANCE_ID };

In a Configuration, if any of the above connection properties is unique, we would create a new connection. Otherwise an existing connection whose underlying connection properties carry the same values would be returned.

Initially attempt was made to use Java finalizer to clean up unused connections. It turned out object finalization is tricky - client may get a closed connection if multiple HTable instances are involved and some of them may go out of scope, leading to finalizer execution.

So for HTable, we expect user to explicitly call close() method once the HTable instance would no longer be used. Take a look at the modified src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java

Finally, we changed how TableOutputFormat closes connection. Previously HConnectionManager.deleteAllConnections(true) is called in TableRecordWriter() because it was an easy way to deal with connection leak. Now, calling table.close() is enough.

In order to make 0.90.3 release stable, HBASE-3777 wouldn't be integrated into 0.90.3

Epilog:
When Ramkrishna worked on HBASE-4052, he discovered a problem in TRUNK which was not in 0.90 branch. Namely, after Master failover, HBaseAdmin.getConnection() would still get the shared connection which points to the previous active master. Using this stale connection results in an IOException wrapped in UndeclaredThrowableException.

I provided fix for this problem through HBASE-4087 where HBaseAdmin constructor would detect such issue and ask HConnectionManager to remove the stale connection from its cache.

Here is the code snippet:
    for (; tries < numRetries; ++tries) {
      try {
        this.connection.getMaster();
        break;
      } catch (UndeclaredThrowableException ute) {
        HConnectionManager.deleteStaleConnection(this.connection);
        this.connection = HConnectionManager.getConnection(this.conf);       
      }

1 comment:

  1. Hbase is very huge file system used to store replica of MySQL data, used by Yahoo, Facebook.

    ReplyDelete