Thursday, March 31, 2011

Genericizing EndpointCoprocessor

Our new release will use coprocessor framework of HBase
HBASE-1512 provided reference implementation for aggregation.

Originally value for every column is interpreted as Long.
I made the implementation more generic through introduction of ColumnInterpreter which understands the schema of the underlying table (by examining column family:column qualifier, e.g.).
Here're 3 guidelines I followed during development:
  1. User shouldn't modify HbaseObjectWritable directly for the interpreter class which is to be executed on region server. This is achieved by making ColumnInterpreter extend Serializable
  2. We (plan to) store objects of MeasureWritable, a relatively complex class, in HBase. Using interpreter would give us flexibility in computing aggregates. 
  3. We load AggregateProtocolImpl.class into CoprocessorHost. Interpreter feeds various values (such as Long.MIN_VALUE) of concrete type (Long) into AggregateProtocolImpl. This simplifies class loading for CoprocessorHost 
During code review, we tried to distinguish the return value for the case where there is no result from a particular region by using null.

However, we got the following due to type erasure:

2011-04-26 17:55:48,229 INFO  [IPC Server handler 3 on 64132] coprocessor.
AggregateImplementation(66): Maximum from this region is TestTable,,1303840188042.18ec4a1af1b0931be64fc084d2eb9309.: null
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(336): Unsupported type class java.lang.Object
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(339): writeClassCode
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(339): write
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(339): writeObject
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(339): write
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(339): writeObject
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(339): write
2011-04-26 17:55:48,229 ERROR [IPC Server handler 3 on 64132] io.HbaseObjectWritable(339): run2011-04-26 17:55:48,229 WARN  [IPC Server handler 3 on 64132] ipc.HBaseServer$Handler(1122): IPC Server handler 3 on 64132 caught: java.lang.UnsupportedOperationException: No code for unexpected class java.lang.Object
        at org.apache.hadoop.hbase.io.HbaseObjectWritable.writeClassCode(HbaseObjectWritable.java:343)
        at org.apache.hadoop.hbase.io.HbaseObjectWritable$NullInstance.write(HbaseObjectWritable.java:311)
        at org.apache.hadoop.hbase.io.HbaseObjectWritable.writeObject(HbaseObjectWritable.java:449)
        at org.apache.hadoop.hbase.client.coprocessor.ExecResult.write(ExecResult.java:74)
        at org.apache.hadoop.hbase.io.HbaseObjectWritable.writeObject(HbaseObjectWritable.java:449)
        at org.apache.hadoop.hbase.io.HbaseObjectWritable.write(HbaseObjectWritable.java:284)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1092)

The solution is to apply Writable.class for null value.

We didn't consider race condition in callbacks on the client side. See HBASE-3862