This exposes SPL operators in the com.ibm.streamsx.hbase toolkit as Python methods.
Package is organized using standard packaging to upload to PyPi.
The package is uploaded to PyPi in the standard way:
cd package
python setup.py sdist bdist_wheel upload -r pypi
Note: This is done using the ibmstreams account at pypi.org and requires .pypirc file containing the credentials in your home directory.
Package details: https://pypi.python.org/pypi/streamsx.hbase
Documentation is using Sphinx and can be built locally using:
cd package/docs
make html
and viewed using
firefox package/docs/build/html/index.html
The documentation is also setup at readthedocs.io.
Documentation links:
Package can be tested with TopologyTester using the IBM Streams.
The host name and the port of hadoop server has to be specified for testing with the environment variable HADOOP_HOST_PORT.
For example:
export HADOOP_HOST_PORT=hdp264.fyre.ibm.com:8020
The package creates a HBase configuration file (hbase-site.xml) from a template.
And replaces the hadoop server name and the port with values from environment variable HADOOP_HOST_PORT.
Alternative the "hbase-site.xml" file can be specified for testing with the environment variable HBASE_SITE_XML.
For example:
export HBASE_SITE_XML=/usr/hdp/current/hbase-client/conf/hbase-site.xml
The location of hbase toolkit has to be specified for testing with the environment variable STREAMS_HBASE_TOOLKIT.
For example:
export STREAMS_HBASE_TOOLKIT=/opt/ibm/InfoSphere_Streams/4.3.0.0/toolkits/com.ibm.streamsx.hbase
pip install streamsx
pip install urllib3
pip install --upgrade pyOpenSSL
Before you begin with test, you have to create a test table on your HBASE database.
login as hbase user on your HBASE server.
hbase shell
....
create 'streamsSample_lotr','appearance','location'
The first test puts some rows into table via HBASEPut operator from streamsx.hbase toolkit.
In the next test the HBASEGet operator delivers the selected rows from table.
At the end the HBASEScan returns all rows for table.
This test does not require any Streams instance.
cd package
python3 -u -m unittest streamsx.hbase.tests.test_hbase.TestParams
This test requires STREAMS_INSTALL set and a running Streams instance.
Required environment variable for the com.ibm.streamsx.hbase toolkit location: STREAMS_HBASE_TOOLKIT
cd package
python3 -u -m unittest streamsx.hbase.tests.test_hbase.TestDistributed
or
ant test
For a quick test:
cd package
python3 -u -m unittest streamsx.hbase.tests.test_hbase.TestDistributedPut
This test requires STREAMS_INSTALL set and a running Streams instance.
Required environment variable for the com.ibm.streamsx.hbase toolkit location: STREAMS_HBASE_TOOLKIT
And HADOOP_HOST_PORT environment variable (hostname:port) to create a hbase-site.xml file.
- The standard
Beaconoperator creates 10 rows. - The
HBasePutputs the rows into test table 'streamsSample_lotr' - The standard
Beaconoperator creates 10 rwos as query for HbsaeGet - The
HBaseGetgets the rows from test table 'streamsSample_lotr' - The
HBaseScanscans the test table 'streamsSample_lotr' and delivers the rows in output stream
cd package
python3 -u -m unittest streamsx.hbase.tests.test_hbase.TestCompositeClass