Goal


The purpose of this article is to provide instructions for a customer who wishes to use H2O-3 but is reporting vulnerabilities being flagged in their security scans.


Before sharing with a customer, the instructions need to be confirmed with the H2O-3 engineering team to determine if they are still relevant.  

The steps must also be reproduced by the Customer Success team before sending.


Instructions


Background


H2O can provide an alternative jar file with modified dependencies to address vulnerabilities caused by 3rd party libraries bundled in the standard H2O.jar.


The alternative jar provides the same modeling capabilities and for an end-user is indistinguishable from the standard jar.


Because some 3rd party libraries have been removed or have been upgraded to a newer version, the "CVE-free" jar file has some limitations:


  1. It cannot access HDFS running on Hadoop 2.x
  2. It does not support authentication other than hashfile based (for example, no LDAP)


Steps


1. Uninstall the current version of H2O-3

2. Install a client-only version of the Python package

The client-only version doesn't come with the jar bundled with it.  The jar needs to be provided externally.  This allows for using a different "CVE-free" version of the jar.

Example of installing the Python client:

pip install -y https://h2o-release.s3.amazonaws.com/h2o/rel-zumbo/2/Python/h2o_client-3.36.1.2-py2.py3-none-any.whl


3. Validate the installation by attempting to run h2o.init()


The expectation is that this step will fail because we did not yet provide the h2o.jar.


To confirm, execute the following steps:


Run python: python

In python console run:
>>> import h2o
>>> h2o.init()



The expected output is: 

Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
...
...
h2o.exceptions.H2OServerError: Server process terminated with error code 1: no main manifest attribute, in /home/kurkami/.local/lib/python3.10/site-packages/h2o/backend/bin/h2o.jar

If you see output similar to this one - you have successfully installed the client version of the package.



4. Download the clean version of the h2o.jar and install it into /opt/h2o/h2o-steam-3.36.1.2.jar

Steps shown below:

wget https://h2o-release.s3.amazonaws.com/h2o/rel-zumbo/2/h2o-steam-3.36.1.2.jar

sudo mkdir /opt/h2o
sudo cp h2o-steam-3.36.1.2.jar /opt/h2o/


5. Define the environment variable H2O_JAR_PATH


The H2O Python package will use this variable to lookup the jar file.  This variable should be defined on a system level so that the user doesn't need to set it to run H2O.


Example:

export H2O_JAR_PATH=/opt/h2o/h2o-steam-3.36.1.2.jar


6. Test the installation


Run python: python

In python console run:
>>> import h2o
>>> h2o.init()


H2O should correctly start up and produce output similar to this:


Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_312"; OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1-b07); OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
  Starting server from /opt/h2o/h2o-steam-3.36.1.2.jar
  Ice root: /tmp/tmpul4op4u8
  JVM stdout: /tmp/tmpul4op4u8/h2o_kurkami_started_from_python.out
  JVM stderr: /tmp/tmpul4op4u8/h2o_kurkami_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.
--------------------------  ------------------------------
H2O_cluster_uptime:         01 secs
H2O_cluster_timezone:       America/New_York
H2O_data_parsing_timezone:  UTC
H2O_cluster_version:        3.36.1.2
H2O_cluster_version_age:    1 month and 4 days
H2O_cluster_name:           H2O_from_python_kurkami_j669xv
H2O_cluster_total_nodes:    1
H2O_cluster_free_memory:    6.934 Gb
H2O_cluster_total_cores:    12
H2O_cluster_allowed_cores:  12
H2O_cluster_status:         locked, healthy
H2O_connection_url:         http://127.0.0.1:54321
H2O_connection_proxy:       {"http": null, "https": null}
H2O_internal_security:      False
Python_version:             3.10.4 final
--------------------------  ------------------------------

You can verify it is using the right jar from the line

     Starting server from /opt/h2o/h2o-steam-3.36.1.2.jar