Difference between revisions of "Tensorflow JNI development"

From ElphelWiki
Jump to: navigation, search
(preferred [option 2] Build so package)
(Note)
 
(6 intermediate revisions by the same user not shown)
Line 28: Line 28:
  
 
==<font color='blue'>Build</font>==
 
==<font color='blue'>Build</font>==
 +
===Note===
 +
* <font color='red'>While running bazel ate all RAM (have 16GB) a few times and PC "hanged". To limit bazel's appetites try:</font>
 +
<font size=2>~$ bazel build --jobs 4 --local_ram_resources=4096 ...
 +
~$ bazel test --jobs 4 --local_ram_resources=4096 ...
 +
# I think that '''local_ram_resources''' is MBs per thread (have 8):
 +
~$ bazel build --local_ram_resources=2048 ...
 +
~$ bazel test --local_ram_resources=2048 ...</font>
 +
 +
===Quick===
 +
~/git/tensorflow-1.15.0/mvn_build.sh:
 +
<font size=2># Step 1 (Java):
 +
 +
bazel build -c opt //tensorflow/java:tensorflow //tensorflow/java:libtensorflow_jni //tensorflow/java:pom
 +
mvn install:install-file -Dfile=bazel-bin/tensorflow/java/libtensorflow.jar -DpomFile=bazel-bin/tensorflow/java/pom.xml
 +
 +
# Step 2 (JNI):
 +
 +
bazel build -c opt //tensorflow/tools/lib_package:libtensorflow_jni.tar.gz
 +
 +
rm -rf bazel-bin/tensorflow/tools/lib_package/maven
 +
mkdir -p bazel-bin/tensorflow/tools/lib_package/maven/org/tensorflow/native/linux-x86_64
 +
 +
POM="
 +
<project>
 +
  <modelVersion>4.0.0</modelVersion>
 +
  <description>Platform-dependent native code for the TensorFlow Java library. CUDA support depends on the local build.</description>
 +
  <groupId>org.tensorflow</groupId>
 +
  <artifactId>libtensorflow_jni_gpu</artifactId>
 +
  <version>1.15.0</version>
 +
  <packaging>jar</packaging>
 +
  <build>
 +
    <resources>
 +
      <resource>
 +
        <directory>.</directory>
 +
          <excludes>
 +
            <exclude>target/**</exclude>
 +
          </excludes>
 +
      </resource>
 +
    </resources>
 +
  </build>
 +
</project>
 +
"
 +
 +
echo $POM > bazel-bin/tensorflow/tools/lib_package/maven/pom.xml
 +
 +
tar -zxvf bazel-bin/tensorflow/tools/lib_package/libtensorflow_jni.tar.gz -C bazel-bin/tensorflow/tools/lib_package/maven/org/tensorflow/native/linux-x86_64
 +
cd bazel-bin/tensorflow/tools/lib_package/maven
 +
mvn package
 +
mvn install
 +
cd ../../../../..
 +
 +
===Detailed===
 +
 
  <font size=2>cd ~/git/tensorflow-1.15.0
 
  <font size=2>cd ~/git/tensorflow-1.15.0
 
  ./configure # do not forget CUDA
 
  ./configure # do not forget CUDA

Latest revision as of 09:48, 1 April 2020

Why

Why modify TF JNI?

  • Add TF features that are still missing in TF for Java like feeding directly from GPU memory thus saving time on back and forth CPU-GPU transfers if you, say, run data through a custom CUDA kernel first.

About

Notes on how to build TF JNI, where to modify JNI if needed, install to local maven and setup a project that will use modified native functions.

Based on Build TensorFlow 2.0 for Java on Windows article. Also this one - tensorflow/java/README.md.

These instructions are for Linux and old TensorFlow 1.15.0.

How to:

  • Build TF JNI - libtensorflow.jar, libtensorflow_jni.so and pom.xml
  • Add TF JAR to local Maven which will override the Central Maven Repository
  • Modify TF JNI functions
  • Create Elipse project

There's JavaCPP Presets project. Seems useless. Seems useful.

Install

In Kubuntu:

Based on Feeding Tensorflow from GPU.

Build

Note

  • While running bazel ate all RAM (have 16GB) a few times and PC "hanged". To limit bazel's appetites try:
~$ bazel build --jobs 4 --local_ram_resources=4096 ...
~$ bazel test --jobs 4 --local_ram_resources=4096 ...
# I think that local_ram_resources is MBs per thread (have 8):
~$ bazel build --local_ram_resources=2048 ...
~$ bazel test --local_ram_resources=2048 ...

Quick

~/git/tensorflow-1.15.0/mvn_build.sh:

# Step 1 (Java):

bazel build -c opt //tensorflow/java:tensorflow //tensorflow/java:libtensorflow_jni //tensorflow/java:pom
mvn install:install-file -Dfile=bazel-bin/tensorflow/java/libtensorflow.jar -DpomFile=bazel-bin/tensorflow/java/pom.xml

# Step 2 (JNI):

bazel build -c opt //tensorflow/tools/lib_package:libtensorflow_jni.tar.gz

rm -rf bazel-bin/tensorflow/tools/lib_package/maven
mkdir -p bazel-bin/tensorflow/tools/lib_package/maven/org/tensorflow/native/linux-x86_64

POM="
<project>
  <modelVersion>4.0.0</modelVersion>
  <description>Platform-dependent native code for the TensorFlow Java library. CUDA support depends on the local build.</description>
  <groupId>org.tensorflow</groupId>
  <artifactId>libtensorflow_jni_gpu</artifactId>
  <version>1.15.0</version>
  <packaging>jar</packaging>
  <build>
    <resources>
      <resource>
        <directory>.</directory>
          <excludes>
            <exclude>target/**</exclude>
          </excludes>
      </resource>
    </resources>
  </build>
</project>
"

echo $POM > bazel-bin/tensorflow/tools/lib_package/maven/pom.xml

tar -zxvf bazel-bin/tensorflow/tools/lib_package/libtensorflow_jni.tar.gz -C bazel-bin/tensorflow/tools/lib_package/maven/org/tensorflow/native/linux-x86_64
cd bazel-bin/tensorflow/tools/lib_package/maven
mvn package
mvn install
cd ../../../../..

Detailed

cd ~/git/tensorflow-1.15.0
./configure # do not forget CUDA
bazel build -c opt //tensorflow/java:tensorflow //tensorflow/java:libtensorflow_jni //tensorflow/java:pom

With TF, bazel tends to rebuild everything from scratch - takes a ton of time. Is it because it gets restarted after idle timeout or something else? A somewhat solution might be

At launch bazel starts its server which, to prevent it, add to ~/.bazelrc:
startup --max_idle_secs=0

Artifacts of interest are in bazel-bin/tensorflow/java/:

libtensorflow_jni.so
libtensorflow.jar
pom.xml
  • xml and jar will be taken care of by mvn command.
  • so will have to be in the library path (alternatively see Build so package a little below and skip this linking). Link or copy to /usr/lib/ or go with "java -Djava.library.path=...".

[option 1] Link so library

# /usr/lib is in the default java.library.path
sudo ln -sf ~/GIT/tensorflow-1.15.0/bazel-bin/tensorflow/java/libtensorflow_jni.so /usr/lib/

preferred [option 2] Build so package

bazel build -c opt //tensorflow/tools/lib_package:libtensorflow_jni.tar.gz

It puts all libs into a single archive. Now to create a JAR to replace libtensorflow_jni_gpu, do this:

mkdir -p bazel-bin/tensorflow/tools/lib_package/maven/org/tensorflow/native/linux-x86_64
tar -zxvf bazel-bin/tensorflow/tools/lib_package/libtensorflow_jni.tar.gz -C bazel-bin/tensorflow/tools/lib_package/maven/org/tensorflow/native/linux-x86_64

Next create a pom.xml in bazel-bin/tensorflow/tools/lib_package/maven:

<project>
  <modelVersion>4.0.0</modelVersion>
  <description>Platform-dependent native code for the TensorFlow Java library. CUDA support depends on the local build.</description>
  <groupId>org.tensorflow</groupId>
  <artifactId>libtensorflow_jni_gpu</artifactId>
  <version>1.15.0</version>
  <packaging>jar</packaging>
  <build>
    <resources>
      <resource>
        <directory>.</directory>
          <excludes>
            <exclude>target/**</exclude>
          </excludes>
      </resource>
    </resources>
  </build>
</project>

Note: libtensorflow_jni_gpu - name can be any - just make sure you use it in your project's pom.xml. Next:

cd bazel-bin/tensorflow/tools/lib_package/maven
mvn package
mvn install
cd ../../../../..

Install JAR to local Maven Repository

~/GIT/tensorflow-1.15.0$ mvn install:install-file -Dfile=bazel-bin/tensorflow/java/libtensorflow.jar -DpomFile=bazel-bin/tensorflow/java/pom.xml

How to uninstall maven local repo - and switch back to official versions from Maven Central - this link. Or remove unneeded stuff from ~/.m2/repository/org/tensorflow

After *_jni.so is linked (or jar'd) and jar installed one can resume normal development. See below what to add to your project's pom.xml

Modify TF JNI functions

For example, one wants to create a new function in org.tensorflow.TensorFlow package. Then see inside:

tensorflow/java/src/main/java/org/tensorflow/
tensorflow/java/src/main/native/

Three places:

  • add native method to tensorflow/java/src/main/java/org/tensorflow/TensorFlow.java
  • add to header file tensorflow/java/src/main/native/tensorflow_jni.h
  • add to c file tensorflow/java/src/main/native/tensorflow_jni.cc

Rebuild and Reinstall.

The native header files seem to be regenerated but I haven't tested if they are actually used (need to test). In function naming - avoid underscores, e.g.:

Java_org_tensorflow_TensorFlow_<Name>

Java Maven project in Eclipse

Nothing special.

  • Create a new maven project
  • Edit pom.xml:
<project>
  ...
  <dependencies>
    ...
    <dependency>
      <groupId>org.tensorflow</groupId>
      <artifactId>libtensorflow</artifactId>
      <version>1.15.0</version>
    </dependency>
    ...
  </dependencies>
  ...
</project>
  • Write code as usual

Basic example code

tfhello.java:

import org.tensorflow.TensorFlow;

public class tfhello{
	public static void main(String[] args){
		System.out.println(TensorFlow.version());
	}
}

A few words on TF in Maven Central repository

libtensorflow

Record in pom.xml:

<dependency>
  <groupId>org.tensorflow</groupId>
  <artifactId>libtensorflow</artifactId>
  <version>1.15.0</version>
</dependency>

Archive contains Java classes.

libtensorflow_jni_gpu

Record in pom.xml:

<dependency>
  <groupId>org.tensorflow</groupId>
  <artifactId>libtensorflow_jni_gpu</artifactId>
  <version>1.15.0</version>
</dependency>

Archive contains native library:

├── META-INF
│   ├── MANIFEST.MF
│   └── maven
│       └── org.tensorflow
│           └── libtensorflow_jni_gpu
│               ├── pom.properties
│               └── pom.xml
└── org
    └── tensorflow
        └── native
            ├── linux-x86_64
            │   ├── libtensorflow_framework.so.1
            │   ├── libtensorflow_jni.so
            │   ├── LICENSE
            │   └── THIRD_PARTY_TF_JNI_LICENSES
            └── windows-x86_64
                ├── LICENSE
                └── tensorflow_jni.dll