Twister2Flow
Designing a dataflow pipeline to support high performance deep learning in Twister2. This is an experimental version of the initial High Performance Deep Learning Connect of Twister2.
Supporting Dataflow Operations
- File Systems oriented dataflow operations
- Remote Memory Access (Work In Progress)
Supporting Frameworks
- Twister2 (JVM Oriented Big Data Toolkit)
- Pytorch (Deep Learning Library)
- Python3
Pre-Requisites
- Twister2 Python Installation. See more.
- Pytorch Installation from Source See more.
- Install PyArrow
pip3 install pyarrow
- Install requests
pip3 install requests
- Install mlxtend
pip3 install mlxtend
- Add Twister2 to Path
export TWISTER2_HOME=<path-to-twister2-binaries>
export PATH=$TWISTER2_HOME/bin:$PATH
Install
python3 -m pip install --index-url https://test.pypi.org/simple/ --no-deps twister2flow-test
Bootstrap Task Executor
python3 bootstrap/PytorchJobSubmitter.py\
--script /home/vibhatha/github/forks/twister2/deeplearning/pytorch/src/main/python/PytorchMnistDist.py\
--executor /home/vibhatha/venv/ENV37/bin/python3\
--parallelism 4 --hostfile hostfile
Example Dataflow
from twister2flow.twister2.pipeline import PipelineGraph
from twister2flow.twister2.task.Twister2Task import Twister2Task
from twister2flow.twister2.task.PytorchTask import PytorchTask
from twister2flow.twister2.task.PythonTask import PythonTask
plg = PipelineGraph.PipelineGraph(name="UNNAMED_TASK")
download_task = PythonTask(name="download_task")
download_task.set_command("python3")
download_task.set_script_path(script_path="MnistDownload.py")
download_task.set_exec_path(exec_path=None)
twister2_task = Twister2Task(name="t2_task")
twister2_task.set_command("twister2 submit standalone python")
twister2_task.set_script_path(script_path="Twister2PytorchMnist.py")
twister2_task.set_exec_path(exec_path=None)
pytorch_task = PytorchTask(name="pytorch_task")
pytorch_task.set_command()
pytorch_task.set_script_path(script_path="PytorchMnistDist.py")
pytorch_task.set_exec_path(exec_path=None)
plg.add_task(download_task)
plg.add_task(twister2_task)
plg.add_task(pytorch_task)
print(str(plg))
plg.execute()
Running example
python3 examples/Twister2Flow.py