Name | Default | Description |
twister2.checkpointing.enable | false | Enable or disable checkpointing |
twister2.checkpointing.store | edu.iu.dsc.tws.checkpointing.stores.LocalFileStateStore | The implementation of the store to be used |
twister2.checkpointing.store.fs.dir | ${TWISTER2_HOME}/persistent/ | Root directory of local file system based store |
twister2.checkpointing.store.hdfs.dir | /twister2/persistent/ | Root directory of hdfs based store |
twister2.checkpointing.source.frequency | 1000 | Source triggering frequency |
Name | Default | Description |
twister2.data.hadoop.home | ${HADOOP_HOME} | |
twister2.data.hdfs.config.directory | ${HADOOP_HOME}/etc/hadoop/core-site.xml | |
twister2.data.hdfs.data.directory | ${HADOOP_HOME}/etc/hadoop/hdfs-site.xml | |
twister2.data.hdfs.namenode | namenode.domain.name | |
twister2.data.hdfs.namenode.port | 9000 | |
twister2.data.fs.root | ${TWISTER2_HOME}/persistent/data | |
twister2.data.hdfs.root | /twister2/persistent/data |
Name | Default | Description |
twister2.network.buffer.size | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count | 4 | number of send buffers to be used |
twister2.network.receiveBuffer.count | 4 | number of receive buffers to be used |
twister2.network.channel.pending.size | 2048 | channel pending messages |
twister2.network.send.pending.max | 4 | the send pending messages |
twister2.network.message.group.low_water_mark | 8 | group up to 8 ~ 16 messages |
twister2.network.message.group.high_water_mark | 16 | this is the max number of messages to group |
twister2.network.message.grouping.size | 10 | in batch partition operations, this value will be used to create mini batches within partial receivers |
twister2.network.ops.persistent.dirs | ["${TWISTER2_HOME}/persistent/"] | For disk based operations, this directory list will be used to persist incoming messages. This can be used to balance the load between multiple devices, by specifying directory locations from different devices. |
twister2.network.shuffle.memory.bytes.max | 102400000 | the maximum amount of bytes kept in memory for operations that goes to disk |
twister2.network.shuffle.memory.records.max | 102400000 | the maximum number of records kept in memory for operations that goes to dist |
twister2.network.shuffle.file.bytes.max | 10000000 | size of the shuffle file (10MB default) |
twister2.network.shuffle.parallel.io | 2 | no of parallel IO operations permitted |
twister2.network.alltoall.algorithm.batch | ring | the partitioning algorithm |
Name | Default | Description |
twister2.network.buffer.size.stream.reduce | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.stream.reduce | 2 | number of send buffers to be used |
twister2.network.receiveBuffer.count.strea.reduce | 2 | number of receive buffers to be used |
twister2.network.buffer.size.stream.gather | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.stream.gather | 2 | number of send buffers to be used |
twister2.network.receiveBuffer.count.stream.gather | 2 | number of receive buffers to be used |
twister2.network.buffer.size.stream.bcast | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.stream.bcast | 2 | number of send buffers to be used |
twister2.network.receiveBuffer.count.stream.bcast | 2 | number of receive buffers to be used |
twister2.network.alltoall.algorithm.stream.partition | simple | the partitioning algorithm |
twister2.network.buffer.size.stream.partition | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.stream.partition | 4 | number of send buffers to be used |
twister2.network.receiveBuffer.count.stream.partition | 4 | number of receive buffers to be used |
Name | Default | Description |
twister2.network.buffer.size.batch.reduce | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.batch.reduce | 2 | number of send buffers to be used |
twister2.network.receiveBuffer.count.batch.reduce | 2 | number of receive buffers to be used |
twister2.network.buffer.size.batch.gather | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.batch.gather | 2 | number of send buffers to be used |
twister2.network.receiveBuffer.count.batch.gather | 2 | number of receive buffers to be used |
twister2.network.buffer.size.batch.bcast | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.batch.bcast | 2 | number of send buffers to be used |
twister2.network.receiveBuffer.count.batch.bcast | 2 | number of receive buffers to be used |
twister2.network.alltoall.algorithm.batch.partition | simple | the partitioning algorithm |
twister2.network.buffer.size.batch.partition | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.batch.partition | 4 | number of send buffers to be used |
twister2.network.receiveBuffer.count.batch.partition | 4 | number of receive buffers to be used |
twister2.network.alltoall.algorithm.batch.keyed_gather | simple | the partitioning algorithm |
ttwister2.network.partition.ring.group.workers.batch.keyed_gather | 2 | ring group worker |
twister2.network.buffer.size.batch.keyed_gather | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.batch.keyed_gather | 4 | number of send buffers to be used |
twister2.network.receiveBuffer.count.batch.keyed_gather | 4 | number of receive buffers to be used |
twister2.network.message.group.low_water_mark.batch.keyed_gather | 8000 | group up to 8 ~ 16 messages |
twister2.network.message.group.high_water_mark.batch.keyed_gather | 16000 | this is the max number of messages to group |
twister2.network.message.grouping.size.batch.keyed_gather | 10000 | in batch partition operations, this value will be used to create mini batches within partial receivers |
twister2.network.alltoall.algorithm.batch.keyed_reduce | simple | the partitioning algorithm |
ttwister2.network.partition.ring.group.workers.batch.keyed_reduce | 2 | ring group worker |
twister2.network.buffer.size.batch.keyed_reduce | 1024000 | the buffer size to be used |
twister2.network.sendBuffer.count.batch.keyed_reduce | 4 | number of send buffers to be used |
twister2.network.receiveBuffer.count.batch.keyed_reduce | 4 | number of receive buffers to be used |
twister2.python.port | 5400 | port offset for python-java connection. port+workerId will be used by each worker to communicate with python process port-1 will be used by client process for the initial communication |
Name | Default | Description |
twister2.client.debug | '-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5006' | use this property to debug the client submitting the job |
twister2.resource.systempackage.copy | false | Weather we have a requirement to copy the system package |
Name | Default | Description |
twister2.zookeeper.based.group.management | true | ZooKeeper can be used to exchange job status data and discovery Workers can discover one another through ZooKeeper They update their status on ZooKeeper Dashboard can get job events through ZooKeeper If fault tolerance is enabled, ZooKeeper is used, irrespective of this parameter |
#twister2.resource.zookeeper.server.addresses | ip:port | when conf/kubernetes/deployment/zookeeper-wo-persistence.yaml is used following service name can be used as zk address Options
|
twister2.zookeeper.root.node.path | /twister2 | the root node path of this job on ZooKeeper the default is "/twister2" |
Name | Default | Description |
twister2.job.master.used | true | |
twister2.job.master.runs.in.client | true | if true, the job master runs in the submitting client if false, job master runs as a separate process in the cluster by default, it is true when the job master runs in the submitting client, this client has to be submitting the job from a machine in the cluster |
twister2.worker.to.job.master.response.wait.duration | 100000 |
Name | Default | Description |
twister2.worker.controller.max.wait.time.for.all.workers.to.join | 100000 | amount of timeout for all workers to join the job in milli seconds |
twister2.worker.controller.max.wait.time.on.barrier | 100000 | amount of timeout on barriers for all workers to arrive in milli seconds |
Name | Default | Description |
twister2.common.thread.pool.threads | 2 | Maximum number of threads to spawn on demand |
twister2.common.thread.pool.keepalive | 10 | maximum time that excess idle threads will wait for new tasks before terminating |
twister.python.bin | python3 | path to python binary |
Name | Default | Description |
twister2.dashboard.host | http://localhost:8080 | Dashboard server host address and port if this parameter is not specified, then job master will not try to connect to Dashboard |
Name | Default | Description |
twister2.taskscheduler.streaming | roundrobin | Task scheduling mode for the streaming jobs "roundrobin" or "firstfit" or "datalocalityaware" or "userdefined" |
twister2.taskscheduler.streaming.class | edu.iu.dsc.tws.tsched.streaming.roundrobin.RoundRobinTaskScheduler | Task Scheduler class for the round robin streaming task scheduler |
#twister2.taskscheduler.streaming.class | edu.iu.dsc.tws.tsched.streaming.datalocalityaware.DataLocalityStreamingTaskScheduler | Task Scheduler for the Data Locality Aware Streaming Task Scheduler |
#twister2.taskscheduler.streaming.class | edu.iu.dsc.tws.tsched.streaming.firstfit.FirstFitStreamingTaskScheduler | Task Scheduler for the FirstFit Streaming Task Scheduler |
#twister2.taskscheduler.streaming.class | edu.iu.dsc.tws.tsched.userdefined.UserDefinedTaskScheduler | Task Scheduler for the userDefined Streaming Task Scheduler |
twister2.taskscheduler.batch | batchscheduler | Task scheduling mode for the batch jobs "roundrobin" or "datalocalityaware" or "userdefined" |
#twister2.taskscheduler.batch.class | edu.iu.dsc.tws.tsched.batch.roundrobin.RoundRobinBatchTaskScheduler | Task Scheduler class for the round robin batch task scheduler |
twister2.taskscheduler.batch.class | edu.iu.dsc.tws.tsched.batch.batchscheduler.BatchTaskScheduler | Task Scheduler class for the batch task scheduler |
#twister2.taskscheduler.batch.class | edu.iu.dsc.tws.tsched.batch.datalocalityaware.DataLocalityBatchTaskScheduler | Task Scheduler for the Data Locality Aware Batch Task Scheduler |
#twister2.taskscheduler.batch.class | edu.iu.dsc.tws.tsched.userdefined.UserDefinedTaskScheduler | Task Scheduler for the userDefined Batch Task Scheduler |
twister2.taskscheduler.task.instances | 2 | Number of task instances to be allocated to each worker/container |
twister2.taskscheduler.task.instance.ram | 512.0 | Ram value to be allocated to each task instance |
twister2.taskscheduler.task.instance.disk | 500.0 | Disk value to be allocated to each task instance |
twister2.taskscheduler.task.instance.cpu | 2.0 | CPU value to be allocated to each task instance |
twister2.taskscheduler.container.instance.ram | 4096.0 | Default Container Instance Values Ram value to be allocated to each container |
twister2.taskscheduler.container.instance.disk | 8000.0 | Disk value to be allocated to each container |
twister2.taskscheduler.container.instance.cpu | 16.0 | CPU value to be allocated to each container |
twister2.taskscheduler.ram.padding.container | 2.0 | Default Container Padding Values Default padding value of the ram to be allocated to each container |
twister2.taskscheduler.disk.padding.container | 12.0 | Default padding value of the disk to be allocated to each container |
twister2.taskscheduler.cpu.padding.container | 1.0 | CPU padding value to be allocated to each container |
twister2.taskscheduler.container.padding.percentage | 2 | Percentage value to be allocated to each container |
twister2.taskscheduler.container.instance.bandwidth | 100 #Mbps | Static Default Network parameters Bandwidth value to be allocated to each container instance for datalocality scheduling |
twister2.taskscheduler.container.instance.latency | 0.002 #Milliseconds | Latency value to be allocated to each container instance for datalocality scheduling |
twister2.taskscheduler.datanode.instance.bandwidth | 200 #Mbps | Bandwidth to be allocated to each datanode instance for datalocality scheduling |
twister2.taskscheduler.datanode.instance.latency | 0.01 #Milliseconds | Latency value to be allocated to each datanode instance for datalocality scheduling |
twister2.taskscheduler.task.parallelism | 2 | Prallelism value to each task instance |
twister2.taskscheduler.task.type | streaming | Task type to each submitted job by default it is "streaming" job. |
twister2.exector.worker.threads | 1 | number of threads per worker |
twister2.executor.batch.name | edu.iu.dsc.tws.executor.threading.BatchSharingExecutor2 | name of the batch executor |
twister2.exector.instance.queue.low.watermark | 10000 | number of tuples executed at a single pass |
twister2.executor.stream.name | edu.iu.dsc.tws.executor.threading.StreamingSharingExecutor | name of the streaming executor |
twister2.executor.stream.name | edu.iu.dsc.tws.executor.threading.StreamingAllSharingExecutor |
Name | Default | Description |
twister2.network.channel.class | edu.iu.dsc.tws.comms.mpi.TWSMPIChannel |
Name | Default | Description |
twister2.resource.scheduler.mpi.working.directory | ${HOME}/.twister2/jobs | working directory |
twsiter2.resource.scheduler.mpi.mode | standalone | mode of the mpi scheduler |
twister2.resource.scheduler.mpi.job.id | the job id file | |
twister2.resource.scheduler.mpi.shell.script | mpi.sh | slurm script to run |
twister2.resource.scheduler.mpi.home | the mpirun command location | |
twister2.resource.system.package.uri | ${TWISTER2_DIST}/twister2-core-0.7.0.tar.gz | the package uri |
twister2.resource.class.launcher | edu.iu.dsc.tws.rsched.schedulers.standalone.MPILauncher | the launcher class |
twister2.resource.scheduler.mpi.mpirun.file | ompi/bin/mpirun | mpi run file, this assumes a mpirun that is shipped with the product change this to just mpirun if you are using a system wide installation of OpenMPI or complete path of OpenMPI in case you have something custom |
twister2.resource.scheduler.mpi.mapby | node | mpi scheduling policy. Two possible options are node and slot. read more at https://www.open-mpi.org/faq/?category=running#mpirun-scheduling |
twister2.resource.scheduler.mpi.mapby.use-pe | false | use mpi map-by modifier PE. If this option is enabled, cpu count of compute resource specified in job definition will be taken into consideration |
twister2.resource.sharedfs | true | Indicates whether bootstrap process needs to be run and distribute job file and core between nodes. Twister2 assumes job file is accessible to all nodes if this property is set to true, else it will run the bootstrap process |
twister2.resource.fs.mount | ${TWISTER2_HOME}/persistent/fs/ | Directory for file system volume mount |
twister2.resource.uploader.directory | ${HOME}/.twister2/repository | the uploader directory |
twister2.resource.class.uploader | edu.iu.dsc.tws.rsched.uploaders.localfs.LocalFileSystemUploader | the uplaoder class |
Name | Default | Description |
twister2.network.channel.class | edu.iu.dsc.tws.comms.mpi.TWSMPIChannel |
Name | Default | Description |
twister2.resource.scheduler.mpi.working.directory | ${HOME}/.twister2/jobs | working directory |
twsiter2.resource.scheduler.mpi.mode | slurm | mode of the mpi scheduler |
twister2.resource.scheduler.mpi.job.id | the job id file | |
twister2.resource.scheduler.mpi.shell.script | mpi.sh | slurm script to run |
twister2.resource.scheduler.slurm.partition | juliet | slurm partition |
twister2.resource.scheduler.mpi.home | the mpirun command location | |
twister2.resource.system.package.uri | ${TWISTER2_DIST}/twister2-core-0.7.0.tar.gz | the package uri |
twister2.resource.class.launcher | edu.iu.dsc.tws.rsched.schedulers.standalone.MPILauncher | the launcher class |
twister2.resource.scheduler.mpi.mpirun.file | twister2-core/ompi/bin/mpirun | mpi run file, this assumes a mpirun that is shipped with the product change this to just mpirun if you are using a system wide installation of OpenMPI or complete path of OpenMPI in case you have something custom |
twister2.resource.uploader.directory | ${HOME}/.twister2/repository | the uploader directory |
twister2.resource.class.uploader | edu.iu.dsc.tws.rsched.uploaders.localfs.LocalFileSystemUploader | the uplaoder class |
Name | Default | Description |
twister2.worker.controller.max.wait.time.for.all.workers.to.join | 100000 | amount of timeout for all workers to join the job in milli seconds |
twister2.worker.controller.max.wait.time.on.barrier | 100000 | amount of timeout on barriers for all workers to arrive in milli seconds |
Name | Default | Description |
twister2.dashboard.host | http://localhost:8080 | Dashboard server host address and port if this parameter is not specified, then job master will not try to connect to Dashboard |
Name | Default | Description |
twister2.network.channel.class | edu.iu.dsc.tws.comms.tcp.TWSTCPChannel |
Name | Default | Description |
twister2.resource.system.package.uri | ${TWISTER2_DIST}/twister2-core-0.7.0.tar.gz | the package uri |
twister2.resource.class.launcher | edu.iu.dsc.tws.rsched.schedulers.aurora.AuroraLauncher | launcher class for aurora submission |
twister2.resource.class.uploader | edu.iu.dsc.tws.rsched.uploaders.scp.ScpUploader | the uploader class Options
|
twister2.resource.job.worker.class | edu.iu.dsc.tws.examples.internal.rsched.BasicAuroraContainer | container class to run in workers |
twister2.resource.class.aurora.worker | edu.iu.dsc.tws.rsched.schedulers.aurora.AuroraWorkerStarter | the Aurora worker class |
Name | Default | Description |
twister2.resource.uploader.directory | /root/.twister2/repository/ | the directory where the file will be uploaded, make sure the user has the necessary permissions to upload the file here. |
twister2.resource.uploader.scp.command.options | This is the scp command options that will be used by the uploader, this can be used to specify custom options such as the location of ssh keys. | |
twister2.resource.uploader.scp.command.connection | root@149.165.150.81 | The scp connection string sets the remote user name and host used by the uploader. |
twister2.resource.uploader.ssh.command.options | The ssh command options that will be used when connecting to the uploading host to execute command such as delete files, make directories. | |
twister2.resource.uploader.ssh.command.connection | root@149.165.150.81 | The ssh connection string sets the remote user name and host used by the uploader. |
Name | Default | Description |
twister2.resource.scheduler.aurora.script | ${TWISTER2_CONF}/twister2.aurora | aurora python script to submit a job to Aurora Scheduler its default value is defined as the following in the code can be reset from this config file if desired |
twister2.resource.scheduler.aurora.cluster | example | cluster name aurora scheduler runs in |
twister2.resource.scheduler.aurora.role | www-data | role in cluster |
twister2.resource.scheduler.aurora.env | devel | environment name |
twister2.resource.job.name | basic-aurora | aurora job name Options
|
twister2.resource.worker.cpu | 1.0 | number of cores for each worker it is a floating point number each worker can have fractional cores such as 0.5 cores or multiple cores as 2 default value is 1.0 core |
twister2.resource.worker.ram | 200 | amount of memory for each worker in the job in mega bytes as integer default value is 200 MB |
twister2.resource.worker.disk | 1024 | amount of hard disk space on each worker in mega bytes this only used when running twister2 in Aurora default value is 1024 MB. |
twister2.resource.worker.instances | 6 | number of worker instances |
Name | Default | Description |
twister2.network.channel.class | edu.iu.dsc.tws.comms.tcp.TWSTCPChannel | If the channel is set as TWSMPIChannel, the job is started as OpenMPI job Otherwise, it is a regular twister2 job. OpenMPI is not started in this case. |
kubernetes.secret.name | twister2-openmpi-ssh-key | A Secret object must be present in Kubernetes master Its name must be specified here |
Name | Default | Description |
kubernetes.worker.base.port | 9000 | the base port number workers will use internally to communicate with each other when there are multiple workers in a pod, first worker will get this port number, second worker will get the next port, and so on. default value is 9000, |
kubernetes.worker.transport.protocol | TCP | transport protocol for the worker. TCP or UDP by default, it is TCP set if it is UDP |
Name | Default | Description |
kubernetes.node.port.service.requested | true | if the job requests NodePort service, it must be true NodePort service makes the workers accessible from external entities (outside of the cluster) by default, its value is false |
kubernetes.service.node.port | 30003 | if NodePort value is 0, it is automatically assigned a value the user can request a specific port value in the NodePort range by setting the value below by default Kubernetes uses the range 30000-32767 for NodePorts Kubernetes admins can change this range |
Name | Default | Description |
twister2.resource.system.package.uri | ${TWISTER2_DIST}/twister2-core-0.7.0.tar.gz | the package uri |
twister2.resource.class.launcher | edu.iu.dsc.tws.rsched.schedulers.k8s.KubernetesLauncher |
Name | Default | Description |
kubernetes.image.pull.policy | Always | image pull policy, by default is IfNotPresent it could also be Always |
kubernetes.log.in.client | true | get log messages to twister2 client and save in files it is false by default |
kubernetes.check.pods.reachable | true | before connecting to other pods in the job, check whether all pods are reachable from each pod wait until all pods become reachable when there are networking issues, pods may not be reachable immediately, so this makes sure to wait before all pods become reachable it is false by default |
Name | Default | Description |
twister2.resource.job.name | t2-job | twister2 job name |
# number of workers using this compute resource | instances * workersPerPod | A Twister2 job can have multiple sets of compute resources instances shows the number of compute resources to be started with this specification workersPerPod shows the number of workers on each pod in Kubernetes. May be omitted in other clusters. default value is 1. |
#- cpu | 0.5 # number of cores for each worker, may be fractional such as 0.5 or 2.4 | Options
|
twister2.resource.job.driver.class | edu.iu.dsc.tws.examples.internal.rsched.DriverExample | driver class to run |
twister2.resource.job.worker.class | edu.iu.dsc.tws.examples.basic.HelloWorld | worker class to run Options
|
twister2.resource.worker.additional.ports | ["port1", "port2", "port3"] | by default each worker has one port additional ports can be requested for all workers in a job please provide the requested port names as a list such as: |
Name | Default | Description |
twister2.resource.persistent.volume.per.worker | 0.0 | persistent volume size per worker in GB as double default value is 0.0Gi set this value to zero, if you have not persistent disk support when this value is zero, twister2 will not try to set up persistent storage for this job |
twister2.resource.kubernetes.persistent.storage.class | twister2-nfs-storage | cluster admin should provide a storage provisioner. please specify the storage class name that is used by the provisioner Minikube has a default provisioner with storageClass of "standard" Options
|
twister2.resource.kubernetes.storage.access.mode | ReadWriteMany | persistent storage access mode. It shows the access mode for workers to access the shared persistent storage. if it is "ReadWriteMany", many workers can read and write https://kubernetes.io/docs/concepts/storage/persistent-volumes |
Name | Default | Description |
twister2.resource.class.uploader | edu.iu.dsc.tws.rsched.uploaders.k8s.K8sUploader |
Name | Default | Description |
it is by default | http://twister2-uploader.default.svc.cluster.local | uploader web server address if you are using, twister2-uploader-wo-ps.yaml no need to set this parameter, default one is ok Options
|
it is by default | /usr/share/nginx/html | uploader web server directory job package will be uploaded to this directory if you are using, twister2-uploader-wo-ps.yaml no need to set this parameter, default one is ok Options
|
it is by default | app=twister2-uploader | uploader web server label job package will be uploaded to the pods that have this label if you are using, twister2-uploader-wo-ps.yaml no need to set this parameter, default one is ok Options
|
Name | Default | Description |
twister2.resource.class.uploader | edu.iu.dsc.tws.rsched.uploaders.s3.S3Uploader | |
twister2.s3.bucket.name | s3://[bucket-name] | s3 bucket name to upload the job package workers will download the job package from this location |
twister2.s3.link.expiration.duration.sec | 7200 | job package link will be available this much time by default, it is 2 hours |
Name | Default | Description |
twister2.resource.kubernetes.node.locations.from.config | false | If this parameter is set as true, Twister2 will use the below lists for node locations: kubernetes.datacenters.list kubernetes.racks.list Otherwise, it will try to get these information by querying Kubernetes Master It will use below two labels when querying node locations For this to work, submitting client has to have admin privileges |
twister2.resource.rack.labey.key | rack | rack label key for Kubernetes nodes in a cluster each rack should have a unique label all nodes in a rack should share this label Twister2 workers can be scheduled by using these label values Better data locality can be achieved no default value is specified |
twister2.resource.datacenter.labey.key | datacenter | data center label key each data center should have a unique label all nodes in a data center should share this label Twister2 workers can be scheduled by using these label values Better data locality can be achieved no default value is specified |
- echo | ['blue-rack', 'green-rack'] | Data center list with rack names |
- green-rack | ['node11.ip', 'node12.ip', 'node13.ip'] | Rack list with node IPs in them |
Name | Default | Description |
kubernetes.bind.worker.to.cpu | true | Statically bind workers to CPUs Workers do not move from the CPU they are started during computation twister2.cpu_per_container has to be an integer by default, its value is false |
kubernetes.worker.to.node.mapping | true | kubernetes can map workers to nodes as specified by the user default value is false |
twister2.resource.kubernetes.worker.mapping.key | kubernetes.io/hostname | the label key on the nodes that will be used to map workers to nodes |
twister2.resource.kubernetes.worker.mapping.operator | In | operator to use when mapping workers to nodes based on key value Exists/DoesNotExist checks only the existence of the specified key in the node. Ref https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature |
twister2.resource.kubernetes.worker.mapping.values | ['e012', 'e013'] | values for the mapping key when the mapping operator is either Exists or DoesNotExist, values list must be empty. Options
|
Valid values | all-same-node, all-separate-nodes, none | uniform worker mapping default value is none Options
|
Name | Default | Description |
twister2.fault.tolerant | false | A flag to enable/disable fault tolerance in Twister2 By default, it is disabled |
twister2.fault.tolerance.failure.timeout | 10000 | a timeout value to determine whether a worker failed If a worker does not send heartbeat messages for this duration in milli seconds, It is assumed failed |
Name | Default | Description |
twister2.job.master.runs.in.client | false | if true, the job master runs in the submitting client if false, job master runs as a separate process in the cluster by default, it is true when the job master runs in the submitting client, this client has to be submitting the job from a machine in the cluster getLocalHost must return a reachable IP address to the job master |
twister2.job.master.port | 11011 | twister2 job master port number default value is 11011 |
twister2.worker.to.job.master.response.wait.duration | 10000 | worker to job master response wait time in milliseconds this is for messages that wait for a response from the job master default value is 10seconds = 10000 |
twister2.job.master.volatile.volume.size | 0.0 | twister2 job master volatile volume size in GB default value is 1.0 Gi if this value is 0, volatile volume is not setup for job master |
twister2.job.master.persistent.volume.size | 0.0 | twister2 job master persistent volume size in GB default value is 1.0 Gi if this value is 0, persistent volume is not setup for job master |
twister2.job.master.cpu | 0.2 | twister2 job master cpu request default value is 0.2 percentage |
twister2.job.master.ram | 1024 | twister2 job master RAM request in MB default value is 1024 MB |
Name | Default | Description |
twister2.worker.controller.max.wait.time.for.all.workers.to.join | 100000 | amount of timeout for all workers to join the job in milli seconds |
twister2.worker.controller.max.wait.time.on.barrier | 100000 | amount of timeout on barriers for all workers to arrive in milli seconds |
Name | Default | Description |
twister2.job.master.to.dashboard.connections | 3 | the number of http connections from job master to Twister2 Dashboard default value is 3 for jobs with large number of workers, this can be set to higher number |
twister2.dashboard.host | http://twister2-dashboard.default.svc.cluster.local | Dashboard server host address and port if this parameter is not specified, then job master will not try to connect to Dashboard if dashboard is running as a statefulset in the cluster |
Name | Default | Description |
twister2.network.channel.class | edu.iu.dsc.tws.comms.tcp.TWSTCPChannel |
Name | Default | Description |
twister2.resource.mesos.scheduler.working.directory | ~/.twister2/repository"#"${TWISTER2_DIST}/topologies/${CLUSTER}/${ROLE}/${TOPOLOGY} | working directory for the topologies |
twister2.resource.directory.core-package | /root/.twister2/repository/twister2-core/ | |
twister2.resource.directory.sandbox.java.home | ${JAVA_HOME} | location of java - pick it up from shell environment |
twister2.mesos.master.uri | 149.165.150.81:5050 | The URI of Mesos Master |
twister2.resource.mesos.framework.name | Twister2 framework | mesos framework name |
twister2.resource.mesos.master.uri | zk://localhost:2181/mesos | |
twister2.resource.mesos.framework.staging.timeout.ms | 2000 | The maximum time in milliseconds waiting for MesosFramework got registered with Mesos Master |
twister2.resource.mesos.scheduler.driver.stop.timeout.ms | 5000 | The maximum time in milliseconds waiting for Mesos Scheduler Driver to complete stop() |
twister2.resource.mesos.native.library.path | /usr/lib/mesos/0.28.1/lib/ | the path to load native mesos library |
twister2.resource.system.package.uri | ${TWISTER2_DIST}/twister2-core-0.7.0.tar.gz | the core package uri |
twister2.resource.mesos.overlay.network.name | mesos-overlay | |
twister2.resource.mesos.docker.image | gurhangunduz/twister2-mesos:docker-mpi | |
twister2.resource.system.job.uri | http://localhost:8082/twister2/mesos/twister2-job.tar.gz | the job package uri for mesos agent to fetch. For fetching http server must be running on mesos master |
twister2.resource.class.launcher | edu.iu.dsc.tws.rsched.schedulers.mesos.MesosLauncher | launcher class for mesos submission |
twister2.resource.job.worker.class | edu.iu.dsc.tws.examples.internal.comms.BroadcastCommunication | container class to run in workers |
twister2.resource.class.mesos.worker | edu.iu.dsc.tws.rsched.schedulers.mesos.MesosWorker | the Mesos worker class |
twister2.resource.uploader.directory | /var/www/html/twister2/mesos/ | the directory where the file will be uploaded, make sure the user has the necessary permissions to upload the file here. |
#twister2.resource.uploader.directory.repository | /var/www/html/twister2/mesos/ | |
twister2.resource.uploader.scp.command.options | --chmod=+rwx | This is the scp command options that will be used by the uploader, this can be used to specify custom options such as the location of ssh keys. |
twister2.resource.uploader.scp.command.connection | root@149.165.150.81 | The scp connection string sets the remote user name and host used by the uploader. |
twister2.resource.uploader.ssh.command.options | The ssh command options that will be used when connecting to the uploading host to execute command such as delete files, make directories. | |
twister2.resource.uploader.ssh.command.connection | root@149.165.150.81 | The ssh connection string sets the remote user name and host used by the uploader. |
twister2.resource.class.uploader | edu.iu.dsc.tws.rsched.uploaders.scp.ScpUploader | the uploader class Options
|
twister2.resource.uploader.download.method | HTTP | this is the method that workers use to download the core and job packages it could be HTTP, HDFS, .. |
twister2.resource.HTTP.fetch.uri | http://149.165.150.81:8082 | HTTP fetch uri |
Name | Default | Description |
twister2.resource.scheduler.mesos.cluster | example | cluster name mesos scheduler runs in |
twister2.resource.scheduler.mesos.role | www-data | role in cluster |
twister2.resource.scheduler.mesos.env | devel | environment name |
twister2.resource.job.name | basic-mesos | mesos job name |
# workersPerPod | 2 # number of workers on each pod in Kubernetes. May be omitted in other clusters. | A Twister2 job can have multiple sets of compute resources instances shows the number of compute resources to be started with this specification workersPerPod shows the number of workers on each pod in Kubernetes. May be omitted in other clusters. default value is 1. |
instances | 4 # number of compute resource instances with this specification | Options
|
twister2.resource.worker.additional.ports | ["port1", "port2", "port3"] | by default each worker has one port additional ports can be requested for all workers in a job please provide the requested port names as a list |
twister2.resource.job.driver.class | edu.iu.dsc.tws.examples.internal.rsched.DriverExample | driver class to run |
twister2.resource.nfs.server.address | 149.165.150.81 | nfs server address |
twister2.resource.nfs.server.path | /nfs/shared-mesos/twister2 | nfs server path |
twister2.resource.worker_port | 31000 | worker port |
twister2.resource.desired_nodes | all | desired nodes |
twister2.resource.use_docker_container | true | |
twister2.resource.rack.labey.key | rack | rack label key for Mesos nodes in a cluster each rack should have a unique label all nodes in a rack should share this label Twister2 workers can be scheduled by using these label values Better data locality can be achieved no default value is specified |
twister2.resource.datacenter.labey.key | datacenter | data center label key each data center should have a unique label all nodes in a data center should share this label Twister2 workers can be scheduled by using these label values Better data locality can be achieved no default value is specified |
- echo | ['blue-rack', 'green-rack'] | Data center list with rack names |
- blue-rack | ['10.0.0.40', '10.0.0.41', '10.0.0.42', '10.0.0.43', '10.0.0.44', ] | Rack list with node IPs in them |
Name | Default | Description |
twister2.job.master.runs.in.client | false | if true, the job master runs in the submitting client if false, job master runs as a separate process in the cluster by default, it is true when the job master runs in the submitting client, this client has to be submitting the job from a machine in the cluster |
#twister2.job.master.port | 2023 | twister2 job master port number default value is 11111 |
twister2.worker.to.job.master.response.wait.duration | 10000 | worker to job master response wait time in milliseconds this is for messages that wait for a response from the job master default value is 10seconds = 10000 |
twister2.job.master.volatile.volume.size | 1.0 | twister2 job master volatile volume size in GB default value is 1.0 Gi if this value is 0, volatile volume is not setup for job master |
twister2.job.master.persistent.volume.size | 1.0 | twister2 job master persistent volume size in GB default value is 1.0 Gi if this value is 0, persistent volume is not setup for job master |
twister2.job.master.cpu | 0.2 | twister2 job master cpu request default value is 0.2 percentage |
twister2.job.master.ram | 1000 | twister2 job master RAM request in MB default value is 0.2 percentage |
twister2.job.master.ip | 149.165.150.81 |
Name | Default | Description |
twister2.worker.controller.max.wait.time.for.all.workers.to.join | 100000 | amount of timeout for all workers to join the job in milli seconds |
twister2.worker.controller.max.wait.time.on.barrier | 100000 | amount of timeout on barriers for all workers to arrive in milli seconds |
Name | Default | Description |
twister2.dashboard.host | http://localhost:8080 | Dashboard server host address and port if this parameter is not specified, then job master will not try to connect to Dashboard |
Name | Default | Description |
twister2.network.channel.class | edu.iu.dsc.tws.comms.tcp.TWSTCPChannel |
Name | Default | Description |
twister2.resource.scheduler.mpi.working.directory | ${HOME}/.twister2/jobs | working directory |
twister2.resource.job.package.url | http://149.165.xxx.xx:8082/twister2/mesos/twister2-job.tar.gz | |
twister2.resource.core.package.url | http://149.165.xxx.xx:8082/twister2/mesos/twister2-core-0.7.0.tar.gz | |
twister2.resource.class.launcher | edu.iu.dsc.tws.rsched.schedulers.nomad.NomadLauncher | the launcher class |
twister2.resource.nomad.scheduler.uri | http://localhost:4646 | |
twister2.resource.nomad.core.freq.mapping | 2000 | The nomad schedules cpu resources in terms of clock frequency (e.g. MHz), while Heron topologies specify cpu requests in term of cores. This config maps core to clock freqency. |
twister2.resource.filesystem.shared | true | weather we are in a shared file system, if that is the case, each worker will not download the core package and job package, otherwise they will download those packages |
twister2.resource.nomad.shell.script | nomad.sh | name of the script |
twister2.resource.system.package.uri | ${TWISTER2_DIST}/twister2-core-0.7.0.tar.gz | path to the system core package |
twister2.resource.uploader.directory | /root/.twister2/repository/ | the directory where the file will be uploaded, make sure the user has the necessary permissions to upload the file here. if you want to run it on a local machine use this value |
#twister2.resource.uploader.directory | /var/www/html/twister2/mesos/ | if you want to use http server on echo |
twister2.resource.uploader.scp.command.options | --chmod=+rwx | This is the scp command options that will be used by the uploader, this can be used to specify custom options such as the location of ssh keys. |
twister2.resource.uploader.scp.command.connection | root@localhost | The scp connection string sets the remote user name and host used by the uploader. |
twister2.resource.uploader.ssh.command.options | The ssh command options that will be used when connecting to the uploading host to execute command such as delete files, make directories. | |
twister2.resource.uploader.ssh.command.connection | root@localhost | The ssh connection string sets the remote user name and host used by the uploader. |
twister2.resource.class.uploader | edu.iu.dsc.tws.rsched.uploaders.localfs.LocalFileSystemUploader | file system uploader to be used Options
|
twister2.resource.uploader.download.method | LOCAL | this is the method that workers use to download the core and job packages it could be LOCAL, HTTP, HDFS, .. |
Name | Default | Description |
twister2.resource.nfs.server.address | localhost | nfs server address |
twister2.resource.nfs.server.path | /tmp/logs | nfs server path |
twister2.resource.rack.labey.key | rack | rack label key for Mesos nodes in a cluster each rack should have a unique label all nodes in a rack should share this label Twister2 workers can be scheduled by using these label values Better data locality can be achieved no default value is specified |
twister2.resource.datacenter.labey.key | datacenter | data center label key each data center should have a unique label all nodes in a data center should share this label Twister2 workers can be scheduled by using these label values Better data locality can be achieved no default value is specified |
- echo | ['blue-rack', 'green-rack'] | Data center list with rack names |
- green-rack | ['node11.ip', 'node12.ip', 'node13.ip'] | Rack list with node IPs in them |
# workersPerPod | 2 # number of workers on each pod in Kubernetes. May be omitted in other clusters. | A Twister2 job can have multiple sets of compute resources instances shows the number of compute resources to be started with this specification workersPerPod shows the number of workers on each pod in Kubernetes. May be omitted in other clusters. default value is 1. |
instances | 4 # number of compute resource instances with this specification | Options
|
twister2.resource.worker.additional.ports | ["port1", "port2", "port3"] | by default each worker has one port additional ports can be requested for all workers in a job please provide the requested port names as a list |
twister2.resource.worker_port | 31000 | worker port |
Name | Default | Description |
twister2.job.master.runs.in.client | true | if true, the job master runs in the submitting client if false, job master runs as a separate process in the cluster by default, it is true when the job master runs in the submitting client, this client has to be submitting the job from a machine in the cluster |
twister2.job.master.port | 11011 | twister2 job master port number default value is 11111 |
twister2.worker.to.job.master.response.wait.duration | 10000 | worker to job master response wait time in milliseconds this is for messages that wait for a response from the job master default value is 10seconds = 10000 |
twister2.job.master.volatile.volume.size | 1.0 | twister2 job master volatile volume size in GB default value is 1.0 Gi if this value is 0, volatile volume is not setup for job master |
twister2.job.master.persistent.volume.size | 1.0 | twister2 job master persistent volume size in GB default value is 1.0 Gi if this value is 0, persistent volume is not setup for job master |
twister2.job.master.cpu | 0.2 | twister2 job master cpu request default value is 0.2 percentage |
twister2.job.master.ram | 1000 | twister2 job master RAM request in MB default value is 0.2 percentage |
twister2.job.master.ip | localhost | the job master ip to be used, this is used only in client based masters |
Name | Default | Description |
twister2.worker.controller.max.wait.time.for.all.workers.to.join | 1000000 | amount of timeout for all workers to join the job in milli seconds |
twister2.worker.controller.max.wait.time.on.barrier | 1000000 | amount of timeout on barriers for all workers to arrive in milli seconds |
Name | Default | Description |
twister2.dashboard.host | http://localhost:8080 | Dashboard server host address and port if this parameter is not specified, then job master will not try to connect to Dashboard |