Wolfram 지원의 빠른 답변: How do I use Mathematica in a managed high-performance cluster?

Background information

Larger computing clusters consist of many nodes where each node is made up of many CPU cores. Moreover, special software such as Mathematica is often available on the cluster. Users request these resources by logging into a head node and submitting batch jobs to the cluster manager. The job executes when the resources become available. Two common cluster managers include TORQUE and Slurm.

Parallelization in Mathematica uses the hub-and-spoke model where a controlling kernel manages a number of subordinate kernels (subkernels). In a cluster environment, the client runs the controlling kernel and the hosts provide subkernels. The cluster manager determines which node acts as the client and which nodes act as hosts.

The benefits of running Mathematica on a cluster are twofold: The number of available CPU cores, even on a single node, is usually more than on a desktop computer. Also, the speed of each CPU core is individually faster than those in a desktop computer.

Running a remote Mathematica front end

A remote front end requires the user to remain in control of the job’s resources. This is known as an interactive session.

Though there is usually a benefit in CPU speed over using a desktop computer, the front end via an interactive session is slower than running a local version of the front end. This is because the front end runs on the cluster while the interface is forwarded to the remote user’s computer.

An interactive session is not intended to run CPU-intensive calculations, but rather it is used to test and diagnose code. It is typical to request only one node’s resources.

Log in to the head node via SSH with X-windows forwarding enabled.
Launch an interactive session, requesting all of the resources on a single node.
Start a Mathematica session.

From a Mathematica notebook, using the LaunchKernels[] command, or any other parallel functionality, will now include subkernels running from the cluster.

Running a remote Mathematica script

It is assumed that

you are familiar with launching remote subkernels
the cluster is a Unix environment
the cluster uses a cloned file system such that Mathematica is run by the same executable on all nodes

If any of the above assumptions are not true, then the following will need to be modified, but the general outline remains:

query the system to find the names of the nodes that are assigned to the job, and how many cores per node are available
manually launch remote kernels

For example, on a Torque managed cluster, in a Mathematica or Wolfram Language script you would have:

(*get association of resources, name of local host and remove local host 
from available resources*)
hosts = Counts[ReadList[Environment["PBS_NODEFILE"], "String"]];
local = First[StringSplit[Environment["HOSTNAME"],"."]];
hosts[local]--;

(*launch subkernels and connect them to the controlling Wolfram Kernel*)
Needs["SubKernels`RemoteKernels`"];
Map[If[hosts[#] > 0, LaunchKernels[RemoteMachine[#, hosts[#]]]]&, Keys[hosts]];

At this point, parallel functions can now be used that utilize the whole set of available resources. When the parallel code is complete, it is good practice to close the Wolfram kernels.

CloseKernels[];

지원 문의

청구서, 제품 동기화에 관한 질문에서 기술적인 질문까지 부담없이 문의하세요.

1-800-WOLFRAM (국제 전화는 +1-217-398-0700)

고객 지원

월요일 - 금요일
8am–5pm 중부 표준시

제품 등록 및 동기화
구매 전 정보 및 주문
설치 및 동작

고급 기술지원 (해당 고객을 대상으로)

월요일 - 목요일
8am–7pm 중부 표준시

금요일
8:30–10am & 11am–5pm 중부 표준시

우선적 기술지원
Wolfram 전문가들의 제품 지원
Wolfram 프로그래밍
고급 설치 지원

기타 학습 리소스

기술 지원

Wolfram 솔루션

교육용 Wolfram 솔루션

사용 시작

자신의 기술을 성장시켜 보세요

Wolfram과 함께하기

성인용 교육 프로그램

청소년을 위한 교육 프로그램

읽을거리

How do I use Mathematica in a managed high-performance cluster?

Background information

Running a remote Mathematica front end

Running a remote Mathematica script

지원 문의

1-800-WOLFRAM (국제 전화는 +1-217-398-0700)

고객 지원

고급 기술지원 (해당 고객을 대상으로)

Background information

Running a remote Mathematica front end

Running a remote Mathematica script

관련 문서

지원 문의

1-800-WOLFRAM (국제 전화는 +1-217-398-0700)

고객 지원

고급 기술지원 (해당 고객을 대상으로)