Engineers with a need for their computer programs to run faster or to process larger datasets prefer that every program automatically use parallelism. However, automatic parallelism still is a subject of basic computer science research. The responsibility of using parallelism to run programs across multiple cores or computers is shared by the designers of programming languages/environments and their users.
Parallel computing capabilities
Engineers are primarily concerned with solving complex problems within their technical domains. While some are
Scalability and portability are key requirements for a parallel computing environment because most engineers want the parallel applications to seamlessly use the available resources. Engineers use a variety of operating systems and hardware. They do not want to change code when migrating applications from one operating system to another or from a multicore desktop computer to a large cluster. The need to have specific knowledge about a cluster is a roadblock for an engineer who wants to use remote cluster hardware. Most engineers prefer that the cluster administrator write system-specific scripts, set environment variables, and manage job queues. Separating user and administrator tasks is an important requirement.
Specialized technologies challenges
There are a number of parallel computing technologies available to an engineer. Some, such as Intel TBB and Cilk, enable programmers to write parallel programs that use multicore computers. However, the same programs cannot scale up to use remote resources such as clusters. Often they need to be rewritten to use other technologies such as MPI, which are complex and require specialized knowledge. This workflow violates the requirement that the same parallel program scales from workstations to clusters without any recoding.
Specialized technologies such as MPI have the additional drawback of requiring the parallel program user to have some knowledge of the system on which it will be run. This reduces the portability of code and the number of people who can use it.
Scalable parallel computing
MATLAB offers different levels of control to a programmer who wishes to convert a program to run efficiently in parallel. Some programs require no recoding, while others require the use of low-level programming methods. The most commonly used programming techniques involve adding annotations to code. For example, a “for” loop with independent iterations can be annotated as a “parfor” loop. At runtime, the computing environment will attempt to run the loop iterations in parallel across multiple MATLAB “workers,” a worker being an execution engine that runs in the background on a workstation or cluster.
An engineer who writes a parallel MATLAB program does not need to know anything about where the program eventually will be run because the MATLAB programming language is separated from the execution environment; the same parallel program can run on multicore workstations as well as on clusters, grids, and clouds. The portability of MATLAB programs across different hardware and operating systems facilitates sharing and deploying parallel programs. For example, an engineer who develops a program on a Windows workstation can run the same program on a Linux cluster or share it with a colleague who uses a Mac.
Scaling up a parallel MATLAB program from workstation to cluster does not require the user to have knowledge of the cluster because MATLAB allows for the roles of user and cluster administrator to be independent of each other. The administrator stores information about the cluster in a configuration file (e.g., how to submit jobs and transfer data) and sends it to cluster users. A user could receive several configurations, one for each remote resource. The user imports the configurations into the MATLAB user interface and selects one of them as the resource on which to run the parallel program.
The typical workflow of an engineer who wishes to solve a large technical problem in MATLAB is:
1. The user writes a serial program and then parallelizes it by using constructs such as parfor.
2. The user tests and debugs the program with small inputs from a workstation.
3. The user increases the size of inputs to the program, imports a configuration for a remote cluster, and reruns the program on that cluster.
A real-world story
The Optics group at University of Bristol performs research on semiconductor vertical-cavity surface-emitting lasers (VCSELs), which are used widely in fiber-optic telecommunication networks. The group develops new generations of VCSELs with photonic crystals (PC-VCSELs). To perform numerical simulations using models of PCVCSELs, MATLAB solvers were used for 2-D partial differential scalar Helmholtz equations and ordinary differential laser rate equations.
The approximated solution time of equations of models varied from 10 to 700 minutes for some models and four to 60 hours for others. Since these models used many input parameters, the computation of PC-VCSEL characteristics and their optimization required hundreds of solutions of equations. Performing the computations on a laboratory workstation would have taken days.
Researchers parallelized the MATLAB program by structuring it as a job that computed parameters of optical modes of PC-VCSELs N times. Therefore, there were N tasks, each of which computed parameters of optical modes. Researchers first tested and debugged the program by using multiple MATLAB workers on a workstation. Once the correctness of the parallel MATLAB program had been established, it was run on a grid system provided by Enabling Grids for E-sciencE (EGEE), the consortium that provides more than 70,000 processor cores to users worldwide. By using a portion of this infrastructure, the time for computation of 300 tasks was reduced from more than five days to just six hours – a speedup of 21 times.
Recommended Reading
Confirmed: Liberty Energy’s Chris Wright is 17th US Energy Secretary
2025-02-03 - Liberty Energy Founder Chris Wright, who was confirmed with bipartisan support on Feb. 3, aims to accelerate all forms of energy sources out of regulatory gridlock.
Artificial Lift Firm Flowco’s Stock Surges 23% in First-Day Trading
2025-01-22 - Shares for artificial lift specialist Flowco Holdings spiked 23% in their first day of trading. Flowco CEO Joe Bob Edwards told Hart Energy that the durability of artificial lift and production optimization stands out in the OFS space.
Utica’s Infinity Natural Resources Seeks $1.2B Valuation with IPO
2025-01-21 - Appalachian Basin oil and gas producer Infinity Natural Resources plans to sell 13.25 million shares at a public purchase price between $18 and $21 per share—the latest in a flurry of energy-focused IPOs.
NOV Appoints Former Denbury CEO Chris Kendall to Board
2024-12-16 - NOV Inc. appointed former Denbury CEO Chris Kendall to its board, which has expanded to 11 directors.
Velocity Management Invests in Pipeline Builder M Wright Services
2025-01-16 - Velocity Management Advisors has made a minority investment in M Wright Services and three of Velocity’s partners will join the construction firm’s board.
Comments
Add new comment
This conversation is moderated according to Hart Energy community rules. Please read the rules before joining the discussion. If you’re experiencing any technical problems, please contact our customer care team.