LS-DYNA on High Performance Servers


As the computational demands of FEM simulation have grown over the past several years, with models continuing to grow in complexity, traditional solution methods have become inadequate. Applying distributed computing techniques, LSTC had developed a version of LS-DYNA that can run today’s large models in reasonable times on a wide range of available hardware. In essence, the problem to be modeled is split into pieces (domains), and each piece is simulated on a different processor. Coordination between the simulations is of course required at the domain boundaries. Contact is a particularly difficult problem, requiring cooperation between all the processors as the domains interact. The communication involved produces overhead, which increases with the number of domains. Consequently, there is a limit to the speed that can be achieved. For a given problem, the simulation time generally goes down as the number of processors increases, up to a point. The speedup will drop off and, if too many processors are used, the simulation time will begin to increase. Here is one example of simulation run time as a function of the number of processors. In this example the problem is a 450,000 element crash model being run on a cluster of PCs.

Currently, the largest application areas for the MPP version of LS-DYNA are in automotive crash and metal forming. One of LSTC's customers has been running production sheet metal stamping simulations using MPP-DYNA for several years. Their problems routinely have 1 million elements, and they achieve overnight turnaround times utilizing a 30 processor system.

Recent advancements in PC hardware have brought these machines to the interest of large corporations as viable alternatives to vector supercomputers. One auto manufacturer has a rack mounted PC cluster of 384 processors on which they perform production simulations utilizing MPP-DYNA. They can run 24 simultaneous 16 processor problems, and have overnight turnaround on typical crash models with over 600,000 elements. Another carmaker recently benchmarked a PC cluster using a 700,000 node ODB (offset deformable barrier impact) model. They found that running on 7 processors they could match the speed of their current model vector supercomputer. On 48 processors it was 7 times faster, and still scaling. This kind of speed is simply unattainable using traditional SMP programming techniques as implemented in LS-DYNA.

Distributed implicit solution methods are currently under development at LSTC. Implicit methods are better suited to certain kinds of problems than explicit methods. By extending the implicit method to a distributed system, the memory and computational power needed to solve large complex systems is readily available. The delay in the implicit methodology in LS-DYNA is primarily related to the effort required to add constitutive matrices to each material model, the implicit extension of every constraint option including contact, and the development of stiffness matrices for each explicit element. This work is nearing completion so our efforts will now center on the MPP implementation.