parallel computers 
Energy consumption of parallel computers has been becoming the obstruction to higherperformance systems.


In this paper, we focus on power optimization of highperformance interconnection networks for MPI applications in highperformance parallel computers.


Fortranbased programming languages for parallel computers are discussed.


The problem of mapping affine loop nests onto parallel computers with distributed memory is considered.


The numerical algorithm is designed to be executed on parallel computers.


This paper presents the principles of the parallel code design and examines its performance on a variety of stateoftheart parallel computers in China.


Cellular automata (CA) are simple mathematical models of the dynamics of discrete variables in discrete space and time, with applications in nonequilibrium physics, chemical reactions, population dynamics and parallel computers.


These results were calculated using the Schwinger multichannel method as implemented on distributedmemory parallel computers.


In response to the computational challenges of the simulations, a highly efficient interprocessor communications methodology is developed, which greatly reduces the simulation time on parallel computers.


The method can be used directly on parallel computers.


Basic aspects and strategies of running Monte Carlo calculations on parallel computers are studied.


The resulting finite sparse matrix problem is solved by diagonalization on parallel computers.


Polymorphic Torus is a novel interconnection network for SIMD massively parallel computers, able to support effectively both local and global communication.


Finally, we consider the potential of the dual approach for execution on parallel computers.


The cellular automata (CA) model of decentralized computations provides one such approach which is ideally tailored for parallel computers.


A Laxmethod is used for numerical analysis of a 2dBGK fluid, which results in an easytoimplement algorithm well suited for implementation on massivly parallel computers.


With recent improvements in algorithms and with the use of parallel computers, the degree and order for full variancecovariance matrices could be increased to 180.


A numerical example on two different parallel computers shows that the proposed implementation of AMR is effective to reduce the computational time for unsteady flows with shock waves.


Two methods are presented which efficiently solve tridiagonal systems on vector supercomputers and parallel computers with a moderate degree of parallelism.


The construction of the base functions is fully decoupled from element to element; thus the method is perfectly parallel and is naturally adapted to massively parallel computers.

