Research On The Parallel Finite Element Method Based On The Many-Core Processor And Engineering Application

Posted on:2017-03-21

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Z Zhang

Full Text:PDF

GTID:1368330596456589

Subject:Mechanical Manufacturing and Automation

Abstract/Summary:

PDF Full Text Request

In the field of mechanical engineering,it is widely used CAE software for designers to design aircraft mechanics,auto chassis,power transmission tower rack,arm,truss structure,in which the finite element method is an important part of CAE design,mainly related nonlinear processing material nonlinearity and geometry status changes and other complex problems,often facing huge numerical calculation,low computational efficiency problems,which demand the practical application of parallel computing is very strong.Recently,the widely used CPU-core computing algorithms and software have low computational efficiency and not-high cost.With the advent of many-core processor age,researchers in academia and industry began to use a variety of many-core processor(many-core accelerator)to accelerate the finite element method to calculate the speed,thereby improving the efficiency of CAE software.Many-core Xeon Phi processor which is widely used in parallel computing field can provide far more than the peak floating point calculations and memory speed.However,how the finite element method ported to many-core Xeon Phi processor,designed and developed to support many-core Xeon Phi processor finite element software is still a huge challenge.To address these problems and challenges,in engineering applications as a guide,we make a deep research on many-core Xeon Phi implementation and optimization techniques of the finite element method.The main work and innovations are as follows:1.A Delaunay triangulation algorithm based on parallel inversion and parallel insertion is proposed,which is implemented and optimized on the Phi Xeon.Based on the definition of Delaunay triangulation,a combinatorial optimization problem with Delaunay triangulation is derived.According to the architecture characteristics of the processor,two local optimization operations are designed and implemented.The approximate Delaunay triangular mesh of the combinatorial optimization problem can be solved by use of these two local optimization methods.Using the repair method,the approximate Delaunay triangular mesh is converted into a real Delaunay triangular mesh.Numerical experiments show that the proposed algorithm can achieve a speedup of about 4 times compared with the CGAL software package.Thus,the majority of processes of the finite element analysis in engineering design can be efficient parallel computing on many-core processors.2.Two sparse matrix LU decomposition parallel algorithms are proposed.The first algorithm uses Right-looking technology,with the use of sparse block technology to optimize,to effectively improve the data reuse and reduce the bandwidth requirements.The second algorithm is based on the Left-looking method,which uses the parallel scheduling strategy based on the elimination graph to realize the load balance of the thread.The experimental results show that the two methods can achieve higher performance than the sparse matrix LU decomposition.At the same time,for most of the matrices,the Left-looking method has higher performance than the Right-looking.3.A parallel conjugate gradient algorithm is designed and optimized on the many-core processor.The computation of the conjugate gradient algorithm is mainly focused on the sparse matrix vector multiplication.According to the finite element matrix characteristics and many-core processor architecture,a sparse matrix storage format is designed to suitablely store sparse matrix and operate efficiently on the Xeon Phi multi-core processor.Accordingly,a parallel sparse matrix vector multiplication algorithm with dynamic scheduling is designed for the storage format.The parallel sparse matrix vector multiplication is used as the key part,and the conjugate gradient algorithm is designed and implemented on the core processor.The experimental results show that the conjugate gradient algorithm on the core processor is much higher than the performance of CPU.Thus,the designer in the design process can more clearly understand the behavioral characteristics of the application,in order to improve the structural design of various projects,and ultimately improve the performance of many-core processors.The research work in this paper is of great significance both in theoretical research and engineering applications,especially for the development of CAE software which supports Phi Xeon many-core processor.

Keywords/Search Tags:

Finite Element Method, IntelXeon Phi many core processor, Delaunay triangulation, LU decomposition, conjugategradient algorithm, sparse matrix-vector multiplication

PDF Full Text Request

Related items

1	Research On Parallel Program Performance Tuning On Multi-Core Computing Platform
2	Parallel Algorithms And Architectures For Matrix Computations On FPGA
3	Design And Optimization Of Parallel Algorithm Based On MIC Many-core Architecture
4	Grid Generation And Application Of The Delaunay Tringulation Under A New Metric
5	Optimization And Realization For Sparse Matrix-Vector Multiplication On FPGA
6	Research And Application Of Parallel Sparse Diagonal Matrix-vector Multiplication Algorithm On GPU
7	High Efficient Matrix Operations On Vector-SIMDE DSPs
8	Mesh Generation Algorithm And Software To Achieve
9	Research On Triangulation Algorithm
10	Studies On Reconstruction Algorithm For Optical Tomography