EN (Письменные переводы статей (15 000 знаков) с английского на русский), страница 3
Описание файла
Файл "EN" внутри архива находится в следующих папках: Письменные переводы статей (15 000 знаков) с английского на русский, Efficiency of algorithm for solution of vector radiative transfer equation in turbid medium slab (Буренков). PDF-файл из архива "Письменные переводы статей (15 000 знаков) с английского на русский", который расположен в категории "". Всё это находится в предмете "иностранный язык" из 9 семестр (1 семестр магистратуры), которые можно найти в файловом архиве НИУ «МЭИ» . Не смотря на прямую связь этого архива с НИУ «МЭИ» , его также можно найти и в других разделах. Архив можно найти в разделе "остальное", в предмете "иностранный язык" в общих файлах.
Просмотр PDF-файла онлайн
Текст 3 страницы из PDF
There are some libraries where the necessary subroutines are collected:LAPACK, IMSL, NAG etc. Math Kernel Library (MKL) is a package that is specially optimized forIntel processors (MKL has versions both for Windows and Linux systems). Great advantages resultfrom the use of the supporting multicore and multiprocessor systems and automatic parallelization.This feature can increase computing speed significantly. It has vector implementations of elementoperations (e.g.
square root from elements of an array) in a parallel mode. Special mathematicalprogram packages like Matlab, Maple, Mathematica use MKL by playing the role of library cover.A perspective way to accelerate the code is a general-purpose graphics processing units (GPUs).Compute Unified Device Architecture (CUDA) is the computing engine in nVIDIA GPUs. It givesaccess to the instruction set of computing elements of video card. Standard LAPACK routines can beaccelerated by CUDA as well. Instead of reprogramming LAPACK subroutines, it is advisable toapply the special library CULA.
Also MATLAB 2010b supports nVIDIA CUDA-capable GPUs, andno knowledge and experience in CUDA is required to use GPU-computing features.The significant acceleration and saving the memory can be reached by using sparse matrices. Thebasic idea is the following: while working with matrix that has a lot of zero elements, it is wise to storeonly non-zero elements and the additional information that can be used to restore the indices ofnonzero elements or indices themselves. There are some formats to store the sparse matrices:compressed sparse row (CSR) format, compressed sparse column format (CSC), and CoordinateFormat.
The choice of format depends on type of operation; for instance, CSC format is convenientwhen using sparse matrix since its elements are arranged by each column. The use of sparse matricescan be efficient for computing the anisotropic part (49) and dealing with a modified source function.6.
Results and discussionsWe analyzed the effect of different hardware-software resources for the efficiency of implementationof solution (26) with the anisotropic part elimination on a basis of MSH by example of codes MDOMand MVDOM. The comparison of run-time for different modes of compiling and computing isrepresented in Table 1 for two tests (test1: N = 101, K=500, M=32; test 2: N = 101, K=1000, M=32).The tests were run on system Ubuntu 10.04, Intel Core 2 Duo 3GHz, 2 Gb RAM, Intel FortranCompiler 11.1 with MKL 10.2. Two compilers gfortran and ifort, optimization and sparse matrixtechnique were used.Note that MKL uses all available computing cores of the system (2 cores in this case) and providesthe results in half of the time. The sparse matrix technique used for computing the anisotropic partreduces run-time significantly.
Due to sparse matrices, the two-dimensional arrays are reduced to onedimensional array. Thus, the run-time is proportional to K instead of to K2.We increased acceleration by about 20% due to the matrix multiplication on nVIDIA GeForce 480GTX GPU. GPU computations provide advantages only for big size arrays, otherwise CPU computingis more preferable. The profiler tool shows that the calculation of eigenvectors and values take the halfof the run-time.
Unfortunately, the subroutines for these problems have not been implemented forMatlab GPU tool. The sizes of matrices are significantly reduced due to the solution anisotropic partelimination by MSH. In practice, N is less than 300. However in the case computations of spectra theparallelization of wavelength loop also can be implemented within CUDA technology. It is a possibleway to use CUDA advantages, but not for a single wavelength problem.8Eurotherm Conference No.
95: Computational Thermal Radiation in Participating Media IVIOP PublishingJournal of Physics: Conference Series 369 (2012) 012021doi:10.1088/1742-6596/369/1/012021Table 1. Comparison of calculating time for two tests.FeaturesRun-time I, secRun-time II, secgfortran+LAPACK240530gfortran+LAPACK + optimization230505ifort+LAPACK210490ifort+MKL115250ifort+MKL+optimization105230ifort+MKL+optimization + sparse matrix3344Matlab 2010b2745Matlab 2010b + CUDA2233Now let’s consider the effectiveness of different methods of the solution anisotropic partelimination by the numerical comparison of the radiance angular distribution calculation for the sametask by different codes: DISORT and MDOM in the scalar case and Pstar and MVDOM in the vectorone. It is worth mentioning that the speed of DISORT calculation can be significantly increased byusing the sparse matrix algorithm in subroutine ZEROIT, which zeros a given matrix.
In this case thespeed for quite smooth phase functions (g<0.9) of MDOM and DISORT is of the same order. For thephase function Henyey-Greenstein g=0.98, the MSH method of anisotropic part elimination appears tobe better. Figure 1 provides the numerical comparison of the reflectance calculations for the mediumscattering anisotropy, and figure 2 for the strong anisotropy.Figure 1. Comparison of MDOM and DISORTin the case of medium scattering anisotropy.Figure 2.
Comparison of MDOM and DISORTin the case of strong scattering anisotropy.Note that even an average degree of scattering anisotropy mandatory requires the use of TMSmethod. Isolation of the solution singularities on the basis of MSH operates independently from thedegree of scattering anisotropy. The running time in the first case is approximately the same t ~ 0.5sec, but it differs greatly in the second case: MDOM t ~ 2 seconds, and DISORT t ~ 14 sec.Quite similar results were obtained in the vector case. We compared Pstar and MVDOM in thecases of the phase matrix for log-normal distribution with parameters r0=5, s=0.4, Λ=0.99; andgeometry of the observation scheme with θ0=45˚, τ0=1.0, ϕ=30˚.
Figure 3 shows a comparison of9Eurotherm Conference No. 95: Computational Thermal Radiation in Participating Media IVIOP PublishingJournal of Physics: Conference Series 369 (2012) 012021doi:10.1088/1742-6596/369/1/012021calculations of the angular distribution of polarization transmitted by slab, and figure 4 provides acomparison of the reflected radiation.Figure 3. Polarization of the transmitted radiation.Figure 4. Polarization of the reflected radiation.For the best fit, we used the parameters: N = 71, K = 171, M = 20 in the code MVDOM and 30streams in the code Pstar.
The comparison of results showed the perfect coincidence at the computeraccuracy level, but the computation time was 11 seconds for MVDOM and for the program Pstar morethan 180 seconds.7. ConclusionsThe discretized VRTE for a slab has a unique analytic solution in the matrix form. The high level ofoptimization of linear algebra packages permits a single algorithm implementation on a computer forthis solution. Various implementations differ in the method of the solution anisotropic partelimination.
Taking into account our algorithm analysis, the analytical anisotropy elimination on thebasis of MSH is supposedly a more precise algorithm than TMS. At the algorithm implementation it isnecessary to use the algorithm of sparse matrices, and when using the Intel processor the applicationof MKL library is very essential.References[1] Kokhanovsky AA, Budak VP, Cornet C, Duan M, Emde C, Katsev IL, Klyukov DA, KorkinSV, C-Labonnote L, Mayer B, Min Q, Nakajima T, Ota Y, Prikhach AS, Rozanov VV, YokotaT, Zege EP 2010 J.
Quant. Spectrosc. Radiat. Transf. 111 1931.[2] Doicu A and Trautmann T 2009 J. Quant. Spectrosc. Radiat. Transf. 110 159[3] Budak VP, Klyuykov DA, and Korkin SV 2011 J. Quant. Spectrosc. Radiat. Transf. 112 1141.[4] Karp AH, Greenstadt J, and Fillmore JA 1980 J. Quant. Spectrosc. Radiat. Transf. 24 391.[5] Kuščer I and Ribarič M 1959 Opt Acta 6 42.[6] Budak VP and Korkin SV 2008 J. Quant. Spectrosc.
Radiat. Transf. 109 220.[7] Flatau PJ and Stephens GL 1988 J. Geophys. Res. 93 11037.[8] Chandrasekhar S 1950 Radiative transfer (London: Oxford University Press)[9] Wiscombe WJ 1977 J. Atmos. Sci. 34 1408.[10] Nakajima T and Tanaka M 1988 J. Quant. Spectrosc. Radiat. Transf. 40 51.[11] ftp://climate1.gsfc.nasa.gov/wiscombe/Multiple_Scatt[12] http://www.ccsr.u-tokyo.ac.jp/~clastr/[13] http://www.svet-mpei.org10.