Press, Teukolsly, Vetterling, Flannery - Numerical Recipes in C (523184), страница 24
Текст из файла (страница 24)
Unfortunately, there is no one standard scheme in general use. Knuth [7] describesone method. The Yale Sparse Matrix Package [6] and ITPACK [8] describe several othermethods. For most applications, we favor the storage scheme used by PCGPACK [9], whichis almost the same as that described by Bentley [10], and also similar to one of the Yale SparseMatrix Package methods. The advantage of this scheme, which can be called row-indexedsparse storage mode, is that it requires storage of only about two times the number of nonzeromatrix elements.
(Other methods can require as much as three or five times.) For simplicity,we will treat only the case of square matrices, which occurs most frequently in practice.To represent a matrix A of dimension N × N , the row-indexed scheme sets up twoone-dimensional arrays, call them sa and ija. The first of these stores matrix element valuesin single or double precision as desired; the second stores integer values. The storage rules are:• The first N locations of sa store A’s diagonal matrix elements, in order. (Note thatdiagonal elements are stored even if they are zero; this is at most a slight storageinefficiency, since diagonal elements are nonzero in most realistic applications.)• Each of the first N locations of ija stores the index of the array sa that containsthe first off-diagonal element of the corresponding row of the matrix. (If there areno off-diagonal elements for that row, it is one greater than the index in sa of themost recently stored element of a previous row.)• Location 1 of ija is always equal to N + 2.
(It can be read to determine N .)• Location N + 1 of ija is one greater than the index in sa of the last off-diagonalelement of the last row. (It can be read to determine the number of nonzeroelements in the matrix, or the number of elements in the arrays sa and ija.)Location N + 1 of sa is not used and can be set arbitrarily.• Entries in sa at locations ≥ N + 2 contain A’s off-diagonal values, ordered byrows and, within each row, ordered by columns.• Entries in ija at locations ≥ N +2 contain the column number of the correspondingelement in sa.While these rules seem arbitrary at first sight, they result in a rather elegant storagescheme. As an example, consider the matrix3.
0. 1. 0. 0. 0. 4. 0. 0. 0. (2.7.27) 0. 7. 5. 9. 0. 0. 0. 0. 0. 2. 0. 0. 0. 6. 5.792.7 Sparse Linear SystemsIn row-indexed compact storage, matrix (2.7.27) is represented by the two arrays of length11, as followsindex k1234567891011ija[k]78810111232454sa[k]3.4.5.0.5.x1.7.9.2.6.(2.7.28)Here x is an arbitrary value. Notice that, according to the storage rules, the value of N(namely 5) is ija[1]-2, and the length of each array is ija[ija[1]-1]-1, namely 11.The diagonal element in row i is sa[i], and the off-diagonal elements in that row are insa[k] where k loops from ija[i] to ija[i+1]-1, if the upper limit is greater or equal tothe lower one (as in C’s for loops).Here is a routine, sprsin, that converts a matrix from full storage mode into row-indexedsparse storage mode, throwing away any elements that are less than a specified threshold.Of course, the principal use of sparse storage mode is for matrices whose full storage modewon’t fit into your machine at all; then you have to generate them directly into sparse format.Nevertheless sprsin is useful as a precise algorithmic definition of the storage scheme, forsubscale testing of large problems, and for the case where execution time, rather than storage,furnishes the impetus to sparse storage.#include <math.h>void sprsin(float **a, int n, float thresh, unsigned long nmax, float sa[],unsigned long ija[])Converts a square matrix a[1..n][1..n] into row-indexed sparse storage mode.
Only elements of a with magnitude ≥thresh are retained. Output is in two linear arrays with dimension nmax (an input parameter): sa[1..] contains array values, indexed by ija[1..]. Thenumber of elements filled of sa and ija on output are both ija[ija[1]-1]-1 (see text).{void nrerror(char error_text[]);int i,j;unsigned long k;for (j=1;j<=n;j++) sa[j]=a[j][j];Store diagonal elements.ija[1]=n+2;Index to 1st row off-diagonal element, if any.k=n+1;for (i=1;i<=n;i++) {Loop over rows.for (j=1;j<=n;j++) {Loop over columns.if (fabs(a[i][j]) >= thresh && i != j) {if (++k > nmax) nrerror("sprsin: nmax too small");sa[k]=a[i][j];Store off-diagonal elements and their columns.ija[k]=j;}}ija[i+1]=k+1;As each row is completed, store index to}next.}The single most important use of a matrix in row-indexed sparse storage mode is tomultiply a vector to its right.
In fact, the storage mode is optimized for just this purpose.The following routine is thus very simple.void sprsax(float sa[], unsigned long ija[], float x[], float b[],unsigned long n)Multiply a matrix in row-index sparse storage arrays sa and ija by a vector x[1..n], givinga vector b[1..n].{void nrerror(char error_text[]);80Chapter 2.Solution of Linear Algebraic Equationsunsigned long i,k;if (ija[1] != n+2) nrerror("sprsax: mismatched vector and matrix");for (i=1;i<=n;i++) {b[i]=sa[i]*x[i];Start with diagonal term.for (k=ija[i];k<=ija[i+1]-1;k++)Loop over off-diagonal terms.b[i] += sa[k]*x[ija[k]];}}It is also simple to multiply the transpose of a matrix by a vector to its right.
(We will usethis operation later in this section.) Note that the transpose matrix is not actually constructed.void sprstx(float sa[], unsigned long ija[], float x[], float b[],unsigned long n)Multiply the transpose of a matrix in row-index sparse storage arrays sa and ija by a vectorx[1..n], giving a vector b[1..n].{void nrerror(char error_text[]);unsigned long i,j,k;if (ija[1] != n+2) nrerror("mismatched vector and matrix in sprstx");for (i=1;i<=n;i++) b[i]=sa[i]*x[i];Start with diagonal terms.for (i=1;i<=n;i++) {Loop over off-diagonal terms.for (k=ija[i];k<=ija[i+1]-1;k++) {j=ija[k];b[j] += sa[k]*x[i];}}}(Double precision versions of sprsax and sprstx, named dsprsax and dsprstx, are usedby the routine atimes later in this section.
You can easily make the conversion, or else getthe converted routines from the Numerical Recipes diskettes.)In fact, because the choice of row-indexed storage treats rows and columns quitedifferently, it is quite an involved operation to construct the transpose of a matrix, given thematrix itself in row-indexed sparse storage mode. When the operation cannot be avoided,it is done as follows: An index of all off-diagonal elements by their columns is constructed(see §8.4). The elements are then written to the output array in column order.
As eachelement is written, its row is determined and stored. Finally, the elements in each columnare sorted by row.void sprstp(float sa[], unsigned long ija[], float sb[], unsigned long ijb[])Construct the transpose of a sparse square matrix, from row-index sparse storage arrays sa andija into arrays sb and ijb.{void iindexx(unsigned long n, long arr[], unsigned long indx[]);Version of indexx with all float variables changed to long.unsigned long j,jl,jm,jp,ju,k,m,n2,noff,inc,iv;float v;n2=ija[1];Linear size of matrix plus 2.for (j=1;j<=n2-2;j++) sb[j]=sa[j];Diagonal elements.iindexx(ija[n2-1]-ija[1],(long *)&ija[n2-1],&ijb[n2-1]);Index all off-diagonal elements by their columns.jp=0;for (k=ija[1];k<=ija[n2-1]-1;k++) {Loop over output off-diagonal elements.m=ijb[k]+n2-1;Use index table to store by (former) columns.sb[k]=sa[m];for (j=jp+1;j<=ija[m];j++) ijb[j]=k;Fill in the index to any omitted rows.2.7 Sparse Linear Systems81jp=ija[m];Use bisection to find which row elementjl=1;m is in and put that into ijb[k].ju=n2-1;while (ju-jl > 1) {jm=(ju+jl)/2;if (ija[jm] > m) ju=jm; else jl=jm;}ijb[k]=jl;}for (j=jp+1;j<n2;j++) ijb[j]=ija[n2-1];for (j=1;j<=n2-2;j++) {Make a final pass to sort each row byjl=ijb[j+1]-ijb[j];Shell sort algorithm.noff=ijb[j]-1;inc=1;do {inc *= 3;inc++;} while (inc <= jl);do {inc /= 3;for (k=noff+inc+1;k<=noff+jl;k++) {iv=ijb[k];v=sb[k];m=k;while (ijb[m-inc] > iv) {ijb[m]=ijb[m-inc];sb[m]=sb[m-inc];m -= inc;if (m-noff <= inc) break;}ijb[m]=iv;sb[m]=v;}} while (inc > 1);}}The above routine embeds internally a sorting algorithm from §8.1, but calls the externalroutine iindexx to construct the initial column index.
This routine is identical to indexx,as listed in §8.4, except that the latter’s two float declarations should be changed to long.(The Numerical Recipes diskettes include both indexx and iindexx.) In fact, you canoften use indexx without making these changes, since many computers have the propertythat numerical values will sort correctly independently of whether they are interpreted asfloating or integer values.As final examples of the manipulation of sparse matrices, we give two routines for themultiplication of two sparse matrices.
These are useful for techniques to be described in §13.10.In general, the product of two sparse matrices is not itself sparse. One therefore wantsto limit the size of the product matrix in one of two ways: either compute only those elementsof the product that are specified in advance by a known pattern of sparsity, or else compute allnonzero elements, but store only those whose magnitude exceeds some threshold value.
Theformer technique, when it can be used, is quite efficient. The pattern of sparsity is specifiedby furnishing an index array in row-index sparse storage format (e.g., ija). The programthen constructs a corresponding value array (e.g., sa). The latter technique runs the danger ofexcessive compute times and unknown output sizes, so it must be used cautiously.With row-index storage, it is much more natural to multiply a matrix (on the left) bythe transpose of a matrix (on the right), so that one is crunching rows on rows, rather thanrows on columns. Our routines therefore calculate A · BT , rather than A · B. This meansthat you have to run your right-hand matrix through the transpose routine sprstp beforesending it to the matrix multiply routine.82Chapter 2.Solution of Linear Algebraic EquationsThe two implementing routines,sprspm for “pattern multiply” and sprstmfor “thresholdmultiply” are quite similar in structure.
Both are complicated by the logic of the variouscombinations of diagonal or off-diagonal elements for the two input streams and output stream.void sprspm(float sa[], unsigned long ija[], float sb[], unsigned long ijb[],float sc[], unsigned long ijc[])Matrix multiply A · BT where A and B are two sparse matrices in row-index storage mode, andBT is the transpose of B. Here, sa and ija store the matrix A; sb and ijb store the matrix B.This routine computes only those components of the matrix product that are pre-specified by theinput index array ijc, which is not modified. On output, the arrays sc and ijc give the productmatrix in row-index storage mode.
For sparse matrix multiplication, this routine will often bepreceded by a call to sprstp, so as to construct the transpose of a known matrix into sb, ijb.{void nrerror(char error_text[]);unsigned long i,ijma,ijmb,j,m,ma,mb,mbb,mn;float sum;if (ija[1] != ijb[1] || ija[1] != ijc[1])nrerror("sprspm: sizes do not match");for (i=1;i<=ijc[1]-2;i++) {Loop over rows.j=m=i;Set up so that first pass through loop does themn=ijc[i];diagonal component.sum=sa[i]*sb[i];for (;;) {Main loop over each component to be output.mb=ijb[j];for (ma=ija[i];ma<=ija[i+1]-1;ma++) {Loop through elements in A’s row. Convoluted logic, following, accounts for thevarious combinations of diagonal and off-diagonal elements.ijma=ija[ma];if (ijma == j) sum += sa[ma]*sb[j];else {while (mb < ijb[j+1]) {ijmb=ijb[mb];if (ijmb == i) {sum += sa[i]*sb[mb++];continue;} else if (ijmb < ijma) {mb++;continue;} else if (ijmb == ijma) {sum += sa[ma]*sb[mb++];continue;}break;}}}for (mbb=mb;mbb<=ijb[j+1]-1;mbb++) {Exhaust the remainder of B’s row.if (ijb[mbb] == i) sum += sa[i]*sb[mbb];}sc[m]=sum;sum=0.0;Reset indices for next pass through loop.if (mn >= ijc[i+1]) break;j=ijc[m=mn++];}}}#include <math.h>void sprstm(float sa[], unsigned long ija[], float sb[], unsigned long ijb[],float thresh, unsigned long nmax, float sc[], unsigned long ijc[])2.7 Sparse Linear Systems83Matrix multiply A · BT where A and B are two sparse matrices in row-index storage mode, andBT is the transpose of B.