roots (1185385), страница 3
Текст из файла (страница 3)
The values of theExample: Trajan Data Set F 15macro variables &max and &N have not changed, so you do not need to generate them a second time. You doneed to capture the value of the negative binomial dispersion parameter in a macro variable, as demonstratedin the following DATA step:data zinbparms;set zinbparms(where=(Parameter="Dispersion"));keep estimate;call symput('k',estimate);run;You compute the maximum likelihood estimates of the marginal probabilities the same way as before exceptthat you now specify a negative binomial distribution in the PDF function; this is where the macro variable&k that contains the negative binomial dispersion parameter is used.data zinb(drop= i);set zinb;lambda=pred/(1-pzero);k=&k;array ep{0:&max} ep0-ep&max;array c{0:&max} c0-c&max;do i = 0 to &max;if i=0 then ep{i}= pzero + (1-pzero)*pdf('NEGBINOMIAL',i,(1/(1+k*lambda)),(1/k));elseep{i}=(1-pzero)*pdf('NEGBINOMIAL',i,(1/(1+k*lambda)),(1/k));c{i}=ifn(roots=i,1,0);end;run;The marginal probabilities are computed the same as before (by computing the means of the conditionalprobabilities) except that the input data set name for the MEANS procedure set has changed from ZIP toZINB.
The SAS statements that reshape the output data sets are the same as before except that the name of thedata set that contains the results is now called Zinbprob, the variable that contains the estimated probabilitiesis called Zinb, and the variable that contains the difference between the observed relative frequencies and theestimated probabilities is named Zinbdiff.proc means data=zinb noprint;var ep0 - ep&max c0-c&max;output out=ep(drop=_TYPE_ _FREQ_) mean(ep0-ep&max)=ep0-ep&max;output out=p(drop=_TYPE_ _FREQ_) mean(c0-c&max)=p0-p&max;run;proc transpose data=ep out=ep(rename=(col1=zinb) drop=_NAME_);run;proc transpose data=p out=p(rename=(col1=p) drop=_NAME_);run;data zinbprob;merge ep p;zinbdiff=p-zinb;16 Froots=_N_ -1;label zinb='ZINB Probabilities'p='Relative Frequencies'zinbdiff='Observed minus Predicted';run;proc sgplot data=zinbprob;scatter x=roots y=p /markerattrs=(symbol=CircleFilled size=5px color=blue);scatter x=roots y=zinb /markerattrs=(symbol=TriangleFilled size=5px color=red);xaxis type=discrete;run;proc sgplot data=zinbprob;series x=roots y=zinbdiff /lineattrs=(pattern=ShortDash color=blue)markers markerattrs=(symbol=CircleFilled size=5px color=blue);refline 0/ axis=y;xaxis type=discrete;run;Figure 6 displays the comparative plots for the ZINB model.Figure 6 Comparison of ZINB Probabilities to Observed Relative FrequenciesZINB Probabilities versus Relative FrequenciesObserved Relative Frequencies Minus ZINB ProbabilitiesYou can also produce a plot that enables you to visually compare the fits of the ZIP and ZINB models.
To dothis, you merge the two data sets Zipprob and Zinbprob and plot the differences between the observed relativefrequencies and the estimated marginal probabilities for both the ZIP and ZINB models.The following DATA step merges the data sets Zipprob and Zinbprob, and then the SGPLOT procedureproduces the comparative plot:data compare;merge zipprob zinbprob;by roots;run;Example: Trajan Data Set F 17proc sgplot data=compare;series x=roots y=zinbdiff /lineattrs=(pattern=Solid color=red)markers markerattrs=(symbol=TriangleFilled color=red)legendlabel="ZINB";series x=roots y=zipdiff /lineattrs=(pattern=ShortDash color=blue)markers markerattrs=(symbol=StarFilled color=blue)legendlabel="ZIP";refline 0/ axis=y;xaxis type=discrete;run;Inspection of Figure 7 does not reveal any clear indication that one model fits better than the other.Figure 7 Comparative Fit of ZIP and ZINB ModelsThe cumulative evidence suggests that the ZINB model provides an adequate fit to the data and that it isat least as good as, or superior to, the ZIP model for these data.
With no evidence of overdispersion, it isreasonable to assume that the standard errors of the ZINB model’s parameter estimates are unbiased and thatthe model’s estimates are suitable for statistical inference.It was clear from the graphical evidence at the outset that Photoperiod has a significant effect, and this issupported by the ZINB model results. The model also indicates that BAP is a significant predictor of thenumber of roots; but with both main and interaction effects, the relationship between the number of roots andthe level of BAP is not readily apparent at first glance.
An effect plot provides a useful graphical summaryof the relationship between a model’s prediction and categorial predictor. For most models that you can fitwith PROC GENMOD, you can request an effect plot by using the EFFECTPLOT statement. However, theEFFECTPLOT statement in PROC GENMOD in SAS/STAT 12.1 is not designed for use with zero-inflatedmodels.
Nevertheless, you can create an effect plot manually by using the following SAS statements:18 Fproc sort data=zinb out=zinb;by photoperiod bap;run;proc means data=zinb;var pred;by photoperiod bap;output out=effects mean(pred)=pred;run;proc sgpanel data=effects;panelby photoperiod;series x=bap y=pred / markersmarkerattrs=(symbol=CircleFilled size=5px color=red);colaxis type=discrete;run;Figure 8 clearly shows that BAP has a negative, linear effect on the expected number of roots whenPhotoperiod = 16. However, the effect of BAP when Photoperiod = 8 is more complex; it appears to benonlinear, first increasing, and then decreasing.Figure 8 Effect of BAP by PhotoperiodReferencesRidout, M.
S., Hinde, J. P., and Demétrio, C. G. B. (1998), “Models for Count Data with Many Zeros,” inProceedings of the 19th International Biometric Conference, 179–192, Cape Town..