2. Measurement Process Characterization 2.3. Calibration 2.3.6. Instrument calibration over a regime 2.3.6.5. Data analysis and model validation |
|
First step - plot the calibration data | If the model for the calibration curve is not known from theoretical considerations or experience, it is necessary to identify and validate a model for the calibration curve. To begin this process, the calibration data are plotted as a function of known values of the reference standards; this plot should suggest a candidate model for describing the data. A linear model should always be a consideration. If the responses and their known values are in the same units, a plot of differences between responses and known values is more informative than a plot of the data for exposing structure in the data. |
Warning - regarding statistical software | Once an initial model has been chosen, the coefficients in the model are estimated from the data using a statistical software package. It is impossible to over-emphasize the importance of using reliable and documented software for this analysis. |
Output required from a software package |
The software package will use
the method of least squares
for estimating the coefficients. The software package should also be
capable of performing a 'weighted' fit for situations where
errors of measurement are
non-constant over the calibration interval. The
choice of weights
is usually the responsibility of the user. The software package
should, at the minimum, provide the following information:
|
Typical analysis of a quadratic fit |
Load cell measurements are
modeled as a quadratic function of known loads as shown below.
There are three repetitions at each load level for a total of 33
measurements.
Parameter estimates for model y = a + b*x + c*x*x + e: Parameter Estimate Std. Error t-value Pr(>|t|) a -1.840e-05 2.451e-05 -0.751 0.459 b 1.001e-01 4.839e-06 20687.891 <2e-16 c 7.032e-06 2.014e-07 34.922 <2e-16 Residual standard error = 3.764e-05 (30 degrees of freedom) Multiple R-squared = 1 Adjusted R-squared = 1 Analysis of variance table: Source of Degrees of Sum of Mean Variation Freedom Squares Square F-Ratio Pr(>F) Model 2 12.695 6.3475 4.48e+09 <2.2e-16 Residual 30 4.2504e-08 1.4170e-09 (Lack of fit) 8 4.7700e-09 5.9625e-10 0.3477 0.9368 (Pure error) 22 3.7733e-08 1.7151e-09 Total 32 12.695 The analyses shown above can be reproduced using Dataplot code and R code. The reader can download the data as a text file. Note: Dataplot reports a probability associated with the F-ratio (for example, 6.334 % for the lack-of-fit test), where a probability greater than 95 % indicates an F-ratio that is significant at the 5 % level. R reports a p-value that corresponds to the probability greater than the F-ratio, so a value less than 0.05 would indicate significance at the 5 % level. Other software may report in other ways; therefore, it is necessary to check the interpretation for each package. |
The F-ratio is used to test the goodness of the fit to the data | The F-ratio provides information on the model as a good descriptor of the data. The F-ratio is compared with a critical value from the F-table. An F-ratio smaller than the critical value indicates that all significant structure has been captured by the model. |
F-ratio < 1 always indicates a good fit | For the load cell analysis, a plot of the data suggests a linear fit. However, the linear fit gives a very large F-ratio. For the quadratic fit, the F-ratio is 0.3477 with v1 = 8 and v2 = 22 degrees of freedom. The critical value of F(0.05, 8, 20) = 2.45 indicates that the quadratic function is sufficient for describing the data. A fact to keep in mind is that an F-ratio < 1 does not need to be checked against a critical value; it always indicates a good fit to the data. |
The t-values are used to test the significance of individual coefficients | The t-values can be compared with critical values from a t-table. However, for a test at the 5 % significance level, a t-value < 2 is a good indicator of non-significance. The t-value for the intercept term, a, is < 2 indicating that the intercept term is not significantly different from zero. The t-values for the linear and quadratic terms are significant indicating that these coefficients are needed in the model. If the intercept is dropped from the model, the analysis is repeated to obtain new estimates for the coefficients, b and c. |
Residual standard deviation | The residual standard deviation estimates the standard deviation of a single measurement with the load cell. |
Further considerations and tests of assumptions |
The residuals (differences between the measurements and their fitted
values) from the fit should also be examined for
outliers and
structure that might
invalidate the calibration curve. They are also a good indicator of
whether basic assumptions of
normality and equal precision for all measurements are valid.
If the initial model proves inappropriate for the data, a strategy for improving the model is followed. |