|
In the letter concerning our article "A Method of Visual Interactive Regression" (1), Robert de Levie outlines several points concerning the accuracy of our model and the data generated. Contrary to his claim that our analysis is incorrect, we would like to emphasize that our analysis is a valid one for the following reasons. The point at the origin (0,0) given in the first two columns for the Figure 1 is not really data. We simply placed it as a dummy data point for graphing purposes in order to show how well the least-squares line with the five real data points extends to intercept the y-axis at the zero x value. Otherwise, the software program would not extend the line to x = 0. The dummy data point could have been hidden by making it transparent. Based on the fact that the spectrophotometer was zeroed for an absorbance value (or adjusted to 100% T) at zero concentration at the very beginning of the measurements, one often tends to use the origin point (0,0) in the data. However, whether that is the best practice or not is entirely arguable depending on the various conditions of the measurements. That is why experimenters often do not include the origin point in practice, even though instrument output is usually zeroed at zero concentration (2). An output signal (S) from an analytical instrument can be given in the following formula, which is the most general for any linear system (2): S = mC + Sb where m (slope) is the sensitivity of the instrument, C is the concentration of analyte, and Sb is the signal from a blank. The output signal S can be values of spectrophotometric absorbance, fluorimetric intensity, electroanalytical current or charge, or chromatographic peak area, among other possibilities. The Beer–Lambert law (S = mC) holds only under an assumption of the blank signal Sb being absolutely zero. Inclusion of an origin point is only reliable when a system (instrument) is absolutely stable and free of interference from other species similar to the analyte. Thus, inclusion of the origin point (0,0) is prone to introduce additional error for a complicated system. In actual practice (rather than theory), blanks are rarely a pure solvent: they often contain species that can interfere with the signal from the analyte. This was not a problem in the Co2+ system we used in which the blank was pure water. The primary reason we excluded the origin point in the least-squares analysis is that the instrument we used in this work, as described in the online portion of our article, was a Spectronic 21 (Milton Roy), which is a single-beam spectrophotometer. A zero value of Sb at the beginning may no longer remain zero when a measurement with a sample is taken because of a possible drift in the background signal. If it is with a double-beam spectrophotometer with which the fluctuation in background signal is instantly corrected, we could have included the origin point as a reliable data point. Inclusion of the origin point can cause adverse effects especially when the background signal drifts away unidirectionally, introducing a systematic error. Thus, we believe that exclusion of the origin point in the regression for our particular system is a more general and common practice for building a standard curve. When an experimenter is absolutely sure of a system for its origin point (zero signal at zero concentration), then the intercept can be forced to zero, adopting the Beer–Lambert law. We tried this also (method B in Table 1); the data were already presented in the Table I of the Web Documentation (as JCE Webware) to compare it with the results from the two-parameter fit to observe some difference in the results; namely, about 2% difference in the slope between methods A and B. It should be noted that the result from our one-parameter fit with the five data points (method B in Table 1) is practically identical to the result from de Levie's one-parameter fit with the six data points (C in Table 1). If the online portion of the article were more readily accessible (without requiring a subscriber identification number and a password), de Levie may not have raised the question. Table 2 summarizes the results from various model equations and methods for fitting: the sums of squared deviations (SSQ), and the average value of squared deviations (ASQ). (ASQ = SSQ/n, where n is the number of data points.) In the examples fitting five data points, the origin point (0,0) is not treated as a data point; in the examples fitting six data points, the origin point is treated as a data point. When ASQ was used as a criterion of fitting, method A yielded the best result with a value of 2.08 × 10-4. However, as de Levie pointed out correctly, the system obviously deviates from the Beer–Lambert law at higher concentrations of Co2+ ions (Figure 1), which was omitted from our original article. Thus, fit to a quadratic equation appears to be better than the linear fit as evidenced in the much smaller (by about five times) value of the average squared deviations (ASQ). To be more rigorous, however, a constant term (c) can be added for the same reasons discussed previously (i.e., Sb is not necessarily zero). A = c + kC + k'C2 The results are summarized in Table 2 for comparison. As far as the ASQ is concerned, method E (the three-parameter fit with five data points and without the origin point) appears to be the best fit because it yielded the smallest value: 6.4 × 10-6. Finally, we would like to emphasize that our original article is primarily written for demonstrating a new visual way of performing regression utilizing the spreadsheet software program. Even though it was not our intention to search for the best model equation for the fit, this letter to the editor took our work one step further to find that the quadratic models are better than the linear models for the entire concentration range because of the deviation from the Beer–Lambert law at higher concentrations. The linear model still appears valid for concentrations of less than 0.1M (Figure 1). 
Figure 1. Comparison of results from several different fits of data points (see Tables 1 and 2). Only results from methods A, D, and F are shown, to increase graphical clarity.
Table 1. Comparison of Results from Several Linear Fits of Beer–Lambert Data Points, by Method of Fitting Method and Parameters | Slope Values | Intercept Values | Sums of Squared Deviations | Averages of Squared Deviations | Work Citation | | A: 5 data points 2 parameters | 3.99 | 0.014 | 0.001040 | 2.08 × 10-4 | Ref 1 | B: 5 data points 1 parameter | 4.08 | 0.00 (fixed) | 0.001309 | 2.62 × 10-4 | Ref 1 online portion of the article | C: 5 data points 1 parameter | 4.084 (±0.059) | 0.00 (fixed) | 0.001308 | 2.18 × 10-4 | de Levie's letter (preceding page) |
Table 2. Comparison of Results from Several Quadratic Fits of Beer–Lambert Data Points, by Method of Fitting Method and Parameters | Slope Values (k) | Curvature Values (k') | Constant Values (c) | Sums of Squared Deviations | Averages of Squared Deviations | Work Citation | | D: 6 data points 2 parameters | 4.53 | 2.67 | 0.00 (fixed) | 0.000297 | 4.95 × 10-5 | de Levie's letter (preceding page) | E: 5 data points 3 parameters | 5.008 | 4.579 | 0.0239 | 0.000032 | 6.4 × 10-6 | This letter | F: 6 data points 3 parameters | 4.678 | 3.274 | 0.007547 | 0.000212 | 2.18 × 10-5 | This letter |
Literature Cited- Kim, M. S.; Burkart, M.; Kim, M.-H. J. Chem. Educ. 2006, 83, 1884.
- Skoog, D. A. Principles of Instrumental Analysis, 3rd ed.; Saunders College Publishing: Philadelphia, PA, 1985; p 19, p 23.
|