7.6 Determining the grids in practice

Based upon the theory developed in the previous sections, I present here a practical set of algorithms for the determination of both the optimal data binning and the model energy grid determination. This may be helpful for the practical purpose of developing software for a particular instrument in order to construct the relevant response matrix.

7.6.1 Creating the data bins

Given an observed spectrum obtained by some instrument, the following steps should be performed in order to generate an optimally binned spectrum.

  1. Determine for each original data channel i the nominal energy Ej0, defined as the energy for which the response at channel i reaches its maximum value. In most cases this will be the nominal channel energy.
  2. Determine for each data channel i the limiting points (i1,i2) for the FWHM, in such a way that Rk,j0 0.5Ri,j0 for all i1 k i2 while the range of (i1,i2) is as broad as possible.
  3. By (linear) interpolation, determine for each data channel the points (fractional channel numbers) c1 and c2 near i1 and i2 where the response is actually half its maximum value. By virtue of the previous step, the absolute difference |c1 - i1| and |c2 - i2| never can exceed 1.
  4. Determine for each data channel i the FWHM in number of channels ci, by calculating c2 - c1. Assure that ci is at least 1.
  5. Determine for each original data channel i the FWHM in energy units (e.g. in keV). Call this Wi. This and the previous steps may of course also be performed directly using instrument calibration data.
  6. Determine the number of resolution elements R by the following approximation:
        ∑  1
R =    c-.
     i  i
    (7.70)

  7. Determine the parameter λk(R) using (7.28).
  8. Determine for each bin the effective number of events Nr from the following expressions:

             i2∑+1
Cr  =        Ck,                                    (7.71)
       k=i1-1
       ∑Nc       i∑2+1
hr  =      Rk,j0∕      Rk,j0,                         (7.72)
       k=1      k=i1- 1
N   =  C  h .                                       (7.73)
 r       r r

    In the above, Ck is the number of observed counts in channel k, and Nc is the total number of channels. Take care that in the summations i1 - 1 and i2 + 1 are not out of their valid range (1,Nc). If for some reason there is not a first-order approximation available for the response matrix Rk,j then one might simply approximate hr from e.g. the Gaussian approximation, namely hr = 1.314, cf. section 2. This is justified since the optimum bin size is not a strong function of Nr, cf. fig. 7.4. Even a factor of two error in Nr in most cases does not affect the optimal binning too much.

  9. Calculate δ = λk(R)√Nr-- for each data channel.
  10. Using (7.33), determine for each data channel the optimum data bin size in terms of the FWHM. The true bin size bi in terms of number of data channels is obtained by multiplying this by ci calculated above during step 4. Make bi an integer number by ignoring all decimals (rounding it to below), but take care that bi is at least 1.
  11. It is now time to merge the data channels into bins. In a loop over all data channels, start with the first data channel. Name the current channel i. Take in principle all channels k from channel i to i + bi - 1 together. However, check that the bin size does not decrease significantly over the rebinning range. In order to do that check, determine for all k between i and i + bi - 1 the minimum ai of k + bk. Extend the summation only from channel i to ai - 1. In the next step of the merging, ai becomes the new starting value i. The process is finished when ai exceeds Nc.

7.6.2 creating the model bins

After having created the data bins, it is possible to generate the model energy bins. Some of the information obtained from the previous steps that created the data bins is needed.

The following steps need to be taken:

  1. Sort the FWHM of the data bins in energy units (Wi) as a function of the corresponding energies Ej0. Use this array to interpolate later any true FWHM. Also use the corresponding values of Nr derived during that same stage. Alternatively, one may use directly the FWHM as obtained from calibration files.
  2. Choose an appropriate start and end energy, e.g. the nomimal lower and upper energy of the first and last data bin, with an offset of a few FWHMs (for a Gaussian, about 3 FWHM is sufficient). In the case of a lsf with broad wings (like the scattering due to the RGS gratings) it may be necessary to take an even broader energy range.
  3. In a loop over all energies as determined in the previous steps, calculate δ = λk(R)√ ---
  Nr. The value of λk(R) is the same as used in the determination of the data channel grid.
  4. Determine also the effective area factor d ln E d ln A for each energy; one may do that using a linear approximation.
  5. For the same energies, determine the necessary bin width in units of the FWHM using eqn. (7.61). Combining this with the FWHMs determined above gives for these energies the optimum model bin size ΔE in keV.
  6. Now the final energy grid can be created. Start at the lowest energy E1,1, and interpolate in the ΔE table the appropriate ΔE(E1,1) value for the current energy. The upper bin boundary E2,1 of the first bin is then simply E1,1 + ΔE(E1,1).
  7. Using the recursive scheme E1,j = E2,j-1, E2,j = E1,j + ΔE(E1,j) determine all bin boundaries untill the maximum energy has been reached. The bin centroids are simply defined as Ej = 0.5(E1,j + E2,j).
  8. Finally, if there are any sharp edges in the effective area of the instrument, it is necessary to add these edges to the list of bin boundaries. All edges should coincide with bin boundaries.