Determining the grids in practice

7.6 Determining the grids in practice

Based upon the theory developed in the previous sections, I present here a practical set of algorithms for the determination of both the optimal data binning and the model energy grid determination. This may be helpful for the practical purpose of developing software for a particular instrument in order to construct the relevant response matrix.

7.6.1 Creating the data bins

Given an observed spectrum obtained by some instrument, the following steps should be performed in order to generate an optimally binned spectrum.

Determine for each original data channel i the nominal energy E_j0, defined as the energy for which the response at channel i reaches its maximum value. In most cases this will be the nominal channel energy.
Determine for each data channel i the limiting points (i1,i2) for the FWHM, in such a way that R_k,j0 ≥ 0.5R_i,j0 for all i1 ≤ k ≤ i2 while the range of (i1,i2) is as broad as possible.
By (linear) interpolation, determine for each data channel the points (fractional channel numbers) c1 and c2 near i1 and i2 where the response is actually half its maximum value. By virtue of the previous step, the absolute difference |c1 - i1| and |c2 - i2| never can exceed 1.
Determine for each data channel i the FWHM in number of channels c_i, by calculating c2 - c1. Assure that c_i is at least 1.
Determine for each original data channel i the FWHM in energy units (e.g. in keV). Call this W_i. This and the previous steps may of course also be performed directly using instrument calibration data.
Determine the number of resolution elements R by the following approximation:
$∑ 1 R = c-. i i$ (7.70)
Determine the parameter λ_k(R) using (7.28).
Determine for each bin the effective number of events N_r from the following expressions:
$i2∑+1 Cr = Ck, (7.71) k=i1-1 ∑Nc i∑2+1 hr = Rk,j0∕ Rk,j0, (7.72) k=1 k=i1- 1 N = C h . (7.73) r r r$

In the above, C_k is the number of observed counts in channel k, and N_c is the total number of channels. Take care that in the summations i1 - 1 and i2 + 1 are not out of their valid range (1,N_c). If for some reason there is not a first-order approximation available for the response matrix R_k,j then one might simply approximate h_r from e.g. the Gaussian approximation, namely h_r = 1.314, cf. section 2. This is justified since the optimum bin size is not a strong function of N_r, cf. fig. 7.4. Even a factor of two error in N_r in most cases does not affect the optimal binning too much.
Calculate δ = λ_k(R)∕ for each data channel.
Using (7.33), determine for each data channel the optimum data bin size in terms of the FWHM. The true bin size b_i in terms of number of data channels is obtained by multiplying this by c_i calculated above during step 4. Make b_i an integer number by ignoring all decimals (rounding it to below), but take care that b_i is at least 1.
It is now time to merge the data channels into bins. In a loop over all data channels, start with the first data channel. Name the current channel i. Take in principle all channels k from channel i to i + b_i - 1 together. However, check that the bin size does not decrease significantly over the rebinning range. In order to do that check, determine for all k between i and i + b_i - 1 the minimum a_i of k + b_k. Extend the summation only from channel i to a_i - 1. In the next step of the merging, a_i becomes the new starting value i. The process is finished when a_i exceeds N_c.

7.6.2 creating the model bins

After having created the data bins, it is possible to generate the model energy bins. Some of the information obtained from the previous steps that created the data bins is needed.

The following steps need to be taken:

Sort the FWHM of the data bins in energy units (W_i) as a function of the corresponding energies E_j0. Use this array to interpolate later any true FWHM. Also use the corresponding values of N_r derived during that same stage. Alternatively, one may use directly the FWHM as obtained from calibration files.
Choose an appropriate start and end energy, e.g. the nomimal lower and upper energy of the first and last data bin, with an offset of a few FWHMs (for a Gaussian, about 3 FWHM is sufficient). In the case of a lsf with broad wings (like the scattering due to the RGS gratings) it may be necessary to take an even broader energy range.
In a loop over all energies as determined in the previous steps, calculate δ = λ_k(R)∕. The value of λ_k(R) is the same as used in the determination of the data channel grid.
Determine also the effective area factor d ln E d ln A for each energy; one may do that using a linear approximation.
For the same energies, determine the necessary bin width in units of the FWHM using eqn. (7.61). Combining this with the FWHMs determined above gives for these energies the optimum model bin size ΔE in keV.
Now the final energy grid can be created. Start at the lowest energy E_1,1, and interpolate in the ΔE table the appropriate ΔE(E_1,1) value for the current energy. The upper bin boundary E_2,1 of the first bin is then simply E_1,1 + ΔE(E_1,1).
Using the recursive scheme E_1,j = E_2,j-1, E_2,j = E_1,j + ΔE(E_1,j) determine all bin boundaries untill the maximum energy has been reached. The bin centroids are simply defined as E_j = 0.5(E_1,j + E_2,j).
Finally, if there are any sharp edges in the effective area of the instrument, it is necessary to add these edges to the list of bin boundaries. All edges should coincide with bin boundaries.

[next] [prev] [prev-tail] [front] [up]