1. Name
spectralworks
2. brief description:
"Programs for viewing, editing, and manipulating spectral data"
3. Full description
===================
Spectralworks is a suite of programs for viewing and manipulating
digitized spectroscopic data, i.e. data consisting of an optical
measurement such as light absorbance as a function of wavelength or energy
of the light (or other radiation) involved. It is most useful for
those spectroscopies for which the Lambert-Beer law or an analogous
relation holds.
The manipulations envisioned included import, export, and conversion
of spectral data between severl formats including (xxx sourceforge project),
appending different wavelength ranges, smoothing and interpolating
between points to change the wavelength sampling interval,
adding, subtracting, multiplying, and making linear combinations
of spectra, finding the best (least-squares) linear combination
of standard spectra to fit an experimental spectrum,
Fourier transform to the frequency domain and back.
and singular value decomposition (SVD) of a set of spectra to find
a small-dimensional basis set spanning the same spectral subspace,
Separate programs will be provided for globally fitting whole-spectral
potentiometric titrations or multiexponential kinetics to obtain the
spectra associated with each phase (midpoint potential or exponential decay).
Targeted Audience-
=====================
The project is aimed at scientists and engineers using spectral
data to analyze the composition of a chemical solution or to understand
the processes taking place within the solution.
Status and future plans-
=========================
nearly all these functions have been implemented and extensively tested in the
author's lab. They are coded in Microsoft Quickbasic, a powerful dialect of
BASIC that was used under DOS and macintosh before system 7. The DOS
versions runs nicely under windows 95,98, NT4.0 and Win2K, and also on most
installs of WinXP, but not on the default install of Vista. Drag and drop
functinality and the use of DOS variables has been added to make the suite
work well in the windows point-and-click environment.
The quickbasic code will be available on the site as the basis for the
new project, and statically linked executables are available from the author.
(However since the compiled program contains proprietary object code
from microsoft, these executables cannot be distributed under the GPL??)
The goal of the project is to port these algorithms to 100% open source,
multi-platform code, and then to continue to develop the code, making the
user interface more graphical and friendly, and adding new functionality
as demand arises.
This goal can be achieved in the short term using another sourceforge project,
the Free Basic Compiler (FBC; http://www.sourceforge.net/projects/fbc)
to compile the programs. FBC produces executables for windows
or i386 Linux machines (and macintosh?). At present the programs do not compile under
FBC due to a small number of syntax peculiarities used by the author that are not
supported by FBC, however these can be found and fixed in short order.
Like Microsoft quickbasic for DOS, FBC has little support for advance graphical user
interfaces. There is an open version of Visual Basic (BCX project),
however the syntax is markedly different from quickbasic and it seems to be
squarely targeted at the Windows platform.
In the long term it is hoped to convert the program to universally supported
languages (fortran and C) and link with Athena Widgets and/or openMotif for
graphics and GUI support. The basic language is very similar to fortran in
structure and statements, so conversion is straightforward if time-consuming.
Fortran can then be compiled using the gcc (g77) compiler.
Applications:
The viewer/editor functions allow one to inspect the spectral curves; add,
subtract and multiply by a constant, and add or subtract different spectra,
allowing the user to construct arbitrary linear combinations of given spectra.
Least Squares Fitting of experimental spectra to standard basis spectra:
The least-squares fitting routine automates the common task of
analyzing the spectrum of a mixture to determine its composition.
The algorithm was originally described by Sternberg et al. 1963,
but apparently not implemented in any of the spectrophotometer
software available today.
For example, a common practice in plant physiology or aquatic ecology
is to take an acetone extract of plant tissues or algal cells and
from the spectrum determine the amount and ratio of chlorophyll a, b,
and various chlorophyll c molecules, which tells something about the
photosynthetic capability, and population distribution in the case of
the algal community. This is currently done by measuring absorbance
at a small number of wavelengths (equal to the number of analytes) and
solving simultaneous equations. This is error-prone both because there is
no warning when additinal spectral components are available (invalidating
the simultaneous equations) and because the peak wavelengths used are
not actually the best in terms of noise analysis (it should be peaks
or extrema in the rows of the least-squares inverse matrix). By using
whole spectra, the optimum wavelengths will necessarily be included,
and it can be shown that the less discriminating wavelengths do not
detract from the accuracy. And if an additional spectral component
is present which is not included in the set of standards, the spectrum
will be impossible to fit (or rather the fit will be so bad that
no false results will be used).
The Windows software presently available will take a series of
experimental spectra, fit them as linear combinations of the
standards, display the fit spectrum overlayed on the experimental
one, and print out a table with the results. If desired a figure can
be made showing the spectra of each component in the right proportions
to add up to make the experimental spectrum. A demo/tutorial of
this capability is available at http://sb20.lbl.gov/berry/scanlsf/ .
Global fitting of experimental data to theoretical models (kinetic,
thermodynamic).
Experiments like whole-spectrally monitored potentiometric
titrations or kinetics result in a 3-dimensional dataset: the dependent
variable Absorbance as a function of both wavelength and spectrum
number, the latter representing redox potential or time. If the spectra of
the pure components are known, then the least squares procedure above
can be used to convert the data into concentration vs time data, and
analysis can proceed from there.
If the spectra of the components are
not known, the extinction of each compound in the model at each wavelength
can be included as additional parameters of the model, and optimizing the
parameters provides the spectra as well as the other parameters. For
full-spectral data this results in a huge increase in the number of
parameters, however the additional parameters are all linear ones. We
have developed an efficient and robust method of fitting this kind of
problem by alternating linear least squares optimization of the linear
parameters at the current values of the nonlinear ones with gradient search
method on the nonlinear parameters (the gradient wrt each linear parameter
now being zero as it has just been optimized). It is important to do this
optimization globally, i.e. minimize the residual over all wavelengths and
all time points simultaneously, rather than treating each wavelength as a
different problem and later trying to put the results together, because
the different relative absorbance of the different compounds at different
wavelengths greatly increases the discrimination. The singular value
decomposition (SVD)can be used to reduce the dimension of the problem and
reduce the effect of noise while maintaining the advantages of global
fitting. In this cae a final step is required to reconstruct the actual
spectra of the compounds from the eigenspectra and fitting parameters.
More background and references.
background.
The optical absorption (or absorbance)of a sample is the negative log of
the fraction of incident light passing through the sample. This value
depends not only on the properties of the sample, but also on the wavelength
of light used. Thus the absorption spectrum of a sample is a useful
fingerprint for identifying its composition, and very sensitive and
sophisticated instruments have been developed for measuring this spectrum.
In most circumstances the absorption, at any wavelength, of a solution
of the absorbing compound, is proportional to the concentration
of the compound. Furthermore the absorption of a mixture of different
non-interacting compounds is the sum of the absorption of each compound.
Applying this at each wavelength we see that absorption spectra constitute
a linear function space such that the absorption spectrum of 0.3M A
and 0.2M B is equal to 0.3 times the spectrum of 1M A plus 0.2 times
the spectrum of 1M B, i.e. the spectrum of a mixture is a linear
combination of the spectra of the individual components with appropriate
multipliers.
As the wavelength of light is a continuously varying variable, so an
absorption spectrum is in principle a continuous function. However
digital spectrophotometers digitize the absorbance at discrete points,
often at equal intervals of wavelength or energy. These discretely
sampled spectra can best be treated as vectors. In principle they are
vectors in an n-dimensional vector space, where n is the number of
points in each spectrum. However because the spectra are linear
combinations of the pure spectra of a small number of absorbing
compounds, they occupy a much lower-dimension subspace of that
vector space, at least in the absence of noise (error).
The addition of random noise to all the points expands the dimensin,
but for small noise all spectra will be close to the subspace spanned
by the pure spectra. The least-squares method allows us to take the
projection of an observed spectrum onto the subspace spanned by the
known component spectra, and determine the proportions of each that
would have the greatest likelihood of resulting in the observed
spectrum. This is what I call problem 1: determining the concentration
of each of several known compounds in a mixture from the spectrum
of the mixture.
Problem 2 is taking a set of spectra with different linear combinations
of the UNKNOWN basis spectra, finding a low-dimension subspace
which includes them all within the noise level, and (if enough
experimental information is available) determining the actual spectra
of the components present in the mixtures (where they are not
available in pure form).
license
The GNU General Public License (GPL)