Armadillo: A Program for Non-Real and Real Time Analysis of Musical Sounds
on the Power Macintosh
General Presentation
Timothy Madden and James Beauchamp
University of Illinois at Urbana-Champaign
Urbana, IL 61801 USA
May, 1998
The primary objective of Armadillo is to provide a convenient way to quickly analyze musical sounds on an inexpensive computer without the need for specialized hardware. The development of high speed computers with excellent graphics and high quality audio I/O for the general consumer in the last few years has made this possible. While a number of analysis programs are available on the Intel or Macintosh platforms, there are few that are dedicated to the analysis of musical sounds as Armadillo is.
Armadillo has several features that distinguish it from previous analysis programs:
1) It performs analysis in both real time and non-real time. In real time mode it operates on a signal produced by microphone or CD input. In non-real time mode it operates on either a pre-stored signal file or a pre-computed Analysis File.
2) Graphic display windows are provided for simultaneous viewing of the amplitude spectrum in a 1-D display (amplitude vs. frequency), a Waterfall Display (successive amplitude vs. frequency graphs), a 2-D display (frequency vs. time with amplitude depicted by color brightnes), and a 3-D display (amplitude vs. time at several frequencies) with each frequency component having a different color). In addition, a signal display window is provided.
3) The 1-D display can be rendered as a bar graph, which shows the individual harmonics most clearly, or as a series of connected line segments, which delineates the spectral envelope of the sound. This changes in time as the sound changes.
4) The vertical frequency axis of the 2-D display can be put in linear frequency mode (where frequencies are linearly spaced) or in logarithmic mode (where frequencies are logarithmically spaced). The data is adjusted vertically to correspond to the axis type.
5) The orientation of the 3-D display can be altered instantaneously by clicking the mouse on the display according to x, y position. The x position changes the horizontal rotation of the 3-D graph. The y position changes the elevation of the view of the 3-D graph. Also, the graph has true 3-D perspective for ease of visualization.
6) The same display windows can be used for real time and non-real time analysis. In the case of real time analysis, the analysis is somewhat "chunky" in that the frame hop size is quite large ( typically 100 ms. with a 180 MHz 604e cpu). The speed of the window updates is, of course, affected by how many processes are going on simultaneously. Windows can be frozen to free up time for other windows. Also, a "more accurate" mode can be chosen at the expense of less mouse responsiveness. Note that as faster processors become available, the chunkyness of analysis display will become less and less. However, if non-real time analysis is chosen, no data is missed, and the analysis will be very smooth.
7) Non-real time analysis can be viewed using static 2-D or 3-D views or by using the same windows as the real time analysis to display "spectrum movies". Either case requires prior storage of a sound file, which can be accomplished within Armadillo. The sound file can be analyzed in non-real time while displaying the analysis windows. Alternatively, analysis can occur "behind the scenes", with no visual display accompanying it. After this analysis is complete,the resulting data is saved to an analysis file for future use. Then, static views of the 2-D and 3-D analysis can then be selected, which generally show the complete, very accurate analyses in one view. Or the same windows used for real time analysis can be used to display the analysis file data. In "spectrum movie mode", the data changes in time more slowly than the original time scale, but unlike the real time case, no data is missed.
8) Analysis can be "tuned" or "untuned". With the untuned mode, an FFT size is chosen, which must be a power of 2. Thus, the analysis frequency frequencies are restricted to (sample rate)/(FFT size). With the tuned mode, it is assumed that the input signal consists of harmonics, and an arbitrary fundamental frequency for the analysis can be chosen which generally corresponds to the exact pitch of the signal. In this case, the FFT size is chosen to correspond to two periods of the input signal, and only those frequencies corresponding to harmonics of the signal are kept. To accomplish this, the original sample rate data is interpolated to produce a power-of-two number of points within the FFT window.
9) The program works with either monaural or stereo sounds in both real time and non-real time (with AIFF sound files). One can easily switch back and forth between the two channels of a stereo analysis.
The analysis engine is a phase vocoder which uses a sequence of overlapping FFT's on a Hamming window processed signal. In the case of tuned analysis, the signal is first interpolated to a new sampling frequency so that the signal's period corresponds to a power of two number of samples. In this case, the FFT/Hamming window actually corresponds to two periods of the signal, so that the harmonics of the signal correspond to the even components of the analysis. Thus, in the tuned case, the even components are repressed.
Applications. The primary use for Armadillo is analysis of pitched musical tones. This may be used by instructors of musical instruments or voice to help students visualize how they can correct deficiencies in their sound production or by musicians in general as an aid to developing good tone quality. Another possibility is classroom depiction of pre-recorded music as an aid to analysis of the music. Other applications are analysis of noise in recordings, calibration of synthesizer voicing, and analysis of sounds leading to new methods of synthesis.
What Armidillo doesn't do. Armadillo is strictly an analyzer and provides only limited editing capability of sound files, and no editing of analysis files. It also does not do synthesis. It provides composite graphs of the amplitude spectrum but does not provide individual graphs of amplitude vs. time or frequency vs. time.