UNPACKING
SETTING
THE DESTINATION DIRECTORY
COMPILING
THE BINARIES
INFORMATION PAGE
SETTING FLAG
VALUES
CONTROLLING PARAMETERS WITH FUNCTIONS
RUNNING COMMANDS WITH SHELL SCRIPTS
AMPLITUDE WARPING
NOISEFILTER
COMPANDER
SPECTWARPER
SPECTRALEXTRACTOR
ADDITIVE SYNTHESIS
HARMONIZER
INHARMONATOR
CHORDMAPPERPLUS
SUBTRACTIVE SYNTHESIS
FILTER
FREQRESPONSE
CHORDRESPONSEMAKER
FILTRESPONSEMAKER
PVANALYSIS
TVFILTER
CONVOLVER
RESONANCE/REVERB
RING
RINGFILTER
RINGTVFILTER
NONLINEAR FREQUENCY DEVIATION
FILTDEVIATOR
TVFILTDEVIATOR
FEATURE EXTRACTION
ENVELOPE
CENTROID
FLUXOID
PITCHTRACKER
CONTROL FUNCTION PROCESSING
RESHAPE
OVERLAP/ADD METHOD VS. OSCILLATOR BANK METHOD AND
RESYNTHESIS THRESHOLDS
SOURCE
MULTIPLE
CHANNELS
PLAYBACK DURING
PROCESSING
INPUT SOUND
FILE
OUTPUT SOUND FILE
FLOATING-POINT AMPLITUDE RESCALING
OUTPUT STATISTICS
FREQIUENCY RESPONSE TERMINAL OUTPUT
ANALYSIS FILES
DECIBELS
LOW/HI SHELF
EQUALIZATION
WARP INDEX
PITCH TRANSPOSITION
FREQUENCY SHIFT
ENVELOPE RESPONSE
TIME
RING DECAY TIME
FFT SIZE
WINDOW SIZE
WINDOW TYPE
FRAMES PER
SECOND
TIME
EXPANSION/CONTRACTION
BEGIN/END TIMES
GAIN
FILTERING:
SOURCE SIGNAL LEVEL
TRANSPOSITION/SHIFT
APPLICATION FLAG
FILTER TYPES:
PASS OR REJECT
RESPONSE
FUNCTION SMOOTHING
ANALYSIS DATA
ACCESS MODE
CONVOLVER
PANPOT
FREQUENCY
RESPONSE ACCUMULATION METHOD
RING ROUTINES:
FILTER PLACEMENT
COMPRESSION
AND EXPANSION
UTILITIES
SOUND FILE
CONVERSION AND INFORMATION: aiffs, aiffd, nexts, nextd, and sfinfo
FUNCTION
VIEWING: showme, showspect
SAMPLE SCRIPT: S.PLAINPV
SHELL SCRIPT OUTPUT SOUND FILE
CONVERSION
GEN FUNCTION
CONTROL OF PARAMETERS
SAMPLE OUTPUT:
S.PLAINPV
PVC is a collection of phase vocoder signal processing routines and accompanying shell scripts for use in the transformation and manipulation of sounds. It is written in C and designed to be used in a UNIX environment. It has come about as a result of my path of education and research into phase vocoder technology. It follows in the spirit of the work by Eric Lyon (out of which PVC is built) and Chris Penrose whose particular dsp research springs from the coding and tutorial work of F.R. Moore and Mark Dolson. Moore's book, Elements of Computer Music, published by Prentice Hall, is consequently a great resource for making sense of the phase vocoder engine which I am unable to go into here. Curtis Road's book, The Computer Music Tutorial, published by MIT Press, has sections on the phase vocoder as well; these may better introduce the beginner to the practical concerns of this technology. Short of the explanations these sources provide, I have attempted to offer below some explanations, particularly as needed for control of the parameters in these routines. A manual and tutorial would be great to have; unfortunately time has not yet made it so.
These routines reflect my need for tools that can perform different spectral resynthesis tasks; both simple and experimental. Their refinement has advanced with my growing skills and curiosity, which I expect will continue as long as I have questions about sound. Most of these routines can be viewed in terms of traditional additive or subtractive synthesis tasks, coming about as they did from the desire for greater finesse and control of these two basic types of synthesis. While the speculative nature of some give them an idiosyncractic character, most should, with practice, reveal the transparency of their names if not the role they can play in the shaping of sound. All require a good ear tuned towards sound and idea as none of these routines are automatic, although many hold great potential for the diligent.
PVC 4.0 contains modifications and some new
routines—chordmapperplus now
replaces chordmapper, spectralextractor and extractor (a work-in-progress
made for use in additive resynthesis using SuperCollider) are added, but
mostly 4.0 is a port to OSX.
The sound file conversion utility SOX by Lance Norskog has been included along with shell script utilities
for converting quickly and easily between NeXT/Sun and AIFF formats.
“Make all” will compile the PVC
and CMUSIC gen routines along with
SOX. Each can be compiled separately. “Make install” will copy all previously compiled binaries to the destination
directory along with the sound file
conversion and information utilities.
Paul Koonce
koonce@music.princeton.edu
The PVC.4.0.OSX package contains my PVC routines, along with the CMUSIC gen functions written by F.R. Moore (each in a separate directory), the SOX sound file conversion utility written by Lance Norskog, and my shell scripts—aiffs, aiffd, nexts, nextd for using SOX, and the shell script sfinfo that cribs together several OSX utilities to get information on a sound file. Moore's standalone gen functions are useful tools for creating function files that can then be used in the time-varying control of parameters. The gen functions included are: cspline, gen0, gen1, gen2, gen3, gen4, gen5, gen6, and genraw; a one-line summary can be obtained by running the command without any arguments. A detailed explanation of each can be found in the appendix of Moore's Elements of Computer Music.
You can compile and install all of the above routines separately or together following unpacking and setting of the destination directory.
First move PVC.4.0.OSX.tar.gz to the directory of your choice. Unzip it with gunzip.
gunzip PVC.tar.gz
Then, unarchive it with tar.
tar xvf PVC.tar
This will produce a PVC.4.0.OSX master directory in which you will find several other directories.
2) Set the Destination Directory:
You will need to set the destination directory in the Makefile located in the PVC.4.0.OSX directory. This is the master makefile for all routines. It is set as ~/bin which unchanged will put all routines in your home directory bin folder. If you have a ~/bin directory and have put it in the path in your .cshrc login file you should be fine. (You will have to make a .cshrc file and specify it in the preferences of the Terminal application.) If the system is multi-user, set the destination directory to a common directory.
To compile the PVC, CMUSIC, and SOX routines together, type:
make all
If successful, install the binaries and shell script utilities by typing:
make install
Or both at once as in:
make all install
To compile and install only the PVC routines, type:
make PVC install
Or the gen routines:
make GEN install
Or the SOX routine:
Make SOX install
In all cases, the command, make install, moves the compiled routines from the PVC.4.0.OSX/bin directory to your specified destination directory along with the sound file utilities in the UTILITIES directory. If the destination directory is in your .cshrc path you should be able to type any of the routines and see their flag information page. (You must first type, source ~/.cshrc, or open a new shell to get them listed in the search-path tables.) Try typing:
plainpv
for example.
The routines are UNIX, command-line routines in the form of:
routine [flags] input_soundfile output_soundfile
At present the only soundfile formats excepted (both input and output) are NEXT/SUN formant files in either 16-bit short samples, or 32-bit floating-point samples. In PVC, the float form has a required rescale function which is the whole reason for using floats in this case. All processing is done in floats. Control of each routine's parameters is done through flags, as in:
plainpv -N1024 -p12 input.snd output.snd
In most cases, the parameter flag inputs allow you to specify either a constant (i.e. -p12 ) or a function file (i.e. -p/tmp/pitch_change ); function files give you time-varying control over a parameter (see below). In most cases, parameters are initialized to the default values listed in brackets on the information page. Parameters can be controlled by function files if (func) appears in the flag explanation on the information page as seen below.
Information about any routine can be seen by typing the name of the routine without any arguments/files. Typing:
plainpv
produces the following information about plainpv.
plainpv: generic phase vocoder with dynamic controls plainpv [flags] [input file (16-bit shorts)] [output file (optional)] (values in brackets denote defaults) N: FFT length (must be a power of 2) [1024] M: window size in samples (must be a power of 2) [2*FFT] (0 will automatically set window to 2*FFT size or larger) w: window type: 0 = hamming, 1 = rectangular 2 = Blackman, 3 = Bartlett triangular [0.] 4-12 = Kaiser windows for alpha = 4-12, respectively (representative sidelobe levels for alpha: 4 = -30dB, 8 = -58 dB, 12 = -90 dB) D: analysis frames per second [200] I: time expansion/contraction factor [1.] (duration = duration * factor, 1. = original time) P: pitch transposition in semitones (func) [0] a: frequency shift factor (bin frequency adder, before -P )(func) [0.] b: begin time in seconds [0.] e: end time in seconds ( 0. = end of file) [0.] C: resynthesis channel (1 -> ?) (0 = all) [0] SHELF EQ:(post transpose/shift) H: SHELF EQ: Low shelf gain in dB (func) [0.] X: SHELF EQ: High shelf gain in dB (func) [0.] m: SHELF EQ: Low shelf frequency in Hz (func) [200.] R: SHELF EQ: High shelf frequency in Hz (func) [2000.] W: warp index for reshaping magnitude response (func) [0.] Values > 0 expand the dynamic range, values < 0 compress the dynamic range. A: gain in decibels (func) [0.] l: envelope attack time (func) [0.] L: envelope release time (func) [0.] T: BRICKWALL FILTER TYPE: 0 = bandpass, not 0 = band reject [0] f: frequency window: low boundary (before -P and -a) (in Hz) [0.] F: frequency window: high boundary (before -P and -a)(in Hz) [Nyquist frequency] p: amplitude reports print mode: 0 = off, 1 = on [0] i: time interval between amplitude reports [.25] _: OUTPUT FORMAT: 0 = taken from input file 1 = 16-bit integer, 2 = 32-bit floats [0] =: PEAK RESCALE LEVEL (float output only) 0 to -96 dB Set to 1 to rescale to level of input file. [ 1 ] TERMINAL DISPLAY AND GRAPH FILE OUTPUT n: number of frames [0] u: low bin frequency [-1] U: high bin frequency (-1 = nyquist) [Nyquist frequency] S: TERMINAL DISPLAY: display option [0] (0 = off, 1 = phase data, 2 = amp data, 3 = both) c: GRAPH FILE: WRITE ascii to FILE 0 = off, 1 = freq, 2 = decibels [0] 3 = decibels - waterfall plot (When on, this flag writes ascii point pairs (with time frame on x axis) for plotting with gnuplot.) d: TERMINAL DISPLAY FILE NAME for -c [./ascii.out] t: oscillator resynthesis threshold in decibels [ -96 ]
If no output file is specifed, the name pv.out.snd will be used in the local directory. The bracketed values at the end of each parameter represent the default value; it can be changed by specifying the flag letter preceeded by a minus sign and followed by the new value, with no spaces on either side. For example, the following:
plainpv -N2048 inputfile outputfile
would change the FFT size to 2048. Some flags require files rather than constants. For these, simply supply the full pathname of the needed file as in:
twarp -F/here/there/everywhere/analysis_file
which supplies twarp with the necessary analysis file.
CONTROLLING PARAMETERS WITH FUNCTIONS:
Parameters with the word (func) just before the default value listed on the info page as in:
W: warp index for reshaping magnitude response (func) [0.]
can be controlled dynamically. This is done by providing a full pathname file in place of the constant. The file is assumed to be a headerless series of values representing how the parameter will evolve as a function of time. The values may be either 32-bit floating-point values, or ASCII numbers, arranged one-per-line (the routine deciphers which it is). The function file can have any number of values as the series is fitted to the specified duration, linearly interpolated to produce the values inbetween. Function files in 32-bit floating-point form can be created with the CMUSIC gen routines provided with this package. (See INSTALLATION.)
RUNNING THE COMMANDS WITH SHELL SCRIPTS:
While all routines can be run at the commandline, they are most easily run using the shell scripts found in the SCRIPTS directory. These scripts are useful for saving and managing the parameters; in many ways they are a poor-man's GUI. All scripts contain a top section for setting variables, and a bottom section where those variables are placed into the commandline flag structure and run. Some scripts perform two routines such as a short analysis routine followed by the main synthesis routine, while others run just one routine. The variables for the routines in a shell are set in the top section. Take note that shell script variable assignments do not allow for spaces. The numerous parameters, numbering as high as 53 in some routines, make these scripts a necessity. They will be your friend if you take care to leave the bottom part alone, and don't corrupt your variable names. Someday I will make a better way to interface with the routines; for now this is the way it is.
To run the scripts, simply type the name of the file (or appropriate pathname when running from outside the directory in which it resides). For example:
S.plainpv
(If this does not work, check to make sure the script is executable, and that the first line contains #!/bin/sh).
If things are working correctly, resynthesis should begin which you will know from the output streaming to the terminal.
(See the explanation below about using shell scripts. )
Below is a listing of the routines contained in this release along with a short description of what each does.
Plainpv is a basic phase vocoder with control of pitch transposition, frequency shift, time scale, amplitude warp and low/high shelf equalization. It also has some nice controls for looking at the data produced by the phase vocoder. Run this routine with S.plainpv. If you are interested in looking and/or graphing segments of the data, run the S.plainpv_with_printout_and_graph_files script instead and use the showspect utility. You will need to have gnuplot installed.
Twarp is like plainpv except that it works from an analysis file rather than a soundfile. This allows you to move forwards/backwards through time according to a time function file. Use pvanalysis through the script S.pvanalysis to make the analysis file; then run the script S.twarp.
Noisefilter filters out the noise in a sound by subtracting out a frequency response. The frequency response is analyzed from a short segment in the file where noise alone is found. For sounds that do not have segments of isolated noise, there is a threshold mode. Run with S.noisefilter.
Compander is a classic compressor/expander. What is different here is the use of a peaks response file. The peaks response file is a frequency response, analyzed from a segment of the sound, that is taken to represent the peak bin amplitudes for the sound. Each frequency bin of the peaks frequency response functions as the 0 dB reference point for that frequency bin. The amplitude of the frequency bin is companded relative to this reference. The entire analysis/companding process (including the analysis segment using freqresponse) can be run using the script S.compander.
Spectwarper
uses an expanded compansion scheme to highlight either
a sound's stronger, resonant components or its weaker noise/residual
components. Spectwarper is fairly similiar to compander; however, unlike compander which compands bins against the constant peak of an
input response file, spectwarper compands bins using a peak
drawn (in the current frame) from a narrow frequency band centered around the
value being processed. This causes the compansion or "warping' of the
amplitudes to accentuate(expansion) or mask(compression) formants located
within the frequency bands; the result being the noise/pitch highlighting
mentioned earlier. Part of this comes from the treatment of compression in Spectwaper. Unlike compander which only reduces the amplitude above the threshold
when compressing, spectwarper reduces the amplitude of the
entire range, becoming, in effect, an expander of the strongest amplitudes that
expands them (when the compression level is severe) out of the picture. Spectwarper is one of
my favorite routines of late simply because it provides such a simple and
powerful control over the noise and pitch characteristics of a sound. I love
it, and use it often. Run this routine with S.spectwarper.
SPECTRALEXTRACTOR
Spectralextractor uses
frequency variation to discriminate between pitch and noise. The frequency of
each bin is tracked and used to create a signal measuring the rate at which the
bin is changing frequency; the higher this rate of frequency of change the more
the bin is associated with noise rather than pitch components—that is, the
bin’s frequency instability becomes a correlate for noise. A frequency change
threshold is set to extract pitch or noise; when the rate of change falls below the threshold the bin is
identified with pitch components, above, with noise. A response time control
(lowpass filter) slows the rate at which the signal is allowed to cross the
threshold, thereby preventing gurgle noise. A lowpass filter (threshold
accumulator) is applied to the tracked bin frequency as well to smooth out
artifacts from the process. While specrtalextractor functions
differently than spectwarper, its results are very, very
similar. I prefer spectwarper which has less grit to it,
although both are interesting. Spectralextractor is newer—a
work-in-progress—and requires more refinement—its parameters of control are not
yet general enough, changing their effect when the FFT size is modified
(beware).
ADDITIVE SYNTHESIS—HARMONIZER, CHORDMAPPERPLUS, AND INHARMONATOR:
These routines all allow for a kind additive synthesis based on the remapping of phase vocoder data according to some model. Each requires an ascii data file specifying how phase vocoder information will be replicated or mapped. This mapping is constant for the run of the routine.
Harmonizer works much like a commercial harmonizer in that it allows you to create harmony against the source by adding a transposed copy of it. Here the concept is extended by allowing for multiple harmonizations, each taken from a different band of frequencies, output with seperate gain. Run this routine using the script S.harmonizer.
CHORDMAPPERPLUS
Chordmapperplus lets you specify how harmonically related groups of partials will be replicated or mapped to produce chords. An input data file organizes the remapping into tone groups, and includes ways to tune or neutralize the frequency deviations of partials. Time-varying control of these features is available as well. You can use this routine to build up thick chords from single tones, or to delicately reorganize a harmonic spectrum. Chordmapperplus is the holy grail of PVC offering exotic fruits for the vigilant. Run this routine using the script S.chordmapperplus.
Inharmonator lets you specify how the partials of one fundamental will be remapped or deviated. While the more recent and developed routine chordmapperplus is probably better for this task, I have decided to leave this routine in for now. (Think chordmapperplus.)
Filter is a very useful routine for filtering a sound by a frequency response. Filtering is achieved by first creating the frequency response through either synthesis or analysis, followed by filtering with filter. Synthetic responses are created using either chordresponsemaker (which synthesizes a spectrum as a collection of harmonic tones), or filtresponsemaker (which synthesizes a frequency response using lines and breakpoints). Analyzed responses can be made with freqresponse (which analyzes a sound file segment and constructs a response representing the peak or average amplitudes). Once made, the magnitudes of the FFT response are multiplied against the time varying magnitudes of the input sound's FFT. Filter allows time-varying control of the response shape (warp), transposition/shift, compansion, smoothing, and source/filter mix, making this a very useful tool for quickly manipulating the spectral characteristics of a sound according to your synthetic or analytic goals. The synthetic forms can be run with the scripts S.filter_with_chord_synthesis or S.filter_with_breakpoint_synthesis; the analysis-based form with S.filter_with_analysis. The analytic form is a powerful tool for bringing the color of one sound into the realm of another.
Freqresponse is a routine used by several others to prepare a spectrum for use with routines that filter, compress or limit. The response can be normalized or not depending on the needs of the routine using the response.
Chordresponsemaker is a routine that uses a collection of harmonic tones, variable in size, to create a synthetic frequency response. It is found in several filtering scripts.
Filtresponsemaker is a routine that uses breakpoints and straight lines to create a synthetic frequency response. It is found in several filtering scripts.
Pvanalysis is the time varying form of freqresponse that creates a phase vocoder analysis for use by other routines. The routines requiring pvanalysis files are twarp, convolver, tvfilter, ringtvfilter, and tvfiltdeviator. Run this routine using the script S.pvanalysis.
Tvfilter is the time-varying (tv) form of filter. Tvfilter uses a pvanalysis file to change the magnitudes of the input sound file. As it is with filter, tvfilter multiplies the magnitudes of the analysis FFT against the magnitudes of the input sound's FFT, while preserving the frequency/phase characteristics of the input sound. Preserving the phase of the input sound file results in a cross-synthesis that sounds like the input sound file covered or suppressed by the shadow of the analysis file. Like filter, tvfilter offers a variety of controls for manipulating the filter characteristic. The use of a phase vocoder analysis to represent the filter characteristic also makes possible the temporal control of the filter file (i.e. backwards/forwards control) as found with twarp. Run this routine using the script S.tvfilter.
In its setup and controll, convolver is the same as tvfilter. It's processing, however, is different. In tvfilter filtering is produced by multiplying the magnitudes from the polar form of the two analyses; leaving the phases (or frequencies) of the source intact while modifying the amplitudes of those frequencies. Convolver goes a bit further by multiplying the two analyses in their Cartesian forms. This produces an intersection of the two spectra. Unlike tvfilter which produces a shadowlike intersection, shadowing the analysis file characteristic onto the input sound file, convolver creates a true spectral intersection, allowing only that which is common to both sounds to be heard. The effect is a sound that is somewhat garbled as it outputs the more intermittently common spectral components of the two. The form of the multiplication in convolver does not allow some of the filter transposition controls associated with tvfilter. There is however a convolution panpot that offers control of the mix between the convolution and source sounds. Run this routine using the script S.convolver.
Ring uses the phase vocoder to create an all-pass resonator. It works by structuring the FFT resynthesis as a bank of feedback filters that feed back the sinusoid of each bin in a strength proportional to the amplitude of that bin (after adjustment by global feedback controls). This allows the sound to "ring" in a way something like reverb or comb filter resonance. The difference from comb filtering is that with ring spectral resonance is created not through a collection of comb filters selected for their ability to resonate various pulse wave spectra, but rather, through an array of feedback filters (sized by the FFT) that resonate a sine wave spectrum while dynamically tuning their feedback frequencies to the frequencies of the input sound. In short, it creates a kind of "self resonance". Ring is a nice way of increasing the resonant pitch characteristics of a sound, although it has its weaknesses. Ring works best with larger FFT sizes as it is attempting to synthesize or accentuate the more pitched/harmonic characteristics of the sound; this is something larger FFTs, with their increased frequency resolution, handle better. Use of the Kaiser window, with its low sidelobe amplitudes, helps as well. In adition, there is a threshold for preventing the noise features of a sound from being resonated, plus an EQ that can be positioned to filter either the source input to the feedback loop, or the feedback return. Run this routine using the script S.ring.
Ringfilter marries filter with ring by allowing a frequency response to be imposed on the resonance created with ring. Ringfilter begins to look more like multiple-delay, comb filter resonance since the static frequency response selects which frequencies will feed back. What is unique here is that the frequency response can come from an analysis, allowing the input sound to be resonated by the average spectral characteristic of another sound. A synthesized frequency response can be used as well. Like the EQ in ring, the filter in ringfilter can be positioned to either filter the source input to the feedback loop, or the feedback return where it will have the effect of introducing the filter characteristic more slowly through the resulting variable rates of decay. Run ringfilter with S.ringfilter_with_chord_synthesis to create a synthetic frequency repsonse, and with S.ringfilter_with_analysis for an analyzed frequency response.
Ringtvfilter is to ringfilter what tvfilter is to filter; that is, it makes the filter in ringfilter time-varying. This is a sophisticated idea, that is, time-varying filtering of the resonance of a time-varying sound. The best characterization would be to say that Ringtvfilter imprints the shadow of one sound onto the reverb of another. Ringtvfilter requires some thought and finese in order to separate and articulate the evolutions of the source, resonance, and filter. The best results are created using dynamic, high-profiled source sounds, rich with transient noise; and more constant, pitch/harmonic sounds for the time-varying filter. Like tvfilter, ringtvfilter requires an analysis file. Run this routine using S.ringtvfilter.
NONLINEAR FREQUENCY DEVIATION:
The idea behind filtdeviator is to use a frequency response function to not only filter a sound (as with filter), but to to create a topology of frequency deviation working in correlation with the filter. Consequently, filtdeviator is filter with added parameters for specifying how the filter frequency response function will be mapped into the deviation of frequency. The added parameters set the base and peak deviation for how the response will be mapped into both pitch transposition and frequency shift, and how the function will be warped within the range set by these limits. Their is also a master (0-1) deviation control for globally controlling the deviation. All the controls of filtdeviator allow you to dynamically vary the presence and effect of amplitude filtering and frequency deviation, making filtdeviator an interesting routine for exploring the way filters can be used to impede/transform the resonant signature of a sound. Using small amounts of frequency deviation, with no amplitude filtering, and a sweeping transposition of the filter will produce an effect something akin to the commercial guitar phase shifter; larger amounts of deviation take it into another place entirely. Adding the correlated amplitude filtering conceals the deviation more (positioning it more at the edges of formants), producing a sound something like the floppy resonant behavior of slide whistles. The scripts to run filtdeviator -- S.filtdeviator_with_ chord_synthesis and S.filtdeviator_with_analysis -- are designed with frequency response synthesis/analysis sections like those for filter and ringfilter. Run this routine using either S.filtdeviator_with_analysis or S.filtdeviator_with_chord_synthesis.
Tvfiltdeviator is to filtdeviator what tvfilter is to filter; i.e. it uses a time-varying filter response in place of the constant one. This routine blows the lid off of what was unusual about tvfiltdeviator. It's great for making wacky sounds out of ones with nice, fixed harmonies. The best use is to use it to deviate itself. Try taking something like a harpsichord or guitar (pitched stuff with decay) and do an analysis of the sound with pvanalysis. Then use the analysis to deviate the same sound. What happens is the strength of each of the sound's components becomes a control over the frequency deviation of that component, one that causes the sound to go "sproing" whenever it has any amplitude. Makes tonal music sound really broken. Run this routine with tvfiltdeviator.
Envelope is a routine for tracking the amplitude envelope of a sound. Output can be ASCII, floats or a NeXT soundfile. Selecting floats or ASCII will produce a file suitable for use in the control of a parameter. Run this routine with S.envelope.
Centroid is a routine for tracking the centroid of a sound. The centroid is the average of all the frequencies weighted by their amplitudes. It essentially gives you a kind of center frequency value for your spectrum. The analysis can be restricted to a band of frequencies, allowing the centroid to track a particular frequency component (although pitchtracker can do this as well). Selecting floats or ASCII will produce a file suitable for use in the control of a parameter. Run this routine with S.centroid.
Fluxoid is a routine for tracking the average frequency change of a sound. The average can be weighted (best) or not by the amplitudes. Selecting floats or ASCII will produce a file suitable for use in the control of a parameter. Run this routine with S.fluxoid.
Pitchtracker is a routine for tracking the fundamental pitch trajectory of a sound. It is an experimental routine that works, I believe, but forever has its quirks. Three detection methods are available for following the 1) fundamental of the harmonic collection, 2) the strongest formant, or 3) the band-limited centroid. Different output formats let you see, hear and eventually use the fruits of your pitch tracking. Run this routine with S.pitchtracker.
Reshape is a routine for transforming function streams to meet the needs of different parameters. It takes a headerless float or ASCII function file as input and outputs a headerless stream of float or ASCII values. With the appropriate flags, it can be used to limit, resample, translate, warp, expand, shrink, invert, quantize, and lowpass filter the input values. The output can be translated into different amp or pitch units depending on your needs. Run reshape at the command line. Omitting the output file turns reshape gives useful statistics on control files.
Below are various terms, parameters, or ways of doing things that are common to many of the routines.
OVERLAP/ADD VS. OSCILLATOR BANK METHODS AND RESYNTHESIS THRESHOLDS:
The phase vocoder resynthesizes the signal using one of two methods, depending on the type of changes made to the FFT. If the changes are only to the magnitudes (amplitudes), then the faster overlap/add method is used. If however changes in frequency are made, then the FFT integrity is compromised, necessitating use of the oscillator bank method in which each bin is synthesized as a sine wave changing in frequency and amplitude. This method is slower, although a resynthesis threshold is available that can be used to increase the computation speed by turning off bins whose amplitude falls below the threshold. A threshold of -60dB is appropriate, although safety warrants using a lower threshold if the spectrum is thin and its decays exposed; use your ear.
The source sound is the original input sound. Some routines allow for the mix of the processed sound with the original source sound.
All routines allow both monophonic and multi-channel input files to be processed. With multi-channelled files, you can either select one channel and produce a monophonic output file, or process all the channels. Channels are numbered beginning with 1. Processing of multi-channelled files is done one channel at a time beginning with channel 1, with zeros written to channels which have yet to be processed. Prcessing one channel at a time requires less memory and allows you to audition the output sooner than if you did all channels at once.
The input sound file must be a NeXT/Sun format sound file in either 16-bit short or 32-bit floating point format. It may have one or more channels.
The output sound file is written as a NeXT/Sun format sound file in either 16-bit short or 32-bit floating point format, of one or more channels. The channels are processed one at a time beginning with the first channel. The first pass writes zeros in the channels yet to be processed, replacing them when processing proceeds to those channels.
FLOATING-POINT AMPLITUDE RESCALING
Selection of the floating-point, output-file format invokes an amplitude rescaling feature. Once processing is complete, a second pass through the sound file is made to rescale the values to the decibel level specified. A dB rescale level of 1 causes rescaling to the level of the original input file.
The header of the output soundfile is updated often, so if your peak amplitude has not exceeded the 16-bit limit of the converters, you may play the float or integer output file before processing has completed.
Two flags are provided for controlling the output amplitude statistics; one turns the statistics on or off, and the other sets how often they will be reported. The statistics provide the peak output level in amplitude and decibels. Wth integer format ouput files, ouput values exceeding the normalized peak amplitude of 1. (0 dB) are clipped to a value of 1.0, and the statistics placed in clip mode; in clip mode reports are made only for frames where clipping occurs. The peak amplitude, its time, and the number of clipped samples are reported at the end of processing. With floating-point format output files, ouput values exceeding the normalized peak amplitude of 1. are not clipped since they will be rescaled in the second pass; output statistics proceed normally throughout. The levels before and after rescaling are reported at the end of processing.
FREQUENCY RESPONSE TERMINAL OUTPUT
In many filtering or companding routines, a crude terminal print of the frequen