hetro — Decomposes an input soundfile into component sinusoids.


Hetrodyne filter analysis for the Csound adsyn generator.


csound -U hetro [flags] infilename outfilename
hetro [flags] infilename outfilename


hetro takes an input soundfile, decomposes it into component sinusoids, and outputs a description of the components in the form of breakpoint amplitude and frequency tracks. Analysis is conditioned by the control flags below. A space is optional between flag and value.

-s srate -- sampling rate of the audio input file. This will over-ride the srate of the soundfile header, which otherwise applies. If neither is present, the default is 10000. Note that for adsyn synthesis the srate of the source file and the generating orchestra need not be the same.

-c channel -- channel number sought. The default is 1.

-b begin -- beginning time (in seconds) of the audio segment to be analyzed. The default is 0.0

-d duration -- duration (in seconds) of the audio segment to be analyzed. The default of 0.0 means to the end of the file. Maximum length is 32.766 seconds.

-f begfreq -- estimated starting frequency of the fundamental, necessary to initialize the filter analysis. The default is 100 (cps).

-h partials -- number of harmonic partials sought in the audio file. Default is 10, maximum is a function of memory available.

-M maxamp -- maximum amplitude summed across all concurrent tracks. The default is 32767.

-m minamp -- amplitude threshold below which a single pair of amplitude/frequency tracks is considered dormant and will not contribute to output summation. Typical values: 128 (48 db down from full scale), 64 (54 db down), 32 (60 db down), 0 (no thresholding). The default threshold is 64 (54 db down).

-n brkpts -- initial number of analysis breakpoints in each amplitude and frequency track, prior to thresholding (-m) and linear breakpoint consolidation. The initial points are spread evenly over the duration. The default is 256.

-l cutfreq -- substitute a 3rd order Butterworth low-pass filter with cutoff frequency cutfreq (in Hz), in place of the default averaging comb filter. The default is 0 (don't use).


As of Csound 4.08, hetro can write SDIF output files if the output file name ends with ".sdif" or ".SDIF". See the sdif2ad utility for more information about the Csound's SDIF support.

File Format

The output file contains time-sequenced amplitude and frequency values for each partial of an additive complex audio source. The information is in the form of breakpoints (time, value, time, value, ....) using 16-bit integers in the range 0 - 32767. Time is given in milliseconds, and frequency in Hertz (cps). The breakpoint data is exclusively non-negative, and the values -1 and -2 uniquely signify the start of new amplitude and frequency tracks. A track is terminated by the value 32767. Before being written out, each track is data-reduced by amplitude thresholding and linear breakpoint consolidation.

A component partial is defined by two breakpoint sets: an amplitude set, and a frequency set. Within a composite file these sets may appear in any order (amplitude, frequency, amplitude ....; or amplitude, amplitude..., then frequency, frequency,...). During adsyn resynthesis the sets are automatically paired (amplitude, frequency) from the order in which they were found. There should be an equal number of each.

A legal adsyn control file could have following format:

-1 time1 value1 ... timeK valueK 32767 ; amplitude breakpoints for partial 1
-2 time1 value1 ... timeL valueL 32767 ; frequency breakpoints for partial 1
-1 time1 value1 ... timeM valueM 32767 ; amplitude breakpoints for partial 2
-2 time1 value1 ... timeN valueN 32767 ; frequency breakpoints for partial 2
-2 time1 value1 ..........
-2 time1 value1 ..........             ; pairable tracks for partials 3 and 4
-1 time1 value1 ..........
-1 time2 value1 ..........


hetro -s44100 -b.5 -d2.5 -h16 -M24000 audiofile.test adsynfile7

This will analyze 2.5 seconds of channel 1 of a file "audiofile.test", recorded at 44.1 kHz, beginning .5 seconds from the start, and place the result in a file "adsynfile7". We request just the first 16 harmonics of the sound, with 256 initial breakpoint values per amplitude or frequency track, and a peak summation amplitude of 24000. The fundamental is estimated to begin at 100 Hz. Amplitude thresholding is at 54 db down.

The Butterworth LPF is not enabled.

Here is an example of the hetro utility. It uses the file hetro.csd.

Example 1354. Example of the hetro utility.

See the sections Real-time Audio and Command Line Flags for more information on using command line flags.

; Select audio/midi flags here according to platform
-odac   -m0 --limiter=.95 ;;;realtime audio out, with limiter protection
; For Non-realtime ouput leave only the line below:
; -o hetro.wav -W ;;; for file output any platform

sr = 44100
ksmps = 32
nchnls = 2
0dbfs  = 1

; by Menno Knevel 2021

gilen  filelen "fox.wav"	    ; get length of soundfile

; analyze sound file and output result to 3 hetro files
ires1 system_i 1,{{ hetro fox.wav fox1.het }}           ; default settings
ires2 system_i 1,{{ hetro -f250 fox.wav fox2.het }}     ; high starting frequency
ires3 system_i 1,{{ hetro -f100 -h180 fox.wav fox3.het }}; up to 18kHz!

instr 1 ; untreated signal
asig    diskin2   "fox.wav", 1
prints "---*duration of soundfile is %f seconds*---\\n",gilen
outs    asig, asig

instr 2
asig      adsyn     1, 1, 1, p4 
outs asig, asig	    


i1 0 2.76               ; original sample

i2 5 2.76  "fox1.het"	; whole sentence
i2 10 2.76 "fox2.het"	; whole sentence, but analyzed with different settings
i2 15 2.76 "fox3.het"	; whole sentence, and again analyzed with different settings


Author: Tom Sullivan
Author: John ffitch
Author: Richard Dobson

October 2002. Thanks to Rasmus Ekman, added a note about the SDIF format.