Virtual Singer
Voice technical background Generalities
|
The human voice is amazingly complex, and the Virtual
Singer software does not pretend to replace it, but only to approximate
it as closely as possible.
We will describe here the basic concepts required to better understand
how Virtual Singer works.
A human voice can be characterized as follows:
-
The timbre (the voice "fingerprint"), which differentiates one
person from another because it depends on each person's vocal tract.
-
The effects, related to the singing technique.
These settings can be adjusted to approximate a given
voice as closely as possible.
Voice
Singing follows the same rules as speaking. The same fundamental
principles
can be applied to both of them.
The lungs generate an air stream, which goes
through the vocal chords.
Vocal chords are twin infoldings of mucous
membrane, positioned
at the base of the larynx, which act as a vibrator or "reed".
The vibration frequency is controlled by the singer in order to produce
the
required note pitch.
This original sound is then shaped by a set of cavities which
form the vocal tract (mouth, nasal fossae...).
The singer controls the opening and capacity of
these cavities to produce resonances, and in doing so, modifies
the
sound emitted by the vocal folds.
Speech and Language
|
 |
Speech is an acoustic way of communication. It is a convention
shared by people speaking the same language.
Each language has its own characteristics, and uses a limited number
of sounds (about thirty) called "phonemes". These phonemes are
then
grouped to become a syllable, a word, a sentence...
Some phonemes are common to several languages, because most
spoken languages come from the same origin. In addition, the range of
possible
phonemes is also limited by physical constraints of the vocal tract.
Phonemes
|
 |
We won't be using the standard acoustic classification
of phonemes used by phoneticians. For a more in-depth discussion,
see one of the various specialized texts on the subject.
Here are the basic groups of phonemes as used in
Virtual Singer:
-
vowels use the vocal chords, are weakly
sounded, and can be stretched
ad lib. They are the essential component of the sung voice.
Some languages (like English) use vowel groups called diphthongs, which
"slide" from one vowel sound to another (like
in "pie", "though"...).
-
voiced consonants are consonants which use the
vocal chords. They
are stretchable (Z => Zzzzz). These consonants also use the
resonances
of nasal cavities (M, N...) or a sound generated by the air stream (Z,
J, V...).
-
unvoiced consonants are stretchable and use only
turbulence
generated by the air stream, but not the vocal chords. These consonants
have no pitch (CH, F, S...).
- plosive consonants are brief,
unstretchable sounds, voiced
(G, D, B..) or not (K, T, P...).
Phoneme pronunciation
|
 |
 |
Question: What
is the difference between singing and speaking? |
|
While speaking, the frequency (note pitch) produced by the
vocal chords only varies a little. It allows the speaker to
provide the
intonation (prosody) of the sentence. In singing,
the frequency produced
by the vocal chords follows a melody and is no longer related
to the
intonation.
The main characteristic of the sung voice is the stretching
of
some phonemes over time. Since some syllables must be extended more
than
others, the singer stretches the more easily and artistically
stretchable
phonemes, i.e. the vowels, whose sound is closest to that of a
musical
instrument.
|