PDFtoMusic Pro

Introduction

How to generate a PDF?

Menus

Document
Preferences
Expert mode
Kooplet search
Command Line version
Amendments
Mixer
Virtual Keyboard
Lyrics following
Annotations

Virtual Singer

Introduction

Technical background

Settings

Real Singer

Advanced recording

Myriad HQ
What's new?

Limits
FAQ
File/Folder selector

Appendices
Problem report
Software license
Order
Greetings
Printable manual

symbol marks changed chapters.

RealSinger

Factors to be considered when

recording for Real Singer

Noise factors

Environmental noise

If a passing noise interferes with your recording, simply re-record it. If you have companions, suggest that they go elsewhere, since their small movements may escape your attention but still be heard in the recording. Yes, your friend sitting behind you is giggling while you are trying to record...

A.C. hum

Depending on your country, alternating current (AC) hum has a frequency of 60Hz or 50Hz, with overtones throughout the vocal range, particularly at 180Hz (150Hz). At right is the spectrum of AC hum from a noisy setup. It is very important to reduce AC hum, since it is difficult to remove this noise without distorting your voice.

If you use a laptop computer, the simplest way to reduce AC hum is to use battery power, and have no peripheral devices connected. If you do use peripherals, the cables should be disconnected at the computer when power is off, not left dangling from the computer.

When a preamplifier (or tape deck) and a computer are connected, and when both of them use AC power, the amount of hum depends on how the power cords are plugged in. The loudest hum is produced when the two devices are plugged into different wall sockets. If one device has an AC auxiliary outlet, plug the second device into it, rather than to the wall. Or, plug both devices into a single extension cord. If the power plugs are not polarized (that is, if they can be inserted into the outlet with prongs reversed), try reversing the prongs.

Some microphones will pick up a lot of AC hum when you touch them. If that happens, mount the mike on an insulating stand, instead of holding it. If you do not have a mike stand, try taping the mike to a wooden stick, held vertical by taping it to the back of a chair. Pay careful attention to this. Just because a microphone can be held in the hand, does not mean that it should be held. If you use a headset microphone, see if AC hum is reduced when you remove the headset.

Be sure that your microphone cable does not run near any power cords. There may be power lines underneath your floor, so try moving the microphone cable. The same applies to the cord between computer and preamp or tape deck, if you are using one. It is especially important to stay away from motor-driven devices, including ceiling fans.

Machine sounds

If you see a noise spectrum like the one above, but the fundamental frequency of your AC is not the first peak, then the source of noise is probably a motor-driven appliance. Machine sounds are common. You have learned to ignore your refrigerator, heating, ventilation, computer fans, and ticking clocks. But if they are present, they will be included in your voice recording. Consider turning off machines - but don't forget to turn them back on again! If you have a lot of noise distributed evenly across the spectrum, it may be caused by air rushing through a ventilation system.

System noise

Some noise is caused by the electrical properties of your system. If this noise is small, Real Singer can analyze it and reduce its effect. But if the system noise is too large, you will have to try a different recording method.

If your computer's sound card is poor, it will detect electrical noise from the surrounding circuitry and include it in your recording. This is especially true if you are using a microphone connected directly to the computer's microphone input. If you have eliminated all other possible noise sources and still have too much unexplained noise, this may be the culprit. Try recording your voice to a tape deck, or using a pre-amplifier, so that you can feed the preamp line-out to your computer's sound card line-in, instead of to the microphone jack. Remember that external recording equipment usually requires a different kind of microphone than the kind used directly by computers.

If you are using a tape deck, it is better to use high-bias or metal tapes and noise reduction. Do not use automatic gain control. Do not use a microphone "built in" to the recorder.

Sound quality factors

Equalization

The human voice contains important frequency components across a broad range. The fundamental pitch of sung notes is below 500Hz (even lower for the male voice), with important overtones at higher frequencies. The range around 2-6KHz contains frequencies that add color and definition to the voice, especially during some consonants and transitions.

Be sure that your microphone has a smooth frequency response across this spectrum. If the mike is normally sensitive only to low frequencies, but has an artificial boost for the highest frequencies, then your recorded voice will sound too bright. Some computer mikes intended for speech recognition (conversion of words to text) may have such an artificial frequency response. But as long as the mike responds adequately across the vocal range, it is not necessary to have a level (flat) frequency response, since Real Singer includes an equalizer.

Saturation and clipping

Saturation and clipping occur when an input signal is too large. This can occur at the microphone, or at any stage of signal processing.

If your voice is too loud, the microphone will distort the sound, even if the electrical output from the mike is within the acceptable range. Computer microphones often have a low dynamic range, meaning that there is not much difference between the softest sounds they can detect above the noise, and the loudest sounds they can accept without distortion. When recording to Real Singer, it is important to keep your voice at uniform loudness. This is especially true if you are using a computer microphone.

Professional audio microphones have a much greater range of loudness that they can accept without distortion. But the range of electrical signals produced is also large. This kind of microphone is used with a preamplifier (or tape deck, acting as preamp). Be sure to pay attention to the VU or other signal amplitude meter. It is OK to briefly exceed a limit if the sound is in an unimportant part of a word, far from the phoneme that you are trying to validate.

Do not use automatic gain control (AGC) for recording to Real Singer. The distortions introduced by AGC are likely to be greater than the distortions removed. It is better to move away from the microphone, or manually adjust volume controls. Portable tape recorders, and office-style voice recorders, usually use AGC. Avoid using these devices, if you can.

If you are transferring a signal into your computer from a preamp or tape deck, be sure to use the correct jacks. Never take a signal from a jack intended to directly drive loudspeakers. The best connection is line-out to line-in.

If you are using an audio editor to apply digital filters to a pre-recorded waveform, be sure that the filter does not clip your sound.

Special problems

Difficult sounds

Some consonants are difficult to record, because they are soft and create a lot of breath wind. In English, these are f, h, s, and th (thin). You will need to place your mouth close to the microphone, but not allow the breath wind to touch it. It helps to feel the air stream coming from your mouth when you make these sounds, to ensure that the mike is correctly placed.

Some other consonants are difficult to record, because they are abrupt. In English, these are b, d, hard g (go), k, and p. These sounds have a moment of high intensity that quickly tapers to a short sound. If spoken too loudly, the intense part will saturate or clip. If spoken too softly, the tapered part will not be detected. Or, if you naturally speak these consonants softly, Real Singer may decide that your voice is "too loud" based on the part of the recorded word leading up to the consonant. Resist the temptation to speak these in an un-natural manner, to "help" Real Singer find them. If you do that, Real Singer will find an un-natural sound!

If you are having difficulty producing a satisfactory recording of these consonants, or if you would generally like to change what Real Singer hears from you, then pre-record your voice and use an audio editor. You can reduce the amplitude of an unnecessary part of a word that is "too loud," so that a necessary, softer part can be accepted. But it is usually not advisable to edit the volume in the portion of sound that contains the desired phoneme, because that will interfere with noise-removal processing.

Using an audio editor

An audio editor is a program that will open an audio file, change its contents, and export the result to a new audio file. One such program is the free Audacity (Windows or Macintosh) available from sourceforge.net. In addition to opening and exporting WAV files, it can open and export Vorbis OGG files. These files can be used by Real Singer in place of a live voice.

With an audio editor, you can: (1) Import a lengthy recording or several words, and slice it into individual words. (2) Adjust the volume or equalization. (3) Inspect sounds for the presence of sudden noise events. (4) Apply special effects (not recommended for Real Singer).

With an audio editor, you can help find sources of noise by looking at noise amplitude and spectra. The most valuable use is to inspect the recording waveforms for the presence of saturation and clipping. For this reason, it is a good idea to pre-test your method of recording, inspect its results with an audio editor, and make any necessary changes to your setup. Then Real Singer will have good quality sound to use for your voice.

Saturation occurs when an increase in sound power produces less than the proportional increase in recorded signal power. Saturation is often desirable; it is certainly better than clipping. But in Real Singer, it is better to avoid saturation, because the recorded tone will be used in soft and loud passages. If you look at sample recordings of your voice with an audio editor, and see that the recorded amplitude is always about the same during both loud and soft parts of your speech, then you may have saturation. (Or, you may be a master at keeping your voice at an even level!) Try recording at lower volume, or move the microphone slightly farther from your mouth. Be sure that automatic gain control is not in use. Avoid saturation in, or near, any part of the word that will be used for the phoneme.

At right are some images (at reduced size) from an audio editor. The top image shows a waveform that has been properly recorded. Even though Virtual Singer plays a sample word with very uniform amplitude, the live human voice varies in amplitude. These irregularities can be seen in the envelope of the waveform.

The second image is the same sound, recorded with saturation. Notice how the irregularities of the envelope have been smoothed. Examining the spectra would show that certain frequencies are more prominent in the the waveform with saturation than in the unsaturated wave.

The third image shows clipping, in this case caused by too large an electrical signal at the sound card input. Notice how the envelope has been flattened (flattening may be symmetrical or asymmetrical).

The fourth image also shows clipping, even though the recorded waveform has lower amplitude than before. In this case, the clipping occurred at the microphone, because the sound was too loud. The electrical signal was reduced by the sound card volume control. However, once a wave is clipped, it cannot be un-clipped.

At left is a composite image of two spectra, for the same word recorded by two different microphones. Areas of concern are marked with an asterisk. One of the microphones (purple spectrum) shows excessive response in the second overtone (third harmonic), which is one characteristic of saturation. Also, that microphone shows excessive response in the high frequency range - probably due to artificial enhancement - which makes the sound bright and harsh. This microphone was intended for computer speech recognition.

The other microphone was a pre-amplified dynamic type, normally used for audio recording. It had a more satisfactory sound quality (green spectrum).