Yamaha Pocket Miku User Manual Download

Page: 4 / 16

with the same fundamental idea. This is the idea
that by analyzing the structure and tone quality of
the human voice, we can then attempt to simulate
it. As a representative example, let’s look at
“formant synthesis.” Formants are the spectral
peaks of the sound spectrum (the distribution of
the volume of each frequency band) of the voice.
The idea is that you can simulate human
pronunciation (the vocal cords and the movement
of the mouth) by supplying these peak
movements to a basic sound source.

“Concatenative synthesis” is another method

that spread quickly due to the shrinking costs of
digital technology. This method involves linking
fragments of recorded (sampled) voices to
synthesize vocals. Vocaloid’s system is basically a
type of concatenative synthesis which produces
more music like results. This system achieves this
effect by similarly connecting vocal fragments, and
at the same time making adjustments to each
frequency zone.

Formant synthesis and the robot
voice

As for an example of a device that is closer to the

concept of “formant synthesis,” the “Vocoder” is a
device that is familiar to many in the music world.
The idea for this device was originally formed in the
late 1920s at Bell Labs. At the time, it was used as a
voice compression technology for sending a clear
voice transmission through a telegraph cable’s small
bandwidth. The technology was used mainly for
purposes of military communication, due to the
limitation of cost reduction with the technology at
the time, as well as the fact that this was the period
encompassing World War II. Production costs were
reduced as semiconductor technology advanced in
the late 1960s, and instruments and effect
processors that gave the human singing voice a
robot-like effect grew popular. Vocoder technology
as a means of voice compression was later used to
improve voice clarity in cellphones. This technology
is still being developed today.

Similarly, a type of effects processor called a talk

box, which uses the structure of the human mouth
itself as a physical filter, has become very popular
in musical genres such as rock and funk. These
devices, however, are simply effects processors
that process sound by using the movements of the
human mouth. They don’t quite belong in the
same category of “vocal synthesizers” as Vocaloid
does, because they do not generate singing voices
on their own.

The birth of Vocaloid

Starting with the Yamaha PLG100-SG in 1997,

which mounted the formant singing sound source
as a plug-in board for a desktop music sound
module, there have been examples in the past of
vocal synthesizers sold as instruments. However, in
2000, a project called “Daisy” which payed homage
to “Daisy Bell” started. In 2003, they released sound
generating software called “Vocaloid” and
everything changed. They adopted a unique
concatenative synthesis system created by
breaking down data of recorded voices into
fragments (phonemes), then adjusting and editing
these fragments to compile a database. In this way,
they were able to achieve smooth vocal synthesis.
Vocaloid was praised for its natural vocal
expression and its user-friendly software. It
became widely acknowledged, particularly by
users dedicated to desktop music. In 2007,
Vocaloid 2 was announced. In the same year, the
more character oriented “Hatsune Miku” was
developed by Crypton Future Media.

Pocket Miku’s built in “eVocaloid”
technology

VOCALOID 3 was released in 2011. Its

concatenative vocal synthesis engine made even
more natural vocal expressions possible, and many
character voices appeared in a library of singing
voices stocked with vocal fragments. Meanwhile,
sound chips used in hardware, such as those that
produce ring tones in cell phones, have become
widespread and continue to develop. Pocket Miku
is equipped with the newest of such chips, the
Yamaha NSX-1. In addition to functioning as a
sound chip, NSX-1 is equipped with an “eVocaloid”
sound generator. This sound generator puts to use
Vocaloid technology which was previously only
used as sound generating software for personal
computers and similar devices. Pocket Miku
brought one further modification to NSX-1.
Whereas previous Vocaloid systems required
programming on software called “score editors”
beforehand, with this modification Pocket Miku is
the first product in the world that enables you to
perform real-time on Vocaloid. Pocket Miku is
battery operated with a built in speaker. By simply
sliding your stylus across the carbon keyboard,
Hatsune Miku will sing for you anywhere. Go
ahead and try it out!