Maker Pro
Maker Pro

Detecting the original pitch of the human voice

C

C3

Jan 1, 1970
0
Suppose you have a recording of a human voice singing a song. The recording
has been sped up or slowed down, such that both the tempo and pitch have
changed. The aim is to detect as close as possible exactly how much you need
to compress or expand the waveform (to speed it up or slow it down) in order
to restore it to the pitch it was originally recorded.

The human voice has a limited range, so you could easily get it within this
range, just by knowing that most people would not be able to sing outside
this range. You also know that the song is sung in tune, in equal
temperament, so the pitches will need to align exactly to a set of defined
notes.

If you knew anything about the singing ability of the singer, you might also
be able to infer something based on how strained the singing of each note
is, but assume you only have the recording, and no prior information about
the singer.

How would you do it?
 
P

Poxy

Jan 1, 1970
0
C3 said:
Suppose you have a recording of a human voice singing a song. The recording
has been sped up or slowed down, such that both the tempo and pitch have
changed. The aim is to detect as close as possible exactly how much you need
to compress or expand the waveform (to speed it up or slow it down) in order
to restore it to the pitch it was originally recorded.

The human voice has a limited range, so you could easily get it within this
range, just by knowing that most people would not be able to sing outside
this range. You also know that the song is sung in tune, in equal
temperament, so the pitches will need to align exactly to a set of defined
notes.

If you knew anything about the singing ability of the singer, you might also
be able to infer something based on how strained the singing of each note
is, but assume you only have the recording, and no prior information about
the singer.

How would you do it?

Dunno, but you might want to go all Google on "vocoder" - might give you
some ideas.
 
F

Franc Zabkar

Jan 1, 1970
0
Suppose you have a recording of a human voice singing a song. The recording
has been sped up or slowed down, such that both the tempo and pitch have
changed. The aim is to detect as close as possible exactly how much you need
to compress or expand the waveform (to speed it up or slow it down) in order
to restore it to the pitch it was originally recorded.

The human voice has a limited range, so you could easily get it within this
range, just by knowing that most people would not be able to sing outside
this range. You also know that the song is sung in tune, in equal
temperament, so the pitches will need to align exactly to a set of defined
notes.

If you knew anything about the singing ability of the singer, you might also
be able to infer something based on how strained the singing of each note
is, but assume you only have the recording, and no prior information about
the singer.

How would you do it?

Depending on the quality of the recording, you may be able to detect
50Hz or 60Hz hum.

- Franc Zabkar
 
M

Mitchell

Jan 1, 1970
0
C3 said:
Suppose you have a recording of a human voice singing a song. The
recording has been sped up or slowed down, such that both the tempo and
pitch have changed. The aim is to detect as close as possible exactly how
much you need to compress or expand the waveform (to speed it up or slow
it down) in order to restore it to the pitch it was originally recorded.

The human voice has a limited range, so you could easily get it within
this range, just by knowing that most people would not be able to sing
outside this range. You also know that the song is sung in tune, in equal
temperament, so the pitches will need to align exactly to a set of defined
notes.

If you knew anything about the singing ability of the singer, you might
also be able to infer something based on how strained the singing of each
note is, but assume you only have the recording, and no prior information
about the singer.

How would you do it?

If you know the key of the song you can fairly easily 'digitally' pitch /
tempo adjust.

Regards,
Mitch...
( I have done this when remastering some cassette taps on to CD.)
 
D

Dac

Jan 1, 1970
0
C3 said:
Suppose you have a recording of a human voice singing a song. The
recording has been sped up or slowed down, such that both the tempo and
pitch have changed. The aim is to detect as close as possible exactly how
much you need to compress or expand the waveform (to speed it up or slow
it down) in order to restore it to the pitch it was originally recorded.

The human voice has a limited range, so you could easily get it within
this range, just by knowing that most people would not be able to sing
outside this range. You also know that the song is sung in tune, in equal
temperament, so the pitches will need to align exactly to a set of defined
notes.

If you knew anything about the singing ability of the singer, you might
also be able to infer something based on how strained the singing of each
note is, but assume you only have the recording, and no prior information
about the singer.

How would you do it?

If it was digital I would convert straight to frequency domain and digitally
band pass filter the most common frequency of human voice, then have a look
at all the harmonics. Do this for heaps of standard noises and you should
see that you get almost the same thing every time. Then compare this to a
file sped/up slowed down adn you should find that you dont get the same
harmonic peaks at the same frequencies.
 
Top