OK, here's a circuit that should do what you want.
It's pretty straightforward, though it probably looks pretty complicated.
If you're interested in understanding the circuit better, you can look up any unfamiliar words using Google or Wikipedia.
The microphone on the left is an electret microphone - a small cylindrical microphone as used in dictation machines. You will need to put this close to the sound source on the existing unit.
R1 provides operating current that's needed by the microphone. C1 couples the signal from the microphone into the amplifier stage built around Q1. This is a conventional common emitter amplifier with variable gain and high frequency boost.
R6 adjusts the gain, and should be set so the circuit reliably detects the sound from the unit but is not false-triggered by other sounds.
The amplified signal appears on Q1's collector (the top terminal) and is coupled into a charge pump circuit consisting of D1, D2 and C4. A voltage is developed across C4 that is related to the amplitude of the signal picked up by the microphone. When this voltage exceeds about 0.7V, Q2 turns on and pulls its collector high.
This point in the circuit is labelled "TONE: H, SILENCE: L". These states are high (+9V) and low (0V) respectively.
This control signal feeds a circuit built around U1, a CD4093B quad NAND gate with Schmitt trigger inputs. That's a bit of a mouthful, but it means that the device consists of four identical "gates" (shown as U1B, U1A, U1D and U1C) whose output is always high unless both inputs are high, in which case the output is low.
These gates have a special kind of input circuit that means that a single gate (U1B in this case) can be used as an oscillator - a circuit that generates a continuous alternating signal at a particular frequency.
The CD4093 has two power pins - pins 14 and 7. C6 is a "decoupling" capacitor, required for reliable operation of the IC, and should be connected as close as possible to those two pins.
When Q2's collector is high, U1B is enabled and produces a signal at its output. The frequency of this signal is determined by R10 and C5. R10 is a variable resistor or "trimpot" that you should adjust to obtain the frequency (pitch) that you want in the speaker.
U1A is used to invert (reverse) the signal, and U1C and U1D are used to gate the signal with the tone-present indication from Q2. Their outputs are fed to two simple complementary bipolar buffer stages which drive opposite ends of the speaker. When U1B's output is high, the Q3/Q4 output is high and the Q5/Q6 output is low. When U1B's output is low, the Q3/Q4 output is low and the Q5/Q6 output is high. This allows a voltage of up to about 15V peak-to-peak to be applied to the series combination of R11 and the speaker, so the speaker can be quite loud.
R11 is a "select on test" resistor whose value determines the speaker volume. There are many variables involved here, such as speaker impedance, speaker sensitivity, speaker mounting, enclosure resonance, enclosure location, your ear's response, and so on, so it's best for you to try different values for R11. Start with around 100 ohms. Lower values make the speaker louder and vice versa. R11 may need to be a 1 watt resistor.
The speaker needs to be located away from the microphone, otherwise you will get a situation where the circuit never turns off, because it "hears itself". This can be minimised by using a speaker with a large or heavy cone that cannot reproduce high frequencies well, mounting the speaker firmly so nothing rattles, and tightly coupling the microphone to the unit so you can reduce the circuit's sensitivity.
If this proves to be an ongoing problem, you can add a capacitor directly across the speaker to reduce the high frequencies in its output. Initially try a few microfarads. Higher values will attenuate the high frequencies more and vice versa. The capacitor must be non-polarised.
U2 is a voltage regulator that provides a constant 9V supply to the circuit. It is powered from the automotive +12V supply with R13, R12, and D3 added to protect it against high voltages during automotive load dump.
I suggest you put the circuit in a strong plastic box with the speaker, and connect the microphone through a length of screened audio cable.
The circuit can be constructed on stripboard, also called Veroboard in the UK. If you plan to build it, or get it built (e.g. by an electronics student, perhaps), let me know and I will write up a full parts list for Digikey. The components cost is not great - should be less than USD 25 excluding the stripboard and the loudspeaker.
If you're not in the U.S. please enter your location in your profile and I will try to find an appropriate supplier instead of Digikey.