To try to be more helpful than my first post:
First Davenn is right, the amount of energy available in sound is tiny. Their little speaker sounds loud, but they are probably putting only a few watts (1-5?) of electrical energy into it. Maybe 1% of that is coming out as sound. And how much of that ends up in your device?
So, how much of that sound is captured by your transducers (be they piezo or magnetic)? Not much. If you look at the video with their little speaker blasting away, it sounds quite noisy. Then there are two piezo elements absorbing some of that sound: can you hear the difference? Well you can't tell, because they didn't do a comparison, but my guess is no, not one jot. Perhaps if you had hundreds of them in a big wall of transducers, the sound would be just a little quieter when they were there?
The point is, even the tiny amount of energy available in sound is difficult to capture. It's going off in all directions. You're trying to capture just the little bit that reaches the few square cm of your device and missing that going to the many sq m of the rest of the room. So perhaps their speaker could put out 10mW and 1/10000 of that (1microW) could reach your device. (Very much guesstimate figures btw.)
If I were trying to capture energy from sound, I would want to design a transducer with a large diaphragm (like in a loudspeaker) which could be moved by the small vibrations of a large amount of air, then couple that movement to my piezo (or magnetic) transducer so that a small movement over a large area with low force became a larger movement or larger force on the small transducer.
Probably my first attempt would be to use a speaker backwards, because they are designed to match the electrical transducer to a large mass of air. Then, since they probably sacrifice efficiency for linearity of response over a wide frequency range, look for ways to optimise that for energy conversion. Reciprocity is often the case: what works best one way also works best in reverse.
Fitting some sort of horn might help capture sound from a large area and concentrate it into a small area near your transducer. (I have read that horns can increase speaker efficiency from 1% to 10% - ie. with a good horn you waste only 90% of the electrical energy, instead of 99%! That's the sort of problem you are up against.)
With your piezo transducers, which may be designed to couple with vibrations in solids rather than sounds in air, you could fix them to a sheet of thin plywood or something like that? Perhaps you could research piezo speakers and see how they use them to generate sound from electricity.
Maybe at this point you need to consider the specific sound source you would use. Is its energy concentrated at certain frequencies, so that you could make your system resonant at those frequencies?
A hundred years of research has gone into converting electricity into sound, so I expect you could find a lot of useful ideas if you look. In the early days transducers would have been less efficient, so people probably tried hard to get good coupling to the air. Now, deterrent alarms (often piezo devices I think) aim to produce maximum sound levels from small devices, so maybe they use some good ideas. (Though when I think of the literally deafening sounds they produce with not much power, I am reminded that you may be on a hiding to nothing!)
As a final thought about green/free/cheap energy, consider the various sources. Water is very heavy, so 1cubic m dropping 1m can provide 10kJ. When you get onto sea moving with tides and waves, you're quickly into MW territory.
Wind is much lighter, but much heavier than people often realise. (When checking my facts I came across a new example which I like: a cylinder of air containing the Eiffel tower would weigh 10000 tonne, more than all the iron in the tower, 7300 tonne. Anyhow, the energy available in wind depends a lot on the speed (proportional to cube of v), which is the snag for many home wind generators, but a 1 sq m turbine in a good breeze is seeing a few hundred watts of wind power of which it might collect about half, say 200W. Scaled up to commercial sizes, a 30 m turbine sees 1000x that power and a 100m turbine 10,000x and these produce upto 200 kW and 2 MW.
Sunlight also clocks in around 200 W/sq m when it is shining ( rather more near the equator, and rather less for us northerners, and we have very cloudy skies.)
On a small scale even air and sun are marginal sources, but when you compare them to sound with power levels of microwatts per sq m (at most, maybe mW?) you are starting off at minimum of 10,000x disadvantage, before you even consider the difficulty of capturing it.
I don't have any special knowledge nor understanding in this area. My comments are based on basic physics, common sense and a few hours checking info on the web. If I were the chaps in the video, these are the things I would want to think about before committing my time to such a project. (I'm not sure how many marks you get for saying, "We had this great idea. Here is all the research we did and our explanation of why it wouldn't work" ,but in my eyes that might be worth more than building a hopeless project based on an unresearched idea.)