Tuesday, June 11, 2013

Camera captures voices without a microphone

Yasuhiro Oikawa of Waseda University in Tokyo pointed a high-speed camera at the throat of a volunteer with one task in mind: To capture his/her voice without the use of a microphone.

Yes, you read that correctly. Oikawa and his team announced at the International Congress on Acoustics on June 3 that they used cameras to take thousands of images per second and record the motions of a person’s neck and voice box as they spoke. A computer program then turned the recorded vibrations into sound waves.

Why did they do this, you ask? Some lip-reading software programs are sophisticated enough to recognize different languages, but the end result doesn’t usually involve much more than a transcript, according to a ScienceNews article. In addition, microphones often record too much background noise, so Oikawa and his colleagues, looking for a new method of capturing vocal tones, came up with this idea.

The article explains that the researchers pointed the camera at the throats of two volunteers and had them say the Japanese word tawara, which means straw bale or bag. The team recorded them at 10,000 fps, and at the same time, recorded the volunteers’ words with a standard microphone and a vibrometer for comparison. The vibrations recorded by the camera vibrations can’t be recorded by a camera – I think you mean “interpreted by the camera data) were similar to the ones from the microphone and vibrometer, Oikawa said in the article.

After running the images though a computer program, the team reconstructed the volunteers’ voices well enough to hear and understand them saying tawara. Mechanical engineer Weikang Jiang of Shanghai Jiao Tong University in China noted Oikawa did not play audio of the reconstructed voices, but instead showed the comparison photos of the sound waves and vibrations.

Like Weikang, I am interested to hear what the audio sounds like.

Friday, June 7, 2013

Personalized advertising with facial detection

“Cara” is new facial detection software from IMRSV that uses a standard webcam to scan faces up to 25 feet away and determines age and gender. It’s currently being used on a wall of shoes in the back of a Reebok store in Fifth Avenue in New York, where it is helping the store to see which customers are spending more time at the shoe wall, quickly walking away, or actually buying something.

If this experiment goes well, Reebok could install an advertising display that would intelligently react to different customers. For instance, if I were to walk into a store and pick up a pair of size 10 running shoes, a video might pop up on the screen to tell me about these shoes.

No, really.

According to IMRSV, Cara collects data with 93% detection accuracy. Its demographics include gender (92% accuracy), and age (Child, young adult, adult, senior, with 80% accuracy).  It detects at a distance of up to 25 feet away and can scan multiple people at the same time. In addition to customized marketing, Cara could be used to watch audiences during live performances and monitor whether drivers are looking at the road, says IMRSV.

While this is all quite fascinating, I can’t help but think of a scene in Minority Report, where Tom Cruise is walking down a hallway rather quickly and the digital billboards are bombarding him with personalized ads. (Check it out here.) While the technology isn’t nearly as intrusive—it’s certainly not scanning your retina and immediately placing exactly who you are and where you’re from—it is eerily reminiscent of the futuristic adverts portrayed in the film.