In this episode of Device Squad, the podcast for the Mobile Enterprise from Propelics, Steve gets futuristic with MERL Senior Principal Research Scientist, John Hershey. The conversation centers around the current state of Neural Networks and artificial intelligence as John brings us the news from the recent NIPS (Neural Information Processing) conference in Barcelona.
We discuss voice recognition and replication strategies and what role they’ll play in our everyday lives—along with John’s current project, deep learning for signal separation, speech recognition, language processing, and multi-modal semantic representation learning.
In other words, John has solved the problem of isolating a single voice in a crowd, a process known as Deep Clustering.
Specifically, we discuss:
The Universe Project – a software platform for measuring and training an AI’s general intelligence across the world’s supply of games, websites and other applications.
WaveNet – a deep generative model of raw audio waveforms – able to generate speech which mimics human voice and sounds more natural than the best text-to-speech systems.
Google’s DeepDreams – DeepDream is a computer vision program created by Google which uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dreamlike hallucinogenic appearance in the deliberately over-processed images.
MERL Deep Clustering – Training deep discriminative embeddings to solve the cocktail party problem.The human auditory system gives us the extraordinary ability to converse in the midst of a noisy throng of party goers. Solving this so-called cocktail party problem has proven extremely challenging for computers, and separating and recognizing speech in such conditions has been the holy grail of speech processing for more than 50 years. Deep clustering is a recently introduced deep learning architecture that uses discriminatively trained embeddings as the basis for clustering, producing unprecedented speaker-independent single-channel separation performance on two-speaker and three-speaker mixtures.
John also predicts when our robot overlords will finally take over, and whether or not the revolution will take the form of an army of seemingly benevolent toys. Also, how long it will be before Alexa (and other) voice-controlled devices begin targeting content based on our emotional states.
Lastly, our two heroes engage in an exciting game of BLIP. Tune in and find out what this 1970’s TOMY game has to do with with artificial intelligence and analog processing!
It’s a long episode but a great one so be sure to tune in!
Oh, and by the way, Mind Flex is a scam.
Content Strategy Lead at Anexinet
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
|cookielawinfo-checbox-analytics||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".|
|cookielawinfo-checbox-functional||11 months||The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".|
|cookielawinfo-checbox-others||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.|
|cookielawinfo-checkbox-necessary||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".|
|cookielawinfo-checkbox-performance||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".|
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.