In-Depth

Gemini AI Creates Research Report and 2-Person Podcast for Free

With a new Audio Overviews feature announced Monday, Google's Gemini AI app can now create a professional-looking Deep Research report and turn it into a professional-sounding two-person AI podcast discussion in minutes, all for free.

Google this week announced that Audio Overviews was available in the Gemini app, which runs on the web or on mobile devices. It leverages NotebookLM tech from Google DeepMind, the cutting-edge AI research division of the company, and it represents an evolution of Google's work on large language models (LLMs) and knowledge management systems.

NotebookLM emerged in 2023 as a way to harness DeepMind's advances in natural language understanding and knowledge management. The product aims to help users organize, search, and interact with their documents and notes in an intelligent, conversational manner. Its primary focus is on turning long-form documents (like research papers, business reports, or study notes) into interactive experiences (e.g., multi-person dialogues or podcasts).

"Leveraging the same technology that powers NotebookLM's Audio Overviews, Gemini app users can now generate podcast-style conversations based on documents, slides, and Deep Research reports," Google said Monday. "Upload files about topics you want to explore and enjoy dynamic discussions between two AI hosts with unique perspectives."

Hands-On: It Works!
And that's exactly what we did in a proof-of-concept (PoC) experiment that tasked Gemini with using its Deep Research functionality to create a report on cloud AI and then turn that report into a polished, professional podcast between two engaging co-hosts -- or rather cloned AI voices for a man and a woman.

The experiment almost fully duplicated workflows found in NotebookLM and other products that many companies are now using to turn their content into other forms. Alternative products in this rapidly expanding space include Wondercraft, Descript, Murf.ai, Play.ht, Resemble AI and many more. Note, however, that this Gemini tech is still in its very early stages and falls far short of the all the bells and whistles that come with the full experience provided by NotebookLM, Wondercraft and other offerings. That experience provides for all kinds of customization in voices, music and other settings. With Gemini, you get the barebones product. But it's free -- as of now.

Deep Research
The experiment started simply enough, I asked Gemini (on desktop web) to create a Deep Research report on cloud AI.

Create a Report
[Click on image for larger view.] Create a Report (source: Ramel).

Gemini's Deep Research goes beyond simple web searches, aiming to provide comprehensive answers to prompts by analyzing information from multiple sources. It automates the research process by browsing the web on your behalf, analyzing information in real-time, and synthesizing findings into detailed reports.

It thought things over for a minute and developed a plan of attack, which it shared with me.

Here's the Plan
[Click on image for larger view.] Here's the Plan (source: Ramel).

After I approved the plan (which can be tweaked), it churned away for a while and spat out a 25-page, 3,331-word, 145-footnote report titled "Cloud AI: A Comprehensive Analysis of Current State and Future Prospects."

The Report
[Click on image for larger view.] The Report (source: Ramel).

"This report provides a comprehensive analysis of cloud AI, examining its definition, key advantages for businesses and individuals, associated challenges and risks, real-world applications across various sectors, the competitive landscape of major cloud service providers, future trends and advancements, as well as ethical considerations and societal impact," it said.

Audio Overviews
This works with a variety of document types, including text (.txt), Word (.doc) and many others, including PowerPoint and spreadsheet documents. It also supports Deep Research reports generated within Gemini, our PoC experiment test case.

First, from the Gemini app, you click on the + symbol in the prompt box to upload a document, upon which the Generate Audio Overview button appears. Or, in our case, just generate a Deep Research report and the button appears.

It grinds away for a while and then politely offers up the Audio Overview, with options to download it or immediately share it on social media.

Audio Overview
[Click on image for larger view.] Audio Overview (source: Ramel).

The conversation between two AI-cloned voices, complete with disfluencies like "you know," started off like this:

Woman: Welcome to the deep dive. Today we're jumping into cloud AI. We've gone through insights from Gartner, CloudRaft, HPE, Salesforce, quite a bit. Think of this as your quick guide to, you know, how AI -- using cloud infrastructure -- is really shaking things up. Our goal: extract the key takeaways for you.

Man: Absolutely. And what's really interesting across all these sources is this focus not just on AI being in the cloud, but what the cloud enables for AI. It's changing who can access it and how.

Use Cases
Unlike 99 percent of our PoC experiments, this one worked flawlessly.

Google's announcement said the new Audio Overviews feature can be used to enhance learning in a delightful and productive way, enabling users to upload class notes, lesson plans, research papers, lengthy email threads, or reports generated by Deep Research and receive an Audio Overview to help summarize the files.

Gemini suggests potential use cases for the combination of Deep Research and Audio Overviews include:

  • Market Intelligence and Competitive Analysis: Quickly get up to speed on complex market trends and competitor activities by having Deep Research generate a comprehensive report and then listening to an audio overview while commuting.
  • Due Diligence and Investment Analysis: Efficiently review due diligence findings for potential investments by having Deep Research compile all necessary information and then listening to a summarized audio overview with key stakeholders.
  • Employee Training and Onboarding: Streamline employee onboarding by using Deep Research to create comprehensive training materials on various topics and then providing new hires with easily digestible audio overviews for on-the-go learning.
  • Internal Knowledge Sharing and Collaboration: Facilitate internal knowledge sharing across teams by using Deep Research to synthesize information from multiple sources and then sharing audio overviews of key project updates or research findings.
  • Executive Briefings and Strategic Planning: Enable executives to stay informed on critical strategic issues by having Deep Research compile relevant data and then providing them with time-efficient audio briefings summarizing the key takeaways.
  • Sales and Business Development: Improve sales team preparedness by using Deep Research to gather detailed information on potential clients and industries, allowing sales representatives to listen to key insights before meetings.

Stay tuned to see further advances in this rapidly expanding space.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

Subscribe on YouTube