AI Bytes Newsletter
Posts
AI Bytes Newsletter Issue #3

AI Bytes Newsletter Issue #3

Rico & Mike Onslow
January 29, 2024

Hey everyone! Welcome back to another exciting edition of AI Bytes, where we jump head-first into the latest and greatest in AI. This week, we're highlighting some incredible advancements and ethical quandaries that have got the AI world buzzing. From ElevenLabs' game-changing "Dubbing" feature to the ethical tightrope of AI in politics, we're covering it all. Plus, I'll be sharing my latest fascinations in AI art and how it's reshaping creative expression. So, buckle up, and let's get started on this AI adventure! Remember, we're here to amplify the wonders of AI, not replace the human spirit. Let's explore the future together!

The Latest in AI

A Look into the Heart of AI

Featured Innovation

For this week’s Featured Innovation we’re solidly going with ElevenLabs new “Dubbing” feature.

When you create a new dubbing project, Dubbing Studio automatically transcribes your content, translates it into the new language, and generates a new audio track in that language. Each speaker’s original voice is isolated and cloned before generating the translation to make sure they sound the same in every language.

Once your project is ready, you can start editing:

Dub project across 29 languages
Generate, edit, and re-generate transcripts and translations to improve localization
Re-generate translated audio segments to adjust delivery style
Assign audio tracks to select voices
Adjust voice stability, similarity and style for each audio track
Adjust timecodes to synchronize dialogue with on-screen action
Inject audio clips into your sequence
Isolate voices from selected audio tracks

One of the biggest surprises for us was that ElevenLabs translation cost Mike 10 times less than than with comparable products!

Check out Mike’s video on ElevenLabs Dubbing below:

Ethical Considerations & Real-World Impact

This week, we're touching upon a troubling AI incident in New Hampshire. A deep-fake of President Biden's voice was used in a robocall, falsely urging Democrats not to vote in the primary, claiming it would help Republicans. This wasn't a harmless AI prank – it's a serious misuse with worrying ethical implications. The New Hampshire attorney general is investigating this as a potential election disruption and voter suppression attempt.

This case isn't just a quirky tech mishap; it's a stark reminder of AI's darker potential, especially in voice cloning. While AI can be a force for good, saving companies money and improving customer service, this example highlights its potential for harmful misuse.

We're excited about AI's capabilities, but we must stay vigilant about its unethical applications. This brings up significant ethical questions about the responsibilities of AI developers and users, and protecting those unfamiliar with AI's rapid evolution.

We advocate for AI's free expression and development but are equally concerned about its negative impacts if unchecked. It's crucial to balance preventing misuse with not stifling AI's growth. Criminals and malicious actors often bypass laws and regulations, making this a complex issue.

What's your take on safeguarding against AI misuse while fostering its positive development? We'd love to hear your thoughts on navigating these ethical challenges and striking the right balance.

Fake Biden robocall telling Democrats not to vote is likely an AI-generated deepfake

The state's attorney general's office said the call was probably an effort at voter suppression ahead of Tuesday's primary.

www.nbcnews.com/tech/misinformation/joe-biden-new-hampshire-robocall-fake-voice-deep-ai-primary-rcna135120?ref=futuretools.io

Tools

The Toolbox for Navigating the AI Landscape

AI Tool of the Week - Deepgram

✨ Hey all, Mike here! I’m excited to share a tool I’ve been using for quite some time for voice transcription and AI summarization. Deepgram is a leader the Speech-To-Text, sentiment analysis, audio summarization. Even more more Deepgram recently released custom models based on specialized purposes like the following:

meeting: Optimized for conference room settings, which include multiple speakers with a single microphone.
phonecall: Optimized for low-bandwidth audio phone calls.
voicemail: Optimized for low-bandwidth audio clips with a single speaker. Derived from the phonecall model.
finance: Optimized for multiple speakers with varying audio quality, such as might be found on a typical earnings call. Vocabulary is heavily finance oriented.
conversationalai: Optimized for use cases in which a human is talking to an automated bot, such as IVR, a voice assistant, or an automated kiosk.
video: Optimized for audio sourced from videos.
medical: Optimized for audio with medical oriented vocabulary.
drivethru: Optimized for audio sources from drivethrus.
automotive: Optimized for audio with automative oriented vocabulary.

Some other features Deepgram offers that we think are great:

Punctuation
What It Does: Adds punctuation and capitalization to the transcript.
Why It's Useful: Essential for conveying the correct tone and meaning, and breaks up text into digestible pieces.

Redaction
What It Does: Redacts sensitive information in the transcript.
Why It's Useful: Protects privacy and confidentiality, crucial for sensitive or private recordings.

Diarization
What It Does: Recognizes speaker changes in the audio.
Why It's Useful: Identifies different speakers, making the transcript clearer and more organized.

Summarization
What It Does: Provides summaries for sections of content.
Why It's Useful: Offers quick, concise overviews of longer sections for faster comprehension.

Topic Detection
What It Does: Identifies and extracts key topics from sections of content.
Why It's Useful: Highlights main ideas and themes, aiding in understanding and recall.

If you’ve got a suggestion on tools we should check out, email us at [email protected] and let us know.

Rico's Roundup

Critical Insights and Curated Content from Rico

Skeptic's Corner:

Hey Everyone! Rico, here. This week I am feeling a little less skeptical and wanted to share my excitement about Generative AI and AI Art.

This week has been a true highlight. I've connected with numerous AI Artists on X.com (formerly known as Twitter), and the joy this brings is immense. The chance to create art, explore spontaneous ideas, and actualize long-held concepts is exhilarating.

In this week's Rico's Roundup, I'm buzzing to see #GenerativeAI, #GenerativeArt, and #GPTs trending on X.com. You might think, "Well Rico, anyone can type a few words into a prompt and make a picture, what's the big deal?" But the scope of this technology goes far beyond that simple view. As it evolves, users' contributions of prompts and refinements create not just advanced outcomes but ones that transcend mere two-dimensional imagery.

Utilized a DALL-E 3 GPT to create a still image, then uploaded to FinalFrame.ai

This week marked a turning point for me - I got my own Midjourney subscription, starting with the Basic Plan. My engagement with ChatGPT's Custom GPTs and DALL-E3 creations reached a peak when I hit my usage limit amidst a creative surge. My curiosity about Midjourney Version 6, particularly noting the distinct art styles compared to DALL-E 3, has me thinking about an upgrade for even greater creative exploration.

Midjourney Subscription Plans

Midjourney has four subscription tiers. Each subscription plan includes access to the Midjourney member gallery, the discord, usage rights, and more.

docs.midjourney.com/docs/plans

My journey into Generative Art led me to both create and admire many AI Artists' works on X.com. I was captivated by moving images and short films, clearly products of expertly crafted AI images and prompts. This discovery led me to explore various platforms and services, each adding dynamic, animated life to static renderings.

In our Pilot episode, we introduced RunwayML's 'Infinite Background' tool, fantastic for generative art or expanding images. Recently in AI Bytes, we highlighted RunwayML's 'Motion Brush', allowing static photos to gain the illusion of motion. My experiments with images from DALL-E 3 and Midjourney Version 6.0 on RunwayML were astonishing. These platforms, intricate yet user-friendly, ended up producing captivating 4-second videos that bring static images to animated life, offering a new realm for video and movie enthusiasts.

Runway Pricing | Create Anything You Can Imagine

Runway is an applied AI research company shaping the next era of art, entertainment and human creativity.

runwayml.com/pricing

During this exploration, I stumbled upon FinalFrame.ai, an indie platform specializing in AI-generated video creation and extension. The results, akin to those from RunwayML, were astounding, adding incredible depth to my creations. With their announcement of a major upgrade, Version 2, coming soon, the X.com AI Artist community is abuzz, many eager to compare it with Adobe’s FireFly, which I am keen to explore.

To perfect my 4-second video creations, I used Descript, a tool Mike and I have been relying on for producing and editing our podcast and other projects. Descript revolutionizes video and podcast production, offering everything from text-based video editing to AI voice cloning. I uploaded my video in Descript, easily added audio tracks, applied our Antics.tv logo, and shared it on X.com. The immediate response, comments, and impressions after posting were thrilling.

What’s the point of all this? It's to highlight the rapid, transformative artistic capabilities enabled by AI tools, free and affordable. Even as someone who still holds skepticism and apprehension in various areas of AI, I'm now more excited about realizing projects I've dreamed of since my teen years. Notes and ideas for movies, stories, and short-form media are now within reach, achievable in hours or a weekend. It's an incredible time for Generative Art and AI, and I hope you share in this excitement as developers vie to bring our creative visions to life in ways we could only dream of.

We are excited to also premier of the short AI thriller film, “Scourge”. For this, I used Midjourney Version 6.0 to create the static images, then fed them into RunwayML’s Image to Video tool, finally bringing them together and adding sounds in Descript. Hope you enjoy!

Have an idea for a creation or a project needing advice on realization? We'd love to hear from you. Contact us on our X.com page or email us at [email protected].

Must-Read Articles

OpenAI's Latest Innovation: Embedding Models Transforming Text into Meaningful Vectors

OpenAI has recently introduced new embedding models, text-embedding-3-small and text-embedding-3-large, offering advanced features for transforming text into numerical vectors, known as embeddings. These models excel in various applications like search, clustering, recommendations, anomaly detection, diversity measurement, and classification. Embeddings function by measuring the relatedness of text strings through vectors, where the distance between vectors indicates their similarity. These tools are pivotal for tasks requiring the analysis of text similarity and relatedness, making them invaluable in today's data-driven landscape. This article delves into the specifics of these embedding models, their applications, and how to effectively utilize them.

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

platform.openai.com/docs/guides/embeddings/frequently-asked-questions

Google Unleashes Gemini: The Future of AI-Powered Advertising in Google Ads

Google has recently unveiled Gemini, a new conversational tool within its Google Ads platform, revolutionizing how advertisers create Search ad campaigns. Leveraging large language models, Gemini simplifies campaign creation through a chat-based interface, utilizing website URLs to generate relevant ad content, including keywords and AI-generated images. Currently available in beta to English language advertisers in the U.S. and U.K., with plans for global and multilingual expansion, this tool promises to enhance campaign quality with minimal effort. This innovation is part of Google's broader strategy to infuse AI into its product suite, as seen with their AI-driven features in Chrome and the Product Studio for AI-generated product imagery.

Google's new Gemini-powered conversational tool helps advertisers quickly build Search campaigns | TechCrunch

Google announced today that Gemini now powers the conversational experience within the Google Ads platform.

techcrunch.com/2024/01/23/googles-new-gemini-powered-conversational-tool-helps-advertisers-quickly-build-search-campaigns/?guccounter=1

Listener's Voice

One of our listeners, Laurie, writes, "I am very worried about media literacy and the general population. Not everyone is curious. People have lazy tendencies, and AI doesn’t have a conscience. It will be interesting to see how this plays out."

Hi Laurie, we agree with you, and we believe the only way to combat the lack of media literacy is to raise awareness! One major way to do that doesn’t have anything to do with posting on LinkedIn and sending emails out. One of the most effective ways to spread the word is to reach regular everyday folks. This could look like letting your mom and dad know about deepfakes and how they could be targeted, or posting on social media with some basic tips on thinking critically about unsolicited calls or emails.

These are the same types of awareness building we do around phishing or Ponzi schemes. We know it, therefore it’s our responsibility to let folks know!

Thank you, Laurie, for your quote and taking the time to write in!

Mike's Musings

What’s up everyone, Mike here! Can’t wait to “delve” into an exciting segment here in week 3.

Tech Deep Dive

Mike breaks down a complex AI concept into understandable terms.

Generative Adversarial Network (GAN)

Hey all, Mike here! Today we’re going to breakdown the acronym GAN. In plain English here is how a GAN works…

In a GAN, there are two key players: the generator and the discriminator. The generator's job is to create new content, like images or stories. It's like an artist trying to create a masterpiece.

The discriminator acts as a critic. It looks at the generator's work and real-world examples, trying to tell the difference between the two. Its job is to figure out if what the generator made is fake or real.

Both the generator and discriminator are learning at the same time. The generator keeps improving its creations to trick the discriminator, while the discriminator gets better at spotting fakes.

Over time, this competition leads to the generator making very realistic content, often so good that it's hard to tell it's artificial.

GANs are useful for many things, like creating realistic images, enhancing photos, or even generating new product designs. They are a powerful tool in AI, where two parts work against each other to get better results.

For a longer breakdown on GAN, check out my article below:

GAN: Innovation Through Adversity

Mike breaks down GAN for us in a simple way!

artificialantics.beehiiv.com/p/gan-innovation-through-adversity

Mike's Favorites

Sharing personal recommendations for AI books, podcasts, or documentaries.

AI Powered Toys

One aspect of AI toys that I find particularly intriguing is their potential to accelerate learning. A colleague of mine observed a significant reduction in their learning time, stating, "Something that would have taken me 2-4 days to learn, I learned well in about 2 hours. ChatGPT allowed me to use my experience to dig into what I was looking for MUCH faster than Google or the documentation."

This efficiency in learning can be a game-changer, especially when applied to AI toys for kids. These toys have the potential to provide tailored, interactive learning experiences, transforming the way our children play and learn.

Another exciting development in AI toys is the customization feature. With some toys, like Curio's, parents can program in their family values and anything else you want to steer the AI with. This means the AI toy can align with a family's unique cultural, ethical, or religious perspectives. Whether your family is religious or vegetarian, these toys can be programmed to reflect those values. It's a step towards preserving individuality in a world that often leaning towards homogenization.

However, it's crucial to understand that AI toys, no matter how aligned with family values, can never replace the engagement kids have with their parents. Parental interaction is irreplaceable for emotional bonding and value transmission. Yet, self-play, facilitated by AI toys, is also vital for a child's independent learning and development. These toys can be excellent tools for times when direct parental involvement isn't possible, offering limitless potential for self-guided learning.

Check of the Forbes video about AI powered toys below:

IBM Explainer Video for RAG

You may remember in last week’s edition, I broke down the term RAG (Retrieval-Augmented Generation). In between then and now, I found a fantastic video that explains RAG with some great examples and in a visual way.

Contact Us

Got a product, service, or innovation in the AI and tech world you're itching to share? Or perhaps you have a strange, hilarious, or uniquely entertaining experience with AI tools or in the AI space? We at Artificial Antics are always on the lookout for exciting content to feature on our podcast and in our newsletter. If you're ready to share your creation or story with an enthusiastic audience, we're ready to listen! Please reach out with a direct message on X.com or send us an email. Let's explore the possibility of a thrilling collaboration together!

Closing

Elevate your Artificial Antics experience! Subscribe to our YouTube for exclusive AI Bytes, follow us on LinkedIn for the latest AI insights, and share our journey with friends. Don't forget to keep up with us on all major streaming platforms for every podcast episode!

Thank You

We're truly grateful for the incredible support from the Artificial Antics family. Your active engagement on our social media, dedication in tuning into our episodes, avid readership of our newsletter, and steadfast following on X.com are the lifeblood of our podcast. It's your involvement that fuels our passion and transforms each episode into a shared adventure into AI's intriguing world. A huge shoutout to our families and friends for their boundless love and backing. Also, a big thanks to our followers on X.com and the AI Art and Generative Art enthusiasts we've interacted with recently. Your creative spirit and efforts in sharing your amazing ideas and artworks inspire us all. Keep creating and enlightening the world with your unique contributions!

THANK YOU!

AI: Amplify, not replace.