You Don't Say (Overwhelmed Again and Again)

The simple-sounding idea was to overlay my study texts with voice overs.

However, as the sub-title above suggests, the problem of being overwhelmed by too much information, too much confusion, and too many details quickly rushed in.

First, the back story. One project I'm working involves using Python to display a sequence of topic review text lines (with colorization) about a topic I'm trying to learn, say it's the 47 non-Dunder methods of the string object (str) in Python.  (See for example here)

It turns out that for me --at first writing of this post -- the playing of sounds via Python was a non-trivial endeavor. There were hundreds of different approaches to try out. [Post Script: Ultimately I did find a combination of approaches that worked. Yay.]

I spent a lot of time researching questions such as these:

(1) Should I use Audacity to dub with my own voice, or use a TTS (Text To Speech) program like Super whisper? (Note: for Win OS version of SuperWhisper, a Vulcan dll is missing, so had to download Vulcan SDK package)  [Post Script: Audacity is complicated. I need to study a whole bunch of videos to start figuring it out, for example: by Mike Russel here, by Skills Factory here, and more by Jay Mayor about audio "enhancement" here (using Adobe Enhancer here)

(2) Another question: Do I use the OS module to play the sound, or some other approach (PyGame?) [p.s. Later-discovered answer: PYGAME ]

(3) Pre-compute option -- Converting text to a speech file takes time. One option is to execute such conversions ahead of time, in idle moments. For example when user is supposed to be reading some study text or watching a study video.

(4) Scarcity of free voice types. PYTTSX3 has only two voices (male and female) and apparently no way to add more [p.s., but by controlling voice rate, pyttsx3 gave acceptable results] [Post script: But later I learned that I can use Audacity and/or other audio enhancers to alter the initial WAV audio produced by pyttsx3 ]

(5) ....

MY INITIAL RESERCH RESULTS:

(links_01):
Too advanced for me just yet, but worthy of note: According to the AI Engineer at "All About AI" (here) there is a so-called ChatTTS package on Github (here) that is really good.

(links_02):
Also way too advanced for me at moment, but worthy of note: According to Jarod's Journeys (here) there is a so-called TortoiseTTS package on Github (here) that is really good (but needs 22GB plus an Nvidia chip? ugh)

(links_03):
The simpler and preferred TTS is PYTTSX3 (search results here) which has some flexibility but apparently only two voices with no extension known. See for example this tutorial (Brilliant Rodgers here). The online, Automate the Boring Stuff book has a Chapter 24 explaining pyttsx3 (here).

[p.s. I missed a vital warning in studying the above. That miss caused me grief. The warning is that pyttsx3 outputs WAV audio files, NOT mp3! ]

(links_04):
One Reddit post (here) provides some TTS suggestions including yapper-tts? (here) which supposedly has many different voices.

(links_05):
The PyGame module is favored by many YouTubers (here)  including a long tutorial by Codemy (here and more here) which I have not watched in full yet. In an older post (5 year old here, he uses Tkinter to create an MP3 player)

(links_06):
Bro Code has a detailed tutorial for an MP3 player using Pygame (here)

(links_07):
Search results for more YouTubes on "converting text to voice in Python" can be found (here). Included in these results is one fast-paced tutorial by BrainWave for edge-tts (here).

(links_08):
Fluffy's full Pygame tutorial: (here)
Fluffy's videos home (here)

ULTIMATELY, I found an unlikely tutorial at CircuitPython School (here) that was a simple script for generating speech AND WORKED !!!  --if only I had found it earlier -- 20/20 hindsight is not much of a lesson

more details to come ....



MORE TO EXPLORE
Google: search "chill music mp3 download" (here)

Comments

Popular posts from this blog

Links for Python Noobs

The Learn HOW to Learn Page

Welcome to Circular Import Hell