October 4, 2024

How The iPod Shuffle Talks

Posted March 11, 2009 at 5:56pm by iClarified · 5386 views
The new iPod Shuffle will tell you what song is playing and who’s performing it but how does it do it?

The secret is built into the soon to be released iTunes 8.1. iTunes will put some extra voice data in your music files to include the name of the band and title of the song. The rendering will be done on the computer and will use the audio voices you already have with your OS. This is why the PC users will hear a woman and Mac users will hear a male (Alex).

Though the additional audio data will be small it will still increase the size of your iTunes Library. Eventually, we may be able to download tracks from iTunes with feature voices already included. ie. Artists could announce their own tracks.

9to5Mac via Hrmph has posted some information from the original patent application for this technology. You can find some of it below.

Read More




Patent Application Details
Audio user interface for computing devices
-----
In order to achieve portability, many hand-held devices use user interfaces that present various display screens to the user for interaction that is predominantly visual. Users can interact with the user interfaces to manipulate a scroll wheel and/or a set of buttons to navigate display screens to thereby access functions of the hand-held devices. However, these user interfaces can be difficult to use at times for various reasons. One reason is that the display screens tend to be small in size and form factor and therefore difficult to see. Another reason is that a user may have poor reading vision or otherwise be visually impaired. Even if the display screens can be perceived, a user will have difficulty navigating the user interface in “eyes-busy” situations when a user cannot shift visual focus away from an important activity and towards the user interface. Such activities include, for example, driving an automobile, exercising, and crossing a street.

It is noted that text strings that correspond to standard text strings can have pre-recorded audio files. Such text strings may correspond to common user interface controls, such as “play”, “stop”, “previous”, etc., and to common menu items such as “Music”, “Extras”, “Backlight.” These audio files can be created using a voice talent or speech synthesized from the voice talent’s recordings. The other text displayed as part of the media player user interface that is usually user specific, such as contacts and customized playlist names can all be synthesized by building a voice from the voice talent recordings. This provides consistency by having the same voice for all textual data to be presented to the user.
-----