Acoustics final ch 2 and 3

  1. What are the levels of speech?
    • Linguistic for speaker
    • Physiological for speaker
    • Acoustic- sound waves
    • Physiological for listener
    • Linguistic for listener
  2. Which level is the easiest to study?
    acoustic level
  3. Linear source-filter theory
    • expresses articulatory-acoustic relationships
    • *one of the most important/best theories in our field
  4. What is involved in speech production?
    • need a power source (breath support)
    • we get a complex periodic signal from the vocal folds (the vibrate in 3 different ways and come together
    • speech is changed by changing the shapes of your cavities
  5. What is the source of sound for speech?
    • vocal folds (vibration)
    • *for some consonants, the source is more complex (can be in the vocal tract or a combination of both- voiceless sounds)
  6. What makes you sound like you?
    shapes of pharynx, nasal, and oral/mouth cavities
  7. What is the filter for speech?
    • vocal tract (frequency dependent like all filters)
    • resonator (air filled cavity)
  8. What does the resonator do for you?
    • natural frequencies change in resonator (ear does everything for you so you can perceive differences)
    • 3-6 syllables per second
  9. How are the source and filter related?
    • they are assumed to be independent of each other (an assumption made for convenience)
    • this implies that you can change the output of the vocal folds without changing the vocal tract and vice-versa
  10. What do the vocal folds and vocal tract give you?
    • vocal folds- fundamental frequency, harmonics, and amplitude changes
    • vocal tract- articulation
  11. How are vowels modeled?
    as a tube closed at one end and open at the other
  12. What is the formula to calculate where the resonant frequencies will be?
    • Fn = (2n-1)c/41
    • Fn = resonant frequency
    • n = integer (if looking for 1 you put 1, if looking for 2 you put 2, etc)
    • c = speed of sound/4 times the length of tube
  13. What is the first resonant frequency with a tube length of 17 cm and speed of sound is 34,000 cm/s?
    • Fn = (2n-1)c/41
    • (2-1)*34000/(4*17) = 500 Hz
    • *the longer the tube the lower the resonant frequencies, the shorter the tube the higher the resonant frequencies
  14. How many resonances are there for a tube?
    • infinite
    • we only need to consider the first 3 or 4 (the model is valid to only about 5 kHz)
  15. What happens when the shape of the tube changes going from one vowel to another?
    resonant frequencies change
  16. Why doesn't changing the frequency/energy of the source of vibration change the resonant frequencies of the pipe/vocal tract?
    the source and filter are independent of one another
  17. What are formant frequencies?
    • resonant frequencies of vowels
    • *do NOT confuse with fundamental frequency!
  18. How do a curved tube (vocal tract) and a straight tube (model) behave out to 5 kHz?
    • indentically acoustically
    • the curve begins to affect acoustic signals with a short wavelength
  19. What happens if the tube has uniform cross sectional area?
    the resonances are equally spaced
  20. Does all of the energy come from the source or filter?
    • source
    • vocal fold vibration for vowels
  21. What does changing the length of the tube do?
    • changes the resonance frequencies
    • influenced by age and sex
    • l = 14.5 cm for females
    • l = 8.75 cm for children
  22. What does every formant/resonant/natural frequency have?
    its own frequency, amplitude, and bandwidth
  23. How are different vowels modeled?
    acoustically by different vocal tract shapes
  24. Phonetically, how are vowels distinguished?
    position of the tongue
  25. What happens if a constriction is placed on the tube/vocal tract?
    the resonances change
  26. What happens if you change the articulation?
    you change the vocal tract shape, and the resonance frequencies, amplitudes, and bandwidths
  27. *Ch. 2 slide 14
  28. The output energy of a vowel is the product of:
    • the source energy
    • the size and shape of the resonator
    • the radiation characteristics (adds 6 dB)
    • increases in frequency by 6 dB + 6 dB (constant)- output is actually -6 dB
  29. What are glottal source characteristics for vowels?
    • vocal fold vibration is periodic
    • fo or F0 is used to indicate the vocal fundamental frequency
    • the amplitude of the harmonics decreases by -12 dB/octave
  30. What gives you amplitude changes?
    • source
    • only changing source and not filter makes resonant frequencies stay the same
  31. *Ch. 2 slide 19
  32. What are filter characteristics for vowels?
    • the vocal tract is a dynamic filter (changes constantly)
    • it is frequency dependent
    • it has, theoretically, an infinite number of resonances (only care about 1st 3 or 4 for vowels)
    • each resonance has a center frequency, and amplitude and a bandwidth
    • for speech, these resonances are called formants
    • formants are numbered in succession from the lowest (F1, F2, F3, etc)
    • the formants together form the transfer function (input-output relationship; formants become physically evident only when energized)
  33. Which harmonic has the highest amplitude?
    the one closest to the vowel
  34. What is radiation characteristic?
    • acoustic effect when a sound leaves a small area and enters a large one (like speaker)
    • the effect is to raise the slope of the spectrum by +6 dB/octave
  35. *What are the acoustic phonetic relationships for vowels?
    • F1 is inversely related to tongue height (raise tongue, low F1 and vice versa)
    • F2 is directly related to tongue advancement (back vowels have low F2, front vowels have high F2)
    • lip rounding lowers all formant frequencies (because you're making the vocal tract longer)
    • you can calculate how close a person is to the sound they are trying to make
  36. What does perturbation mean?
    constriction
  37. What is the perturbation theory?
    • volume velocity variations reflect the way air particles vibrate at a particular point in the vocal tract (how the air is passing through vocal folds)
    • at some points, vibration is minimal (node); at others, maximal (antinodes)
    • for F1, the antinode is at the open end of the tube (mouth) and the node is at the closed end (vocal folds)
    • for F2, there are 2 antinodes and 2 nodes, etc
  38. Where is there always an antinode?
    lips
  39. Where is there always a node?
    vocal folds
  40. What happens when there is a constriction near a node?
    formant frequency will increase
  41. What happens when there is a constriction near an antinode?
    formant frequency will decrease
  42. Perturbation theory, if a change in cross sectional area is applied (a perturbation):
    • the acoustic effect depends on proximity to a node or an antinode (antinode = lower freq.; node = higher freq.)
    • lip constrictions lower all formant frequencies
    • laryngeal constrictions raise all formant frequencies
  43. What do amplitudes depend on?
    formant frequencies
  44. If F1 is lowered (raised), what happens to A1?
    it lowers (rises)
  45. If 2 formant frequencies move closer together:
    both peaks increase in amplitude
  46. How do you raise or lower formant frequencies?
    change articulators (3-6 syllables per second)
  47. What are source-filter interactions?
    • independent of one another
    • BUT some vocal tract shapes may affect vocal fold vibration:
    • singers' formant (to be heard over background noise)
    • high impendance constrictions require greater subglottal air pressure
    • vocal tract - vocal fold coupling during open phase of vibratory cycle
  48. What can the linear source-filter theory be used to describe?
    the acoustics of consonants as well as vowels
  49. Why, for consonants, is the source not always at the level of the vocal folds?
    • some sources are in the vocal tract
    • these sources are aperiodic
    • durations and amplitudes also are different from vowels
  50. What does the source-filter theory give us?
    a series of expectations for the acoustic characteristics for consonants
  51. How are fricatives modeled?
    as a tube with a very severe constriction
  52. What are characteristics of fricatives?
    • the air exiting the constriction is turbulent
    • zeros or antiformants can be found in the spectrum
    • because of the turbulence, there is no periodicity unless accompanied by voicing
  53. What are characteristics of nasal consonants?
    • velopharyngeal port is open and the oral cavity is completely blocked at some point
    • the side-branch resonator produces antiformants (zeros)
    • the overall vocal tract is longer than for vowels
    • oral formants, nasal formants, nasal antiformants
    • nasal murmur
  54. What are characteristics of stops?
    • the tube model is not altered very much
    • time domain is critical
    • there is a complete closure of the vocal tract somewhere
    • pressure builds up behind the closure
    • rapid release
    • articulation results in a burst and transitions
  55. What does analog mean?
    storing ALL the information on a wave
  56. What does digital mean?
    samples at specific times along wave at each frequency and takes few points and stores the information (connects the dots for you and doesn't record amplitudes)
  57. What is a spectrograph?
    • an instrument the can capture the dynamics of speech
    • acoustic signals vary only in frequency, amplitude and time; the sound spectrograph captures all of these
  58. What is a spectrogram?
    the output (usually a hardcopy) of a spectrograph
  59. What is a wide-band filter good for?
    looking at formant frequencies
  60. What is a narrow-band filter good for?
    looking at harmonics and fundamental frequency
  61. What is abscissa?
    • time
    • x axis
  62. What is ordinate?
    • frequency
    • y axis
  63. What do black areas of a spectrogram indicate?
    highest amplitudes
  64. What do white areas on a spectrogram indicate?
    the noise floor
  65. What do shades of gray in a spectrogram indicate?
    • intensity
    • amplitudes between highest amplitude and noise floor
    • the more intense the signal is at a particular frequency and time, the darker the trace
  66. What is the Nyquist theorum?
    • in order to represent a signal faithfully, it must be sampled at a rate equal to twice its highest frequency
    • if you don't pick the right sampling rate, you don't get accurate output (if you get the wrong output, all your measurements are wrong)
  67. What is presampling or brickwall filtering?
    • removes all of the energy above the nyquist frequency
    • the clinician/researcher determines the Nyquist frequency
    • some knowledge of speech and speech and language disorders is required
  68. What is aliasing?
    • when the output doesn't match the input
    • when you don't follow Nyquist rule
  69. What are discrete numbers?
    dots along wave (not continuous measurement)
  70. What is sampling rate?
    how many times you take a discrete number
  71. What is sampling?
    how many times per second the amplitude will be recorded
  72. What does sampling for digital signal processing do?
    • analog-to-digital conversion
    • signal must be sampled at the Nyquist rate
    • sampling rate decides the times at which the signal will be sampled
    • sampling converts the acoustic signal into a series of numbers
    • instead of amplitudes at all instances of time, no matter how small the time interval, amplitudes in the digital world exist only at the sampling interval
  73. What happens to the samples determined by the sample rate?
    they are chopped into discrete numbers (converting amplitude variations into discrete numbers)
  74. What is quantization?
    • discrete number of amplitude levels
    • the more quantizer levels available, the more the discrete signal represents the original analog signal (higher the rate, smaller the interval)
    • in our applications, 16 -bit quantizers over a 20-volt range are typical (this yields an amplitude resolution of 300 microvolts and a signal to noise ratio of 96 dB)
  75. What happens after A/D (analog to digital) conversion?
    • the signal is stored as a stream of numbers
    • time is related by the index to the sampling rate
    • the amplitude is the stored number (quantization process)
    • in this form, many operations can be performed (you can do anything you want)
  76. What is involved in a waveform display?
    • duration measurements (speech changes gradually)
    • signal editing
    • amplitude measurements (rms is most common)
    • vocal fundamental frequency
    • *some consistent rules need to be adopted for duration and signal editing
  77. What is a digital spectrograph?
    a series of spectra based on the FFT (fast Fourier analysis) or LPC (linear predictive coding)
  78. How is amplitude depicted in a digital spectrograph?
    as shades of gray
  79. What is an example of a digital spectrograph?
    • praat
    • does the work for us
  80. What is linear predictive coding (LPC)?
    • you can predict where the next dot (amplitude) will be based on previous cycles (as few as 10 to 15 previous samples is all that is required)
    • speech does not generally vary wildly from sample to sample (highly predictable)
  81. What is the equation for LPC?
    • y = a0 + a1(x-1) + a2 (x-2)+....
    • y = amplitude of the next sample
    • x = one of the previous samples
    • a = estimates of the resonances of vocal tract (can represent sections of vocal tract)
    • allows you to talk on the phone (can guess what speech will be so it only has to transfer so many numbers)
    • individuals with voice/hearing problems have problems with being understood on the phone
  82. What is a wideband spectrogram?
    • short time window (.005, .007, .009)
    • good for measuring formant frequencies (of vowels)
  83. What is a narrowband spectrogram?
    • long time window (.1, .05)
    • good for showing and measuring harmonics
Author
elz125
ID
119668
Card Set
Acoustics final ch 2 and 3
Description
Acoustics of speech and hearing
Updated