The well-intentioned people repeating "give me more words" and "I want clear diction" at you are not in your position: they are not the ones who have to sing the role day-in-and-day-out.
Most of the time they can't demonstrate, and when they can, they are only singing for a phrase or two. They are not performing the whole role: you are.
It's your job to meet the demands of diction as best as possible while preserving your instrument so you can repeatedly sing the whole show.
Good diction has two parts: a clear vowel and clear consonants.
Not loud or forceful - clear.
The first way to clear up diction is to make sure the consonant is present and held long enough to be percevied.
Often singers simply don't sustain a consonant long enough for a listener to perceive it. Fricatives and voiced consonants need to have duration. (L, M, N, TH, V, F, S, Sh, soft Ch)
They typically don't need more air (higher air flow), they just need to last longer.
Plosives need just enough breath to happen, and not a jot more (b, p)
No matter how much we wish it were or how often, or how authoritatively, someone says it, singing is not speaking on pitch.
Speech doesn't require a free and comfortable mechanism.
Speech doesn't require much variety of pitch or long phrases. A spoken phrase averages five seconds. A sung phrase averages ten seconds - twice as long.
Except for very loud speech, sub-glottal pressure is absent (and should be). Sub-glottal pressure should always be present in singing.
If you make the choice to sing the way you speak, you need to accept the limitations that come with that: less volume, shorter phrases, difficulties with high notes and generally less comfort in singing.
One trap in trying to have good diction is to constrict the resonator to make a speech vowel, which tightens the throat so that the pitch changing mechanism can't move.
This does two things: a) sticks you in one mechanism (heavy or light) and b) forces you to increase compression to change pitch, putting more and more pressure on the vocal folds.
The vocal folds can only take so much pressure before they give up.
Have to keep the larynx-throat space as intact as possible while changing the shape of the mouth-throat space so that the vowel sounds right.
The higher you sing, the more important it is to protect the lower formant.
This allows you to use the strechers, which allows pitch changing without killing yourself.
Visual Vowel Generator: Pink Trombone
One trap in trying to have good diction is to increase the airflow to the speed of speech airflow.
This can collapse the mechanism itself by causing the vocal folds to shorten. Then you start to require higher and higher airflow speeds to change pitch. Eventually you can't blow enough air to increase pitch.
Keep your compression intact by actively resisting the exhale while you sing.
It's not really a two vowels - it's the two components of a vowel.
There is a mouth-throat component and a larynx-throat component.
The second component is usually just created by the stretcher as you ascend - as the thyroid cartilage rocks backward, the throat gets a bit longer and it lowers that component.
This is not the same thing as changing laryngeal depth for un-miked singing, where you actively lower the larynx* as part of increasing perceived volume.
*pull down with the sterno-thyroids, not push down with the muscles of the tongue.