You're fighting a losing battle, having a character realistically lip synch on the fly from text extracted from a db is damm difficult. Believe me, companies do this.
What it boils down to are distinguishing phenomes (basic speech pronounciation fragments) from the text, and then getting the character to respond to the phenomes by moving lips accordingly and speaking.
Recommended program flow
Plain text -----> Phenomic and letter-based verbal differences ----->
Speech Engine |___> Lip synch engine
The difficulties you may face are
1) the smooth transition between the hundreds of dual phenome combinations visually and spoken
2)Distinguishing phenomes from the text. Humans do not actually read text the same way, by looking at letters, and phenomes, and prounouncing them. Letters and pronunciations are different in every sentence spoken verbally, which machines find excruciatingly difficult to perform. These variations needs to be detected in the Phenomic and letter-based verbal difference engine
Recommended reading:
Scientific American
June 2005 Conversational computers page 40
Half life 2 also uses a rudimentary (though excellent) text to phenome engine, although the major difference here is that the labour is saved only in character lip animation, the voices are, however pre-recorded
Have fun