Vocalware Text To Speech REST API C#

I recently needed to implement the Vocalware realtime text to speech REST api into one of my websites using c# and I struggled to find any examples online of how to do this easily. The Vocalware documentation only covers clientside calling of their API, but I wanted to be able to call the api server-side using c# and save the speech mp3 file to disk.

To perform this function I needed to be able to create an MD5 hash, so it was off to the Microsoft site to get their standard implementation which you can find here. This is my implementation:

using System.Security.Cryptography;
using System.Text;
 
namespace Jungle.BL
{
    /// <summary>
    /// Encryption Helper
    /// </summary>
    public class EncryptionHelper
    {
        /// <summary>
        /// Get Md5 Hash
        /// </summary>
        /// <param name="md5Hash">Md5Hash Obj</param>
        /// <param name="input">Input String</param>
        /// <returns>MD5 Hash</returns>
        public static string GetMd5Hash(MD5 md5Hash, string input)
        {
 
            // Convert the input string to a byte array and compute the hash.
            byte[] data = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
 
            // Create a new Stringbuilder to collect the bytes
            // and create a string.
            StringBuilder sBuilder = new StringBuilder();
 
            // Loop through each byte of the hashed data
            // and format each one as a hexadecimal string.
            for (int i = 0; i < data.Length; i++)
            {
                sBuilder.Append(data[i].ToString("x2"));
            }
 
            // Return the hexadecimal string.
            return sBuilder.ToString();
        }
    }
}

The next step was to create a class to convert my text to speech and save it as an mp3 file. For my filename I chose to create an MD5 hash combined with the Voice, Language and Engine IDs so I could recognise it later should I need to. In the c# example below, these Ids are passed in within a 'guide' class, but you could easily pass them in as separate ints if you like:

using Jungle.Entities;
using System;
using System.Net;
using System.Security.Cryptography;
using System.Web;
 
namespace Jungle.BL
{
    /// <summary>
    /// Speech Helper
    /// </summary>
    public class SpeechHelper
    {
        /// <summary>
        /// Account ID
        /// </summary>
        private const string AccountID = "YourAccountID";
 
        /// <summary>
        /// API ID
        /// </summary>
        private const string APIID = "YourAPIID";
 
        /// <summary>
        /// File Type Reqd
        /// </summary>
        private const string FileTypeReqd = "mp3";
 
        /// <summary>
        /// Secret Phrase
        /// </summary>
        private const string SecretPhrase = "YourSecretPhrase";
 
        /// <summary>
        /// Speech MP3 Save Location
        /// </summary>
        private string _speechSaveLocation = string.Format("{0}uploads\\speech\\", HttpContext.Current.Request.PhysicalApplicationPath);
 
        /// <summary>
        /// Get Speech MP3
        /// </summary>
        /// <param name="textToSpeak">Text To Speak</param>
        /// <param name="guide">Guide Details (Voice)</param>
        public void GetSpeechMP3(string textToSpeak, Guide guide)
        {
            textToSpeak = FormatTextToSpeak(textToSpeak);
 
            using (MD5 md5 = MD5.Create())
            {
                string dataToHash = string.Format("{0}{1}{2}{3}{4}{5}{6}{7}",
                    guide.EID, // Engine ID
                    guide.LID, // Language ID
                    guide.VID, // Voice ID
                    textToSpeak,
                    FileTypeReqd,
                    AccountID,
                    APIID,
                    SecretPhrase);
 
                // Create an MD5 hash from combined data
                string dataMD5Hash = EncryptionHelper.GetMd5Hash(md5, dataToHash);
               
                Uri url = new Uri(string.Format("{0}EID={1}&LID={2}&VID={3}&TXT={4}&EXT={5}&ACC={6}&API={7}&CS={8}",
                    "http://www.vocalware.com/tts/gen.php?",
                    guide.EID,
                    guide.LID,
                    guide.VID,
                    System.Web.HttpUtility.UrlEncode(textToSpeak),
                    FileTypeReqd,
                    AccountID,
                    APIID,
                    dataMD5Hash));
 
                // Create an MD5 hash from text to use as a filename
                string textToSpeakMD5Hash = EncryptionHelper.GetMd5Hash(md5, textToSpeak);
                string textToSpeakMP3FilePath = string.Format(@"{0}{1}_{2}_{3}_{4}.mp3",_speechSaveLocation, textToSpeakMD5Hash, guide.EID, guide.LID, guide.VID);
 
                using (var webClient = new WebClient())
                {
                    webClient.DownloadFile(url, textToSpeakMP3FilePath);
                }
            }
        }
 
        /// <summary>
        /// Format Text To Speak
        /// </summary>
        /// <param name="textToSpeak">Text to Speak</param>
        /// <returns>Formatted Text To Speak</returns>
        private string FormatTextToSpeak(string textToSpeak)
        {
            return textToSpeak.Trim().Replace("!", "");
        }
    }
}

So, these are the two key c# classes you need to call Vocalware's REST API, all you need to do now is call the methods to create the MP3 like so:

new SpeechHelper().GetSpeechMP3("Hi, my name is big steve", new Guide() { VID = 4, LID = 1, EID = 3 });

And that is all there is to it! I hope my code helps you if you are trying to implement the Vocalware REST api server-side using c# !

Vocalware Text To Speech REST API C#

SG Digital

Mail Me