Spring AI: SpeechClient

Introduction to SpeechClient

The SpeechClient in Spring AI is a powerful tool that allows you to interact with AI models to generate and analyze speech. This tutorial will guide you through setting up a Spring Boot application and demonstrate how to use SpeechClient to handle AI-generated speech effectively.

1. Setting Up the Project

Step 1: Create a New Spring Boot Project

You can create a new Spring Boot project using Spring Initializr or your preferred IDE. Ensure you include the necessary dependencies for Spring Web and Spring AI.

Using Spring Initializr:

  • Go to start.spring.io
  • Select:
    • Project: Maven Project
    • Language: Java
    • Spring Boot: 3.0.0 (or latest)
    • Dependencies: Spring Web, Spring AI
  • Generate the project and unzip it.

Step 2: Add spring-ai-openai-spring-boot-starter Dependency

In your project's pom.xml, add the following dependency:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>1.0.0</version>
</dependency>

2. Configuring the Spring Boot Starter

Step 1: Add API Key to Configuration

Create a application.properties or application.yml file in your src/main/resources directory and add your OpenAI API key.

For application.properties:

openai.api.key=your_openai_api_key

For application.yml:

openai:
  api:
    key: your_openai_api_key

Step 2: Create a Configuration Class

Create a new configuration class to set up the OpenAI client and the SpeechClient abstraction.

package com.example.demo.config;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.ai.openai.OpenAiClient;
import org.springframework.ai.openai.SpeechClient;
import org.springframework.ai.openai.OpenAiSpeechClient;

@Configuration
public class OpenAiConfig {

    @Bean
    public OpenAiClient openAiClient() {
        return new OpenAiClient();
    }

    @Bean
    public SpeechClient speechClient(OpenAiClient openAiClient) {
        return new OpenAiSpeechClient(openAiClient);
    }
}

3. Implementing the SpeechClient

Step 1: Create a Service for Speech Operations

Create a service class that will handle interactions with the SpeechClient abstraction.

package com.example.demo.service;

import org.springframework.ai.openai.SpeechClient;
import org.springframework.ai.openai.model.SpeechRequest;
import org.springframework.ai.openai.model.SpeechResponse;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

@Service
public class SpeechService {

    @Autowired
    private SpeechClient speechClient;

    public byte[] generateSpeech(String text) {
        SpeechRequest request = new SpeechRequest();
        request.setText(text);

        SpeechResponse response = speechClient.generateSpeech(request);
        return response.getAudioData();
    }
}

Step 2: Create a Controller for the Service

Create a controller to expose an endpoint for generating speech.

package com.example.demo.controller;

import com.example.demo.service.SpeechService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.OutputStream;

@RestController
public class SpeechController {

    @Autowired
    private SpeechService speechService;

    @GetMapping("/generateSpeech")
    public void generateSpeech(@RequestParam String text, HttpServletResponse response) throws IOException {
        byte[] audioData = speechService.generateSpeech(text);

        response.setContentType("audio/mpeg");
        response.setContentLength(audioData.length);

        OutputStream os = response.getOutputStream();
        os.write(audioData);
        os.flush();
        os.close();
    }
}

4. Creating a Simple Frontend

For demonstration purposes, we will create a simple HTML page that allows users to interact with the SpeechClient.

Step 1: Create an HTML File

Create an index.html file in the src/main/resources/static directory.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>AI Speech Generator</title>
</head>
<body>
    <h1>AI Speech Generator</h1>
    <div>
        <textarea id="text" rows="4" cols="50" placeholder="Type your text here..."></textarea><br>
        <button onclick="generateSpeech()">Generate</button>
    </div>
    <div id="audioResult"></div>

    <script>
        function generateSpeech() {
            const text = document.getElementById('text').value;
            fetch(`/generateSpeech?text=${encodeURIComponent(text)}`)
                .then(response => response.blob())
                .then(data => {
                    const audio = document.createElement('audio');
                    audio.src = URL.createObjectURL(data);
                    audio.controls = true;
                    document.getElementById('audioResult').appendChild(audio);
                });
        }
    </script>
</body>
</html>

5. Testing the Integration

Step 1: Run the Application

Run your Spring Boot application. Ensure the application starts without errors.

Step 2: Access the Speech Generator

Open your browser and navigate to http://localhost:8080. You should see the simple speech generator interface. Type some text and click "Generate" to hear the AI-generated speech.

Conclusion

In this tutorial, you learned how to set up and use the SpeechClient feature in a Spring Boot application with Spring AI. You created a service to handle speech generation, a controller to expose an endpoint, and a simple frontend for user interaction. This setup provides a foundation for building more complex and feature-rich AI speech applications. 

Explore further customization and enhancements to create a robust speech client.


Comments