-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing pipelines to API #552
Comments
Thank you for the issue. The plan moving forward was to push running pipelines through workflows instead of direct when using the API. |
Upon further review, there are only a few that aren't in the API and it makes sense to have the routers. I've been pushing things more to workflows but it doesn't hurt to have pipelines, especially in the case of a LLM pipeline. |
Another thing I've faced - in my setup txtxai is hosted in a separate remote environment with a powerful GPU and my custom software needs it to be used remotely using the API. Some pipelines like Textraction and Transcription need to have a file name as an argument. The Textraction from remote sources works well, but Transcription doesn't. Could it be fixed? |
The pipelines are focused on a single task by design. That's where workflows come in. There are workflow steps for reading from URLs and cloud object storage. |
Hi David, Thank you for pointing me out, the retrieve task helped me, transcription works well. docker-compose file
version: '3.4' txtai-api.Dockerfile
Set base imageARG BASE_IMAGE=neuml/txtai-gpu:latest Start server and listen on all interfacesENTRYPOINT ["uvicorn", "--host", "0.0.0.0", "txtai.api:app"] app.yml
Index file pathpath: /tmp/index Allow indexing of documentswritable: True Enbeddings indexembeddings: Extractive QAextractor: Zero-shot labelinglabels: Similaritysimilarity: Text segmentationsegmentation: Text summarizationsummary: Text extractiontextractor: Transcribe audio to texttranscription: #Text To Speech Translate text between languagestranslation: Workflow definitionsworkflow: There is my call in C#, sorry not Python, but I showed it for understanding context. public async Task<TextToSpeechResponse> Handle(TextToSpeechCommand request, CancellationToken cancellationToken)
{
var wf = new Workflow(_settings.BaseUrl);
var elements = new List<string>()
{
{ request.Text }
};
var data = await wf.WorkflowActionAsync("tts", elements);
var result = new TextToSpeechResponse
{
Binary = (byte[])data.FirstOrDefault()
};
return result;
}
} Logs from the container
root@debian-AI:/opt/docker/txtai# docker compose up Could you help me figure out the problem please? I feel that there is something missing. Thank you in advance, |
When I'm using curl there is the same. curl -X POST "http://localhost:8000/workflow" -H "Content-Type: application/json" -d '{"name":"tts", "elements":["Say something here"]}' I figured out that the problem on filling responses on the server when using tts. |
I'll have to look at this closer but it seems like it might be an issue with returning binary data as JSON. |
Yes, I have the same suspicion. |
Well instead of binary, I should say NumPy arrays which are what are returned. You can add your own custom pipeline that converts the waveforms to Python floats which are JSON serializable. class Converter:
def __call__(self, inputs):
return [x.tolist() for x in inputs] Or perhaps something that even writes it to a WAV file then base64 encodes that data like what's in this notebook - https://github.com/neuml/txtai/blob/master/examples/40_Text_to_Speech_Generation.ipynb Ultimately, I think having options to write to WAV/base64 encode could be good options to add to the TTS pipeline. |
This could be the best solution IMHO. Also, it could be a Task I guess. |
Hi guys,
First of all, thank you for the amazing job you do.
I didn't find API for Text-To-Speech. The workflow can be used for this I think, but are there any plans to implement it on API?
Kind regards,
/Andriy
The text was updated successfully, but these errors were encountered: