Streaming tokens

Endpoints like Continue from conversation history or Continue from raw prompt return tokens in a streaming fashion.

They use POST requests that return chunked stream of serialized JSON object (should not be confused with Server-Sent Events).

Example usage

This is an example of how to use them through JavaScript's fetch API, but the general pattern applies to most of the languages and libraries:

const response = await fetch('http://127.0.0.1:8061/api/v1/continue_from_raw_prompt', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
    },
    body: JSON.stringify({
        max_tokens: 100,
        raw_prompt: "Tell me a story about"
    })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
    const { done, value } = await reader.read();
    
    if (done) {
        break;
    }
    
    const chunk = decoder.decode(value, { stream: true });
    const lines = chunk.split('\n').filter(line => line.trim());
    
    for (const line of lines) {
        try {
            const message = JSON.parse(line);

            console.log('Received:', message.Response.response.GeneratedToken.Token);
        } catch (err) {
            console.error('Error:', err);
        }
    }
}