Module spin_sdk.wit.imports.llm
A WASI interface dedicated to performing inferencing for Large Language Models.
Global variables
var Error
-
The set of errors which may be raised by functions in this interface
Functions
def generate_embeddings(model: str, text: List[str]) ‑> EmbeddingsResult
-
Expand source code
def generate_embeddings(model: str, text: List[str]) -> EmbeddingsResult: """ Generate embeddings for the supplied list of text Raises: `spin_sdk.wit.types.Err(spin_sdk.wit.imports.llm.Error)` """ raise NotImplementedError
def infer(model: str,
prompt: str,
params: InferencingParams | None) ‑> InferencingResult-
Expand source code
def infer(model: str, prompt: str, params: Optional[InferencingParams]) -> InferencingResult: """ Perform inferencing using the provided model and prompt with the given optional params Raises: `spin_sdk.wit.types.Err(spin_sdk.wit.imports.llm.Error)` """ raise NotImplementedError
Classes
class EmbeddingsResult (embeddings: List[List[float]],
usage: EmbeddingsUsage)-
Expand source code
@dataclass class EmbeddingsResult: """ Result of generating embeddings """ embeddings: List[List[float]] usage: EmbeddingsUsage
Result of generating embeddings
Instance variables
var embeddings : List[List[float]]
var usage : EmbeddingsUsage
class EmbeddingsUsage (prompt_token_count: int)
-
Expand source code
@dataclass class EmbeddingsUsage: """ Usage related to an embeddings generation request """ prompt_token_count: int
Usage related to an embeddings generation request
Instance variables
var prompt_token_count : int
class Error_InvalidInput (value: str)
-
Expand source code
@dataclass class Error_InvalidInput: value: str
Error_InvalidInput(value: str)
Instance variables
var value : str
class Error_ModelNotSupported
-
Expand source code
@dataclass class Error_ModelNotSupported: pass
Error_ModelNotSupported()
class Error_RuntimeError (value: str)
-
Expand source code
@dataclass class Error_RuntimeError: value: str
Error_RuntimeError(value: str)
Instance variables
var value : str
class InferencingParams (max_tokens: int,
repeat_penalty: float,
repeat_penalty_last_n_token_count: int,
temperature: float,
top_k: int,
top_p: float)-
Expand source code
@dataclass class InferencingParams: """ Inference request parameters """ max_tokens: int repeat_penalty: float repeat_penalty_last_n_token_count: int temperature: float top_k: int top_p: float
Inference request parameters
Instance variables
var max_tokens : int
var repeat_penalty : float
var repeat_penalty_last_n_token_count : int
var temperature : float
var top_k : int
var top_p : float
class InferencingResult (text: str,
usage: InferencingUsage)-
Expand source code
@dataclass class InferencingResult: """ An inferencing result """ text: str usage: InferencingUsage
An inferencing result
Instance variables
var text : str
var usage : InferencingUsage
class InferencingUsage (prompt_token_count: int, generated_token_count: int)
-
Expand source code
@dataclass class InferencingUsage: """ Usage information related to the inferencing result """ prompt_token_count: int generated_token_count: int
Usage information related to the inferencing result
Instance variables
var generated_token_count : int
var prompt_token_count : int