ModelScopeChatEndpoint

ModelScope (Home | GitHub) is built upon the notion of “Model-as-a-Service” (MaaS). It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. The core ModelScope library open-sourced in this repository provides the interfaces and implementations that allow developers to perform model inference, training and evaluation.

This will help you get started with ModelScope Chat Endpoint.

Overview

Integration details

Provider	Class	Package	Local	Serializable	Package downloads	Package latest
ModelScope	ModelScopeChatEndpoint	langchain-modelscope-integration	❌	❌

Setup

To access ModelScope chat endpoint you'll need to create a ModelScope account, get an SDK token, and install the langchain-modelscope-integration integration package.

Credentials

Head to ModelScope to sign up to ModelScope and generate an SDK token. Once you've done this set the MODELSCOPE_SDK_TOKEN environment variable:

import getpass
import os

if not os.getenv("MODELSCOPE_SDK_TOKEN"):
    os.environ["MODELSCOPE_SDK_TOKEN"] = getpass.getpass(
        "Enter your ModelScope SDK token: "
    )

Installation

The LangChain ModelScope integration lives in the langchain-modelscope-integration package:

%pip install -qU langchain-modelscope-integration

Instantiation

Now we can instantiate our model object and generate chat completions:

from langchain_modelscope import ModelScopeChatEndpoint

llm = ModelScopeChatEndpoint(
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    temperature=0,
    max_tokens=1024,
    timeout=60,
    max_retries=2,
    # other params...
)

Invocation

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to Chinese. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content='我喜欢编程。', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 3, 'prompt_tokens': 33, 'total_tokens': 36, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'qwen2.5-coder-32b-instruct', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-60bb3461-60ae-4c0b-8997-ab55ef77fcd6-0', usage_metadata={'input_tokens': 33, 'output_tokens': 3, 'total_tokens': 36, 'input_token_details': {}, 'output_token_details': {}})

print(ai_msg.content)

我喜欢编程。

Chaining

We can chain our model with a prompt template like so:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "Chinese",
        "input": "I love programming.",
    }
)

API Reference:ChatPromptTemplate

AIMessage(content='我喜欢编程。', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 3, 'prompt_tokens': 28, 'total_tokens': 31, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'qwen2.5-coder-32b-instruct', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-9f011a3a-9a11-4759-8d16-5b1843a78862-0', usage_metadata={'input_tokens': 28, 'output_tokens': 3, 'total_tokens': 31, 'input_token_details': {}, 'output_token_details': {}})

API reference

For detailed documentation of all ModelScopeChatEndpoint features and configurations head to the reference: https://modelscope.cn/docs/model-service/API-Inference/intro

Chat model conceptual guide
Chat model how-to guides

Overview​

Integration details​

Setup​

Credentials​

Installation​

Instantiation​

Invocation​

Chaining​

API reference​

Related​