Interconnection with Third-party Communities
openMind Hub Client can connect to multiple communities so that you can upload and download files in different communities. Currently, the following communities are interconnected:
- Modelers: This community is connected by default. All open interfaces are supported.
- Openl: Only some interfaces are supported, covering repository creation and model upload and download.
- GitCode: Only some interfaces are supported, covering file upload and download, and file acquisition in a repository.
- Gitee AI: Only download interfaces are supported.
- Hugging Face: Only some interfaces are supported, covering file download and upload, and repository and branch creation.
Installation
pip install openmind_hub[openi]
For details, see Installation Guide.
OpenI Community
Adapted Interfaces and Differences
First, the biggest difference between this community and the default community is that the concept of "project" in the Openl community. A project can be represented as user/project and contains models and datasets. Pay attention to the following differences when using the repo_id parameter. repo_type must be added when you upload or download a dataset.
Model: repo_id="user/project" indicates the model repository named "project" in a project, and repo_id="user/project/model" indicates the model repository named "model" in a project.
Dataset: A dataset is a compressed package in a project. repo_id="user/project", filename="test.zip" indicates the dataset with the specified name in a project.
http_get
There is no obvious difference in common parameters.
def http_get(
url: str,
temp_file: BinaryIO,
resume_size: float = 0,
headers: Optional[Dict[str, str]] = None,
displayed_filename: Optional[str] = None,
**kwargs,
) -> None
om_hub_url
When obtaining the link for downloading private repository files, you need to add the token parameter.
def om_hub_url(
repo_id: str,
filename: str,
token: Optional[str] = None,
**kwargs,
) -> str
create_repo
This interface is used to create a project and create a model under the project. There is no obvious difference in common parameters.
def create_repo(
repo_id: str,
token: Optional[str] = None,
private: bool = False,
exist_ok: bool = False,
desc: Optional[str] = None,
license: Optional[str] = None,
**kwargs,
) -> str
create_commit
There is no obvious difference in common parameters.
def create_commit(
repo_id: str,
operations: List[CommitOperationAdd],
token: Optional[str] = None,
**kwargs,
) -> str
create_branch
This is an empty method. There is no "branch" concept in the Openl community.
upload_folder
When a directory is uploaded to a model, there is no obvious difference. When a directory is uploaded to a dataset, specify repo_type="dataset". The directory uploaded to a dataset is automatically compressed into a package.
def upload_folder(
repo_id: str,
folder_path: Union[str, Path],
token: Optional[str] = None,
repo_type: Literal["dataset", "model"] = "model",
**kwargs,
) -> Optional[str]
om_hub_download
If local_dir is not specified, the file is downloaded to the current directory by default. If the file is downloaded repeatedly, a file with a number suffix is created. To download a dataset, specify repo_type="dataset" and cluster="gpu" (default) or cluster="npu".
def om_hub_download(
repo_id: str,
filename: str,
token: Optional[str] = None,
local_dir: Optional[Union[str, Path]] = None,
force_download: Optional[bool] = False,
repo_type: Literal["dataset", "model"] = "model",
**kwargs,
) -> Optional[Path]
snapshot_download
When all files of a model in a project are downloaded, the model name directory is automatically added to the local path. A compressed package in a dataset represents a complete dataset. Therefore, this method cannot be used to download a dataset.
def snapshot_download(
repo_id: str,
token: Optional[str] = None,
local_dir: Optional[Union[str, Path]] = None,
force_download: Optional[bool] = False,
**kwargs,
) -> Optional[Path]
build_om_headers
There is no obvious difference in common parameters.
def build_om_headers(token: str, headers: Optional[dict] = None, **kwargs) -> dict
CommitOperationAdd
path_in_repo has no specific meaning and does not take effect when it is passed.
@dataclass
class CommitOperationAdd:
path_in_repo: str
path_or_fileobj: Union[str, Path, bytes, BinaryIO]
om_raise_for_status
There is no obvious difference between common parameters, but the way to thrown exceptions is different from the native method.
def om_raise_for_status(response: requests.Response, **kwargs) -> None
try_to_load_from_cache
If cache_dir is not specified, the downloaded file is searched from the current path by default.
def try_to_load_from_cache(
filename: str,
cache_dir: Union[str, Path, None] = None,
**kwargs,
) -> Union[str, None]
Specifying the Community to Be Accessed
Using
set_platform():pythonfrom openmind_hub import * set_platform("openi") om_hub_download(repo_id="FoundationModel/ChatGLM2-6B", filename="config.json", local_dir=".")Using
platform:pythonfrom openmind_hub import * om_hub_download(repo_id="FoundationModel/ChatGLM2-6B", filename="config.json", local_dir=".", platform="openi")Using environment variables:
pythonimport os # Environment variables must be set before **openmind_hub** is imported. os.environ["platform"] = "openi" from openmind_hub import * om_hub_download(repo_id="FoundationModel/ChatGLM2-6B", filename="config.json", local_dir=".")
Example:
from openmind_hub import set_platform, create_repo, upload_folder, snapshot_download
token = "token_in_openi"
# Download the PyTorch-NPU/t5_small model in the Modelers community to the **./t5_small** directory.
snapshot_download(repo_id="PyTorch-NPU/t5_small", local_dir="./t5_small")
# Download the FoundationModel/ChatGLM2-6B model in the specified community to the **./ChatGLM2-6B** directory.
snapshot_download(repo_id="FoundationModel/ChatGLM2-6B", local_dir="./ChatGLM2-6B", platform="openi")
# Set the default community.
set_platform("openi")
# Create an **owner/cool-model** project and a **cool-model** in the specified community. (Use the actual user name and repository name.)
create_repo(repo_id="owner/cool-model", token=token)
# Upload files to the **cool-model** of the **owner/cool-model** project.
upload_folder(repo_id="owner/cool-model", folder_path="./t5_small", token=token)
GitCode Community
Adapted Interfaces and Differences
The GitCode community adapts to the following methods. The parameters and behaviors of the methods are basically the same as those of the native methods.
- upload_file
- upload_folder
- create_commit
- om_hub_download:
revisionsupports only the branch name. An access token is required for downloading files from a public repository. - snapshot_download: revision supports only the branch name. An access token is required for downloading files from a public repository.
- list_repo_tree: A maximum of 100 files can be listed at each level.
Specifying the Community to Be Accessed
import os
os.environ["OPENMIND_HUB_ENDPOINT"] = "https://api.gitcode.com"
from openmind_hub import snapshot_download
token = "xxx"
snapshot_download("owner/repo", token=token)
Gitee AI Community
Adapted Interfaces and Differences
The gitee AI community supports download interfaces. "revision" is required, as "master" is the default branch of a repository. "repo_type" does not take effect.
- om_hub_download
- snapshot_download: Extra generates a .cache folder in the download directory.
Specifying the Community to Be Accessed
import os
os.environ["HF_ENDPOINT"] = "https://hf-api.gitee.com"
from openmind_hub import snapshot_download, set_platform
set_platform("gitee")
token = "xxx"
snapshot_download("owner/repo", token=token, revision="master")
The gitee AI community supports reading access tokens from environment variables or local files. If you use this method, you should take the initiative to prevent the security risks brought by this behavior.
Hugging Face Community
Adapted Interfaces and Differences
You can upload or download files, and create repositories or branches in the Hugging Face community. There is no obvious usage difference.
- upload_file
- upload_folder
- om_hub_download
- snapshot_download: Extra generates a .cache folder in the download directory.
- create_branch
- create_repo:"fullname", "desc", and "license" are not supported.
from openmind_hub import snapshot_download, set_platform
set_platform("huggingface")
token = "xxx"
snapshot_download("owner/repo", token=token)
The Hugging Face community supports reading access tokens from environment variables or local files. If you use this method, you should take the initiative to prevent the security risks brought by this behavior.