Skip to content

Optional Tokenizer Dependency to Improve Embedded Compatibility #209

Open
@ykhrustalev

Description

@ykhrustalev

What behavior of the library made you think about the improvement?

The library currently depends on TLS libraries indirectly through the tokenizers crate, which in turn pulls in hf-hub. However, this dependency is not strictly necessary in all environments.

In many cases, the application using this library already has an instance of a tokenizer and can pass in the vocabulary as a constraint. Requiring hf-hub (and thus the TLS stack) creates issues in embedded environments like iOS or Mac Catalyst due to the heavy dependency graph.

Proposal
Make the hf-hub dependency optional by introducing a default feature flag. This allows stripping the dependency in environments where it's not needed.

Benefits
This change enables building in embedded targets such as:

aarch64-apple-ios
aarch64-apple-ios-macabi

with the following commands:

cargo build --release --target aarch64-apple-ios --no-default-features
cargo build --release --target aarch64-apple-ios-macabi --no-default-features

Related change to show how it could work (worked for our case)
#200

How would you like it to behave?

A feature flag that would make the hf hub an optional dependency and in turn will not pull the tls

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions