Skip to content

Tokenization¤

Status: Supported runtime extension owner

Module: artifex.generative_models.extensions.nlp.tokenization

Source: src/artifex/generative_models/extensions/nlp/tokenization.py

Tokenization owner for the NLP extension family.

Top-Level Module Exports¤

  • AdvancedTokenization

Class APIs¤

AdvancedTokenization¤

  • tokenize()
  • detokenize()
  • encode_batch()
  • decode_batch()
  • create_attention_mask()
  • add_special_tokens()
  • compute_token_frequencies()
  • apply_masking()
  • create_position_ids()
  • truncate_sequences()
  • get_vocabulary_info()