Training Systems¤
Status: Supported runtime training reference
artifex.generative_models.training keeps the shared owner set narrow: the shared package owns Trainer, typed optimizer and scheduler factories, callback modules, gradient accumulation helpers, distributed utilities, staged and streaming loop helpers, and typed RL trainer contracts. Family-specific trainer implementations live under artifex.generative_models.training.trainers.
Shared Trainer¤
Use Trainer when you want an explicit objective boundary and callback-aware training loop:
from artifex.generative_models.core.configuration import (
OptimizerConfig,
SchedulerConfig,
TrainingConfig,
)
from artifex.generative_models.training import Trainer, create_optimizer, create_scheduler
from artifex.generative_models.training.callbacks import (
CallbackList,
ProgressBarCallback,
ProgressBarConfig,
)
optimizer_config = OptimizerConfig(
name="adamw",
optimizer_type="adamw",
learning_rate=1e-3,
weight_decay=0.01,
)
scheduler_config = SchedulerConfig(
name="cosine",
scheduler_type="cosine",
warmup_steps=1_000,
cycle_length=100_000,
min_lr_ratio=0.1,
)
training_config = TrainingConfig(
name="baseline-training",
optimizer=optimizer_config,
scheduler=scheduler_config,
batch_size=64,
num_epochs=20,
)
schedule = create_scheduler(
SchedulerConfig(
name="cosine",
scheduler_type="cosine",
warmup_steps=1_000,
cycle_length=100_000,
min_lr_ratio=0.1,
),
base_lr=optimizer_config.learning_rate,
)
optimizer = create_optimizer(
OptimizerConfig(
name="adamw",
optimizer_type="adamw",
learning_rate=1e-3,
weight_decay=0.01,
),
schedule=schedule,
)
trainer = Trainer(
model=model,
training_config=training_config,
optimizer=optimizer,
loss_fn=loss_fn,
callbacks=CallbackList([
ProgressBarCallback(ProgressBarConfig(show_metrics=True)),
]),
)
Family Trainers¤
The shared package does not hide model-specific objectives behind one universal trainer class. Use the trainer family that matches the model runtime you are actually training:
- VAE Trainer
- GAN Trainer
- Diffusion Trainer
- Flow Trainer using
FlowTrainingConfig(time_sampling="logit_normal")when you want the retained shared flow-matching configuration surface - Energy Trainer
- Autoregressive Trainer
- REINFORCE Trainer
- PPO Trainer
- GRPO Trainer
- DPO Trainer
Distributed Utilities¤
Artifex ships distributed helpers as utilities, not as trainer subclasses. The retained owners are:
DeviceMeshManagerin mesh.mdDataParallelin data_parallel.mdDevicePlacementin device_placement.mdDistributedMetricsin distributed_metrics.md
Advanced Shared Utilities¤
GradientAccumulatorandDynamicLossScalerlive in gradient_accumulation.md- shared helper functions such as
sample_logit_normallive in utils.md - callback surfaces live in base.md, checkpoint.md, early_stopping.md, logging.md, and profiling.md
Current Training Pages¤
- Callbacks: base, checkpoint, early_stopping, logging, profiling
- Factories and helpers: factory, gradient_accumulation, utils
- Distributed utilities: data_parallel, device_placement, distributed_metrics, mesh
- Family trainers: vae_trainer, gan_trainer, diffusion_trainer, flow_trainer, energy_trainer, autoregressive_trainer
- RL trainers: reinforce, ppo, grpo, dpo
Coming Soon¤
Standalone optimizer and scheduler module pages remain roadmap-only until real modules exist. Use the current factory owners instead.
- Planned-only or future pages: adamw, adafactor, lion, scheduler, optax_wrappers, exponential, linear, cosine, mixed_precision, tracking, visualization, model_parallel