AbstractPhila PRO
AI & ML interests
Recent Activity
Organizations
The various mechanisms aren't named EXACTLY what I described, and their purposes may have been tweaked a bit. However, the trainer_v4 is running now.
https://huggingface.co/AbstractPhil/tiny-flux-deep/resolve/main/scripts/trainer_v4_testing.py
After converting the model, I've reinitialized the EMA due to the last EMA being essentially completely garbage noise.
https://huggingface.co/AbstractPhil/tiny-flux-deep/tree/main/checkpoint_runs/v4_init
This EMA will be more closely monitored to ensure it doesn't collapse or implode.
Adjacently, the old EMA will be updated to keep hope alive that it will learn eventually as well.
I'll format a safetensors variant for the sol unet today, and ensure the experts exist in a model repo for ease-of-utility.
Talk to Lune here; should be absolutely stunning.
https://huggingface.co/AbstractPhil/tinyflux-experts/blob/main/inference_sd15_flow_lune.py
Talk to Sol here; should encapsulate the entirety of flat output geometric structure in a shape.
https://huggingface.co/AbstractPhil/tinyflux-experts/blob/main/inference_sd15_flow_sol.py
Seems the SOL training NEVER advanced far enough to become full flow-matching, but it definitely aligns to velocity prediction. This may provide a more useful representation than lune in many avenues. Lune is most definitely a full rectified flow model.
Introducing the "blot" expert, sd15-flow-sol. The twin sister flow-matching experts for tinyflux-lailah; sd15-flow-lune AND sd15-flow-sol will be used in tandem to train tinyflux-Lailah. sd15-flow-sol never managed to reach full flow-matching prediction, so epsilon vpred conversion is required. All experts will exist within the tinyflux-experts repo, including all the critical checkpoint sets.
Lune was heavily finetuned in the sd3-style and adapted shift timestep system after David's interpolation converted sd15 into geometric basis.
Sol was left abandoned after 50 epochs with David and was considered overcooked and rigid, until I noticed the geometric structure today. Lune doesn't produce geometric structure as solid as Sol, not even close. Lune produces improved fidelity and detail, but Sol produces something very very different, aligned to sd15's behavior, and fully representative of the 5point 4simplex structure that David brought to the table.
Sol is essentially a nearly perfect blob-forming geometric blotter. Sol is SD15, and yet SOL was trained using a specific pattern recognizing and timestep aligned David model. David was tasked with classifying timesteps and patterns using complex deep-recognition structural analysis layer-by-layer, determining full-scale opinions after watching the entirety of sd15's structure during training.
Even though the sd15-flow-sol was left abandoned, the structure of Sol is HIGHLY effective at understanding TIMESTEP blotting interpolation. I didn't realize how crucially important this was until Lailah started to show rigidity and compartmentalized behavior with sequence - which likely happens to ALL flow matching models.
AbstractPhil/sd15-flow-matching
AbstractPhil/geo-david-collective-sd15-distilled
AbstractPhil/geo-david-collective-sd15-base-e40
Alright I've decided, I'll be training experimentally for some epochs the expertise afforded by sd15-flow-lune's timestep and trajectory knowledge as the guidance distillation mechanism for training. How accurate to the interpolation requirement of tinyflux is to be determined.
Flow-Lune is an acceptable distillation that converted sd15 into a useful representation of an image synthesizer with entirely synthetic data based on sd15 and schnell data.
The pretraining has hit an impasse.
Currently it's a linear timestep based on shift and a random number between 1 and 5 for guidance. I have narrowed the possibilities down to two that can be implemented today to solve this problem; CFG or TIMESTEP, which expert is required and which is the best candidate?
- The model WILL require a timestep expert manifold. This will allow the expertise for the timestep manifold to be managed by something much more trained and more intelligent during training, which will require CFG guidance training controlled by learning or complete random chance. E.G. standard dropout to encourage CFG.
- OR the model WILL require a cfg expert to distill the guidance embeds. This model is simply too small. The embeds CAN learn useful information yes, if they are distilled from an expert to cake the CFG into the model by default. This will likely require a third expert that can be modularly snapped off for inference; this expert will likely need to be present during training otherwise the model will heavily drift due to the model's small size.
I have trained a multitude of v-pred sdxl models and a flow-matching shift sd15 model that can represent this necessary implication. This begs the question now; which expert should be used and should I just make a very specific tinyflux expert distilled from ALL SD15 and SDXL timestep variants using David?
This leads to one CORE and IMPORTANT question; CAN THIS BE REPRESENTED WITHOUT AN EXPERT!? I think this is possible, I've ran VIT experiments that used raw sinusoidal for encodings with a surprisingly fair representation of encoding capacity.
The model is ALREADY responsive to CFG but only in part. The current cfg guidance is only getting in the way in many points and I assume is just jiggering in noise, so I'll need to either disable it or use it correctly. The further in training the model gets the more retraining will be required for such a component, so the decisions need to happen sooner rather than later.
Lailah uses flan-t5-base, clip-vit-l-14, and BlackForestLabs Flux1s VAE.
SEQ limit 128, images 512x512 for now. Lailah's early form is based on three variants. TinyFlux's weights were carefully planted into a deeper structure and trained yet again - dubbed TinyFlux-Deep. This variant has 15 dual-stream blocks and 25 single-stream blocks, nearly identical weight code as Flux with a similar attention mechanism - but intentionally deviant and compacted with careful consideration to scaling and purpose of mechanisms.
She went through quite a few growing pains with her earlier attention mechanism which required a reimagining today and careful consideration of the consequences, and now I present to you the preliminary look into Lailah.
The preliminary training is still heavily under way, the mechanisms are still being augmented, and her stability is currently being measured. The potential for fidelity, depth, and quality are still in measure - so I will be shifting attention and pivoting utility based on the needs over time.