--- license: apache-2.0 --- # FLUX.1 [schnell] -- Flumina Server App (FP8 Version) This repository contains an implementation of the FLUX.1 [schnell] [FP8 version](https://github.com/aredden/flux-fp8-api), which utilizes float8 numerics instead of bfloat16. This update allows for a 2x performance improvement, significantly speeding up inference tasks when deployed via Fireworks AI’s Flumina Server App toolkit. ![Example output](example.png) ## Getting Started -- Serverless deployment on Fireworks This FP8 Server App is deployed to Fireworks as-is in a "serverless" deployment, offering high-speed, hassle-free performance. Grab an [API Key](https://fireworks.ai/account/api-keys) from Fireworks and set it in your environment variables: ```bash export API_KEY=YOUR_API_KEY_HERE ``` ### Text-to-Image Example Call ```bash curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/fireworks/models/flux-1-schnell-fp8/text_to_image' \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -H "Accept: image/jpeg" \ -d '{ "prompt": "Woman laying in the grass", "aspect_ratio": "16:9", "guidance_scale": 3.5, "num_inference_steps": 4, "seed": 0 }' \ --output output.jpg ``` ![Output of text-to-image](t2i_output.jpg) ## What is Flumina? Flumina is Fireworks.ai’s innovative platform for hosting Server Apps that lets users deploy deep learning inference to production environments in just minutes. ## What does Flumina offer for FLUX models? Flumina provides the following advantages for FLUX models: * Clear, precise definitions of server-side workloads by reviewing the server app implementation (right here). * Extensibility interface for dynamic loading and dispatching of add-ons server-side. For FLUX, this includes: * ControlNet (Union) adapters * LoRA adapters * Off-the-shelf support for on-demand capacity scaling with Server Apps on Fireworks. * Customization of the deployment logic through modifications to the Server App, with easy redeployment. * Support for FP8 numerics, unlocking faster, more efficient inference capabilities. ## Deploying FLUX.1 [schnell] FP8 Version to Fireworks On-Demand ## Deploying Custom FLUX.1 [schnell] FP8 Apps to Fireworks On-demand Coming soon!