botirk
/

tiny-prompt-task-complexity-classifier

Text Classification

Model card Files Files and versions

botirk commited on Jun 12

Commit

269f6c8

·

verified ·

1 Parent(s): d2ab339

Upload quantized ONNX model

Files changed (2) hide show

README.md +11 -4
model_quantized.onnx +2 -2

README.md CHANGED Viewed

@@ -13,17 +13,24 @@ pipeline_tag: text-classification
 # Quantized ONNX model for botirk/tiny-prompt-task-complexity-classifier
-This repository contains the quantized ONNX version of the [nvidia/prompt-task-and-complexity-classifier](https://huggingface.co/nvidia/prompt-task-and-complexity-classifier) model.
 ## Model Description
-This is a multi-headed model which classifies English text prompts across task types and complexity dimensions. This version has been quantized to `INT8` using dynamic quantization with the [🤗 Optimum](https://github.com/huggingface/optimum) library, resulting in a smaller footprint and faster CPU inference.
-For more details on the model architecture, tasks, and complexity dimensions, please refer to the [original model card](https://huggingface.co/nvidia/prompt-task-and-complexity-classifier).
 ## How to Use
-You can use this model directly with `optimum.onnxruntime` for accelerated inference.
 First, install the required libraries:
 ```bash

 # Quantized ONNX model for botirk/tiny-prompt-task-complexity-classifier
+This repository contains the quantized ONNX version of the \
+[nvidia/prompt-task-and-complexity-classifier](https://huggingface.co/nvidia/prompt-task-and-complexity-classifier) model.
 ## Model Description
+This is a multi-headed model which classifies English text prompts across task \
+types and complexity dimensions. This version has been quantized to `INT8` \
+using dynamic quantization with the [🤗 Optimum](https://github.com/huggingface/optimum) \
+library, resulting in a smaller footprint and faster CPU inference.
+For more details on the model architecture, tasks, and complexity dimensions, \
+please refer to the [original model card]\
+(https://huggingface.co/nvidia/prompt-task-and-complexity-classifier).
 ## How to Use
+You can use this model directly with `optimum.onnxruntime` for accelerated \
+inference.
 First, install the required libraries:
 ```bash

model_quantized.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:36c58a6b89d72d22c9a67caebab6356673f13e6a3c743e54552878cf1557c3e0
-size 243965613

 version https://git-lfs.github.com/spec/v1
+oid sha256:6822e95319064c37f205a315480d1c3754f670f560c058726312445e46fc01b4
+size 187497950