File size: 5,338 Bytes
56ab9c7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
---
license: apache-2.0
language:
- en
- code
tags:
- security
- code-repair
- oss-bench
- moe
- chain-of-thought
- nlp
- c
- cpp
- php
library_name: transformers
pipeline_tag: text-generation
---
<div align="center">
# π‘οΈ NxCode-SafeCoder-30B
**The Next-Generation Mixture-of-Experts Model for Secure Code Intelligence**
[](https://opensource.org/licenses/Apache-2.0)
[](https://huggingface.co/docs/transformers/index)
[]()
</div>
---
## π Model Overview
**NxCode-SafeCoder-30B** is a state-of-the-art code generation model engineered specifically for **software security auditing and automated vulnerability remediation**.
Built upon a highly efficient **Mixture-of-Experts (MoE)** architecture, it delivers the knowledge density of a 30B parameter model while maintaining the inference latency of a much smaller model (only ~3B active parameters per token).
Unlike general-purpose coding assistants, NxCode-SafeCoder is aligned using a **Security-First Chain-of-Thought (CoT)** methodology. It effectively mimics the workflow of a senior security researcher: **Analyze -> Reason -> Fix**.
## β¨ Key Capabilities
* **π‘οΈ Surgical Vulnerability Patching**: Excel at fixing complex memory safety issues (Buffer Overflows, Use-After-Free, Double Free) in C/C++ and PHP.
* **π§ Dual-Phase Generation**: The model is trained to output a detailed **Security Analysis** (`### Analysis`) before generating the **Fixed Code**, ensuring the fix is logically sound and side-effect free.
* **β‘ High-Throughput Inference**: Fully optimized for **vLLM**, achieving **>600 tokens/s** on NVIDIA A100 GPUs, making it suitable for large-scale codebase scanning.
* **π Minimal False Positives**: Drastically reduced sanitizer alerts compared to GPT-4o and Llama-3-70B in fuzzing benchmarks (OSS-Bench).
## π Performance
*Evaluation based on the OSS-Bench framework (Random Split, PHP-src & SQLite target).*
| Model | Architecture | Compilation Rate | Test Pass Rate | **Sanitizer Alerts** (Lower is Better) |
| :--- | :--- | :---: | :---: | :---: |
| **NxCode-SafeCoder-30B** | **MoE (30B)** | **High** | **High** | **Lowest** π |
| GPT-4o | Dense | High | High | Medium |
| Llama-3-70B-Instruct | Dense | Medium | Medium | High |
| DeepSeek-Coder-33B | Dense | High | Medium | Medium |
> **Note**: While general-purpose models often generate code that compiles, they frequently miss subtle boundary checks or introduce new logic errors. NxCode-SafeCoder prioritizes memory safety above all else.
## π» Usage
### 1. Using Transformers
````python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "NxcodeOfficial/NxCode-SafeCoder-30B"
# Load with Flash Attention 2 for best performance
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
attn_implementation="flash_attention_2"
)
# Standard Security Prompt Template
# Note: The model expects the function to be wrapped in C code blocks
prompt = """You are a Linux Kernel security expert. Fix the vulnerabilities in the following C function.
Repository: linux
File: mm/mmap.c
Function:
```c
void *simple_mmap(void *addr, size_t len) {
// Vulnerable: No checks
return mmap(addr, len, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
}
```
"""
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
# Generate (The model will output Analysis first, then the Code)
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
````
### 2. Using vLLM (Production Recommended)
For maximum throughput (e.g., scanning entire repositories), use vLLM.
```python
from vllm import LLM, SamplingParams
llm = LLM(
model="NxcodeOfficial/NxCode-SafeCoder-30B",
trust_remote_code=True,
tensor_parallel_size=1, # Fits on a single A100 80GB
gpu_memory_utilization=0.95,
max_model_len=8192 # Recommended limit to avoid OOM
)
# ... (inference code)
```
## π¬ Methodology
The model was fine-tuned on a proprietary dataset containing **10k+ high-quality security patches** distilled from advanced reasoning engines. The training process utilized:
1. **Expert Routing Optimization**: Tuning the MoE router to specialize specific experts for code analysis vs. code generation.
2. **Conservative Alignment**: Reinforcing the preference for safer standard libraries (e.g., `strncpy`, `snprintf`) and explicit null-pointer checks.
## π Citation
If you use this model in your research or product, please cite:
```bibtex
@misc{nxcode2025safecoder,
title={NxCode-SafeCoder: Automating Secure Code Repair with MoE},
author={NxCode Team},
year={2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/NxcodeOfficial/NxCode-SafeCoder-30B}}
}
```
## βοΈ License
Apache 2.0 |