# 🚀 Object Detection with Transformer Models This project provides a robust object detection system leveraging state-of-the-art transformer models, including **DETR (DEtection TRansformer)** and **YOLOS (You Only Look One-level Series)**. The system supports object detection and panoptic segmentation from uploaded images or image URLs. It features a user-friendly **Gradio** web interface for interactive use and a **FastAPI** endpoint for programmatic access. Try the online demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection). ## Models Supported The application supports the following models, each tailored for specific detection or segmentation tasks: - **DETR (DEtection TRansformer)**: - `facebook/detr-resnet-50`: Fast and accurate object detection with a ResNet-50 backbone. - `facebook/detr-resnet-101`: Higher accuracy object detection with a ResNet-101 backbone, slower than ResNet-50. - `facebook/detr-resnet-50-panoptic`: Panoptic segmentation with ResNet-50 (note: may have stability issues). - `facebook/detr-resnet-101-panoptic`: Panoptic segmentation with ResNet-101 (note: may have stability issues). - **YOLOS (You Only Look One-level Series)**: - `hustvl/yolos-tiny`: Lightweight and fast, ideal for resource-constrained environments. - `hustvl/yolos-base`: Balances speed and accuracy for object detection. ## Features - **Image Upload**: Upload images via the Gradio interface for object detection. - **URL Input**: Provide image URLs for detection through the Gradio interface or API. - **Model Selection**: Choose between DETR and YOLOS models for detection or panoptic segmentation. - **Object Detection**: Highlights detected objects with bounding boxes and confidence scores. - **Panoptic Segmentation**: Supports scene segmentation with colored masks (DETR panoptic models). - **Image Properties**: Displays metadata like format, size, aspect ratio, file size, and color statistics. - **API Access**: Programmatically process images via the FastAPI `/detect` endpoint. - **Flexible Deployment**: Run locally, in Docker, or in cloud environments like Google Colab. ## How to Use ### 1. **Local Setup (Git Clone)** Follow these steps to set up the application locally: #### Prerequisites - Python 3.8 or higher - `pip` for installing dependencies - Git for cloning the repository #### Clone the Repository ```bash git clone https://github.com/NeerajCodz/ObjectDetection cd ObjectDetection ``` #### Install Dependencies Install required packages from `requirements.txt`: ```bash pip install -r requirements.txt ``` #### Run the Application Launch the Gradio interface: ```bash python app.py ``` To enable the FastAPI server: ```bash python app.py --enable-fastapi ``` #### Access the Application - **Gradio**: Open the URL displayed in the console (typically `http://127.0.0.1:7860`). - **FastAPI**: Navigate to `http://localhost:8000` for the API or Swagger UI (if enabled). ### 2. **Running with Docker** Use Docker for a containerized setup. #### Prerequisites - Docker installed on your machine. Download from [Docker's official site](https://www.docker.com/get-started). #### Pull the Docker Image Pull the pre-built image from Docker Hub: ```bash docker pull neerajcodz/objectdetection:latest ``` #### Run the Docker Container Run the application on port 8080: ```bash docker run -d -p 8080:80 neerajcodz/objectdetection:latest ``` Access the interface at `http://localhost:8080`. #### Build and Run the Docker Image To build the Docker image locally: 1. Ensure you have a `Dockerfile` in the repository root (example provided in the repository). 2. Build the image: ```bash docker build -t objectdetection:local . ``` 3. Run the container: ```bash docker run -d -p 8080:80 objectdetection:local ``` Access the interface at `http://localhost:8080`. ### 3. **Demo** Try the demo on Hugging Face Spaces: [Object Detection Demo](https://huggingface.co/spaces/NeerajCodz/ObjectDetection) ## Command-Line Arguments The `app.py` script supports the following command-line arguments: - `--gradio-port `: Specify the port for the Gradio UI (default: 7860). - Example: `python app.py --gradio-port 7870` - `--enable-fastapi`: Enable the FastAPI server (disabled by default). - Example: `python app.py --enable-fastapi` - `--fastapi-port `: Specify the port for the FastAPI server (default: 8000). - Example: `python app.py --enable-fastapi --fastapi-port 8001` You can combine arguments: ```bash python app.py --gradio-port 7870 --enable-fastapi --fastapi-port 8001 ``` Alternatively, set the `GRADIO_SERVER_PORT` environment variable: ```bash export GRADIO_SERVER_PORT=7870 python app.py ``` ## Using the API **Note**: The FastAPI API is currently unstable and may require additional configuration for production use. The `/detect` endpoint allows programmatic image processing. ### Running the FastAPI Server Enable FastAPI when launching the script: ```bash python app.py --enable-fastapi ``` Or run FastAPI separately with Uvicorn: ```bash uvicorn objectdetection:app --host 0.0.0.0 --port 8000 ``` Access the Swagger UI at `http://localhost:8000/docs` for interactive testing. ### Endpoint Details - **Endpoint**: `POST /detect` - **Parameters**: - `file`: (optional) Image file (must be `image/*` type). - `image_url`: (optional) URL of the image. - `model_name`: (optional) Model name (e.g., `facebook/detr-resnet-50`, `hustvl/yolos-tiny`). - **Content-Type**: `multipart/form-data` for file uploads, `application/json` for URL inputs. ### Example Requests #### Using `curl` with an Image URL ```bash curl -X POST "http://localhost:8000/detect" \ -H "Content-Type: application/json" \ -d '{"image_url": "https://example.com/image.jpg", "model_name": "facebook/detr-resnet-50"}' ``` #### Using `curl` with an Image File ```bash curl -X POST "http://localhost:8000/detect" \ -F "file=@/path/to/image.jpg" \ -F "model_name=facebook/detr-resnet-50" ``` ### Response Format The response includes a base64-encoded image with detections and detection details: ```json { "image_url": "data:image/png;base64,...", "detected_objects": ["person", "car"], "confidence_scores": [0.95, 0.87], "unique_objects": ["person", "car"], "unique_confidence_scores": [0.95, 0.87] } ``` ### Notes - Ensure only one of `file` or `image_url` is provided. - The API may experience instability with panoptic models; use object detection models for reliability. - Test the API using the Swagger UI for easier debugging. ## Development Setup To contribute or modify the application: 1. Clone the repository: ```bash git clone https://github.com/NeerajCodz/ObjectDetection cd ObjectDetection ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. Run the application: ```bash python app.py ``` Or run FastAPI: ```bash uvicorn objectdetection:app --host 0.0.0.0 --port 8000 ``` 4. Access at `http://localhost:7860` (Gradio) or `http://localhost:8000` (FastAPI). ## Contributing Contributions are welcome! To contribute: 1. Fork the repository. 2. Create a feature or bugfix branch (`git checkout -b feature/your-feature`). 3. Commit changes (`git commit -m "Add your feature"`). 4. Push to the branch (`git push origin feature/your-feature`). 5. Open a pull request on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection). Please include tests and documentation for new features. Report issues via GitHub Issues. ## Troubleshooting - **Port Conflicts**: If port 7860 is in use, specify a different port with `--gradio-port` or set `GRADIO_SERVER_PORT`. - **Colab Issues**: Use the `--gradio-port` argument or environment variable to avoid port conflicts in Google Colab. - **Panoptic Model Bugs**: Avoid `detr-resnet-*-panoptic` models until stability issues are resolved. - **API Instability**: Test with smaller images and object detection models first. For further assistance, open an issue on the [GitHub repository](https://github.com/NeerajCodz/ObjectDetection).