Multi-Model Orchestrator

A sophisticated multi-model orchestration system that manages parent-child LLM relationships, specifically integrating CLIP-GPT2 image captioner and Flickr30k text-to-image models.

🚀 Features

Parent Orchestrator

  • Intelligent Task Routing: Automatically routes tasks to appropriate child models
  • Model Management: Handles loading, caching, and lifecycle of child models
  • Error Handling: Robust error handling and recovery mechanisms
  • Task History: Comprehensive logging and monitoring of all operations
  • Async Support: Both synchronous and asynchronous processing modes

Child Models

  • CLIP-GPT2 Image Captioner: Converts images to descriptive text captions
  • Flickr30k Text-to-Image: Generates images from text descriptions
  • Extensible Architecture: Easy to add new child models

📦 Installation

pip install git+https://huggingface.co/kunaliitkgp09/multi-model-orchestrator

🎯 Quick Start

from multi_model_orchestrator import SimpleMultiModelOrchestrator

# Initialize orchestrator
orchestrator = SimpleMultiModelOrchestrator()
orchestrator.initialize_models()

# Generate caption from image
caption = orchestrator.generate_caption("sample_image.jpg")
print(f"Caption: {caption}")

# Generate image from text
image_path = orchestrator.generate_image("A beautiful sunset over mountains")
print(f"Generated image: {image_path}")

🔗 Model Integration

Child Model 1: CLIP-GPT2 Image Captioner

  • Model: kunaliitkgp09/clip-gpt2-image-captioner
  • Task: Image-to-text captioning
  • Performance: ~40% accuracy on test samples

Child Model 2: Flickr30k Text-to-Image

  • Model: kunaliitkgp09/flickr30k-text-to-image
  • Task: Text-to-image generation
  • Performance: Fine-tuned on Flickr30k dataset

📊 Usage Examples

Multimodal Processing

# Process both image and text together
results = orchestrator.process_multimodal_task(
    image_path="sample_image.jpg",
    text_prompt="A serene landscape with mountains"
)

print("Caption:", results["caption"])
print("Generated Image:", results["generated_image"])

Async Processing

from multi_model_orchestrator import AsyncMultiModelOrchestrator
import asyncio

async def async_example():
    orchestrator = AsyncMultiModelOrchestrator()
    orchestrator.initialize_models()
    
    results = await orchestrator.process_multimodal_async(
        image_path="sample_image.jpg",
        text_prompt="A futuristic cityscape"
    )
    return results

asyncio.run(async_example())

🎯 Use Cases

  • Content Creation: Generate captions and images for social media
  • Research and Development: Model performance comparison and prototyping
  • Production Systems: Automated content generation pipelines
  • Educational Applications: AI model demonstration and learning

📈 Performance Metrics

  • Processing Time: Optimized for real-time applications
  • Memory Usage: Efficient GPU/CPU memory management
  • Success Rate: Robust error handling and recovery
  • Extensibility: Easy integration of new child models

🤝 Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for:

  • New child model integrations
  • Performance improvements
  • Bug fixes
  • Documentation enhancements

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments


Happy Orchestrating! 🚀

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support