A web application built with Streamlit, LangChain, OpenAI, and ElevenLabs that generates a short story, a matching image, and an audio narration based on a user's creative prompt.
- AI-generated story using OpenAI's language models
- Image creation via OpenAI's image generation API (DALL·E)
- Voice narration using ElevenLabs text-to-speech
- Interactive UI for selecting genre, tone, image style, and narration voice
- Python 3.8+
- OpenAI API key
- ElevenLabs API key
-
Clone this repository:
git clone <your-repo-link-here> cd multimodal-content-generator
-
Create and activate a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the dependencies:
pip install -r requirements.txt
-
Set your environment variables:
- Create a
.envfile in the project root and add:OPENAI_API_KEY=your_openai_api_key ELEVENLABS_API_KEY=your_elevenlabs_api_key
- Create a
streamlit run app.py- Enter a creative prompt (e.g., "A dragon who learns to dance").
- Choose your preferred genre, tone, image style, and narration voice.
- Click Generate to create a story, an illustration, and a voiceover.
<https://chatgpt.com/c/6812406e-35d8-8000-b99a-3c86c426ce84>
MIT License
- OpenAI for the GPT and image generation APIs
- ElevenLabs for their speech synthesis API
- Streamlit for easy and elegant web UI
Built with ❤️ using AI technologies
=======