A modern AI assistant featuring voice control, voice interaction, camera analysis, code generation, and device management. Built with Python and PyQt6, providing a seamless and intuitive interface for AI-powered tasks.
AI Assistant in Action
Experience the seamless interaction with voice commands, real-time responses, and intelligent assistance.
- π€ Voice Interaction: Natural conversation with wake word activation
- π€ AI Processing: Real-time responses powered by Gemini AI
- π» Smart Interface: Clean, modern UI with intuitive controls
- β‘ Quick Actions: Efficient command processing and execution
Voice Commands Wake word activation and voice response |
AI Processing Real-time AI-powered interactions |
graph TD
A["π€ AI Assistant"] --> B["π€ Voice"]
A --> C["πΈ Camera"]
A --> D["π» Code"]
A --> E["βοΈ Config"]
A --> F["π± Devices"]
A --> G["π Apps"]
B --> B1(["Recognition"])
B --> B2(["TTS"])
B --> B3(["STT"])
C --> C1(["Analysis"])
C --> C2(["Upload"])
C --> C3(["discription"])
D --> D1(["Complete"])
D --> D2(["Highlight"])
E --> E1(["Settings"])
E --> E2(["API Keys"])
F --> F1(["Bluetooth"])
F --> F2(["Serial"])
F --> F3(["Wifi"])
G --> G1(["Launch"])
G --> G2(["Commands"])
%% Color definitions
classDef root fill:#2c3e50,stroke:#2c3e50,color:#fff
classDef main fill:#3498db,stroke:#2980b9,color:#fff
classDef sub fill:#e8f4f8,stroke:#3498db,color:#2c3e50
%% Apply colors
class A root
class B,C,D,E,F,G main
class B1,B2,B3,C1,C2,C3,D1,D2,E1,E2,F1,F2,F3,G1,G2 sub
Category | Features |
---|---|
π€ Voice | Wake word detection, Speech recognition, Text-to-speech, Noise reduction |
πΈ Camera | Real-time analysis, Image upload, Custom prompts, Gemini Vision |
π» Code | Smart completion, Syntax highlighting, Editor integration |
βοΈ Config | API setup, Voice settings, UI preferences, Storage |
π± Devices | Bluetooth control, Port detection, Serial communication |
π Apps | Custom commands, App launching, Command sequences |
- Wake Word Detection: Activate with "computer" using Picovoice Porcupine
- Speech Recognition: Accurate voice-to-text with noise reduction
- Text-to-Speech: Natural voice responses with multiple voice options
- Noise Reduction: WebRTC-based voice activity detection
- Real-time Analysis: Live camera feed processing
- Image Upload: Support for image file analysis
- Custom Prompts: Tailored image analysis queries
- Gemini Vision: Powered by Google's Gemini AI for image understanding
- Smart Completion: Context-aware code suggestions
- Syntax Highlighting: Clear code visualization
- Editor Integration: Custom editor configuration
- Code Simulation: Typing simulation for demonstrations
- API Configuration: Gemini and Picovoice API key management
- Voice Settings: Language and voice customization
- UI Preferences: Theme and display options
- Persistent Storage: Settings auto-save and recovery
- Bluetooth Control: Serial communication with devices
- Device Discovery: Automatic port detection
- Connection Management: Connect/disconnect functionality
- Custom Commands: Device-specific command handling
- Custom Commands: User-defined command sequences
- App Launching: Quick access to favorite applications
- Command Sequences: Multi-step automation
- Settings Persistence: Saved configurations across sessions
π£οΈ Wake Word (Picovoice) Get Key β Free: Default wake words & basic usage |
π§ AI Features (Gemini) Get Key β Free: Gemini Pro with generous limits |
Add keys to .env
:
PICOVOICE_API_KEY=xxxxx
GEMINI_API_KEY=xxxxx
- Python 3.8 or higher
- pip (Python package installer)
- Microphone (for voice features)
- Camera (for image analysis)
- Clone the repository:
git clone https://github.com/moego0/ai-assistant.git
cd ai-assistant
- Create and activate virtual environment:
# Windows
python -m venv venv
venv\Scripts\activate
# Linux/Mac
python3 -m venv venv
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Configure API Keys:
- Launch the application
- Navigate to Settings
- Add required API keys:
- Gemini API key (AI features)
- Porcupine key (wake word)
sequenceDiagram
participant User
participant Assistant
participant API
User->>Assistant: Say "Computer"
Assistant->>User: "Yes?"
User->>Assistant: Speak Command
Assistant->>API: Process Command
API->>Assistant: Response
Assistant->>User: Voice & Text Response
- Activate: Say "computer"
- Speak your command/question
- Receive voice and text response
- Access camera via camera icon
- Options:
- Real-time analysis
- Image upload
- Custom analysis prompts
- Describe your code needs
- Get formatted, syntax-highlighted code
- Copy or save generated code
Category | Options |
---|---|
Voice | Gender (Male/Female) |
Language | English/Arabic/Bilingual |
API Keys | Gemini, Porcupine |
Model | Multiple Gemini models |
{
"voice_gender": "male",
"speech_language": "en-US",
"vad_aggressiveness": 3
}
Check requirements.txt
for full dependency list:
- PyQt6 >= 6.4.2
- google-generativeai >= 0.3.0
- SpeechRecognition >= 3.10.0
- And more...
ai-assistant/
βββ AI_Assistant.py # Main application
βββ requirements.txt # Dependencies
βββ README.md # Documentation
βββ LICENSE # MIT License
βββ .gitignore # Git ignore rules
Contributions welcome! Please:
- Fork the repository
- Create feature branch
- Commit changes
- Push to branch
- Open pull request
This project is licensed under the MIT License - see LICENSE file.
- Google Gemini - AI capabilities
- PyQt6 - UI framework
- Picovoice - Wake word detection
- Open source community
- Smart Lights (Philips Hue, LIFX, etc.)
- Smart Thermostats (Nest, Ecobee)
- Security Cameras
- Media Players (Smart TVs, Speakers)
- Smart Plugs and Switches
- Enable device discovery in Settings
- Connect to your home network
- Authorize devices
- Create device groups (optional)
- Custom Script Creation
- Scheduled Tasks
- Event-Based Triggers
- App Integration
- Voice Command Macros
# Morning Routine Automation
@automation.schedule("07:00")
def morning_routine():
# Turn on lights gradually
smart_lights.fade_in(duration=300)
# Set temperature
thermostat.set_temperature(22)
# Start coffee maker
smart_plug.turn_on("coffee_maker")
- Python 3.8 or higher
- Git
- Microphone (for voice features)
- Camera (optional, for image analysis)
git clone https://github.com/moego0/ai-assistant.git
cd ai-assistant
# Windows
python -m venv venv
venv\Scripts\activate
# Linux/macOS
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
-
Picovoice (Wake Word Detection)
- Visit Picovoice Console
- Create a free account
- Get your API key
- Free tier includes:
- Default wake words
- Basic usage limits
-
Google Gemini (AI Features)
- Visit Google AI Studio
- Sign in with your Google account
- Get your API key
- Free tier includes:
- Access to Gemini Pro
- Generous usage limits
- Create a
.env
file in the project root:
PICOVOICE_API_KEY=your_picovoice_key_here
GEMINI_API_KEY=your_gemini_key_here
- (Optional) Customize settings in
editor_config.json
python AI_Assistant.py
- Say "computer" to activate
- Wait for the activation sound
- Speak your command
- Examples:
- "What's the weather like?"
- "Generate some Python code"
- "Analyze this image"
- Click the camera icon to start
- Use "Analyze" for real-time analysis
- Upload images for detailed analysis
- Request code in natural language
- Use the code editor for modifications
- Save generated code to files
- Fork the repository
- Create your feature branch:
git checkout -b feature/AmazingFeature
- Commit your changes:
git commit -m 'Add some AmazingFeature'
- Push to the branch:
git push origin feature/AmazingFeature
- Open a Pull Request
-
Microphone not working
- Check microphone permissions
- Select correct input device in settings
-
Camera issues
- Ensure camera permissions are granted
- Check camera connection
-
API Key errors
- Get a valid API key from Picovoice and Gemini AI
- Check API key format
- Ensure free tier limits not exceeded
- Open an issue on GitHub
- Check existing issues
- Include error messages and system info
To activate the assistant, simply say: "Computer"
Command | Description | Example |
---|---|---|
Open [application] |
Launch applications | "Open Chrome", "Open Spotify" |
Search for [query] |
Search the web | "Search for gold preices today" |
What's the time? |
Get current time/date | "What's the time?" |
Type [text] |
Type text via voice | "Type Hello World" |
Genrate code |
Genrate code | "genrate code for python calculator" |
control devices |
Open and close lights | "open red light" |
- Speak clearly and at a normal pace
- Wait for the wake word acknowledgment before giving a command
- Commands are case-insensitive
- For application names, use common names (e.g., "chrome" instead of "google chrome")
- Don't forget to initialize apps and devices before using them
- Use the
Genrate code
command to generate code - Use the
control devices
command to open and close lights - Use the
Open [application]
command to launch applications - You can integrate this app with Arduino or Home assistant to control you hame and devices