An AI-powered browser automation tool built with Next.js and Gemini 2.0 Vision AI. Transform natural language into browser automation with visual understanding.
- Convert natural language to browser automation steps
- Advanced screenshot analysis and visual understanding
- Intelligent element detection with fallback strategies
- Automatic screenshot capture with metadata
- Real-time progress tracking
- Clean, minimalist interface
- Secure API key management
- Gemini API Key from Google AI Studio
- Node.js 18 or higher
-
Clone the repository:
git clone https://github.com/razee4315/browser-automation-agent.git cd browser-automation-agent
-
Install dependencies:
npm install
-
(Optional) Create
.env.local
with your API key:GEMINI_API_KEY=your_gemini_api_key_here
Note: The app will prompt for your API key on first use if not set in environment.
-
Start the development server:
npm run dev
Open http://localhost:3000 to view the app.
-
For production build:
npm run build npm start
- Get your API key from Google AI Studio
- Enter it in the app when prompted
Variable | Description | Required |
---|---|---|
GEMINI_API_KEY |
Your Gemini API key | Optional* |
*API key can be entered in the app interface
src/
├── app/ # Next.js App Router
│ ├── api/ # API routes
│ └── page.tsx # Main page
├── components/ # React components
│ ├── ApiKeySetup.tsx # API key auth
│ ├── AutomationForm.tsx # User input
│ ├── AutomationStatus.tsx # Progress
│ └── AutomationResults.tsx # Results
└── lib/ # Utilities
├── browser.ts # Playwright
├── gemini.ts # Gemini AI
└── debug-helpers.ts # Debugging
- API keys stored in browser localStorage
- No data collection
- HTTPS required
- Client-side processing
- Chrome/Chromium
- Firefox
- Safari
- Edge
Contributions are welcome! Please read our Contributing Guidelines.
- Fork the repo
- Create a feature branch
- Commit your changes
- Push and open a Pull Request
Saqlain Abbas
Developer
GitHub | Email
Aleena Tahir
Developer
GitHub | Email
This project is licensed under the MIT License - see the LICENSE file for details.
For support, email us at saqlainrazee@gmail.com
⭐ Star this repo if you find it useful!