A modular pipeline for searching and analyzing GitHub issues and providing recommendations for issue classification
This project provides an intelligent, multi-agent system for processing, analyzing, and managing GitHub issues at scale. It leverages LLMs, vector databases, and cloud-native Kubernetes infrastructure to automate search, triage, and enrichment of GitHub issues, supporting advanced workflows and integration using LangGraph agents.
The detailed implementation of the project can be found in the following blog.
- Multi-agent orchestration for issue processing
- Integration with GitHub, PostgreSQL, and Qdrant vector store
- Modular architecture for agents, guardrails, tools, and pipelines
- Infrastructure-as-code with AWS CDK and Kubernetes support
- GitHub Issues Multiagent Intelligence
├── LICENSE
├── Makefile
├── README.md
├── SETUP.md
├── alembic.ini
├── aws_cdk_infra
│ ├── README.md
│ ├── app.py
│ ├── aws_eks_rds
│ │ ├── __init__.py
│ │ ├── eks_stack.py
│ │ ├── rds_stack.py
│ │ └── vpc_stack.py
│ ├── requirements.txt
│ └── source.bat
├── docker
│ ├── dev.Dockerfile
│ ├── docker-compose.yml
│ └── prod.Dockerfile
├── env.example
├── kubernetes
│ ├── fastapi-deployment.yaml
│ ├── fastapi-service.yaml
│ ├── iam_policy.json
│ └── test-pod.yaml
├── langgraph.json
├── migrations
│ ├── README
│ ├── env.py
│ ├── script.py.mako
│ └── versions
│ └── 77e4d0a13aa8_create_comments_and_issues_table.py
├── pyproject.toml
├── scripts
│ └── lint-makefile.sh
├── src
│ ├── __init__.py
│ ├── agents
│ │ ├── __init__.py
│ │ ├── agents.py
│ │ ├── graph.py
│ │ └── graph_service.py
│ ├── api
│ │ ├── __init__.py
│ │ └── main.py
│ ├── config
│ │ ├── guardrails.yaml
│ │ └── repos.yaml
│ ├── data_pipeline
│ │ ├── __init__.py
│ ├── ingest_embeddings.py
│ │ └── ingest_raw_data.py
│ ├── database
│ │ ├── __init__.py
│ │ ├── drop_tables.py
│ │ ├── init_db.py
│ │ └── session.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── agent_models.py
│ │ ├── api_model.py
│ │ ├── db_models.py
│ │ ├── github_models.py
│ │ ├── guardrails_models.py
│ │ └── repo_models.py
│ ├── utils
│ │ ├── __init__.py
│ │ ├── config.py
│ │ ├── error_handler.py
│ │ ├── guardrails.py
│ │ └── promps.py
│ └── vectorstore
│ ├── __init__.py
│ ├── create_collection.py
│ ├── create_index.py
│ ├── delete_collection.py
│ ├── payload_builder.py
│ ├── qdrant_store.py
│ └── qdrant_store_sync.py
├── tests
│ ├── integration
│ │ ├── test_api_process_issue.py
│ │ ├── test_full_graph_output_guardrails.py
│ │ └── test_query_search.py
│ └── unit
│ ├── test_db_ingest_qdrant.py
│ ├── test_input_guardrail_agent.py
│ ├── test_output_guardrail_agent.py
│ └── test_qdrant_collection.py
└── uv.lock
- Python 3.12+
- uv
- Docker & Docker Compose
- PostgreSQL
- Qdrant
- AWS CLI (for CDK)
- Node.js (for AWS CDK)
- Kubernetes CLI (
kubectl
) - OpenAI API Key
- Guardrails AI API Key
- GitHub Token
git clone https://github.com/benitomartin/github-issues-multiagent-intelligence.git
cd github-issues-multiagent-intelligence
uv sync --all-groups
source ./.venv/bin/activate
There must be two environments created (development and production):
cp env.example .env.dev
cp env.example .env.prod
The development mode runs on localhost. The production mode runs with RDS as the database and AWS EKS with Fargate for FastAPI.
You must follow the SETUP.md first before running the below commands:
Start the database and supporting services, either in development or production mode:
make docker-build APP_ENV=dev
Access Adminer at http://localhost:8080.
Update the database schema:
alembic upgrade head
This file defines which repositories to pull issues from, how many issues to pull, and in what state (e.g., open, closed, or all).
- owner: scikit-learn
repo: scikit-learn
state: all
per_page: 100
max_pages: 1
This file configures the thresholds for Guardrails agents like jailbreak, toxicity, and secrets detection.
jailbreak:
threshold: 0.8
on_fail: "filter"
toxicity:
threshold: 0.5
validation_method: "full"
on_fail: "filter"
secrets:
on_fail: "filter"
Install dependencies in a separate virtual environment:
pip install -r aws_cdk_infra/requirements.txt
Deploy infrastructure:
- AWS EKS with Fargate and Load Balancer
- AWS RDS
- VCP
cd aws_cdk_infra
cdk bootstrap
cdk deploy
Run all tests (unit and integration):
make all-tests
Or run individual test suites.
The FastAPI server is defined in src/api/main.py
.
Start the API server (example):
uvicorn src.api.main:app --reload
{
"title": "Test Issue",
"body": "Test Issue"
}
API docs are available at /docs
when running.
Kubernetes manifests are in kubernetes
. Once the CDK Stack has been deployed, the environment variables must be adapted. Sensitive information can be found under AWS Secrets Manager.
Update your cluster configuration and add a new namespace:
aws eks --region <aus-region> update-kubeconfig --name <cluster-name>
kubectl create namespace my-app
Make sure to add the environment variables information to the Kubernetes cluster:
kubectl create configmap app-config \
--from-literal=APP_ENV=prod \
--from-literal=AWS_REGION= \
--from-literal=POSTGRES_DB= \
--from-literal=POSTGRES_PORT= \
--from-literal=ADMINER_PORT= \
--from-literal=ISSUES_TABLE_NAME= \
--from-literal=COMMENTS_TABLE_NAME= \
--from-literal=DENSE_MODEL_NAME= \
--from-literal=SPARSE_MODEL_NAME= \
--from-literal=LEN_EMBEDDINGS= \
--from-literal=COLLECTION_NAME= \
--from-literal=CHUNK_SIZE= \
--from-literal=BATCH_SIZE= \
--from-literal=CONCURRENT_COMMENTS= \
--from-literal=LLM_MODEL_NAME= \
--from-literal=TEMPERATURE= \
--from-literal=REPOS_CONFIG=src/config/repos.yaml \
--from-literal=GUARDRAILS_CONFIG=src/config/guardrails.yaml \
-n my-app
kubectl create secret generic app-secrets \
--from-literal=GH_TOKEN= \
--from-literal=POSTGRES_USER= \
--from-literal=POSTGRES_HOST=\
--from-literal=POSTGRES_PASSWORD== \
--from-literal=QDRANT_API_KEY= \
--from-literal=QDRANT_URL= \
--from-literal=LANGSMITH_API_KEY= \
--from-literal=OPENAI_API_KEY= \
--from-literal=GUARDRAILS_API_KEY= \
--from-literal=SECRET_NAME= \
-n my-app
You need to build and push the image in production and send it to AWS ECR:
aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com
aws ecr create-repository --repository-name fastapi-app --region <aus-region>
docker tag myapp-prod-image:latest <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com/fastapi-app:latest
docker push <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com/fastapi-app:latest
Then adapt the image name in the deployment manifest and apply:
kubectl apply -f kubernetes/fastapi-deployment.yaml
As the VPC is in private mode, you cannot make a request locally. You can either forward the port, create an EC2 instance in the same network, or add a load balancer to your Kubernetes cluster following these instructions.
Then you can apply the load balancer manifest:
kubectl apply -f kubernetes/fastapi-service.yaml
These will expose an External IP that can be used to make requests:
curl -X POST "http://k8s-myapp-fastapie-96d739e92d-4d28b27c27683b40.elb.eu-central-1.amazonaws.com/process-issue" \
-H "Content-Type: application/json" \
-d '{
"title": "Test Issue",
"body": "Test Issue"
}'
This project is licensed under the MIT License. See the LICENSE