← back to projects
backendapidevopslive

Compliant Internship Outreach Automation

A compliance-first, multi-agent Python pipeline that discovers Nigerian tech companies, finds public HR emails, and sends personalized internship outreach — controlled entirely via a Telegram bot.

The Problem

Manually researching companies, hunting down HR contacts, and sending individual outreach emails at scale is time-consuming and legally risky. Without compliance guardrails, bulk outreach can violate CAN-SPAM, GDPR, and Nigeria's NDPA — exposing the sender to legal liability and damaging deliverability.

My Approach

Built on a Directive-Orchestration-Execution (DOE) architecture: Directives are markdown SOPs defining agent behavior, an Orchestrator coordinates agent execution order and validates output contracts, and deterministic Python Execution scripts handle the actual work. Compliance is enforced at every layer — robots.txt is respected, rate limits are hard-coded, and a do-not-contact list is checked before every send.

Challenges & Solutions

The two hardest problems were (1) building a scraper that respects robots.txt and avoids login-walled data across dozens of company sites with inconsistent structures, and (2) implementing multi-tier rate limiting (15 emails/day, 5 emails/hour, 120-second minimum delay between sends) without blocking the main thread or the Telegram bot's async event loop.

Results & Impact

5 emails successfully delivered on initial send (100% delivery rate, 0% bounce rate). 89 companies in the database, 34 contacts extracted across industries. Zero compliance violations logged.

Architecture Overview

Three-layer DOE architecture: Directives (markdown SOPs) define what each agent does, an Orchestrator sequences the pipeline (research → validate → contact discovery → personalization → outreach → logging), and Python Execution scripts carry out each step. A Compliance Layer runs checks before every send and halts the pipeline on any violation.

Tech Stack

PythonSQLiteGmail SMTPGoogle Sheets APITelegram Bot APIBeautifulSoup4requestsDuckDuckGo Searchgspreadpython-dotenv

API Showcase

BOT_COMMAND/start | /find | /extract | /send | /stopauth

Primary control interface for the pipeline. Authenticated via ALLOWED_USER_ID — unauthorized users are silently rejected.

Outreach Bot Ready
 Authorized

Commands:
/find - Discover companies
/extract - Extract contacts
/send - Send outreach emails
/stop - Emergency halt
CLIpython main.py pipeline --industry <industry>

Runs the full research → contact extraction → email send sequence for a given industry.

POST/internal/sync-sheets

Syncs the sent_log.json audit trail to the Outreach Log and Summary tabs in Google Sheets.

Deployment

Platform

Local / VPS (DigitalOcean or Hetzner)

Docker

No