I wanted to speak to my computer, All I wanted was to tell it to "read my screen and reply to this message", or "I cant find this, use my mousse cursor and clic it".
AI&I did it (Opus)
I have dyslexia & ADHD, every email, every Slack message, every document can be a fight against my own brain. I needed something that could hear me think out loud 247 and I wanted it to be private, nothing existed that did this. So I started building, this is the way in our days I heard.
I went for the name CODEC, secured opencodec.org for 7usd per year.
I have open-sourced this project to assist fellow developers with my approach and to share my vision for the AI's capabilities.
CODEC is inteligent framework that turns your Mac into a voice-controlled AI workstation. You give it a brain (any LLM local ( I use MLX QWEN 3.5 35b a3b 4-bit on a mac studio m1 ultra 64GB ram) or cloud), ears (Whisper), a voice (Kokoro), and eyes (vision model). Four ingredients. The rest is Python.
From there, it listens, it sees your screen, it speaks back, it controls your apps, it writes code, it drafts your messages, it researches topics, it manages your google workspace, and when it doesn't know how to do something have it to writes its own plugin and learns BYOB vibe.
I aimed for advance privacy and security, while navigating the learning curve of what is achievable.
No cloud. No subscription. No data leaving your machine, none. MIT licensed.
Your voice. Your computer. Your rules. No limit.
Their is a total of 7 product frame
CODEC Overview — The Command Layer
Youc can have it always on, just say Hey Codec, F13 turn on. F18 press down speak release sent voice note. F16 for direct text.
I wanted CODEC to stand out, I was after a feature with different layers and direct action, it goes like this, hand free, speak up, HEY codec, look at my screenI let reply to *** let say ***
CODEC reads your screen, sees the conversation, writes a contextual response, and pastes it into the text field. Once I succeeded in doing so, I knew that with today's tools, all you need is the idea and the time to bring it to life.
CODEC Core is conected to 50+ skills that fire instantly without touching the LLM. Calculator, weather, timers, Spotify, volume, Apple Notes, Apple Reminders, Google Calendar, Gmail, Drive, Docs, Sheets, Slides, Tasks, Chrome automation, web search, clipboard, app switching, and more.
CODEC Dictate — Hold, Speak, Paste
Hold right CMD. Say what you mean. Release. Text appears wherever your cursor is. If CODEC detects you're drafting a message, express refinement through the LLM first — grammar fixed, tone polished, meaning preserved. Works in every app on macOS. A free, open-source SuperWhisper replacement that runs entirely on your machine.
CODEC Instant — One Right-Click
Select any text, anywhere. Right-click. Proofread. Elevate. Explain. Prompt. Translate. Reply. Save & Read Aloud. Eight services, system-wide, powered by your own LLM. Reducing manipulation to just a clic.
CODEC Chat — 250K Context + 12 Agent Crews
Full conversational resoning AI running on your hardware. Long context. File uploads. Image analysis via vision model. Conversation history. Web browsing.
CODEC Agents — a multi-agent framework, less then 800 lines. Zero dependencies. No CrewAI. No LangChain. Twelve specialized crews that go out, research, and come back with results:
Extensive well presented report with multi ilustrated image, agent handle, Deep Research. Daily Briefing. Trip Planner. Competitor Analysis. Email Handler. Social Media Manager. Code Reviewer. Data Analyst. Content Writer. Meeting Summarizer. Invoice Generator. Project Manager.
You say "research the latest AI agent frameworks and write a report." A handfull of minutes later there's a formatted Google Doc in your Drive with sources, analysis, and recommendations. Local inference. Zero cloud costs.
CODEC Vibe — AI Coding IDE + Skill Forge
Split-screen in your browser. Monaco editor on the left (same engine as VS Code). AI chat on the right. Describe what you want CODEC writes it, you click Apply, run it, and if you want, save it as a new CODEC skill with one click. Point your cursor to select what need fixing, Auto errors reload
Skill Forge takes it further. Bring other skills on board, open claw, claude code, or just describe what you want in plain English. CODEC converts it into a working plugin. The framework writes its own extensions.
CODEC Voice — Live Voice Calls
Real-time voice-to-voice conversations with your AI. WebSocket pipeline to replacing tool like Pipecat. No external dependencies. You call CODEC from your phone, talk naturally, and mid-call you say "check my screen can you see ***" — a screenshot will be taken and shared live, it actually runs q and speaks the result back. Try that with Siri.
Full transcript saved to memory. Every conversation becomes searchable context for future sessions.
CODEC Overview— Your Mac in Your Pocket
Private dashboard accessible from your phone, anywhere in the world. Cloudflare Tunnel with Zero Trust authentication. Send commands, upload files, view your screen, launch voice calls — all from a browser on your phone. No VPN app. No port forwarding. No third-party relay.
Five Security Layers
This isn't a toy. It has system access. So security isn't optional.
- Cloudflare Zero Trust — email whitelist
- PIN code login
- Touch ID biometric authentication
- 2FA Two-factor authentication
- E2E AES-256 end-to-end encryption, every byte encrypted in the browser before it touches the network. Cloudflare sees noise. You can also deploy on tailscale, without needing a domain.
Command preview enforcement (Allow/Deny before every bash command), dangerous command blocker (30+ patterns), full audit log, 8-step execution cap on agents, wake word noise filter, and code sandbox with timeout.
The Privacy Argument
Hey Siri. Hey Alexa. Where do those commands go through someone else's servers, hardware and database, go figure how they trained their models ;)
CODEC data stays right here on your local database. FTS5 full-text search over every conversation you've ever had wiht it, searchable and readable, private, yours, that's not a feature. That's the point.
Every idea and new feature which I kep ton adding to CODEC originally started by depending on established tools progresively replaced with native code:
- Pipecat → CODEC Voice (own WebSocket pipeline)
- CrewAI + LangChain → CODEC Agents (795 lines, zero dependencies)
- SuperWhisper → CODEC Dictate (free, open source)
- Cursor / Windsurf → CODEC Vibe (Monaco + AI + Skill Forge)
- Google Assistant / Siri → CODEC Core (actually controls your computer)
- Grammarly → CODEC Assist (right-click services via your own LLM)
- ChatGPT → CODEC Chat (250K context, fully local)
- Cloud LLM APIs → local stack (Qwen + Whisper + Kokoro + Vision)
- Vector databases → FTS5 SQLite (simpler, faster)
- Telegram bot relay → direct webhook (no middleman)
External services: DuckDuckGo for web search and Cloudflare free tier for the tunnel (or tailscale). Everything else is your hardware, your models, your code.
The needed
- A Mac (Ventura or later)
- Python 3.10+
- An LLM (Ollama, LM Studio, MLX, OpenAI, Anthropic, Gemini — anything OpenAI-compatible)
- Whisper for voice input, Kokoro for voice output, a vision model for screen reading git clone https://github.com/AVADSA25/codec.git cd codec pip3 install pynput sounddevice soundfile numpy requests simple-term-menu brew install sox python3 setup_codec.py python3 codec.py
The setup wizard handles everything in 8 steps.
The Numbers
- 7 product frames
- 50+ skills
- 12 agent crews
- 250K token context
- 5 security layers
- 70+ GitHub stars in 5 days
GitHub: https://github.com/AVADSA25/codec
Site: https://opencodec.org
Enterprise setup: https://avadigital.ai
Star it. Clone it. Rip it. Make it yours.
Mickael Farina
All I wanted was to speak to my computer, All I wanted was to tell it to "read my screen and reply to this message", AI&I did it (Opus)
That dream ate a year of my life.
I have dyslexia & ADHD, every email, every Slack message, every document can be a fight against my own brain. I needed something that could hear me think out loud 247 and I wanted it to be private, nothing existed that did this. So I started building, this is the way in our days I heard.
I went for the name CODEC, secured opencodec.org for 7usd per year.
I have open-sourced this project to assist fellow developers with my approach and to share my vision for the AI's capabilities.
CODEC is inteligent framework that turns your Mac into a voice-controlled AI workstation. You give it a brain (any LLM local ( I use MLX QWEN 3.5 35b a3b 4-bit on a mac studio m1 ultra 64GB ram) or cloud), ears (Whisper), a voice (Kokoro), and eyes (vision model). Four ingredients. The rest is Python.
From there, it listens, it sees your screen, it speaks back, it controls your apps, it writes code, it drafts your messages, it researches topics, it manages your google workspace, and when it doesn't know how to do something have it to writes its own plugin and learns BYOB vibe.
I aimed for advance privacy and security, while navigating the learning curve of what is achievable.
No cloud. No subscription. No data leaving your machine, none. MIT licensed.
Your voice. Your computer. Your rules. No limit.
Their is a total of 7 product frame
CODEC Overview — The Command Layer
Youc can have it always on, just say Hey Codec, F13 turn on. F18 press down speak release sent voice note. F16 for direct text.
I wanted CODEC to stand out, I was after a feature with different layers and direct action, it goes like this, hand free, speak up, HEY codec, look at my screenI let reply to *** let say ***
CODEC reads your screen, sees the conversation, writes a contextual response, and pastes it into the text field. Once I succeeded in doing so, I knew that with today's tools, all you need is the idea and the time to bring it to life.
CODEC Core is conected to 50+ skills that fire instantly without touching the LLM. Calculator, weather, timers, Spotify, volume, Apple Notes, Apple Reminders, Google Calendar, Gmail, Drive, Docs, Sheets, Slides, Tasks, Chrome automation, web search, clipboard, app switching, and more.
CODEC Dictate — Hold, Speak, Paste
Hold right CMD. Say what you mean. Release. Text appears wherever your cursor is. If CODEC detects you're drafting a message, express refinement through the LLM first — grammar fixed, tone polished, meaning preserved. Works in every app on macOS. A free, open-source SuperWhisper replacement that runs entirely on your machine.
CODEC Instant — One Right-Click
Select any text, anywhere. Right-click. Proofread. Elevate. Explain. Prompt. Translate. Reply. Save & Read Aloud. Eight services, system-wide, powered by your own LLM. Reducing manipulation to just a clic.
CODEC Chat — 250K Context + 12 Agent Crews
Full conversational resoning AI running on your hardware. Long context. File uploads. Image analysis via vision model. Conversation history. Web browsing.
CODEC Agents — a multi-agent framework, less then 800 lines. Zero dependencies. No CrewAI. No LangChain. Twelve specialized crews that go out, research, and come back with results:
Extensive well presented report with multi ilustrated image, agent handle, Deep Research. Daily Briefing. Trip Planner. Competitor Analysis. Email Handler. Social Media Manager. Code Reviewer. Data Analyst. Content Writer. Meeting Summarizer. Invoice Generator. Project Manager.
You say "research the latest AI agent frameworks and write a report." A handfull of minutes later there's a formatted Google Doc in your Drive with sources, analysis, and recommendations. Local inference. Zero cloud costs.
CODEC Vibe — AI Coding IDE + Skill Forge
Split-screen in your browser. Monaco editor on the left (same engine as VS Code). AI chat on the right. Describe what you want CODEC writes it, you click Apply, run it, and if you want, save it as a new CODEC skill with one click. Point your cursor to select what need fixing, Auto errors reload
Skill Forge takes it further. Bring other skills on board, open claw, claude code, or just describe what you want in plain English. CODEC converts it into a working plugin. The framework writes its own extensions.
CODEC Voice — Live Voice Calls
Real-time voice-to-voice conversations with your AI. WebSocket pipeline to replacing tool like Pipecat. No external dependencies. You call CODEC from your phone, talk naturally, and mid-call you say "check my screen can you see ***" — a screenshot will be taken and shared live, it actually runs q and speaks the result back. Try that with Siri.
Full transcript saved to memory. Every conversation becomes searchable context for future sessions.
CODEC Overview— Your Mac in Your Pocket
Private dashboard accessible from your phone, anywhere in the world. Cloudflare Tunnel with Zero Trust authentication. Send commands, upload files, view your screen, launch voice calls — all from a browser on your phone. No VPN app. No port forwarding. No third-party relay.
Five Security Layers
This isn't a toy. It has system access. So security isn't optional.
- Cloudflare Zero Trust — email whitelist
- PIN code login
- Touch ID biometric authentication
- 2FA Two-factor authentication
- E2E AES-256 end-to-end encryption, every byte encrypted in the browser before it touches the network. Cloudflare sees noise. You can also deploy on tailscale, without needing a domain.
Command preview enforcement (Allow/Deny before every bash command), dangerous command blocker (30+ patterns), full audit log, 8-step execution cap on agents, wake word noise filter, and code sandbox with timeout.
The Privacy Argument
Hey Siri. Hey Alexa. Where do those commands go through someone else's servers, hardware and database, go figure how they trained their models ;)
CODEC data stays right here on your local database. FTS5 full-text search over every conversation you've ever had wiht it, searchable and readable, private, yours, that's not a feature. That's the point.
Every idea and new feature which I kep ton adding to CODEC originally started by depending on established tools progresively replaced with native code:
- Pipecat → CODEC Voice (own WebSocket pipeline)
- CrewAI + LangChain → CODEC Agents (795 lines, zero dependencies)
- SuperWhisper → CODEC Dictate (free, open source)
- Cursor / Windsurf → CODEC Vibe (Monaco + AI + Skill Forge)
- Google Assistant / Siri → CODEC Core (actually controls your computer)
- Grammarly → CODEC Assist (right-click services via your own LLM)
- ChatGPT → CODEC Chat (250K context, fully local)
- Cloud LLM APIs → local stack (Qwen + Whisper + Kokoro + Vision)
- Vector databases → FTS5 SQLite (simpler, faster)
- Telegram bot relay → direct webhook (no middleman)
External services: DuckDuckGo for web search and Cloudflare free tier for the tunnel (or tailscale). Everything else is your hardware, your models, your code.
The needed
- A Mac (Ventura or later)
- Python 3.10+
- An LLM (Ollama, LM Studio, MLX, OpenAI, Anthropic, Gemini — anything OpenAI-compatible)
- Whisper for voice input, Kokoro for voice output, a vision model for screen reading git clone https://github.com/AVADSA25/codec.git cd codec pip3 install pynput sounddevice soundfile numpy requests simple-term-menu brew install sox python3 setup_codec.py python3 codec.py
The setup wizard handles everything in 8 steps.
The Numbers
- 7 product frames
- 50+ skills
- 12 agent crews
- 250K token context
- 5 security layers
- 70+ GitHub stars in 5 days
Site https://opencodec.org
GitHub https://github.com/AVADSA25/codec
Enterprise setup https://avadigital.ai
Star it. Clone it. Rip it. Make it yours.
Mickael Farina