I use AI tools daily – both through user interfaces and APIs—to assist with development, learning, content creation, and more. But every now and then, I get curious:
How would other language models answer the same question?
Would Claude be more concise than GPT? Would Gemini surface unique insights? Is DeepSeek any good?
That curiosity led me to build a small tool – just enough to scratch my itch without overcomplicating things.
The Idea
I wanted a script that could:
- Query multiple LLMs in parallel (OpenAI GPT, Claude, Gemini, Perplexity, DeepSeek),
- Use whichever models I have API keys for (configurable),
- Read my prompt from a file (or ask me if it’s missing),
- Save the answers in markdown inside my Obsidian notes,
- Be lightweight and flexible enough to expand later.
And that’s exactly what I built.
How It Works
Here’s how my Python script (ai_comparison_tool.py) works:
- It reads the question from a
question.txtfile. If it doesn’t exist, it simply asks me for a prompt in the terminal. - It then sends that prompt in parallel to all configured AI models (using
asyncioandaiohttp). - When all responses are collected, it creates a timestamped markdown file saved in my Obsidian vault.
- That markdown includes the question, the model names, responses, and a section for personal comparison notes.
The whole thing runs in the terminal. No UI, no browser tabs. Just clean, focused results.

Sample Use Case
Say I’m working on a tricky piece of copy or concept and want multiple perspectives. I drop my prompt in question.txt, run the script, and voilà – I get responses from Claude, GPT-4o, Gemini, and others neatly packaged in markdown.
For example:
# AI Models Comparison **Question:** How can a small SaaS team improve onboarding without hiring a full-time UX researcher? **Date:** 2025-07-29 15:30:02 --- ## Claude 3.7 Sonnet ... ## GPT-4o ... ## Google Gemini ... ... ## Notes and Comparison ### Main differences: - Claude focused more on customer interviews. - GPT-4o mentioned analytics tools like Hotjar. - Gemini emphasized using AI-driven surveys. ### Best response: - GPT-4o provided the most actionable insights. ### Conclusions: - Combine all three strategies.
And here’s how it looks in Obsidian
Why Not Just Open Multiple Tabs?
Sure, I could open a few tabs in the browser and paste my question into each chat UI… but:
- I prefer terminal-based workflows.
- This integrates into my Obsidian-first knowledge system.
- It’s repeatable and fast (parallel async requests save time).
- I can version and compare markdown files over time.
That said, I may eventually build a browser-based comparison tool. And if I do – I’ll share it here!
Tech Stack
- Language: Python 3.11+
- Async:
asyncio,aiohttp - Environment:
.envfile with API keys - Storage: Markdown in local Obsidian vault
- Models: GPT-4o, Claude 3.7, Gemini 1.5, Perplexity Sonar Pro, DeepSeek Chat
All models and endpoints are configurable in one dictionary, and you can easily enable/disable them.
Want to Try It?
If you’d like a copy of the script, drop me a message via email. It’s easy to adapt for your own note-taking setup or AI API use.
And if you’re already doing something similar – let’s compare workflows!


