The Story Behind Aru AI
The Beginning
It's been a year since I decided to open new horizons for myself and start diving into the world of development and real software architecture. I can't say I didn't write code before or make some utilities for myself and the team, but I looked at where the whole world was heading and started learning python. By the end of the month, I had made bots using aiogram, and in March I made a project with an admin panel for creating entertainment bots with quotes and stories. By the way, it successfully flew somewhere to Europe, even if not for big money, but it motivated me well.
March 2025, moving to Almaty and a bunch of custom bots via Reddit. It was interesting, especially when students from India ordered testing bots to study together.
Making various bots in March and April, I started paying attention to the vacancies published here in Kazakhstan. A lot was related to automation through messenger bots.
I started looking into this issue and realized that if in Europe and the USA there is an unspoken division where WhatsAPP is for family and friends, and Telegram is for work and business, then in Kazakhstan (and probably in some other countries) it's exactly the opposite.
Why is that? The answer turned out to be quite simple - a large supply from companies where employees don't even write code, but use node editors in the form of block diagrams to create bot logic and connect AI. And since the bot creators themselves do not understand the audience, the business does not even know about the advantages or differences between the audience in Telegram and Whatsapp.
Why do I dedicate so much time to bots and messengers in this article, instead of moving on to Aru Ai?
I realized that I needed to give people a tool that would clearly show the advantages of Telegram and at the same time make it as easy as possible to set up, ideally - just text descriptions. That's how the Promptly project was born. At the time of writing this post, there is information about it in the projects section, but maybe when you visit it will be incomplete because I'm completely remaking the project for a different stack and got stuck on the frontend.
During development, I came to the conclusion that the bot should not only be smart, but also understand the context of the conversation, as strange as it may sound. Besides, it needs to recognize commands to show products or draw inline buttons right in the chat. The solution? Semantic models.
When the first version of my project came out, it worked almost immediately. I made several custom bots, and now my original project is spinning somewhere on servers in India, Europe, and the USA. Then employers from Almaty paid attention to it and offered me to do the Myna project (there is also information about it on the projects page).
But there was one 'but'...
Proving ground for semantics
The project worked using Gemini as the brains, the choice was obvious - easy wrapper creation and most importantly a FREE TIER :)
With semantics, everything was complicated. I don't remember exactly, but for about two weeks I delved into the structure of semantic models deeply enough to perform cosine similarity and tried various algorithms. At some point, making tests on the project became inconvenient, and I decided to assemble a small test stand in the form of a separate chatbot.
Further on, all screenshots unfortunately will be in Russian, I apologize if you are reading this post in English or Kazakh, generally, the interface will be clear anyway.

It wasn't like this right away, but you can already read the features of Aru in the form she is now. I don't know why I called it Promptly Local. Probably because I always had localhost in the terminal before my eyes.
At first everything was very simple, then came the connection to SQLite and chat history storage. This way I could track progress between chats.
Eventually, I realized that I spend much more time in this stand than in regular Gemini or ChatGPT. I asked everything there - even food recipes or a training plan for my running.
Another two weeks later, I made a decision - this will be a separate project. Whether anyone would need it or not, I didn't know. I just got carried away and set myself specific tasks.
Project awareness and tasks
Spitting on all the rules of architecture and product creation, I compiled the following list of tasks:
- Must run everywhere and work fast
- Must work on all devices
- Must have good semantics support
- Needs a good image and stylish design
- Needs translations into several languages
- Must have file handling
- Mandatory generation of artifacts
- Voice and sound - the assistant must speak and understand human speech
- The project must be secure and private
- Free, for everyone
For each of the points, I didn't have a clear plan, there was only excitement and a desire to make a dream project.

The first thing I tackled was artifacts, this is very far from what Aru can do now, for some reason it seemed like a great idea to display content right in the chat, not on a canvas.
It worked poorly and looked terrible. But it seemed to me that interface and user convenience is something that can be worked on later.

And since convenience could be postponed for "someday", I noticed that due to cute and pleasant system prompts, the assistant sometimes answered in the feminine gender.
I asked her to come up with a name for herself, something that would sound very beautiful and connected to the Kazakh language and culture. It seems she understood me too literally and came up with the name Aru. In Kazakh, it literally means beauty.
In the screenshot above, you can see that her real name is already in the header and there is a little fox. I decided it would be good to have a graphical representation right on the screen. I asked to generate a fox in ChatGPT, it seemed very fabulous to me so I drew a sketch myself right on paper, then processed the image using a neural network based on the sketch and got almost the current images. Back then the fox was simply displayed in the chat, you could turn it on with a paw icon button and drag it around the screen.

Then a library of saved artifacts appeared.
Later I added more images and wrote a very simple heuristic module (much, much simpler than the current one) to determine the mood of the message by keywords and show the necessary picture.

A voice also appeared using PiperTTS. And a player to listen to messages from Aru.

Interface translations appeared, artifacts were still created right in the chat, it all still looked terrible.
It was time to sum up, how many of the 10 tasks did I complete? I was very scared because in reality zero...
The project was written using Python+Flask, which means I needed my own server for Aru to work for everyone. I calculated the expenses and realized that it would cost serious money. The project implied that users themselves enter their Gemini keys and use it, but how to store databases? How to encrypt them? How to make it so that people could use their databases without Aru?

The terrible thing is that I killed two months on the entire project, and hit a wall with ideas that were catastrophic. The screenshot shows that I tried to organize authorization through Google. But cumulative errors, incorrect architectural and engineering decisions led me to a painful but necessary thought - I would have to start from scratch.
If only I knew that I would have to do this at least five more times...
I studied all my mistakes, realized that the best solution would be to connect each user's SQLite database at the beginning of the session, store the path to the database in the browser, and thereby relieve myself of the responsibility for storing data. Nothing would be left on the server.

The project started, I reworked the interface, this time I wrote it myself, and didn't ask AI. Chasing trends in interfaces led me to the thought - a glass interface is a great idea!
Now I look at it and realize what a fool I was...
This project attempt had the shortest life cycle. Literally two weeks.
Very quickly many pitfalls appeared and even more bad engineering decisions. Without going too much into detail, the semantic model became the most serious stumbling block. In the bundle I came up with, it simply couldn't process multiple users simultaneously. Besides, I understood that the SAAS format most likely wasn't going to happen for me, voice, semantics, backend, processing files took too many resources on the server.

Rolling back again to a clean project in VScode. But there was a plus, the idea to create a startup wizard with mode selection turned out to be the best idea out of all these bad implementations.
The biggest failure
I am surprised by my stubbornness and persistence. In general, this is a trait of my temperament, which is probably why I became a track and field athlete with good (even excellent) indicators for an amateur. Once a goal is set - it must be achieved.
There were a few more attempts to start the project, but I quickly closed and deleted everything because there were a lot of doubts about each implementation.
Then I decided - since I can't take on the heaviest calculations, does it mean I can shift absolutely everything to the user? And therefore no need to skimp?
- A semantic model weighing 1.2 GB? Bring it here.
- A bunch of libraries as if I'm a beginner developer at the pip install everything level! Right here!
- TTS right on the user's side with the heaviest voices? Of course, bring it here!
And how to run all this? At the time I thought - I'll open the code, and they'll figure out how to do python main.py. For those who don't know how, I'll make binaries for three operating systems and an installer. Let's go!

Maximum minimalism in the interface, minimum colors, everything is very flat. A three-step wizard for creating a database, full processing of everything on the user's device. No servers. Even loading semantics into the cache was displayed at the first launch. Oh, I was so glad then how well everything was turning out.
I was stupid and too reckless, forgive me for that. I don't do this with commercial projects, it was only with Aru. One person even expressed that I have "Aru of the brain".
From September 2025 to November 2025, I worked on this attempt to get the project running.

And actually, some of it even worked out.
You could choose any database, change settings, the correct stickers appeared under the messages, the artifacts library was divided into two categories - documents and apps.
I worked on Aru when I had time and didn't really sweat it. I thought it would be a success.
Disappointment didn't come right away...
The mistake was also that since it was python code, sometimes not the best algorithms literally migrated from one implementation to another. When I decided that the project was coming to an end, I started doing the packaging and was horrified.
Distribution weight... drumroll... 500 megabytes. If you add a semantic model here, the size reached almost 2GB.
When packing via pyinstaller, the best I achieved was a 250 megabyte installer. And it only worked well on Windows. It wouldn't work on Mac under any circumstances, because you need to sign the application and the packaging is different. I barely stabilized the build for Linux and decided that I would do the build for Apple someday later. Postponed matters again...
For some reason, I didn't want to believe in failure. I wrote articles in three languages on Medium and even tried to make a page on Product Hunt.
10 minutes after launch, I got banned. A few months later I found out it was a mistake, but at that moment I couldn't find a place for myself. I didn't try to figure it out right away because I was in despair. And honestly, I understand that this accidental ban became the salvation of the project.
But then I fell into depression, literally. My financial situation was already not in the best shape, and I was wasting time on Aru, and this version again turned out to be a failure.

I was still trying to do at least something, rewrote several modules and organized a plugin system with its own manifests. Each of them was worse than the other. For example, one of the plugins allowed... um... to watch TV with Aru. What for? Why did I think this was a good idea?
Never write code and don't engage in projects if you have mental problems :)
At some point I decided - the project will either be abandoned or deleted completely. I spent my days doing nothing. December came, and I just went to my hometown to change the scenery.
North of Kazakhstan. Frost near -40 degrees (this is the same in Celsius and Fahrenheit). A frozen river and snowdrifts up to the knees.
I walked the streets in such weather and thought what to do next?
The best idea is to start completely from scratch and abandon all old solutions
I once again looked at the list of my tasks, realized that I had failed to complete a single one again. It should also be noted that in all versions I focused on Gemini in the hope that if the project was interesting, I would add other providers later. And this very thought didn't give me peace. Why Gemini? Why the same thing? Why python at all?
My cry of realizing how stupid I was was probably heard by the whole city + on the other side of the river :)
I returned home, opened a new project in VScode, opened my paper notebook, tore out all the pages where I wrote about previous attempts to implement Aru, and really started from a blank slate.
New list of tasks:
- The project must be a real service, not a 2 GB installer.
- The project must run as fast as possible on any hardware, even an old phone.
- Semantics must be lightweight and load invisibly.
- The heuristic module must become a real part of the project, not just a sticker output.
- The design needs to be completely redone focusing on user convenience, not on how it turns out.
- Full and real support for at least three languages is needed, ideally - as many as possible.
- Working with databases should be as easy as possible.
- The user must choose which APIs to use, even if these are their personal local models run via Ollama.
- No python and complex libraries.
- Nothing should go to my server at all, not even files.
- Mandatory PWA application so that it works and launches everywhere.
- Maximum security for any age and clear settings for everyone.
- Free, without restrictions, even if my server goes down - the project must work without interruptions.
- The weight of the application and the entire project should be as small as possible.
- The user must have access to their data and database regardless of whether they use Aru or not.
What technical solutions could combine all these tasks?
Aru was completely, truly rewritten from scratch in Vanilla JS. Albeit complex, but still convenient for users to work with SQLite. Transformers.js right in the browser. The choice of any provider to connect to language models. Convenient settings and competent solutions so that the user could use the project regardless of whether they are sitting on a tab in the browser, downloaded the application, or even launched Aru on their computer from the source code.
Here you can also add adaptability for any devices, creation of a full-fledged PWA, a bunch of reviewed interface options and implementations to find truly convenient solutions. Canvas organization, beautiful design... and much more that you most likely know about since you are reading this post. Full information is on the project page.
December, January, February 2026. It took me three months to study technologies and implementations that used to be a dark forest for me.
If earlier I worked on Aru when I had time, now I worked on Aru literally ALL THE TIME I HAD. I slept for 3-4 hours, read articles, watched tutorials, tormented all AIs that could somehow help to teach me and tell me how and what I could do.
I looked at all my projects that I did for other people and companies and asked myself a question - why was this never in Aru?
Even such elementary things as a page with a guide or information, a license agreement and a notification that Aru can make mistakes, all this wasn't there before because I hadn't thought about it.

If you look at what was with what has become, it becomes clear - it's good that all bad implementations turned out to be bad and didn't get publicity or popularity. It's good that it didn't work out. The main thing is that it worked out now.
Almost all the tasks I set are completed. Aru works anywhere, on anything, can connect to any provider, and if you download the PWA, it can literally work through localhost with Ollama on the user's computer.
Competent division of artifacts, adding features like news and weather, beautiful design. In the end - the weight of the application or project in the form of source code is only 4 megabytes.
Good progress between 2 GB and 4 MB? I think so. The semantics is loaded once into the cache and called in RAM during subsequent launches, it takes only 60 MB in memory and is almost invisible.
Finale
Aru is not perfect, there are minuses and bugs, there is still work to do. There is a post about the roadmap in the blog, almost every point already has a beginning and the first tests.
Everything will definitely work out. Yes, there is still a lot of work ahead. First of all, we need to figure out how to make the iPhone work with Aru because it has very limited File Access Api. But there are already thoughts on how to defeat this.
A difficult stack for me (but convenient for the user) allowed me to bring the dream project to mind. Even if still in terms of concept or prototype, but everything works exactly as I intended. All steps and tasks are completed, and even if not completed - they will definitely reach a competent finale.
My advice - if you came up with a dream project, do not hope that it will work out right away. Making mistakes, starting from scratch is normal. Choosing not the best solutions, and then redoing them - is also normal.
The main thing is - don't give up, strive, try and achieve.
I am incredibly happy that I went through this path almost a year long.
Hugs to everyone. Thanks for reading to the end.