back to home

Mercury13 / unicodia

Encyclopedia of Unicode characters

View on GitHub
167 stars
17 forks
43 issues

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing Mercury13/unicodia in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/Mercury13/unicodia)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

# What is Unicodia? It is a simple Unicode encyclopedia and the most comprehensive character map ever. Right now Windows only. **Lifecycle phase: 5** (production/stable). Minor troubles with sustainability, but generally survived five releases of Unicode, 14.0 to 17.0. **I’m in Ukraine torn with war, so I’ll release often.** See “war release” tag in Issues. Note about Egyptian font It has been moved to a separate repo. Visit https://github.com/Mercury13/unicodia-sesh How to get portable I was asked several times, but by this time it had already been portable. Open Unicodia.xml, it’s documented. Privacy policy Unicodia does not collect data at all, but uses GitHub API for updating. How to translate? • Ask programmer to add localized buttons if needed. One button is international for now, A-Z, and it already has Cyrillic, Katakana and Chinese versions. The rest are unchangeable for now… until needed. • Download Lang-src/en.uorig from this repo. • If you are able to use Git, better use it. We’ll be able to work together on one translation. • Put Unicodia to writeable location. • Create language directory, edit locale.xml for that language. • Download UTranslator. New → Translation of *.uorig. • If you don’t know English, use other \*.utran file as reference translation. • After saving, UTranslator created lang.xml. Put it to language directory. Or use a symlinking tool to tie these files forever and avoid handwork. • Press F12 in Unicodia to reload translation without reopening the entire program. • Warning, it reloads strings only; all locales are loaded on startup. • When new original arrived: File → Update data, Go → Find warnings → All. • nspk template parameters: 1=language name (or script name, non-localizable), 2=pre-comment (e.g. synonym, localizable). • If there’s no {{nspk}} in languages and there’s language data, default {{nspk}} is added automatically. So: {{nspk}} at the end → delete, it’ll be added! Need e.g. synonym → add {{nspk||=Klingon}}. Synonym is the SECOND parameter. See _Script.Mroo_ in English/Russian. • To test alphabetic sorting, especially in troublesome languages like Japanese: press Ct+Sh+W and look into Blocks drop-down list (does not work in Sort by tech name). There’s only one telltale, [1] when the 1st character does not belong to the sorting alphabet. These [1]’s are often mistakes and always signs of attention. Language policy **Common.** No war jargon. Describe 2022 war as neutral as possible. Every _lingua franca_ (English, Russian, French) in its international form. Make examples as patriotic as possible for language we’re writing in: the same letter is Russian and Ukrainian in respective L10n’s. And English if the same phenomenon exists in English language. Apostrophe is U+2019. Is **Old** in the front or in the back? It depends. 1) In Scripts — as convenient. In Blocks… 2) Old is the main word (Ancient symbols) → better front. 3) Auxiliary block (Old Sogdian, Ancient Greek) → no matter, we’ll find it anyway by looking around Greek. 4) Old is an adjective to something more important (Italic old, Mongolian old, Permic old) → better back. It’s just ease of finding a block in the long list of 300 blocks. **English.** International: truck > lorry, petrol > gas. Prefer British form if both are good. Punctuation around quotes is British/international: it’s inside quotes if it’s a part of “phrase being quoted”. **Russian.** Ё is mandatory. No grammatical concessions to Ukrainian. _(May apply to new languages as well.)_ Adjectives like _Georgian_ may agree to _script_ (_письменность_, female in Russian), or to _language_ (_язык_, male). The rules are… • BLOCKS: strongly connected to language → to language _(грузинский=Georgian [language])_. Otherwise to script _(батакская=Batak [script])_. • SCRIPTS: of course agree to script _(грузинская=Georgian [script])_. **Ukrainian.** See Lang-src/Ukrainian.md. **New languages.** • As English uses lots of capital letters, translations to other languages may use small where English is capital. Refer to Russian/Ukrainian for letter case. • See Russian script/language rule. _About war jargon._ Open-source software with neutral license and without special purpose (e.g. censorship circumvention) should be neutral. Period. How to build? • Slight C++20 and std::filesystem here → so need either MSYS or recent Qt with MinGW 11. • Also need cURL (present in W10 18H2+), 7-zip, UTransCon, SvgCleaner. • Configure and run tape.bat file. • Configure and run rel.bat file. How to develop? See develop.md. Compatibility and policies Platforms **Win7/10/11 x64 only.** Rationale: • WXP, WVista and W8 are completely abandoned by all imaginable software. Though I did some improvements specially for W8. • No obstacles for x86, just untested because no one compiled Qt for x86. • Recently checked Windows 11, and it works. Tofu/misrenderings • **W10/11 should support everything possible, W7 just runs somehow.** At the time of testing still no BMP tofu, per old policy. • Previously W7 supported the entire base plane and three important plane 1 scripts. I dropped that guarantee, though I did nothing against it, just did not test • Small misrenderings in descriptions are tolerable, I’ll fix them only if samples are bad, or if the font has other problems. Update Unicode Wartime: as soon as base arrives, and release date is frozen, even on alpha review stage Peacetime (probably): stable release + some big font covering a major set arrives. Han too if the coverage is really high Emergency releases of a few characters (e.g. currency, Japanese era): instantly, even if they are tofu Fonts Fonts are always updated to release versions. Font is updated to alpha/beta if fixes a major misrender, and/or professionally implements a new character. Naming: Noto if tables and existing glyphs are surely untouched; Uto otherwise. These fonts are taken to Unicodia without author’s consent: • Craggy font with missing/trivial tables. Example…