huridocs / uwazi
Uwazi is a web-based, open-source solution for building and sharing document collections
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing huridocs/uwazi in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewUwazi is a flexible database application to capture and organise collections of information with a particular focus on document management. HURIDOCS started Uwazi and is supporting dozens of human rights organisations globally to use the tool. Uwazi | HURIDOCS Read the user guide Installation guide • Dependencies • Production • Development Dependencies Before anything else you will need to install the application dependencies: • **NodeJs 20.19.6** For ease of update, use nvm. • **ElasticSearch 8.18.0** Please note that ElasticSearch requires Java. Follow the instructions to install the package manually, you also probably need to disable ml module in the ElasticSearch config file: • **ICU Analysis Plugin (recommended)** Adds support for number sorting in texts and solves other language sorting nuances. This option is activated by setting the env var USE_ELASTIC_ICU=true before running the server (defaults to false/unset). • **MongoDB 7.0.24** The MongoDB installation needs to be configured as a Replica Set. It can be a single-node replica set, but Replica Set must be initialized. If you have a previous version installed, please follow the instructions on how to upgrade here. • **mongosh** The new mongosh dependency needs to be added. • **Yarn 4+** The project uses Yarn 4.13.0 (see in package.json). Run before your first so Node uses the correct version. Alternatively, npm 10+ is supported if you prefer to switch. • **pdftotext (Poppler)** tested to work on version 22.12 but it's recommended to use the latest available for your platform. Make sure to **install libjpeg-dev** if you build from source. Production Install/upgrade procedure Development If you want to use the latest development code: There may be an issue with pngquant not running correctly. If you encounter this issue, you are probably missing the library **libpng-dev**. Please run: Docker Infrastructure dependencies (ElasticSearch, ICU Analysis Plugin, MongoDB, Redis and Minio (S3 storage) can be installed and run via Docker Compose. ElasticSearch container will claim 2Gb of memory so be sure your Docker Engine is alloted at least 3Gb of memory (for Mac and Windows users). Development Run This will launch a webpack server and nodemon app server for hot reloading any changes you make. Webpack server This will launch a webpack server. You can also pass to get detailed info on the webpack build. Storybook For component development and documentation: Cypress component tests ( ) use Storybook stories. Testing Unit and Integration tests We test using the JEST framework (built on top of Jasmine). To run the unit and integration tests, execute This will run the entire test suite, both on server and client apps. Some suites need MongoDB configured in Replica Set mode to run properly. The provided Docker Compose file runs MongoDB in Replica Set mode and initializes the cluster automatically, if you are using your own mongo installation Refer to MongoDB's documentation for more information. There are also Cypress components tests. It's recommended that Cypress tests are run with Chrome or Chrome based browsers. You can run individual tests with the Cypress UI: or you can run tests in headless mode: End-to-End testing (e2e) Running end-to-end tests requires a running Uwazi app. For End-to-End testing, we have a full set of fixtures that test the overall functionality. **It's not advised to run these tests on production environments**, since an incorrectly configured run can have unwanted effects on the production database. Note that if you already have an instance running, this will likely throw an error of ports already being used. Only one instance of Uwazi may be run in the same port at the same time. The Uwazi APP needs to run on a specific database and with a specific ElasticSearch index. This is configured via environment variables when starting the application. Running tests with Puppeteer (legacy) Start UWazi: On a different console tab, run: This will trigger a run of all the Puppeteer tests. You can run test individually: Running tests with Cypress Start Uwazi: On a different console tab, run: This will open the Cypress interface where you can select the tests to run. It's recommended that Cypress tests are run with Chrome or Chrome based browsers. You can run tests in headless mode, and run individual suites via: Cypress tests that use our Information Extraction features need to run Uwazi together with a dummy service that mimics the external services needed for the features. To run these tests you also need to add the following environment variables when running Uwazi: Default login The application's default login is admin / change this password now Note the subtle nudge ;) System Requirements • For big files with a small database footprint (such as video, audio and images) you'll need more HD space than CPU or RAM • For text documents you should consider some decent RAM as ElasticSearch is pretty greedy on memory for full text search The bare minimum you need to be able to run Uwazi on-prem without bottlenecks is: • 4 GB of RAM (reserve 2 for Elastic and 2 for everything else) • 2 CPU cores • 20 GB of disk space For development: • 8GB of RAM (depending on whether the services are running) • 4 CPU cores • 20 GB of disk space