back to home

cjhutto / vaderSentiment

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

4,953 stars
1,062 forks
56 issues
Python

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing cjhutto/vaderSentiment in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/cjhutto/vaderSentiment)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

==================================== VADER-Sentiment-Analysis ==================================== VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is *specifically attuned to sentiments expressed in social media*. It is fully open-sourced under the _ (we sincerely appreciate all attributions and readily accept most contributions, but please don't hold us liable). • _ • Introduction_ • _ • Installation_ • _ • _ • _ • _ • _ • _ Features and Updates ------------------------------------ Many thanks to George Berry, Ewan Klein, Pierpaolo Pantone for key contributions to make VADER better. The new updates includes capabilities regarding: #. Refactoring for Python 3 compatibility, improved modularity, and incorporation into _ ...many thanks to Ewan & Pierpaolo. #. Restructuring for much improved speed/performance, reducing the time complexity from something like O(N^4) to O(N)...many thanks to George. #. Simplified pip install and better support for vaderSentiment module and component import. (Dependency on vader_lexicon.txt file now uses automated file location discovery so you don't need to manually designate its location in the code, or copy the file into your executing code's directory.) #. More complete demo in the __main__ for vaderSentiment.py . The demo has: • examples of typical use cases for sentiment analysis, including proper handling of sentences with: • typical negations (e.g., "*not* good") • use of contractions as negations (e.g., "*wasn't* very good") • conventional use of **punctuation** to signal increased sentiment intensity (e.g., "Good!!!") • conventional use of **word-shape** to signal emphasis (e.g., using ALL CAPS for words/phrases) • using **degree modifiers** to alter sentiment intensity (e.g., intensity *boosters* such as "very" and intensity *dampeners* such as "kind of") • understanding many **sentiment-laden slang** words (e.g., 'sux') • understanding many sentiment-laden **slang words as modifiers** such as 'uber' or 'friggin' or 'kinda' • understanding many sentiment-laden **emoticons** such as :) and :D • translating **utf-8 encoded emojis** such as 💘 and 💋 and 😁 • understanding sentiment-laden **initialisms and acronyms** (for example: 'lol') • more examples of **tricky sentences** that confuse other sentiment analysis tools • example for how VADER can work in conjunction with NLTK to do **sentiment analysis on longer texts**...i.e., decomposing paragraphs, articles/reports/publications, or novels into sentence-level analyses • examples of a concept for assessing the sentiment of images, video, or other tagged **multimedia content** • if you have access to the Internet, the demo has an example of how VADER can work with analyzing sentiment of **texts in other languages** (non-English text sentences). ==================================== Introduction ==================================== This README file describes the dataset of the paper: | **VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text** | (by C.J. Hutto and Eric Gilbert) | Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014. | For questions, please contact: | C.J. Hutto | Georgia Institute of Technology, Atlanta, GA 30032 | cjhutto [at] gatech [dot] edu Citation Information ------------------------------------ If you use either the dataset or any of the VADER sentiment analysis tools (VADER sentiment lexicon or Python code for rule-based sentiment analysis engine) in your research, please cite the above paper. For example: **Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.** ==================================== Installation ==================================== There are a couple of ways to install and use VADER sentiment: #. The simplest is to use the command line to do an installation from _ using pip, e.g., > pip install vaderSentiment #. Or, you might already have VADER and simply need to upgrade to the latest version, e.g., > pip install --upgrade vaderSentiment #. You could also clone this _ #. You could download and unzip the _ In addition to the VADER sentiment analysis Python module, options 3 or 4 will also download all the additional resources and datasets (described below). ==================================== Resources and Dataset Descriptions ==================================== The package here includes **PRIMARY RESOURCES** (items 1-3) as well as additional **DATASETS AND TESTING RESOURCES** (items 4-12): #. vader_icwsm2014_final.pdf The original paper for the data set, see citation information (above). #. vader_lexicon.txt FORMAT: the file is tab delimited with TOKEN, MEAN-SENTIMENT-RATING, STANDARD DEVIATION, and RAW-HUMAN-SENTIMENT-RATINGS NOTE: The current algorithm makes immediate use of the first two elements (token and mean valence). The final two elements (SD and raw ratings) are provided for rigor. For example, if you want to follow the same rigorous process that we used for the study, you should find 10 independent humans to evaluate/rate each new token you want to add to the lexicon, make sure the standard deviation doesn't exceed 2.5, and take the average rating for the valence. This will keep the file consistent. DESCRIPTION: Empirically validated by multiple independent human judges, VADER incorporates a "gold-standard" sentiment lexicon that is especially attuned to microblog-like contexts. The VADER sentiment lexicon is sensitive both the **polarity** and the **intensity** of sentiments expressed in social media contexts, and is also generally applicable to sentiment analysis in other domains. Sentiment ratings from 10 independent human raters (all pre-screened, trained, and quality checked for optimal inter-rater reliability). Ove…