CustomerInsight is an interactive analytics platform that uses NLP and machine learning to extract actionable insights from customer reviews. Built with Streamlit, it provides real-time sentiment analysis, keyword extraction, topic modeling, and anomaly detection — optimized for both Chinese and English text.
- Sentiment Analysis — BERT-based sentiment classification with confidence scores. Uses
roberta-base-finetuned-jd-binary-chinesefor Chinese andbert-base-multilingual-uncased-sentimentfor English. - Keyword Extraction — TF-IDF keyword extraction with word cloud visualization, trend analysis, and rating-based comparison.
- Topic Modeling — LDA and K-Means clustering with topic network graphs, heatmaps, and trend tracking.
- Anomaly Detection — Isolation Forest-based outlier detection with multi-feature analysis (rating, text length, sentiment).
- Interactive Filtering — Filter by date range, rating, text length, and keywords with real-time updates.
- Data Export — Download filtered data and analysis results as CSV.
# Clone the repository
git clone https://github.com/ChanMeng666/customer-insight.git
cd customer-insight
# Install dependencies
pip install -r requirements.txt
# Run the application
streamlit run app.pyThen open http://localhost:8501 and upload the sample dataset from data/example_dataset.csv.
| Column | Type | Required | Description |
|---|---|---|---|
timestamp |
datetime | Yes | Review timestamp |
content |
string | Yes | Review text |
rating |
float | No | Rating (1-5 scale) |
category |
string | No | Review category |
user_id |
string | No | User identifier |
graph TD
A[CSV/Excel Upload] --> B[Data Processor]
B --> C[Text Cleaning]
C --> D{Analysis Engine}
D --> E[Sentiment Analyzer]
D --> F[Keyword Analyzer]
D --> G[Topic Analyzer]
D --> H[Insight Analyzer]
E --> I[Visualization Layer]
F --> I
G --> I
H --> I
I --> J[Streamlit Dashboard]
Framework: Streamlit · Language: Python 3.9+
NLP & ML: Transformers · PyTorch · jieba · NLTK · scikit-learn · Sentence Transformers
Visualization: Plotly · Matplotlib · WordCloud · NetworkX
Data: Pandas · NumPy · SciPy
See CONTRIBUTING.md for the development guide.

