Extras
Optional helpers for data pipelines — comment validation, course code normalization, and sentiment analysis.
import { normalizeComment, isValidComment, cleanCourseLabel, buildCourseMapping, analyzeSentiment } from "ratemyprofessors-client/extras";
Helpers #
normalizeComment
Normalizes a comment for comparison or deduplication. Trims whitespace, strips HTML tags (opt-out), lowercases, and collapses runs of whitespace. Optionally strips punctuation for looser matching.
| Parameter | Type | Default | Description |
|---|---|---|---|
text | string | — | Comment text |
options.stripHtml | boolean | true | Remove HTML tags |
options.stripPunctuation | boolean | false | Remove all punctuation |
const a = normalizeComment(" Great Professor! ");
const b = normalizeComment("great professor!");
console.log(a === b); // true
normalizeComment("<b>Loved</b> this class"); // "loved this class"
normalizeComment("Hello, world!", { stripPunctuation: true }); // "hello world"
isValidComment
Validates a comment and returns detailed diagnostics. Checks for empty text, insufficient length, all-caps, excessive repeated characters, and absence of alphabetic characters.
| Parameter | Type | Default | Description |
|---|---|---|---|
text | string | — | Comment text |
minLen | number | 10 | Minimum character length |
Returns: ValidationResult with valid (boolean) and issues (array of CommentIssue).
Each issue has a code ("empty", "too_short", "all_caps", "excessive_repeats", "no_alpha") and a human-readable message.
isValidComment("Good");
// { valid: false, issues: [{ code: "too_short", message: "Comment is 4 chars (minimum 10)" }] }
isValidComment("Great class, learned a lot");
// { valid: true, issues: [] }
isValidComment("WORST PROF EVER!!!");
// { valid: false, issues: [{ code: "all_caps", message: "Comment is all uppercase" }] }
Course Codes #
cleanCourseLabel
Cleans a scraped course label by removing parenthesized counts and normalizing whitespace.
cleanCourseLabel("CS 101 (42)"); // "CS 101"
buildCourseMapping
Maps scraped course labels to valid course codes.
| Parameter | Type | Description |
|---|---|---|
scrapedLabels | Iterable<string> | Labels scraped from RMP |
validCourses | Iterable<string> | Known valid course codes |
const mapping = buildCourseMapping(["CS 101 (42)", "MATH200"], ["CS101", "CS102", "MATH200"]);
Sentiment #
analyzeSentiment
Analyzes the sentiment of a comment using the AFINN-165 word list and Emoji Sentiment Ranking. Runs entirely locally with no external API calls.
| Parameter | Type | Description |
|---|---|---|
text | string | The text to analyze |
Returns: SentimentResult with:
score(number) — Raw aggregate AFINN scorecomparative(number) — Normalized score per word, typically in [-1, 1]label(string) —"very positive","positive","neutral","negative", or"very negative"
const result = analyzeSentiment("Amazing professor, really clear lectures!");
console.log(result.label); // "positive"
console.log(result.comparative); // 0.5
console.log(result.score); // 3