Extras

Optional helpers for data pipelines — comment validation, course code normalization, and sentiment analysis.

import { normalizeComment, isValidComment, cleanCourseLabel, buildCourseMapping, analyzeSentiment } from "ratemyprofessors-client/extras";

Helpers #

normalizeComment

normalizeComment(text: string, options?: NormalizeOptions): string

Normalizes a comment for comparison or deduplication. Trims whitespace, strips HTML tags (opt-out), lowercases, and collapses runs of whitespace. Optionally strips punctuation for looser matching.

Parameter	Type	Default	Description
`text`	`string`	—	Comment text
`options.stripHtml`	`boolean`	`true`	Remove HTML tags
`options.stripPunctuation`	`boolean`	`false`	Remove all punctuation

const a = normalizeComment("  Great  Professor!  ");
const b = normalizeComment("great professor!");
console.log(a === b); // true

normalizeComment("<b>Loved</b> this class"); // "loved this class"
normalizeComment("Hello, world!", { stripPunctuation: true }); // "hello world"

isValidComment

isValidComment(text: string, minLen?: number): ValidationResult

Validates a comment and returns detailed diagnostics. Checks for empty text, insufficient length, all-caps, excessive repeated characters, and absence of alphabetic characters.

Parameter	Type	Default	Description
`text`	`string`	—	Comment text
`minLen`	`number`	`10`	Minimum character length

Returns: ValidationResult with valid (boolean) and issues (array of CommentIssue).

Each issue has a code ("empty", "too_short", "all_caps", "excessive_repeats", "no_alpha") and a human-readable message.

isValidComment("Good");
// { valid: false, issues: [{ code: "too_short", message: "Comment is 4 chars (minimum 10)" }] }

isValidComment("Great class, learned a lot");
// { valid: true, issues: [] }

isValidComment("WORST PROF EVER!!!");
// { valid: false, issues: [{ code: "all_caps", message: "Comment is all uppercase" }] }

Course Codes #

cleanCourseLabel

cleanCourseLabel(raw: string): string

Cleans a scraped course label by removing parenthesized counts and normalizing whitespace.

cleanCourseLabel("CS 101 (42)"); // "CS 101"

buildCourseMapping

buildCourseMapping(scrapedLabels: Iterable<string>, validCourses: Iterable<string>): Map<string, Set<string> | null>

Maps scraped course labels to valid course codes.

Parameter	Type	Description
`scrapedLabels`	`Iterable<string>`	Labels scraped from RMP
`validCourses`	`Iterable<string>`	Known valid course codes

const mapping = buildCourseMapping(["CS 101 (42)", "MATH200"], ["CS101", "CS102", "MATH200"]);

Sentiment #

analyzeSentiment

analyzeSentiment(text: string): SentimentResult

Analyzes the sentiment of a comment using the AFINN-165 word list and Emoji Sentiment Ranking. Runs entirely locally with no external API calls.

Parameter	Type	Description
`text`	`string`	The text to analyze

Returns: SentimentResult with:

score (number) — Raw aggregate AFINN score
comparative (number) — Normalized score per word, typically in [-1, 1]
label (string) — "very positive", "positive", "neutral", "negative", or "very negative"

const result = analyzeSentiment("Amazing professor, really clear lectures!");
console.log(result.label);       // "positive"
console.log(result.comparative); // 0.5
console.log(result.score);       // 3