Extras

Optional helpers for data pipelines — comment validation, course code normalization, and sentiment analysis.

import { normalizeComment, isValidComment, cleanCourseLabel, buildCourseMapping, analyzeSentiment } from "ratemyprofessors-client/extras";

Helpers #

normalizeComment

normalizeComment(text: string, options?: NormalizeOptions): string

Normalizes a comment for comparison or deduplication. Trims whitespace, strips HTML tags (opt-out), lowercases, and collapses runs of whitespace. Optionally strips punctuation for looser matching.

ParameterTypeDefaultDescription
textstringComment text
options.stripHtmlbooleantrueRemove HTML tags
options.stripPunctuationbooleanfalseRemove all punctuation
const a = normalizeComment("  Great  Professor!  ");
const b = normalizeComment("great professor!");
console.log(a === b); // true

normalizeComment("<b>Loved</b> this class"); // "loved this class"
normalizeComment("Hello, world!", { stripPunctuation: true }); // "hello world"

isValidComment

isValidComment(text: string, minLen?: number): ValidationResult

Validates a comment and returns detailed diagnostics. Checks for empty text, insufficient length, all-caps, excessive repeated characters, and absence of alphabetic characters.

ParameterTypeDefaultDescription
textstringComment text
minLennumber10Minimum character length

Returns: ValidationResult with valid (boolean) and issues (array of CommentIssue).

Each issue has a code ("empty", "too_short", "all_caps", "excessive_repeats", "no_alpha") and a human-readable message.

isValidComment("Good");
// { valid: false, issues: [{ code: "too_short", message: "Comment is 4 chars (minimum 10)" }] }

isValidComment("Great class, learned a lot");
// { valid: true, issues: [] }

isValidComment("WORST PROF EVER!!!");
// { valid: false, issues: [{ code: "all_caps", message: "Comment is all uppercase" }] }

Course Codes #

cleanCourseLabel

cleanCourseLabel(raw: string): string

Cleans a scraped course label by removing parenthesized counts and normalizing whitespace.

cleanCourseLabel("CS 101 (42)"); // "CS 101"

buildCourseMapping

buildCourseMapping(scrapedLabels: Iterable<string>, validCourses: Iterable<string>): Map<string, Set<string> | null>

Maps scraped course labels to valid course codes.

ParameterTypeDescription
scrapedLabelsIterable<string>Labels scraped from RMP
validCoursesIterable<string>Known valid course codes
const mapping = buildCourseMapping(["CS 101 (42)", "MATH200"], ["CS101", "CS102", "MATH200"]);

Sentiment #

analyzeSentiment

analyzeSentiment(text: string): SentimentResult

Analyzes the sentiment of a comment using the AFINN-165 word list and Emoji Sentiment Ranking. Runs entirely locally with no external API calls.

ParameterTypeDescription
textstringThe text to analyze

Returns: SentimentResult with:

const result = analyzeSentiment("Amazing professor, really clear lectures!");
console.log(result.label);       // "positive"
console.log(result.comparative); // 0.5
console.log(result.score);       // 3