Skip to content

v0.17.0 - Text Classification, Script Detection & Security Sanitization

Choose a tag to compare

@Zheruel Zheruel released this 27 Sep 11:14
· 30 commits to main since this release
73eb115

🎉 What's New

This release introduces three powerful new utility functions for text analysis and security:

🔍 classifyText(text: string): Classification

Intelligently classifies text content with confidence scoring:

  • Detects URLs, emails, phone numbers
  • Identifies JSON, code, markdown, HTML
  • Recognizes questions and numeric content
  • Returns type and confidence score (0-1)
  • Bundle size: ~2.3KB

🌍 detectScript(text: string): Script

Detects the dominant writing system in text:

  • Supports: Latin, CJK, Arabic, Cyrillic, Hebrew, Devanagari, Greek, Thai
  • Useful for internationalization and language detection
  • Bundle size: ~2.4KB

🔒 sanitize(str: string, options?: SanitizeOptions): string

Security-focused string sanitization for web applications:

  • Removes XSS vectors (script tags, event handlers, dangerous URIs)
  • Configurable HTML tag/attribute allowlisting
  • Control character removal
  • Whitespace normalization
  • Custom pattern removal
  • Bundle size: ~2.5KB

📊 Performance

All functions maintain excellent performance standards:

  • classifyText: ~400K-2M operations/second
  • detectScript: ~500K-10M operations/second
  • sanitize: ~365K-2.2M operations/second

📦 Bundle Size

Total library size remains under 9KB gzipped with all 47 functions.

🧪 Quality Assurance

  • ✅ 579 new tests added (1,153 total tests)
  • ✅ 100% code coverage for new functions
  • ✅ TypeScript strict mode compliant
  • ✅ Zero dependencies maintained

📚 Documentation

  • Updated README with comprehensive API documentation
  • Added all functions to interactive playground
  • Enhanced bundle size tracking and comparisons

Installation

```bash
npm install nano-string-utils@0.17.0

or

yarn add nano-string-utils@0.17.0

or

pnpm add nano-string-utils@0.17.0
```

Usage Examples

```javascript
import { classifyText, detectScript, sanitize } from 'nano-string-utils';

// Text classification
classifyText('https://example.com'); // { type: 'url', confidence: 1 }
classifyText('What is TypeScript?'); // { type: 'question', confidence: 1 }

// Script detection
detectScript('Hello World'); // 'latin'
detectScript('你好世界'); // 'cjk'
detectScript('مرحبا'); // 'arabic'

// Security sanitization
sanitize("<script>alert('xss')</script>Hello"); // "Hello"
sanitize("Bold text", { allowedTags: ["b"] }); // "Bold text"
```


Full Changelog: v0.16.0...v0.17.0