π UTF-8 Cleaner Help
π― Quick Start
1. Paste your text with encoding issues into the Input Text area
2. Click π§ Fix Encoding to automatically fix common issues
3. Review the fixed text in the Output area
4. Use π₯ Export to save the corrected text
π Encoding Detection
The tool automatically detects the encoding of your text using:
- BOM (Byte Order Mark) analysis
- Character frequency analysis
- Invalid UTF-8 sequence detection
- Common mojibake pattern recognition
π οΈ Fix Options Explained
- Target Encoding: The encoding you want to convert to (usually UTF-8)
- Source Encoding: The encoding of your input text (auto-detect if unsure)
- Mojibake Pattern: Select the type of garbled text you're seeing
- BOM Handling: Add, remove, or keep the Byte Order Mark
- Fix HTML Entities: Convert to actual spaces, etc.
- Fix URL Encoding: Decode %20, %2F, etc.
- Normalize Unicode: Combine accented characters (Γ© instead of eΜ)
- Remove Control Characters: Strip invisible control chars
- Batch Mode: Process each line independently
π§ Common Issues & Solutions
Garbled characters like "ΓΒ©" instead of "Γ©": Your text was encoded as UTF-8, then decoded as Latin-1. Select "Latin-1 interpreted as UTF-8" pattern.
Double-encoded text: Text encoded as UTF-8 twice. Select "Double-encoded UTF-8" pattern.
Question marks or squares: Characters not supported by the target encoding. Try UTF-8 as target.
BOM issues in files: Use BOM handling to add/remove the BOM as needed.
πΎ Encoding Profiles
Save your frequently used settings as profiles for quick access. Click "Save Current Settings" to create a profile.
π API Access
Click the API button to see how to integrate this tool into your applications or text editors.
β‘ Tips
- Use the Compare view to see exactly what changed
- The stats bar shows real-time encoding information
- Batch mode is useful for CSV files with mixed encodings
- Export with BOM if you need compatibility with Windows apps