* feat: add unicode normalization & ascii_only mode for DocumentCleaner. * feat: add unicode_normalization parameter valdiation to DocumentCleaner. * test: fix the unit test to work after code linting.