mlprimitives.custom.text module¶
-
class
mlprimitives.custom.text.TextCleaner(column=None, language='multi', lower=True, accents=True, stopwords=True, non_alpha=True, single_chars=True)[source]¶ Bases:
object-
RE_ACCENTS= {'a': re.compile('[àâáäåã]'), 'e': re.compile('[èêéë]'), 'i': re.compile('[ìîíï]'), 'o': re.compile('[òôóö]'), 'u': re.compile('[ùûúü]')}¶
-
RE_NON_ALNUM= re.compile('[^\\w\\d]')¶
-
RE_NON_ALPHA= re.compile('[^a-z]+')¶
-
RE_SYMBOLS= re.compile('[-]')¶
-
STOPWORDS= {}¶
-