Quick Start
Define Extractor
def extract_domain_name(sample):
domain_name = sample.split("//")[-1].split('/')[0]
return domain_nameInitialize components
dataset = Dataset()
url_domain = FeatureDefinition(extract_domain_name, [dataset])
normalizer = Normalizer(PropertiesNormalizer(), [url_domain], delimiter = {'delimiter': ["."]})Dataset Preparation
sample_data = ['https://www.google.com/', 'https://www.youtube.com/']
prepared_data = dataset.add_samples(sample_data)
print(prepared_data)Output
0
1
Last updated