Artifician
Github Repository
  • Introduction
    • “Turn your data preparation nightmares into a dream.”
    • Why Artifician?
    • Simple Example
    • Without Artifician
    • Using Artifician
    • Output
  • Getting Started with Artifician
    • Pre-requisites
    • Installation
    • Using pip
    • Using conda
    • Verify Installation
    • Next Steps
  • Quick Start
    • Define Extractor
    • Initialize components
    • Subscriptions
    • Dataset Preparation
    • Output
  • Advanced Concepts
    • Processor Chaining
      • Overview
      • Key Features
      • Syntax Showcase
      • Example Scenario: NLP Processing Pipeline
      • Building an NLP Pipeline with processor chaining
      • Output
    • Defining Custom Extractors
      • Introduction
      • Why Custom Extractors?
      • How Extractors Work
      • Example of a Simple Extractor
      • Integrating Custom Extractors
      • Advanced Usage
      • Conclusion
    • Defining Custom Processors
      • Introduction
      • Why Custom Processors?
      • How Processors Work
      • Example of a Simple Processor
      • Integrating Custom Processors
      • Advanced Usage
      • Conclusion
    • Library Architecture
      • Events
      • Dataset
      • Feature Definition
      • Processors
      • Extractors
  • API Reference
Powered by GitBook
On this page
  • Introduction
  • Why Custom Extractors?
  • How Extractors Work
  • Example of a Simple Extractor
  • Integrating Custom Extractors
  • Advanced Usage
  • Conclusion

Was this helpful?

Edit on GitHub
  1. Advanced Concepts

Defining Custom Extractors

Introduction

Extractors are user-defined functions that serve as the cornerstone for feature extraction in Artifician. This document provides an in-depth guide on how to create your own custom extractors.

Why Custom Extractors?

There might be specific features you need to extract from your data that the pre-defined extractors do not cover. In such cases, custom extractors come in handy.

How Extractors Work

Extractors operate by taking a sample of raw data and extracting specific features from it. These extracted features are then passed on to the FeatureDefinition for further processing.

Example of a Simple Extractor

Here is an example that extracts domain names from URLs.

def extract_domain_name(sample):
    domain_name = sample.split("//")[-1].split('/')[0]
    return domain_name

Integrating Custom Extractors

To use your custom extractor, pass it as an argument when initializing a FeatureDefinition object.

url_domain = FeatureDefinition(extract_domain_name)

Advanced Usage

If your feature extraction is more complex and involves multiple steps, you can also create a class-based extractor. This allows for more modularity and reusability.

class AdvancedExtractor:
    def __init__(self, param1, param2):
        self.param1 = param1
        self.param2 = param2

    def extract(self, sample):
        # Complex extraction logic here
        return extracted_feature

advanced_extractor = AdvancedExtractor(param1_value, param2_value)
advanced_feature = FeatureDefinition(advanced_extractor.extract)

Conclusion

Custom extractors offer a flexible way to handle feature extraction specific to your needs. They seamlessly integrate with the existing components of Artifician, allowing for a cohesive and streamlined feature extraction process.

PreviousProcessor ChainingNextDefining Custom Processors

Last updated 1 year ago

Was this helpful?