Generating a data set¶

In order to empirically assess the accuracy and computation time of the labelling algorithms, one requires a data set. We added various options for data set generation to our repository.

Generating an argumentation system¶

Instead of manually designing an argumentation system and loading it (using an ArgumentationSystemXLSXReader) we also provide functionality to automatically generate an argumentation system. The repository currently holds two types of generators: a random and a layered argumentation system generator.

Random argumentation system generator¶

class RandomArgumentationSystemGenerator(argumentation_system_generation_parameters)¶

Bases: modules.dataset_generator.argumentation_system_generator.argumentation_system_generator_interface.ArgumentationSystemGeneratorInterface

generate()¶

Randomly generate a new ArgumentationSystem based on the RandomArgumentationSystemGeneratorParameters.

Return type: ArgumentationSystem
Returns: The generated ArgumentationSystem.

class RandomArgumentationSystemGeneratorParameters(language_size, rule_size, rule_antecedent_distribution, queryable_size=None, queryable_ratio=None, allow_rules_for_queryables=True, allow_conclusion_in_antecedents=True, allow_inconsistent_antecedents=True)¶

Bases: object

__init__(language_size, rule_size, rule_antecedent_distribution, queryable_size=None, queryable_ratio=None, allow_rules_for_queryables=True, allow_conclusion_in_antecedents=True, allow_inconsistent_antecedents=True)¶

Parameters for randomly generating an ArgumentationSystem.

Parameters

language_size (int) – Number of Literals (including negations)

rule_size (Optional[int]) – Number of Rules

rule_antecedent_distribution (Dict[int, float]) – Number of Rules with a specific number of antecedents.

queryable_size (Optional[int]) – Number of Queryables.

queryable_ratio (Optional[float]) – Fraction of Queryables by the number of Literals.

allow_rules_for_queryables (bool) – Boolean indicating if there can be Rules for Queryables.

allow_conclusion_in_antecedents (bool) – Boolean indicating if a Rule can have its conclusion as an antecedent.

allow_inconsistent_antecedents (bool) – Boolean indicating if a Rule can have inconsistent antecedents.

Layered argumentation system generator¶

class LayeredArgumentationSystemGenerator(argumentation_system_generation_parameters)¶

Bases: modules.dataset_generator.argumentation_system_generator.argumentation_system_generator_interface.ArgumentationSystemGeneratorInterface

generate()¶

Generate an ArgumentationSystem with a layered structure according to the LayeredArgumentationSystemGeneratorParameters.

Return type: ArgumentationSystem
Returns: The generated ArgumentationSystem.

class LayeredArgumentationSystemGeneratorParameters(language_size, rule_size, rule_antecedent_distribution, literal_layer_distribution)¶

Bases: object

__init__(language_size, rule_size, rule_antecedent_distribution, literal_layer_distribution)¶

Parameters for randomly generating an ArgumentationSystem with a layered structure.

Parameters

language_size (int) – The number of Literals (including negations).

rule_size (int) – The number of Rules.

rule_antecedent_distribution (Dict[int, int]) – The number of Rules having a specific number of antecedents.

literal_layer_distribution (Dict[int, int]) – The number of Literals in a specific layer.

Computing properties for an ArgumentationTheory or ArgumentationSystem¶

compute_argumentation_theory_properties(argumentation_theory, verbose=False)¶

Compute some properties of the given ArgumentationTheory, such as the corresponding incomplete argumentation framework or the number of future ArgumentationTheories.

Parameters

argumentation_theory (ArgumentationTheory) – ArgumentationTheory for which properties are needed.
verbose – Boolean indicating if information should be printed.

Return type

ArgumentationTheoryProperties

Returns

ArgumentationTheoryProperties of the ArgumentationTheory.

enumerate_future_argumentation_theories(argumentation_theory, verbose=False)¶

Enumerate all future ArgumentationTheories of this ArgumentationTheory.

Parameters

argumentation_theory (ArgumentationTheory) – ArgumentationTheory for which future ArgumentationTheories should be enumerated.
verbose (bool) – Boolean indicating if information should be printed.

Return type

List[ArgumentationTheory]

Returns

All future ArgumentationTheories of this ArgumentationTheory.

compute_argumentation_system_properties(argumentation_system)¶

Compute some properties of the given ArgumentationSystem, such as the number of literals or rule antecedents.

Parameters: argumentation_system (ArgumentationSystem) – ArgumentationSystem for which properties are needed.
Return type: ArgumentationSystemProperties
Returns: ArgumentationSystemProperties of the ArgumentationSystem.

Generating a Dataset for a specific ArgumentationTheory¶

class DatasetGenerator(argumentation_system, argumentation_system_custom_name=None)¶

Bases: object

classmethod from_file(argumentation_system_file_name)¶

Generate a Dataset for an ArgumentationSystem that must still be read from a file.

Parameters: argumentation_system_file_name (str) – Name of ArgumentationSystem for which a Dataset should be generated.
Returns: Dataset for specified ArgumentationSystem.

generate_dataset(custom_dataset_name=None, include_ground_truth=True, verbose=True)¶

Generate a Dataset, where all possible ArgumentationTheories for the given ArgumentationSystem are generated. Note: for ArgumentationSystems with many Queryables, this takes a lot of time.

Parameters

custom_dataset_name (Optional[str]) – Optional, name of the Dataset. Otherwise a name based on the timestamp is chosen.
include_ground_truth (bool) – Boolean indicating if the ground truth should be computed. Note: this takes time!
verbose (bool) – Boolean indicating if information should be printed.

Return type

Dataset

Returns

The resulting Dataset.

generate_dataset_sample(custom_dataset_name=None, include_ground_truth=True, sample_size=1000, verbose=True)¶

Generate a Dataset, where the number of DatasetItems for each number of items in the knowledge base is specified. For example, if there are 4 Queryables in the ArgumentationSystem, then a knowledge base can contain either 0, 1, or 2 (=4/2) items. If, for example, sample_size = 10 then for each knowledge base size 10 ArgumentationTheories are generated, so the total number of DatasetItems is 30.

Parameters

custom_dataset_name (Optional[str]) – Optional, name of the Dataset. Otherwise a name based on the timestamp is chosen.
include_ground_truth (bool) – Boolean indicating if the ground truth should be computed. Note: this takes time!
sample_size (int) – Number of DatasetItems for each number of items in the knowledge base.
verbose (bool) – Boolean indicating if information should be printed.

Return type

Dataset

Returns

The resulting Dataset.

DataSet classes¶

class Dataset(name, argumentation_system_name, dataset_items)¶

Bases: object

__init__(name, argumentation_system_name, dataset_items)¶

A Dataset has a name, the name of its ArgumentationSystem and a list of DatasetItems.

Parameters

name (str) – Name of the Dataset.

argumentation_system_name (str) – Name of the ArgumentationSystem on which the Dataset is based.

dataset_items (List[DatasetItem]) – Items in the Dataset.

class DatasetItem(argumentation_system, argumentation_system_name, knowledge_base)¶

Bases: object

__init__(argumentation_system, argumentation_system_name, knowledge_base)¶

A DatasetItem has an ArgumentationSystem, its name and a knowledge base.

Parameters

argumentation_system (ArgumentationSystem) – The ArgumentationSystem on which the DatasetItem is based.

argumentation_system_name (str) – The name of the ArgumentationSystem.

knowledge_base (List[Queryable]) – The knowledge base (list of Queryables).

classmethod from_str(dataset_item_str)¶

Read the DatasetItem from a string.

Parameters: dataset_item_str (str) – String representation of the DatasetItem.
Returns: DatasetItem represented by the input string.

class AnnotatedDatasetItem(argumentation_system, argumentation_system_name, knowledge_base, topic_literal, gt_acceptability_label, gt_stability_label)¶

Bases: stability_label_algorithm.modules.dataset_generator.dataset_item.DatasetItem

An AnnotatedDatasetItem is a specific type of DatasetItem that also has a ground truth acceptability and stability label for a topic literal.

__init__(argumentation_system, argumentation_system_name, knowledge_base, topic_literal, gt_acceptability_label, gt_stability_label)¶

Create an AnnotatedDatasetItem.

Parameters

argumentation_system (ArgumentationSystem) – The ArgumentationSystem on which the DatasetItem is based.

argumentation_system_name (str) – The name of the ArgumentationSystem.

knowledge_base (List[Queryable]) – The knowledge base (list of Queryables).

topic_literal (Literal) – The Literal for which the ground truth is given.

gt_acceptability_label (StabilityLabel) – Ground truth acceptability label for the topic Literal.

gt_stability_label (StabilityLabel) – Ground truth stability label for the topic Literal.

classmethod from_str(dataset_item_str)¶

Read the DatasetItem from a string.

Parameters: dataset_item_str (str) – String representation of the DatasetItem.
Returns: DatasetItem represented by the input string.