qhana_plugin_runner.plugin_utils.entity_marshalling module

Module containing helpers to marshall and unmarshall entities into csv or json files.

class qhana_plugin_runner.plugin_utils.entity_marshalling.ArrayEntity(ID: str, href: str | None, values: Sequence[int | float | None])

Bases: NamedTuple

An entity containing array data in a values attribute.

ID: str

Alias for field number 0

href: str | None

Alias for field number 1

values: Sequence[int | float | None]

Alias for field number 2

class qhana_plugin_runner.plugin_utils.entity_marshalling.DefaultDialect

Bases: Dialect

The default csv dialect used to serialize entities.

Quotes all fields to correctly escape special unicode characters.

Registered under the name "default".

delimiter = ','
doublequote = True
escapechar = None
lineterminator = '\r\n'
quotechar = '"'
quoting = 1
skipinitialspace = False
strict = False
class qhana_plugin_runner.plugin_utils.entity_marshalling.EntityTupleMixin(*args, **kwargs)

Bases: object

A mixin class to provide entity metadata (e.g. attribute names) and some helper functions to a namedtuple class.

Use the helper method get_entity_tuple_class to create a new entity tuple class.

as_dict() Dict[str, Any]

Convert the entity tuple to a dict.

entity_attributes: ClassVar[Sequence[str]]

The list of attribute names.

classmethod from_dict(**kwargs)

Create an entity tuple from key=value mapping (using keyword arguments).

classmethod from_iter(iterable: Sequence)

Create an entity tuple from any iterable.

get(key: str | int, default=None)
class qhana_plugin_runner.plugin_utils.entity_marshalling.ResponseLike(*args, **kwargs)

Bases: Protocol

A protocol of the minimal interface of Response that can be used to read responses.

iter_lines(chunk_size: int = 512, decode_unicode: bool = False, delimiter: str | bytes | None = None) Iterator[Any]
json(*, cls: Type[JSONDecoder] | None = None, object_hook: Callable[[Dict[Any, Any]], Any] | None = None, parse_float: Callable[[str], Any] | None = None, parse_int: Callable[[str], Any] | None = None, parse_constant: Callable[[str], Any] | None = None, object_pairs_hook: Callable[[List[Tuple[Any, Any]]], Any] | None = None, **kwds: Any) Any
qhana_plugin_runner.plugin_utils.entity_marshalling.array_to_entity(items: Iterable[ArrayEntity], prefix: str = 'dim', suffix: str = '', tuple_: Callable[[Iterable[Any]], T | NamedTuple] | None = None) Generator[NamedTuple | T, None, None]

Convert entities from array entities to standard tuple based entity.

If tupe_ is not set, this method creates a new NamedTuple instance with attributes shaped like this: [“ID”, “href”, f”{prefix}{index}{suffix}”, “dim01”, “dim02”, …]

Parameters:
  • items (Iterable[ArrayEntity]) – the input array entitiy stream/iterable

  • prefix (str; default "dim") – the prefix for the array attribute column names

  • suffix (str; default "dim") – the suffix for the array attribute column names

  • tuple (Callable[Iterable, Tuple|NamedTuple], optional) – if set, this function is used to build the result tuples

Yields:

Generator[ArrayEntity, None, None] – the output iterable

qhana_plugin_runner.plugin_utils.entity_marshalling.ensure_array(items: Iterable[Dict[str, Any] | NamedTuple], strict: bool = False) Generator[ArrayEntity, None, None]

Convert entities from a “entity/vector” or “entity/numeric” format into array entities.

This method tries to convert all string values to numbers. Missing values (None) are left as is by default. String values that cannot be converted to numbers will instead become missing values None.

With strict behaviour, missing values will result in exceptions (ValueError).

Parameters:
  • items (Iterable[Dict[str, Any]|NamedTuple]) – the input entitiy stream/iterable

  • strict (bool, optional) – if True any value that cannot be converted to a number raises an exception. Defaults to False.

Yields:

Generator[ArrayEntity, None, None] – the output iterable

qhana_plugin_runner.plugin_utils.entity_marshalling.ensure_dict(items: Iterable[Dict[str, Any] | NamedTuple]) Generator[Dict[str, Any], None, None]

Ensure that all entities in an iterable are dicts.

Parameters:

items (Iterable[Union[Dict[str, Any], NamedTuple]]) – the input iterable

Yields:

Generator[Dict[str, Any], None, None] – the output iterable

qhana_plugin_runner.plugin_utils.entity_marshalling.ensure_tuple(items: Iterable[Dict[str, Any] | NamedTuple], tuple_: Callable[[...], T]) Generator[NamedTuple | T, None, None]

Ensure that all entities in an iterable are namedtuples.

Parameters:
  • items (Iterable[Union[Dict[str, Any], NamedTuple]]) – the input iterable

  • tuple (Callable[..., T]) – The namedtuple class to construct tuples from dicts with

Yields:

Generator[NamedTuple, None, None] – the output iterable

qhana_plugin_runner.plugin_utils.entity_marshalling.entity_attribute_sort_key(attribute_name: str)

A sort key function that can be used to sort keys from a dictionary before passing them to save_entities() or creating a NamedTuple entity class.

qhana_plugin_runner.plugin_utils.entity_marshalling.get_entity_tuple_class(attributes: Sequence[str], name: str = 'Entity') Type[NamedTuple]

Get an entity tuple class.

Caches the classes based on attributes and provided class name.

Parameters:
  • attributes (Sequence[str]) – the list of entity attributes

  • name (str, optional) – the name to use for creating the new type. Defaults to “Entity”.

Returns:

the created entity tuple class

Return type:

Type[NamedTuple]

qhana_plugin_runner.plugin_utils.entity_marshalling.iskeyword()

x.__contains__(y) <==> y in x.

qhana_plugin_runner.plugin_utils.entity_marshalling.load_entities(file_: ResponseLike, mimetype: str, csv_dialect: str = 'default', tuple_: Callable[[Iterable[Any]], T] | None = None, process_csv_header: Callable[[Sequence[str]], Sequence[str]] | None = None) Generator[Dict[str, Any] | T, None, None]

Load entities from a Response like object.

Attributes of entities are either deserialized as json or as strings (csv).

If the mimetype is “text/csv” this method returns a stream of namedtuples. For json dicts are returned. Use the generator functions ensure_dict() and ensure_tuple() to always convert items in the result stream to dicts or tuples.

For csv files this method produces namedtuples as output. For this to work all column names must be valid python identifiers and will be normalized with normalize_attribute_name(). This behaviour can be overwritten with the process_csv_header callback. If the callback is set then the header names will not be normalized with normalize_attribute_name()!

Parameters:
  • file (ResponseLike) – the object to load the entities from

  • mimetype (str) – the mime type to use for deserialization (supported mimetypes: “application/json”, “application/X-lines+json” and “text/csv”)

  • csv_dialect (str, optional) – the csv dialect to use (only used with csv mimetype). Defaults to “default”.

  • tuple (Optional[Type[NamedTuple]], optional) – the namedtuple class to use (only used with csv mimetype). Defaults to None.

  • process_csv_header (Optional[Callable[[Sequence[str]], Sequence[str]]]) – a callback used to process the csv header. Defaults to None.

Raises:

ValueError – For unknown mimetypes

Yields:

Generator[Union[Dict[str, Any], NamedTuple], None, None] – a stream of deserialized entities (dicts for json and tuples for csv)

qhana_plugin_runner.plugin_utils.entity_marshalling.save_entities(entities: Iterable[Dict[str, Any] | NamedTuple], file_: TextIO, mimetype: str, attributes: Sequence[str] | None = None, csv_dialect: str = 'default', tuple_: Callable[[...], Tuple] | None = None)

Write entities to a file.

CSV files require an attribute order that can be specified by the attributes parameter. The first and second attribute should be "ID" and "href". The function entity_attribute_sort_key() can be used to achieve that order.

Parameters:
  • entities (Iterable[Union[Dict[str, Any], NamedTuple]]) – an iterable of entities as returned by load_entities()

  • file (TextIO) – the file to write the entities into

  • mimetype (str) – the mime type to use for serialization (supported mimetypes: “application/json”, “application/X-lines+json” and “text/csv”)

  • attributes (Optional[Sequence[str]], optional) – A list of attributes in the order they should appear in the csv file. MUST be valid python identifiers! All entities must have all attributes specified here! Defaults to None.

  • csv_dialect (str, optional) – the csv dialect to use. Defaults to “default”.

  • tuple (Optional[Type[NamedTuple]], optional) – the namedtuple class to use (only used with csv mimetype, passed to ensure_tuple). Defaults to None.

Raises:
  • ValueError – if mimetype=="text/csv" and attributes is None

  • ValueError – For unknown mimetypes