Universe#

Get and find functions#

esgvoc.api.universe.find_data_descriptors_in_universe(expression: str, only_id: bool = False, limit: int | None = None, offset: int | None = None) list[tuple[str, dict]][source]#

Find data descriptors in the universe based on a full text search defined by the given expression. The expression comes from the powerful SQLite FTS extension and corresponds to the expression of the MATCH operator. It can be composed of one or multiple keywords combined with boolean operators (NOT, AND, ^, etc. default is OR). Keywords can define prefixes or postfixes with the wildcard *. The function returns a list of data descriptor ids and contexts, sorted according to the bm25 ranking metric (list index 0 has the highest rank). If the provided expression does not hit any data descriptor, the function returns an empty list. The function searches for the expression in the data descriptor specifications. However, if only_id is True (default is False), the search is restricted to the id of the data descriptors. At the moment, `only_id` is set to `True` as the data descriptors haven’t got any description.

Parameters:
  • expression (str) – The full text search expression.

  • only_id (bool) – Performs the search only on ids, otherwise on all the specifications.

  • limit (int | None) – Limit the number of returned items found. Returns all items found the if limit is either None, zero or negative.

  • offset (int | None) – Skips offset number of items found. Ignored if offset is either None, zero or negative.

Returns:

A list of data descriptor ids and contexts. Returns an empty list if no matches are found.

Return type:

list[tuple[str, dict]]

Raises:

EsgvocValueError – If the expression cannot be interpreted.

esgvoc.api.universe.find_items_in_universe(expression: str, only_id: bool = False, limit: int | None = None, offset: int | None = None) list[Item][source]#

Find items, at the moment terms and data descriptors, in the universe based on a full-text search defined by the given expression. The expression comes from the powerful SQLite FTS extension and corresponds to the expression of the MATCH operator. It can be composed of one or multiple keywords combined with boolean operators (NOT, AND, ^, etc. default is OR). Keywords can define prefixes or postfixes with the wildcard *. The function returns a list of item instances sorted according to the bm25 ranking metric (list index 0 has the highest rank). If the provided expression does not hit any item, the function returns an empty list. The function searches for the expression in the term and data descriptor specifications. However, if only_id is True (default is False), the search is restricted to the id of the terms and data descriptors. At the moment, `only_id` is set to `True` for the data descriptors because they haven’t got any description.

Parameters:
  • expression (str) – The full text search expression.

  • only_id (bool) – Performs the search only on ids, otherwise on all the specifications.

  • limit (int | None) – Limit the number of returned items found. Returns all items found the if limit is either None, zero or negative.

  • offset (int | None) – Skips offset number of items found. Ignored if offset is either None, zero or negative.

Returns:

A list of item instances. Returns an empty list if no matches are found.

Return type:

list[Item]

Raises:

EsgvocValueError – If the expression cannot be interpreted.

esgvoc.api.universe.find_terms_in_data_descriptor(expression: str, data_descriptor_id: str, only_id: bool = False, limit: int | None = None, offset: int | None = None, selected_term_fields: Iterable[str] | None = None) list[DataDescriptor][source]#

Find terms in the given data descriptor based on a full-text search defined by the given expression. The expression comes from the powerful SQLite FTS extension and corresponds to the expression of the MATCH operator. It can be composed of one or multiple keywords combined with boolean operators (NOT, AND, ^, etc. default is OR). Keywords can define prefixes or postfixes with the wildcard *. The function returns a list of term instances sorted according to the bm25 ranking metric (list index 0 has the highest rank). This function performs an exact match on the data_descriptor_id, and does not search for similar or related data descriptor. If the provided expression does not hit any term or the given data_descriptor_id does not match exactly to an id of a data descriptor, the function returns an empty list. The function searches for the expression in the term specifications. However, if only_id is True (default is False), the search is restricted to the id of the terms.

Parameters:
  • expression (str) – The full text search expression.

  • only_id (bool) – Performs the search only on ids, otherwise on all the specifications.

  • limit (int | None) – Limit the number of returned items found. Returns all items found the if limit is either None, zero or negative.

  • offset (int | None) – Skips offset number of items found. Ignored if offset is either None, zero or negative.

  • selected_term_fields (Iterable[str] | None) – A list of term fields to select or None. If None, all the fields of the terms are returned. If empty, selects the id and type fields.

Returns:

A list of term instances. Returns an empty list if no matches are found.

Return type:

list[DataDescriptor]

Raises:

EsgvocValueError – If the expression cannot be interpreted.

esgvoc.api.universe.find_terms_in_universe(expression: str, only_id: bool = False, limit: int | None = None, offset: int | None = None, selected_term_fields: Iterable[str] | None = None) list[DataDescriptor][source]#

Find terms in the universe based on a full-text search defined by the given expression. The expression comes from the powerful SQLite FTS extension and corresponds to the expression of the MATCH operator. It can be composed of one or multiple keywords combined with boolean operators (NOT, AND, ^, etc. default is OR). Keywords can define prefixes or postfixes with the wildcard *. The function returns a list of term instances sorted according to the bm25 ranking metric (list index 0 has the highest rank). If the provided expression does not hit any term, the function returns an empty list. The function searches for the expression in the term specifications. However, if only_id is True (default is False), the search is restricted to the id of the terms.

Parameters:
  • expression (str) – The full text search expression.

  • only_id (bool) – Performs the search only on ids, otherwise on all the specifications.

  • limit (int | None) – Limit the number of returned items found. Returns all items found the if limit is either None, zero or negative.

  • offset (int | None) – Skips offset number of items found. Ignored if offset is either None, zero or negative.

  • selected_term_fields (Iterable[str] | None) – A list of term fields to select or None. If None, all the fields of the terms are returned. If empty, selects the id and type fields.

Returns:

A list of term instances. Returns an empty list if no matches are found.

Return type:

list[DataDescriptor]

Raises:

EsgvocValueError – If the expression cannot be interpreted.

esgvoc.api.universe.get_all_data_descriptors_in_universe() list[str][source]#

Gets all the data descriptors of the universe.

Returns:

A list of data descriptor ids.

Return type:

list[str]

esgvoc.api.universe.get_all_terms_in_data_descriptor(data_descriptor_id: str, selected_term_fields: Iterable[str] | None = None) list[DataDescriptor][source]#

Gets all the terms of the given data descriptor. This function performs an exact match on the data_descriptor_id and does not search for similar or related descriptors. If the provided data_descriptor_id is not found, the function returns an empty list.

Parameters:
  • data_descriptor_id (str) – A data descriptor id

  • selected_term_fields (Iterable[str] | None) – A list of term fields to select or None. If None, all the fields of the terms are returned. If empty, selects the id and type fields.

Returns:

a list of term instances. Returns an empty list if no matches are found.

Return type:

list[DataDescriptor]

esgvoc.api.universe.get_all_terms_in_universe(selected_term_fields: Iterable[str] | None = None) list[DataDescriptor][source]#

Gets all the terms of the universe. Terms are unique within a data descriptor but may have some synonyms in the universe.

Parameters:

selected_term_fields (Iterable[str] | None) – A list of term fields to select or None. If None, all the fields of the terms are returned. If empty, selects the id and type fields.

Returns:

A list of term instances.

Return type:

list[DataDescriptor]

esgvoc.api.universe.get_data_descriptor_in_universe(data_descriptor_id: str) tuple[str, dict] | None[source]#

Returns the id and the context of the data descriptor, in the universe whose, id corresponds exactly to the given data descriptor id. This function performs an exact match on the data_descriptor_id and does not search for similar or related data descriptors. If the provided data_descriptor_id is not found, the function returns None.

Parameters:

data_descriptor_id (str) – An id of a data descriptor to be found.

Returns:

The data descriptor id and context. Returns None if no match is found.

Return type:

tuple[str, dict] | None

esgvoc.api.universe.get_term_in_data_descriptor(data_descriptor_id: str, term_id: str, selected_term_fields: Iterable[str] | None = None) DataDescriptor | None[source]#

Returns the term, in the given data descriptor, whose id corresponds exactly to the given term id. This function performs an exact match on the term_id and the data_descriptor_id and does not search for similar or related terms and data descriptors. If the provided term_id is not found, the function returns None.

Parameters:
  • data_descriptor_id (str) – The id of the given data descriptor.

  • term_id (str) – The id of a term to be found.

  • selected_term_fields (Iterable[str] | None) – A list of term fields to select or None. If None, all the fields of the terms are returned. If empty, selects the id and type fields.

Returns:

A term instance. Returns None if no match is found.

Return type:

DataDescriptor | None

esgvoc.api.universe.get_term_in_universe(term_id: str, selected_term_fields: Iterable[str] | None = None) DataDescriptor | None[source]#

Returns the first occurrence of the terms, in the universe, whose id corresponds exactly to the given term id. Terms are unique within a data descriptor but may have some synonyms in the universe. This function performs an exact match on the term_id and does not search for similar or related terms. If the provided term_id is not found, the function returns None.

Parameters:
  • term_id (str) – The id of a term to be found.

  • selected_term_fields (Iterable[str] | None) – A list of term fields to select or None. If None, all the fields of the terms are returned. If empty, selects the id and type fields.

Returns:

A term instance. Returns None if no match is found.

Return type:

DataDescriptor | None