academic_observatory_api.server.utils

Module Contents

Functions

parse_args() → Tuple[List[str], List[str], str, str, str, int, str, str, bool]

Parse the arguments coming in from the request.

create_es_connection() → Union[elasticsearch.Elasticsearch, str]

Create an elasticsearch connection

list_available_index_dates(es: elasticsearch.Elasticsearch, alias: str) → List[str]

For a given index name (e.g. journals-institution), list which dates are available

create_search_body(agg_field: str, agg_ids: Optional[List[str]], subagg_field: Optional[str], subagg_ids: Optional[List[str]], from_year: Optional[str], to_year: Optional[str], size: int, search_after: str = None, pit_id: str = None) → dict

Create a search body that is passed on to the elasticsearch 'search' method.

process_response(res: dict) → Tuple[Optional[str], Optional[str], list, Optional[str]]

Get the search_after id and hits from the response of an elasticsearch search query.

academic_observatory_api.server.utils.parse_args() Tuple[List[str], List[str], str, str, str, int, str, str, bool][source]

Parse the arguments coming in from the request.

alias: concatenate ‘subset’ and ‘agg’ index_date: directly from requests.args. None allowed from_date: from_date + ‘-12-31’. None allowed to_date: to_date + ‘-12-31’. None allowed filter_fields: directly from requests.args for each item in ‘query_filter_parameters’. Empty dict allowed size: If ‘limit’ is given -> set to ‘limit’, can’t be more than 10000. If no ‘limit’ -> 10000 scroll_id: directly from requests.args

Returns

alias, index_date, from_date, to_date, filter_fields, size, scroll_id

academic_observatory_api.server.utils.create_es_connection() Union[elasticsearch.Elasticsearch, str][source]

Create an elasticsearch connection

Returns

elasticsearch connection

academic_observatory_api.server.utils.list_available_index_dates(es: elasticsearch.Elasticsearch, alias: str) List[str][source]

For a given index name (e.g. journals-institution), list which dates are available

Parameters
  • es – elasticsearch connection

  • alias – index alias

Returns

list of available dates for given index

academic_observatory_api.server.utils.create_search_body(agg_field: str, agg_ids: Optional[List[str]], subagg_field: Optional[str], subagg_ids: Optional[List[str]], from_year: Optional[str], to_year: Optional[str], size: int, search_after: str = None, pit_id: str = None) dict[source]

Create a search body that is passed on to the elasticsearch ‘search’ method.

Parameters
  • agg_field – The aggregate that is queried

  • agg_ids – List of aggregate values on which is filtered

  • subagg_field – The subaggregate that is queried

  • subagg_ids – List of subaggregate values on which is filtered

  • from_year – Refers to published year, add to ‘range’. Include results where published year >= from_year

  • to_year – Refers to published year, add to ‘rangen’. Include results where published year < to_year

  • size – The returned size (number of hits)

  • search_after – Return results from after this unique id (used for pagination)

  • pit_id – The unique point in time IDn (used for pagination)

Returns

search body

academic_observatory_api.server.utils.process_response(res: dict) Tuple[Optional[str], Optional[str], list, Optional[str]][source]

Get the search_after id and hits from the response of an elasticsearch search query.

Parameters

res – The response.

Returns

pit id, search after and hits

exception academic_observatory_api.server.utils.APIError(error: Dict[str, str], status_code: int)[source]

Bases: Exception

Common base class for all non-exit exceptions.