`cdm`¶

Subpackages¶

Submodules¶

cdm.properties

Package Contents¶

Functions¶

`map_model`(imodel, data, data_atts, cdm_subset=None, log_level='INFO')	Calls the main mapping function _map()
`cdm_to_ascii`(cdm, delimiter='\|', null_label='null', cdm_complete=True, extension='psv', out_dir=None, suffix=None, prefix=None, log_level='INFO')	Exports a complete cdm file with multiple tables to an ascii file.
`table_to_ascii`(table, table_atts, delimiter='\|', null_label='null', cdm_complete=True, filename=None, full_table=True, log_level='INFO')	Exports a cdm table to an ascii file.
`read_tables`(tb_path, tb_id, cdm_subset=None, delimiter='\|', extension='psv', col_subset=None, log_level='INFO', na_values=[])	Reads CDM table like files from file system to a pandas data frame.

map_model(imodel, data, data_atts, cdm_subset=None, log_level='INFO')[source]¶

Calls the main mapping function _map()

Parameters

imodel (a data model that can be of several types.) –
1. A generic mapping from a defined data model, like IMMA1’s core and attachments. e.g. ~/cdm-mapper/lib/mappings/icoads_r3000

2. A specific mapping from generic data model to CDM, like map a SID-DCK from IMMA1’s core and attachments to CDM in a specific way.

e.g. ~/cdm-mapper/lib/mappings/icoads_r3000_d704
data (input data to map.) – e.g. a pandas.Dataframe or io.parsers.TextFileReader objects or in-memory text streams (io.StringIO object).
data_atts (dictionary with the {element_name:element_attributes} of the data.) – Type: string.
cdm_subset (subset of CDM model tables to map.) – Defaults to the full set of CDM tables defined for the imodel. Type: list.
log_level (level of logging information to save.) – Defaults to ‘DEBUG’. Type string.

Returns

cdm_tables – a python dictionary with the {cdm_table_name: cdm_table_object} pairs.
For more information look at the _map function.

cdm_to_ascii(cdm, delimiter='|', null_label='null', cdm_complete=True, extension='psv', out_dir=None, suffix=None, prefix=None, log_level='INFO')[source]¶

Exports a complete cdm file with multiple tables to an ascii file. Exports a complete cdm file with multiple tables written in the C3S Climate Data Store Common Data Model (CDM) format to ascii files. The tables format is contained in a python dictionary, stored as an attribute in a pandas.DataFrame (or pd.io.parsers.TextFileReader).

Parameters

cdm – common data model tables to export
delimiter – default ‘|’
null_label – specified how nan are represented
cdm_complete – extract the entire cdm file
extension – default ‘psv’
out_dir – where to stored the ascii file
suffix – file suffix
prefix – file prefix
log_level – level of logging information

Returns

Return type

Saves the cdm tables as ascii files in the given directory with a psv extension.

table_to_ascii(table, table_atts, delimiter='|', null_label='null', cdm_complete=True, filename=None, full_table=True, log_level='INFO')[source]¶

Exports a cdm table to an ascii file. Exports tables written in the C3S Climate Data Store Common Data Model (CDM) format to ascii files. The tables format is contained in a python dictionary, stored as an attribute in a pandas.DataFrame (or pd.io.parsers.TextFileReader).

Parameters

table – pandas.Dataframe to export
table_atts (attributes of the pandas.Dataframe stored as a python dictionary.) – This contains all element names, characteristics and types encoding, as well as other characteristics e.g. decimal places, etc.
delimiter – default ‘|’
null_label – specified how nan are represented
cdm_complete (if we export the entire set of tables.) – default is True
filename – the name of the file to stored the data
full_table – if we export a single table
log_level – level of logging information to be saved

Returns

Return type

Saves cdm tables as ascii files

read_tables(tb_path, tb_id, cdm_subset=None, delimiter='|', extension='psv', col_subset=None, log_level='INFO', na_values=[])[source]¶

Reads CDM table like files from file system to a pandas data frame.

Parameters

tb_path – path to the file
tb_id – any identifier including wildcards if required extension, defaulting to ‘psv’
cdm_subset (specifies a subset of tables or a single table.) –
- For multiple subsets of tables: This option will return a pandas.Dataframe that is multi-index at
the columns, with (table-name, field) as column names. Tables are merged via the report_id field. - For a single table: the function returns a pandas.Dataframe with a simple indexing for the columns.
delimiter – default is ‘|’
extension – default is psv
col_subset (a python dictionary specifying the section or sections of the file to read) –
- For multiple sections of the tables:
  e.g col_subset = {table0:[columns],...tablen:[columns]}
- For a single section:
  e.g. list type object col_subset = [columns] This variable assumes that the column names are all conform to the cdm field names in lib.tables/*.json
log_level (Level of logging messages to save) –
na_values (specifies the format of NaN values) –

Returns

pandas.Dataframe (either the entire file or a subset of it.)
logger.error (logs specific messages if there is any error.)

cdm¶

Subpackages¶

Submodules¶

Package Contents¶

Functions¶

`cdm`¶