cdm
¶
Submodules¶
Package Contents¶
Functions¶
|
Calls the main mapping function _map() |
|
Exports a complete cdm file with multiple tables to an ascii file. |
|
Exports a cdm table to an ascii file. |
|
Reads CDM table like files from file system to a pandas data frame. |
-
map_model
(imodel, data, data_atts, cdm_subset=None, log_level='INFO')[source]¶ Calls the main mapping function _map()
- Parameters
imodel (a data model that can be of several types.) –
1. A generic mapping from a defined data model, like IMMA1’s core and attachments. e.g.
~/cdm-mapper/lib/mappings/icoads_r3000
2. A specific mapping from generic data model to CDM, like map a SID-DCK from IMMA1’s core and attachments to CDM in a specific way.
e.g.
~/cdm-mapper/lib/mappings/icoads_r3000_d704
data (input data to map.) – e.g. a
pandas.Dataframe
orio.parsers.TextFileReader
objects or in-memory text streams (io.StringIO object).data_atts (dictionary with the {element_name:element_attributes} of the data.) – Type: string.
cdm_subset (subset of CDM model tables to map.) – Defaults to the full set of CDM tables defined for the imodel. Type: list.
log_level (level of logging information to save.) – Defaults to ‘DEBUG’. Type string.
- Returns
cdm_tables – a python dictionary with the
{cdm_table_name: cdm_table_object}
pairs.For more information look at the _map function.
-
cdm_to_ascii
(cdm, delimiter='|', null_label='null', cdm_complete=True, extension='psv', out_dir=None, suffix=None, prefix=None, log_level='INFO')[source]¶ Exports a complete cdm file with multiple tables to an ascii file. Exports a complete cdm file with multiple tables written in the C3S Climate Data Store Common Data Model (CDM) format to ascii files. The tables format is contained in a python dictionary, stored as an attribute in a
pandas.DataFrame
(orpd.io.parsers.TextFileReader
).- Parameters
cdm – common data model tables to export
delimiter – default ‘|’
null_label – specified how nan are represented
cdm_complete – extract the entire cdm file
extension – default ‘psv’
out_dir – where to stored the ascii file
suffix – file suffix
prefix – file prefix
log_level – level of logging information
- Returns
- Return type
Saves the cdm tables as ascii files in the given directory with a psv extension.
-
table_to_ascii
(table, table_atts, delimiter='|', null_label='null', cdm_complete=True, filename=None, full_table=True, log_level='INFO')[source]¶ Exports a cdm table to an ascii file. Exports tables written in the C3S Climate Data Store Common Data Model (CDM) format to ascii files. The tables format is contained in a python dictionary, stored as an attribute in a
pandas.DataFrame
(orpd.io.parsers.TextFileReader
).- Parameters
table – pandas.Dataframe to export
table_atts (attributes of the pandas.Dataframe stored as a python dictionary.) – This contains all element names, characteristics and types encoding, as well as other characteristics e.g. decimal places, etc.
delimiter – default ‘|’
null_label – specified how nan are represented
cdm_complete (if we export the entire set of tables.) – default is
True
filename – the name of the file to stored the data
full_table – if we export a single table
log_level – level of logging information to be saved
- Returns
- Return type
Saves cdm tables as ascii files
-
read_tables
(tb_path, tb_id, cdm_subset=None, delimiter='|', extension='psv', col_subset=None, log_level='INFO', na_values=[])[source]¶ Reads CDM table like files from file system to a pandas data frame.
- Parameters
tb_path – path to the file
tb_id – any identifier including wildcards if required extension, defaulting to ‘psv’
cdm_subset (specifies a subset of tables or a single table.) –
For multiple subsets of tables: This option will return a pandas.Dataframe that is multi-index at
the columns, with (table-name, field) as column names. Tables are merged via the report_id field. - For a single table: the function returns a pandas.Dataframe with a simple indexing for the columns.
delimiter – default is ‘|’
extension – default is psv
col_subset (a python dictionary specifying the section or sections of the file to read) –
- For multiple sections of the tables:
e.g
col_subset = {table0:[columns],...tablen:[columns]}
- For a single section:
e.g.
list type object col_subset = [columns]
This variable assumes that the column names are all conform to the cdm field names in lib.tables/*.json
log_level (Level of logging messages to save) –
na_values (specifies the format of NaN values) –
- Returns
pandas.Dataframe (either the entire file or a subset of it.)
logger.error (logs specific messages if there is any error.)