The data parser¶
Parsing the database¶
In the database, experimental measurements are logically grouped in the following ways:
- Sources: A source is a unique publication in which the data was produced
- Sets: A set is those measurements that were taken under a similar set of conditions with one variable changed. For instance, a set might be all ignition delay times at a particular pressure and composition that vary with temperature, or all laminar flame speeds at a particular pressure and temperature.
- Primary fuel: The primary fuel in a combustion experiment is that substance which is the principal component being oxidized. It is usually designated by the researcher.
Measurements are named according to the convention source_setidXX, where ‘source’ is a string that uniquely identifies the origin of the data (e.g. paper, online database, etc), setid is a unique identifier for each set within that source, and XX is a unique identifier for each measurement within a set. So, for instance, the name ‘ecl90_b01’ refers to the data source named ecl90, set b, measurement 01. Likewise, lif77_1a04 refers to source lif77, set 1a, measurement 04. In addition, each Excel data file in the database contains the measurements for a particular primary fuel, so some sources appear in different data files (for instance, ezl90).
The functions in the data_parser module will scan the database and break it down into individual sets. A MUM-PCE Measurement is created for each measurement in the database, and then all the measurements identified as belonging to a particular set are grouped into a Project. Each project will then be run and the resulting values saved.
find_data_sets (df) |
Takes a pandas.DataFrame object containing mumpce_py.cantera_utils compatible measurements and returns the data sources and data sets in the DataFrame. |
load_projects (df) |
Takes a pandas.DataFrame object containing mumpce_py.cantera_utils compatible measurements and returns a list of mumpce.Project objects built from measuremenst in the DataFrame. |
run_project (project_to_run) |
Check to see if the Measurements in a Project have been evaluated and, if they havent, evaluate them. |
run_project_parallel (project_to_run) |
Check to see if the Measurements in a Project have been evaluated and, if they havent, evaluate them. |
run_project_sensitivity (project_to_run) |
Check to see if the Measurements in a Project have been evaluated for sensitivity analysis and, if they havent, evaluate them. |
Creating documentation¶
Once the projects have been run, the data parser will automatically generate the web documentation for the database. For ignition delay, each set will be given its own page. For laminar flame speed, the sets are grouped by nominal unburned gas temperature and pressure, and then every set with that condition will be plotted together.
make_fls_docs (primary_fuel_name, ...) |
Creates the documentation for the laminar flame speeds in the database |
make_main_fls_page (primary_fuel_name, ...) |
Creates the top level index page for a set of laminar flame speed projects |
make_main_page (primary_fuel_name, ...) |
Creates the table of contents for the top level index page. |
make_fls_subpage (primary_fuel, project_dict) |
Creates the documentation page for a single data source |
make_fls_doc_page (primary_fuel, ...) |
Creates the documentation page for laminar flame speeds at a particular nominal condition associated with a list of Project objects |
make_ign_docs (primary_fuel_name, sources, ...) |
Creates the documentation for the ignition delay times in the database |
make_main_ign_page (primary_fuel_name, ...) |
Creates the top level index page for a set of ignition delay time projects |
make_ign_subpage (source, primary_fuel_name, ...) |
Creates the documentation page for a single data source |
make_ign_doc_page (project) |
Creates the documentation page for a single ignition delay set associated with a Project |
Full function documentation¶
-
data_parser.
find_data_sets
(df)[source]¶ Takes a pandas.DataFrame object containing mumpce_py.cantera_utils compatible measurements and returns the data sources and data sets in the DataFrame.
In the Small Hydrocarbon Databse, measurements are named according to the convention source_setidXX, where ‘source’ is a string that uniquely identifies the origin of the data (e.g. paper, online database, etc), setid is a unique identifier for each set within that source, and XX is a unique identifier for each measurement within a set. This function will return all of the unique source strings and source_set strings.
Parameters: df (pandas.DataFrame) – The pandas DataFrame to be parsed Returns: sources,sets, the data sources and data sets contained within the dataframe.
-
data_parser.
load_projects
(df)[source]¶ Takes a pandas.DataFrame object containing mumpce_py.cantera_utils compatible measurements and returns a list of mumpce.Project objects built from measuremenst in the DataFrame.
This function will use
find_data_sets()
to break the DataFrame into sources and sets. Each set will then be built into its own Project, and each Measurement in the Project will be a measurement from that data set.In the Small Hydrocarbon Databse, measurements are named according to the convention source_setidXX, where ‘source’ is a string that uniquely identifies the origin of the data (e.g. paper, online database, etc), setid is a unique identifier for each set within that source, and XX is a unique identifier for each measurement within a set.
Parameters: df (pandas.DataFrame) – The pandas DataFrame to be parsed Returns: project_list, the list of Projects contained within the dataframe.
-
data_parser.
run_project
(project_to_run)[source]¶ Check to see if the Measurements in a Project have been evaluated and, if they havent, evaluate them. Returns the Project.
-
data_parser.
run_project_parallel
(project_to_run)[source]¶ Check to see if the Measurements in a Project have been evaluated and, if they havent, evaluate them. Returns the Project. This is the parallel version.
-
data_parser.
run_project_sensitivity
(project_to_run)[source]¶ Check to see if the Measurements in a Project have been evaluated for sensitivity analysis and, if they havent, evaluate them. Returns the Project.
-
data_parser.
make_fls_docs
(primary_fuel_name, primary_fuel, project_dict)[source]¶ Creates the documentation for the laminar flame speeds in the database
This function takes a primary fuel name and creates the main documentation page for that fuel. It will also make the subpages for each of the experimental sets described in sources. In addition, the function will return the table of contents for the primary fuel’s main documentation page.
Parameters: - primary_fuel_name – The English name of the fuel for which the documentation page is being created
- primary_fuel – The Cantera name of the fuel
- project_dict – The dictionary of mumpce Projects containing laminar flame speed information
Returns: main_contents, the table of contents for the primary fuel main page
-
data_parser.
make_main_fls_page
(primary_fuel_name, data_source_table)[source]¶ Creates the top level index page for a set of laminar flame speed projects
This function takes a primary fuel name and a table of sources and creates the top-level index page for that fuel.
Parameters:
-
data_parser.
make_main_page
(primary_fuel_name, data_source_table, main_rstcontents)[source]¶ Creates the table of contents for the top level index page.
This function accepts a boilerplate restructured text string and inserts the primary fuel name as the tile and the data source table as the table of contents. It returns the properly-formatted restructured text.
Parameters: - primary_fuel_name – The English name of the primary fuel. This will be the title of the documentation page
- data_source_table – The table of contents that will go into the index file. This will be inserted into the documentation page as its table of contents
- main_rstcontents – A string representing the index page boilerplate.
Returns: main_source_contents, the information that will be written to the documentation file
-
data_parser.
make_fls_subpage
(primary_fuel, project_dict)[source]¶ Creates the documentation page for a single data source
This function takes a data source and creates the main documentation page for that source. It also creates the individual subpages for each data set associated with that source.
Parameters: - primary_fuel – The Cantera name of the fuel
- project_dict – The dictionary of mumpce Projects containing laminar flame speed information
Returns: contents, the table of contents for the data source main page
-
data_parser.
make_fls_doc_page
(primary_fuel, nominal_pres, nominal_T, project_list)[source]¶ Creates the documentation page for laminar flame speeds at a particular nominal condition associated with a list of
Project
objects
-
data_parser.
make_ign_docs
(primary_fuel_name, sources, project_list)[source]¶ Creates the documentation for the ignition delay times in the database
This function takes a primary fuel name and creates the main documentation page for that fuel. It will also make the subpages for each of the experimental sets described in sources. In addition, the function will return the table of contents for the primary fuel’s main documentation page.
Parameters: - primary_fuel_name – The English name of the fuel for which the documentation page is being created
- sources – The list of sources associated with this fuel
- project_list – The list of mumpce Projects containing ignition delay information
Returns: main_contents, the table of contents for the primary fuel main page
-
data_parser.
make_main_ign_page
(primary_fuel_name, data_source_table)[source]¶ Creates the top level index page for a set of ignition delay time projects
This function takes a primary fuel name and a table of sources and creates the top-level index page for that fuel.
Parameters:
-
data_parser.
make_ign_subpage
(source, primary_fuel_name, project_list)[source]¶ Creates the documentation page for a single data source
This function takes a data source and creates the main documentation page for that source. It also creates the individual subpages for each data set associated with that source.
Parameters: - source – The source for which the main page and subpages will be created.
- primary_fuel_name – The English name of the fuel for which the documentation page is being created
- project_list – The list of mumpce Projects containing ignition delay information
Returns: contents, the table of contents for the data source main page