nucleon_elastic_ff.data.scripts.concat¶
Script for concatenating correlator data
-
nucleon_elastic_ff.data.scripts.concat.
concat_dsets
(files: List[str], out_file: str, axis: int = 0, dset_replace_patterns: Optional[Dict[str, str]] = None, ignore_containers: Optional[List[str]] = None, write_unpaired_dsets: bool = False, overwrite: bool = False)[source]¶ Reads h5 files and exports the contatenation of datasets across files.
Each group in the file list will be contatenated over files. Files are concatenatinated in the order they are specified.
Also the contatenation meta info is stored in the resulting output file in the meta attribute of local_current.
Note
Suppose you pass two h5 files files = [“file1.h5”, “file2.h5”]. to write to the out file out_file = “out.h5”. Lets assume the dset structure is as follows
file1.h5 /x1y1 /x1y2
and also
file2.h5 /x1y1 /x1y2
out.h5 /x1y1 /x1y2
where the dset x1y1 is np.concatenate([file1.h5/x1y1, file2.h5/x1y1], axis=axis) and so on.
- Arguments
- files: List[str]
- List of h5 file address which will be read into memory and averaged over.
- out_file: str
- The name of the file which will contain the averages.
- axis: Optional[int] = 0
- The axis to concatenate over.
- dset_replace_patterns: Dict[str, str]
- A map for how dsets in the input files are used to write to the output file. The routine only concatenates over dsets which match all keys of dset_replace_patterns.
- ignore_containers: Optional[List[str]] = None
- Ignores the following h5 containers (groups or dsets) when concatinating. (dont write them at all).
- write_unpaired_dsets: bool = False
- Also write group of data sets where the number of data sets is fewer or more then the number of input files. Prints warning to stdout if numbers don’t match in any case.
- overwrite: bool = False
- Overwrite existing files.
-
nucleon_elastic_ff.data.scripts.concat.
concatenate
(root: str, concatenation_pattern: Dict[str, str], axis: int = 0, file_match_patterns: Optional[List[str]] = None, dset_replace_patterns: Optional[Dict[str, str]] = None, expected_file_patterns: Optional[List[str]] = None, ignore_containers: Optional[List[str]] = None, overwrite: bool = False)[source]¶ Recursively scans directory for files and concatinates them.
Finds files and all files which will be considered for grouping and feeds them to concat_dsets.
The concatinated dset will be ordered according to the file names.
- Arguments
- root: str
- Root directory to recursively scan for files to concatinate.
- concatenation_pattern: Dict[str, str]
- The regex patterns to use for consider for concatinating. The input files must match the key which will be replaced by the value. Only files with similar pattern will be concatinated.
- axis: int = 0
- The axis to concatinate over.
- file_match_patterns: Optional[List[str]] = None
- The regex patterns which file must match in order to be found. This list is extended by concationation pattern keys.
- dset_replace_patterns: Optional[Dict[str, str]] = None
- The patterns for dsets to be replaced after concationaion.
- expected_file_patterns: Optional[List[str]] = None
- Adds expected regex patterns to file filter patterns. After files have been filtered and grouped, checks if all strings in this list are present in the file group. If not exactly all sources are found in the group, raises AssertionError.
- ignore_containers: Optional[List[str]] = None
- Ignore certain h5 groups and dsets and not concat them (dont write them at all).
- overwrite: bool = False
- Overwrite existing sliced files.