nucleon_elastic_ff.data.scripts.concat

Script for concatenating correlator data

nucleon_elastic_ff.data.scripts.concat.concat_dsets(files: List[str], out_file: str, axis: int = 0, dset_replace_patterns: Optional[Dict[str, str]] = None, ignore_containers: Optional[List[str]] = None, write_unpaired_dsets: bool = False, overwrite: bool = False)[source]

Reads h5 files and exports the contatenation of datasets across files.

Each group in the file list will be contatenated over files. Files are concatenatinated in the order they are specified.

Also the contatenation meta info is stored in the resulting output file in the meta attribute of local_current.

Note

Suppose you pass two h5 files files = [“file1.h5”, “file2.h5”]. to write to the out file out_file = “out.h5”. Lets assume the dset structure is as follows

file1.h5
/x1y1
/x1y2

and also

file2.h5
/x1y1
/x1y2
out.h5
/x1y1
/x1y2

where the dset x1y1 is np.concatenate([file1.h5/x1y1, file2.h5/x1y1], axis=axis) and so on.

Arguments
files: List[str]
List of h5 file address which will be read into memory and averaged over.
out_file: str
The name of the file which will contain the averages.
axis: Optional[int] = 0
The axis to concatenate over.
dset_replace_patterns: Dict[str, str]
A map for how dsets in the input files are used to write to the output file. The routine only concatenates over dsets which match all keys of dset_replace_patterns.
ignore_containers: Optional[List[str]] = None
Ignores the following h5 containers (groups or dsets) when concatinating. (dont write them at all).
write_unpaired_dsets: bool = False
Also write group of data sets where the number of data sets is fewer or more then the number of input files. Prints warning to stdout if numbers don’t match in any case.
overwrite: bool = False
Overwrite existing files.
nucleon_elastic_ff.data.scripts.concat.concatenate(root: str, concatenation_pattern: Dict[str, str], axis: int = 0, file_match_patterns: Optional[List[str]] = None, dset_replace_patterns: Optional[Dict[str, str]] = None, expected_file_patterns: Optional[List[str]] = None, ignore_containers: Optional[List[str]] = None, overwrite: bool = False)[source]

Recursively scans directory for files and concatinates them.

Finds files and all files which will be considered for grouping and feeds them to concat_dsets.

The concatinated dset will be ordered according to the file names.

Arguments
root: str
Root directory to recursively scan for files to concatinate.
concatenation_pattern: Dict[str, str]
The regex patterns to use for consider for concatinating. The input files must match the key which will be replaced by the value. Only files with similar pattern will be concatinated.
axis: int = 0
The axis to concatinate over.
file_match_patterns: Optional[List[str]] = None
The regex patterns which file must match in order to be found. This list is extended by concationation pattern keys.
dset_replace_patterns: Optional[Dict[str, str]] = None
The patterns for dsets to be replaced after concationaion.
expected_file_patterns: Optional[List[str]] = None
Adds expected regex patterns to file filter patterns. After files have been filtered and grouped, checks if all strings in this list are present in the file group. If not exactly all sources are found in the group, raises AssertionError.
ignore_containers: Optional[List[str]] = None
Ignore certain h5 groups and dsets and not concat them (dont write them at all).
overwrite: bool = False
Overwrite existing sliced files.
nucleon_elastic_ff.data.scripts.concat.main()[source]

Command line interface for concatenatenating list of h5 files