MSIGen API Reference

Submodules

MSIGen.msigen module

This module provides a function to subclass the base MSIGen object for different file formats.

class MSIGen.msigen.msigen(*args, **kwargs)[source]

Bases: object

This function subclasses the base MSIGen object for different file formats.

Parameters:
  • example_file (str or list) – The file path or list of file paths to be processed.

  • MSIGen.base_class.MSIGen_base. (Parameters are passed to)

Returns:

An instance of the appropriate class based on the file format.

classmethod get_metadata_and_params(*args, **kwargs)[source]

This is an alias for the __new__ method to allow for compatibility with older versions of MSIGen_jupyter files.ipynb files.

static load_pixels(path=None)[source]

This function loads pixel data from the specified file without initilizing the class beforehand.

Parameters:

path (str) – The file path to load pixel data from. If path is None, the current directory will be searched for a file named pixels.npy, pixels.npz, or pixels.csv and this file will be loaded.

Returns:

Pixel data loaded from the file.

MSIGen.base_class module

This module provides a base class for MSIGen which can be subclassed to handle different file formats.

class MSIGen.base_class.HiddenPrints[source]

Bases: object

Allows code to be run without displaying messages.

class MSIGen.base_class.MSIGen_base(example_file=None, mass_list_dir=None, tol_MS1=10, tol_MS1_u='ppm', tol_prec=1, tol_prec_u='mz', tol_frag=10, tol_frag_u='ppm', tol_mob=0.1, tol_mob_u='μs', h=10, w=10, hw_units='mm', is_MS2=False, is_mobility=False, normalize_img_sizes=True, pixels_per_line='mean', output_file_loc=None, in_jupyter=True, testing=False, gui=False, save_file_format='npy', ask_confirmation=True)[source]

Bases: object

Base class for MSIGen. This class is not generally to be used directly.

It is intended to be subclassed for specific file formats (e.g., D, mzml, raw).

Parameters:
  • example_file (str, list, or tuple) – The file path or list of file paths to be processed. (default None) If type is str, it should be a single file path and all other files with the same name apart from the line number will be used. If the type is list or tuple, the provided files will be the only ones processed. None initializes the base class without data files.

  • mass_list_dir (str) – The directory containing the mass list file. (default None)

  • tol_MS1 (float) – Tolerance for MS1 mass selection. (default 10.)

  • tol_MS1_u (str) – Units for MS1 tolerance (‘ppm’ or ‘mz’). (default ‘ppm’)

  • tol_prec (float) – Tolerance for precursor mass selection in MS2 entries of the mass list. (default 1.)

  • tol_prec_u (str) – Units for precursor tolerance (‘ppm’ or ‘mz’). (default ‘mz’)

  • tol_frag (float) – Tolerance for fragment mass selection in MS2 entries of the mass list. (default 10.)

  • tol_frag_u (str) – Units for fragment tolerance (‘ppm’ or ‘mz’). (default ‘ppm’)

  • tol_mob (float) – Tolerance for mobility selection. Ignored if no mobility data is present. (default 0.1)

  • tol_mob_u (str) – Units for mobility tolerance (‘μs’ or ‘1/K0’). (default ‘μs’)

  • h (float) – Height of the image in specified units. (default 10.)

  • w (float) – Width of the image in specified units. (default 10.)

  • hw_units (str) – Units for height and width. (default “mm”)

  • is_MS2 (bool) – Flag indicating if the data files contain MS2 information. (default False)

  • is_mobility (bool) – Flag indicating if the data files contain mobility information. (default False)

  • normalize_img_sizes (bool) – Flag indicating if image sizes should be normalized. (default True) If True, all images will be resized to the same size and the data can be saved as an .npy or .csv file. If False, images will be saved in their original sizes and the data will be saved as an .npz file.

  • pixels_per_line (str) – Number of pixels per line (‘mean’, ‘min’, ‘max’, or a specific integer). (default “mean”) If “mean”, the mean number of pixels per line will be used. If “min”, the minimum number of pixels per line will be used. If “max”, the maximum number of pixels per line will be used. If an integer, that number of pixels will be used.

  • output_file_loc (str) – Location to save the output files. (default None)

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. (default True)

  • testing (bool) – Flag indicating if the code is in testing mode. (default False)

  • gui (bool) – Flag indicating if the GUI is being used. (default False)

  • save_file_format (str) – Format to save the output files (‘npy’, ‘npz’, or ‘csv’). (default “npy”) If “npy”, the data will be saved as a .npy file, or as an npz file if normalize_img_sizes is False and MS2 is True. If “npz”, the data will be saved as a .npz file. If “csv”, the data will be saved as a .csv file. normalize_img_sizes will be set to True in this case.

  • ask_confirmation (bool) – Flag indicating if the user should be asked for confirmation before overwriting existing files. (default True)

Other Attributes:
self.results (dict):

Dictionary to store results and metadata for the GUI. (default {})

self.HiddenPrints (HiddenPrints):

A context manager to suppress print statements.

self.data_format (str, None):

The data format of the .d files to be read. (default None) Unnecessary unless using .d files.

self.NpEncoder (NpEncoder):

A custom JSON encoder for numpy data types.

self.get_confirmation_dialogue (get_confirmation):

A dialog to confirm overwriting existing files.

self.numba_present (bool):

Flag indicating if numba is available. (default False)

self.verbose (int):

Verbosity level for print statements. (default 0)

self.tkinter_widgets (list):

List of tkinter widgets for the GUI. (default [None, None, None])

assign_values_to_pixel_njit(intensities, idxs_to_sum)[source]

Assigns values to pixels array based on the provided intensities and indices. Uses numba njit to speed up the process. Will run the same function without njit if numba is not present.

Parameters:
  • intensities (np.ndarray) – The intensity values of the pixel to pixels.

  • idxs_to_sum (np.ndarray) – The indices of the intensity values to sum for each pixel.

Returns:

The summed intensity values for each pixel.

Return type:

np.array

check_dim(*args, **kwargs)[source]

Implemented in subclasses.

check_for_existing_files(json_path, pixels_path)[source]

Checks if the specified JSON and pixels files already exist.

column_header_check(raw_mass_list=None)[source]

Ensures that the column headers include a valid combination of mass, precursor, fragment, mobility, and polarity columns.

Parameters:

raw_mass_list (DataFrame) – The DataFrame containing the raw mass list. If None, self.raw_mass_list will be used.

Returns:

List of column types for the mass list. raw_mass_list_col_idxs (list):

List of indices for the selected mass list columns.

Return type:

raw_mass_list_col_type (list)

column_type_check(raw_mass_list=None)[source]

Checks the column headers of the mass list and returns a list of column types. (m/z, precursor, fragment, mobility, or polarity)

Parameters:

raw_mass_list (DataFrame) – The DataFrame containing the raw mass list. If None, self.raw_mass_list will be used.

Returns:

List of column classifications in the mass list based on the column header .

Return type:

col_type (list)

confirm_overwrite_file(file_list)[source]

Prompts the user to confirm overwriting existing files.

consolidate_filter_list(filters_info, mzsPerFilter, scans_per_filter, mzsPerFilter_lb, mzsPerFilter_ub, mzIndicesPerFilter)[source]

The function will group together MS2 filters that are present in the same scans. This is necessary to deals with the case where ms2 filters do not have matching mass ranges, which is common with Agilent data.

display_mass_list()[source]

Displays the mass list as a DataFrame with appropriate formatting.

extract_masses_no_mob(mz, lb, ub, intensity_points)[source]

Finds all values of mz within the mass windows defined by lower bounds lb and upper bounds ub and returns the summed intensities of those m/z values. The provided data must be sorted form lowest m/z to greatest m/z and not contain mobility information. Length of l and r must be the same.

Parameters:
  • mz (np.ndarray) – The m/z values to search through.

  • lb (list) – The lower bounds of the mass windows.

  • ub (float) – The upper bounds of the mass windows.

  • intensity_points (np.ndarray) – The intensity values corresponding to each m/z value.

Returns:

The summed intensity values for each mass window.

Return type:

pixel (np.ndarray)

static flatten_list(l)[source]

Flattens a nested list into a single list.

get_CountsPerFilter(filters_info)[source]

Gets information about the peaks present in each ms2 filter.

get_ScansPerFilter(*args, **kwargs)[source]
get_all_ms_and_mobility_windows(mass_lists=None, tolerances=None, tolerance_units=None)[source]

Determines the upper and lower bounds for each mass or mobility window based on the provided mass lists, tolerances, and units.

Parameters:
  • mass_lists (list) – List of mass lists to use for the selection windows. If None, self.mass_list will be used.

  • tolerances (list) – List of tolerances to use for the selection windows. If None, self.tolerances will be used.

  • tolerance_units (list) – List of units for the tolerances. If None, self.tolerance_units will be used.

Returns:

The lower limit of each selection window. upper_lims (list of arrays):

The upper limit of each selection window.

Return type:

lower_lims (list of arrays)

get_attr_values(metadata, source, attr_list, save_names=None, metadata_dicts=None)[source]

Gets the values of the specified attributes from the metadata dictionary.

get_basic_instrument_metadata(*args, **kwargs)[source]
get_default_load_path()[source]

Returns the default path for loading pixel data. Searches the working directory for a file named ‘pixels.npz’, ‘pixels.npy’, or ‘pixels.csv’.

static get_file_extension(example_file)[source]

Returns the file extension of the provided example file.

get_filter_idx(*args, **kwargs)[source]
get_filters_info(all_filters_list)[source]

Collects information that would be present in Thermo filters.

get_image_data(**kwargs)[source]

Processes the image data for the specified mass list and line list. Saves and returns the processed image data. Requires a subclass to run successfully. All arguments are optional and will update the corresponding class attribute values. Accepted arguments are: verbose, in_jupyter, testing, gui, results, pixels_per_line, tkinter_widgets

Returns:

Dictionary containing metadata about the image data. pixels (np.array or list):

A 3D array or list of pixel image data extracted from the image.

Return type:

metadata (dict)

get_line_list(example_file=None, display=False)[source]

Returns a list of file names for each line scan in the experiment in increasing order of line number.

Parameters:
  • example_file (str) – The example file name to use for determining the line list. All files in the same directory that match the naming scheme and file extension will be included. If None, self.example_file will be used.

  • display (bool) – If True, the line list will be printed to the console. (default False)

Returns:

List of file names for each line scan in the experiment in increasing order of line number.

Return type:

line_list (list)

get_mass_list(mass_list_dir=None, header=0, sheet_name=0)[source]

Reads the mass list file and returns a DataFrame containing the mass list.

Parameters:
  • mass_list_dir (str) – The directory containing the mass list file. If None, self.mass_list_dir will be used.

  • header (int) – The row number to use as the header in the spreadsheet. (default 0)

  • sheet_name (str or int) – The name (if str) or index (if int) of the sheet to read. (default 0)

Returns:

List containing mass/mobility lists split based on MS level. Each sublist contains the mass, mobility, or polarity values for that MS level.

Return type:

mass_list (list of lists)

get_mass_mobility_lists(raw_mass_list=None, col_type=None, col_idxs=None)[source]

Returns the a list containing mass and mobility, and any other values such as polarity, from the raw mass list DataFrame based on col_type and col_idxs.

Parameters:
  • raw_mass_list (DataFrame) – The DataFrame containing the raw mass list. If None, self.raw_mass_list will be used.

  • col_type (list) – List of column types to use. If None, self.raw_mass_list_col_type will be used.

  • col_idxs (list) – List of indices for the selected mass list columns. If None, self.raw_mass_list_col_idxs will be used.

Returns:

List containing mass/mobility lists split based on MS level. Each sublist contains the mass, mobility, or polarity values for that MS level.

Return type:

self.mass_list (list)

get_mass_or_mobility_window(val_list, tol, unit)[source]

Determines the upper and lower bounds for a selection window based on the provided values, tolerance. Defines the lower_lims and upper_lims attributes of the class. Treats entries of 0, which are usually blanks in the mass list, as having a window of 0 to infinity.

Parameters:
  • val_list (list) – List of values for which to determine the selection window. These values define the center of the selection window.

  • tol (float) – Tolerance value for the selection window. The window will be +/- this value.

  • unit (str) – Units for the tolerance (ex. ‘ppm’, ‘mz’, etc).

get_metadata_and_params(**kwargs)[source]

Initializes or resets the metadata and parameters for the class. List of valid keys along with their default value are:

‘tol_MS1’:10, ‘tol_MS1_u’:’ppm’, ‘tol_prec’:1, ‘tol_prec_u’:’mz’, ‘tol_frag’:10, ‘tol_frag_u’:’ppm’, ‘tol_mob’:0.1, ‘tol_mob_u’:’μs’, ‘h’:10, ‘w’:10, ‘hw_units’:’mm’, ‘is_MS2’:False, ‘is_mobility’:False, ‘normalize_img_sizes’:True, ‘output_file_loc’:None, ‘in_jupyter’:True, ‘testing’:False, ‘gui’:False, ‘pixels_per_line’:”mean”,

get_num_spe_per_group_aligned(scans_per_filter_grp, normalize_img_sizes=None, pixels_per_line=None)[source]

Determines the number of spectra per filter group. If normalize_img_sizes is True, all images will be resized to the same size, being the maximum number of spectra per filter group. If normalize_img_sizes is False, each filter group is independently resized, resulting in images of varying size.

get_raw_files(name_body=None, name_post=None)[source]

Returns a list of raw files in the directory that match the naming scheme of the example file. :param name_body: The body of the file name to match (the absolute path except for numbers at the end of the file name).

If None, self.name_body will be used.

Parameters:

name_post (str) – The file extension to match (including the dot). If None, self.name_post will be used.

Returns:

List of raw files in the directory that match the naming scheme of the example file.

Return type:

raw_files (list)

get_scan_without_zeros(*args, **kwargs)[source]

Implemented in subclasses.

load_files(*args, **kwargs)[source]

Implemented in subclasses.

load_pixels(path=None)[source]

Loads pixel data from the specified file without initializing the class beforehand. These files can be in the .npz, .npy, or .csv format and must have a corresponding metadata file in .json format. If path is None, it uses the default load path.

make_metadata_dict()[source]

Creates a metadata dictionary containing information about the mass list, tolerances, and other parameters.

ms1_interp(pixels, rts=None, mass_list=None, pixels_per_line=None)[source]

Interpolates MS1 data to create a 2D image for each entry in the mass list. Interpolation is done by normalizing retention times of each line to be between 0 and 1. A 2D grid is created with the specified height (number of line) and width (pixels per line) and the data is interpolated onto this grid using nearest-neighbor interpolation.

Parameters:
  • pixels (np.ndarray) – List of arrays of shape (pixels_per_line, m/z) containing intenisty data. Each entry represents a single line scan.

  • rts (list of np.array) – (optional) List of retention times for each line. If None, uses the class attribute self.rts.

  • mass_list (np.ndarray) – (optional) 2D array of shape (m/z, 1) containing the mass list. If None, uses self.MS1_list.

  • pixels_per_line (str or int) – (optional) Number of pixels per line for the output image. If None, uses self.pixels_per_line. Valid options are “min”, “max”, “mean”, or an integer.

Returns:

3D array of shape (m/z+1, lines, pixels_per_line) containing the interpolated data. The last dimension contains the mass list and the intensity data for each pixel.

Return type:

pixels_aligned (np.ndarray)

ms1_mob(*args, **kwargs)[source]

Implemented in subclasses.

ms1_no_mob(*args, **kwargs)[source]

Implemented in subclasses.

ms2_interp(pixels_metas, all_TimeStamps, acq_times, scans_per_filter_grp, mzs_per_filter_grp, normalize_img_sizes=None, pixels_per_line=None)[source]

Interpolates MS2 data to create a 2D image for each entry in the mass list. Interpolation is done by normalizing retention times of each line to be between 0 and 1. If normalize_img_sizes is True, the interpolation is the same as in ms1_interp. If normalize_img_sizes is False, each filter group is independently interpolated, resulting in images of varying size stored in a list.

Parameters:
  • pixels_metas (list) – A list of lists of 2D arrays of shape (pixels_per_line, # of m/z in the group) containing intenisty data. Each entry represents a single line scan that is made up of a list representing each group of transitions.

  • all_TimeStamps (list of np.array) – A nested list containing retention times for each pixel in pixels_meta.

  • acq_times (list of np.array) – A nested list containing the acquisition times for all spectra in each line scan, whether used or not.

  • scans_per_filter_grp (list) – A list of lists containing the number of spectra per filter group for each line scan.

  • mzs_per_filter_grp (list) – A list of lists containing the m/z values for each filter group for each line scan.

  • normalize_img_sizes (bool) – (optional) If True, all images will be resized to the same size, being the maximum number of spectra per filter group.

  • pixels_per_line (str or int) – (optional) Number of pixels per line for the output images. If None, uses self.pixels_per_line. Valid options are “min”, “max”, “mean”, or an integer

ms2_mob(*args, **kwargs)[source]

Implemented in subclasses.

ms2_no_mob(*args, **kwargs)[source]

Implemented in subclasses.

normalize_ms2_timestamps(all_TimeStamps, acq_times)[source]

Normalizes the retention times of each line to be between 0 and 1 for MS2 data.

pixels_list_to_array(pixels, all_TimeStamps_aligned)[source]

Converts a list of pixels to a numpy array. Only to be used when all images are the same size.

progressbar_start_extraction()[source]

Displays the progress bar showing completion of data processing.

progressbar_start_preprocessing()[source]

Displays the progress bar while preprocessing the data.

progressbar_update_progress(num_spe, i, j)[source]

Updates the progress bar with the current progress.

Parameters:
  • num_spe (int) – The number of spectra in the current line scan.

  • i (int) – The current line number being processed.

  • j (int) – The current spectrum number being processed.

reorder_pixels(pixels, consolidated_filter_list, mz_idxs_per_filter_grp, mass_list_idxs)[source]

Reorders the pixels to match the order of the mass list.

resize_images_to_same_size(pixels)[source]

Resizes all images in the pixels list to the same size.

static run_GUI()[source]

Runs the MSIGen GUI.

save_pixels(metadata=None, pixels=None, MSI_data_output=None, file_format=None, ask_confirmation=True)[source]

Saves the pixels and metadata to a file in the specified format. The file format can be .npy, .npz, or .csv. .npy and .csv are used for saving images of the same size, whereas .npz is used for saving images of different sizes. If ask_confirmation is True, the user will be prompted to confirm overwriting existing files, otherwise it will overwrite them without asking. If an error occurs here, it will just be a warning and the program will continue.

segment_filename(file_name=None)[source]

Segments file_name into the body (everything before the number preceding the file extension) and post (extension) parts. If file_name is None, it uses the example_file attribute.

Parameters:

file_name (str) – The file name to segment. If None, self.example_file will be used.

Returns:

The body of the file name (everything before the number preceding the file extension). name_post (str):

The file extension (including the dot).

Return type:

name_body (str)

select_mass_list_cols_to_use(col_types)[source]

Filters the column types to include only the valid ones and returns the filtered list and their indices.

Parameters:

col_types (list) – List of column types to filter.

Returns:

List of column types with duplicates and columns with NoneType classifications removed. col_idxs (list):

List of indices corresponding to the columns in col_types_filtered.

Return type:

col_types_filtered (list)

sort_raw_files(raw_files, name_body=None, name_post=None)[source]

Sorts the raw files in ascending order based on their line numbers. The line numbers are extracted from the file names by removing the name_body and name_post parts.

Parameters:
  • raw_files (list) – List of raw files to sort.

  • name_body (str) – The body of the file name to match (the absolute path except for numbers at the end of the file name). If None, self.name_body will be used.

  • name_post (str) – The file extension to match. If None, self.name_post will be used.

Returns:

List of raw files sorted in ascending order based on their line numbers.

Return type:

sorted_raw_files (list)

static sorted_slice(a, l, r)[source]

Outputs the indices where the values of numpy array (a) are within a given lower (l) and upper (r) bound. Array (a) must be in order of increasing value for this to be used.

Parameters:
  • a (np.ndarray) – The array to search through.

  • l (float) – The lower bound of the mass window.

  • r (float) – The upper bound of the mass window.

Returns:

The indices of the values in the array that are within the given bounds.

Return type:

np.array

static vectorized_sorted_slice(a, l, r)[source]

Outputs a list of indices where the values of numpy array (a) are within a given lower and upper bounds for for each entry in the vectors containing (l) and upper (r) bounds. Array (a) must be in order of increasing value for this to be used. Length of l and r must be the same.

Parameters:
  • a (np.ndarray) – The array to search through.

  • l (float) – The lower bounds of the mass windows.

  • r (float) – The upper bounds of the mass windows.

Returns:

A 2D array of indices where the values of the array are within the given bounds. Each row corresponds to a mass window defined by the lower and upper bounds. Each column corresponds to an index in the array that is within the bounds.

Return type:

np.array

vectorized_sorted_slice_njit(a, l, r)[source]

Outputs a list of indices where the values of numpy array (a) are within a given lower and upper bounds for for each entry in the vectors containing (l) and upper (r) bounds. Array (a) must be in order of increasing value for this to be used. Length of l and r must be the same. If numba is not present, this will run vectorized_sorted_slice instead.

Parameters:
  • a (np.ndarray) – The array to search through.

  • l (float) – The lower bounds of the mass windows.

  • r (float) – The upper bounds of the mass windows.

Returns:

A 2D array of indices where the values of the array are within the given bounds. Each row corresponds to a mass window defined by the lower and upper bounds. Each column corresponds to an index in the array that is within the bounds.

Return type:

np.array

static vectorized_unsorted_slice(mz, lbs, ubs)[source]

Outputs a list of indices where the values of numpy array (a) are within a given lower and upper bounds for for each entry in the vectors containing (l) and upper (r) bounds. Works with unsorted arrays.

Parameters:
  • mz (np.ndarray) – The array to search through.

  • lbs (float) – The lower bounds of the mass windows.

  • ubs (float) – The upper bounds of the mass windows.

Returns:

A 2D array of indices where the values of the array are within the given bounds. Each row corresponds to a mass window defined by the lower and upper bounds. Each column corresponds to an index in the array that is within the bounds.

Return type:

np.array

static vectorized_unsorted_slice_mob(mz, mob, lbs, ubs, mob_lbs, mob_ubs)[source]

Outputs a list of indices where the values of the m/z array (mz) are within a given lower and upper bounds for for each entry in the vectors containing lower (lbs) and upper (rbs) bounds and where the values of the mobility array (mob) are within a given lower and upper bounds for for each entry in the vectors containing lower (mob_lbs) and upper (mob_rbs) bounds. Works with unsorted arrays.

Parameters:
  • mz (np.ndarray) – The array to search through.

  • mob (np.ndarray) – The mobility array to search through.

  • lbs (np.ndarray) – The lower bounds of the mass windows.

  • ubs (np.ndarray) – The upper bounds of the mass windows.

  • mob_lbs (np.ndarray) – The lower bounds of the mobility windows.

  • mob_ubs (np.ndarray) – The upper bounds of the mobility windows.

Returns:

A 2D array of indices where the values of the array are within the given bounds. Each row corresponds to a mass/mobility window defined by the lower and upper bounds. Each column corresponds to an index in the array that is within the bounds.

Return type:

np.array

class MSIGen.base_class.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

Custom JSON encoder for numpy data types.

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return super().default(o)
MSIGen.base_class.custom_warning(msg, err=None, category=<class 'UserWarning'>, stacklevel=2)[source]

Custom warning function that formats warnings without showing the source line.

Includes filename:line + category + message, but omits the repeated source line.

class MSIGen.base_class.get_confirmation(file_list)[source]

Bases: object

Dialog to confirm overwriting existing files.

callback(value)[source]

Get the user entry and exit the overwrite confirmation window.

MSIGen.D module

This module contains a subclass of the base MSIGen class for handling files with the .d file extension. This includes Bruker .tsf, .baf, and .tdf formats and Agilent formats that do not contain ion mobility data.

class MSIGen.D.MSIGen_D(*args, **kwargs)[source]

Bases: MSIGen_base

MSIGen_D class for processing mass spectrometry data from Agilent and Bruker formats.

Inherits from the base MSIGen_base class and implements methods for loading and processing data.

This class is designed to handle different file formats, including Agilent .d files and Bruker .tsf/.baf/.tdf files.

data_format

The format of the data file. Can be “bruker_tsf”, “bruker_baf”, “bruker_tdf”, or “agilent”.

Type:

str, None

scanTypeDict

Dictionary mapping scan types to their descriptions for Agilent files.

Type:

dict

scanLevelDict

Dictionary mapping scan levels to their descriptions for Agilent files.

Type:

dict

ionModeDict

Dictionary mapping ion modes to their descriptions for Agilent files.

Type:

dict

scanModeDict

Dictionary mapping scan modes to their descriptions for Agilent files.

Type:

dict

deviceTypeDict

Dictionary mapping device types to their descriptions for Agilent files.

Type:

dict

ionPolarityDict

Dictionary mapping ion polarities to their descriptions for Agilent files.

Type:

dict

desiredModeDict

Dictionary mapping desired modes to their corresponding values for Agilent files.

Type:

dict

agilent_d_ms1_no_mob(metadata=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None)[source]

Data processing for Agilent .d files with only MS1 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line).

Return type:

metadata (dict)

agilent_d_ms2_no_mob(metadata=None, normalize_img_sizes=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None)[source]

Data processing for Agilent .d files that contain MS2 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • normalize_img_sizes (bool) – Flag indicating if image sizes should be normalized. Overwrites self.normalize_img_sizes if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line) or list of ion image arrays of shape (height, width).

Return type:

metadata (dict)

bruker_d_ms1_no_mob(metadata=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None)[source]

Data processing from Bruker .tsf/.baf files containing only MS1 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line).

Return type:

metadata (dict)

bruker_d_ms2_no_mob(metadata=None, normalize_img_sizes=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None)[source]

Data processing for Bruker .tsf/.baf files that contain MS2 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • normalize_img_sizes (bool) – Flag indicating if image sizes should be normalized. Overwrites self.normalize_img_sizes if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line) or list of ion image arrays of shape (height, width).

Return type:

metadata (dict)

check_dim(ShowNumLineSpe=False)[source]

Gets the acquisition times and other information about each scan to decide what mass list entries can be obtained from each scan.

Returns:

A list of acquisition times for each line. filter_list (list): A list of information that would be included in Thermo-style filter strings for each line.

Return type:

acq_times (list)

determine_file_format(example_file=None)[source]

Determines the file format and MS level of the provided example file. If the data contains any MS2 scans, the MS level is “MS2”, otherwise it is “MS1”.

Returns:

The format of the data file. Can be “agilent”, “bruker_tsf”, “bruker_baf”, or “bruker_tdf”. MS_level (str): The MS level of the data file. Can be “MS1” or “MS2”.

Return type:

data_format (str)

get_ScansPerFilter(filters_info, all_filters_list, filter_inverse, display_tqdm=False)[source]

Determines the number of scans that use a specific filter

static get_agilent_scan(data, index)[source]

A faster implementation of the scan() method from multiplierz’s mzFile package for Agilent files.

Returns:

The m/z values of the scan. intensity_points (np.ndarray): The intensity values of the scan.

Return type:

mz (np.ndarray)

get_basic_instrument_metadata(data, metadata=None)[source]

Gets some of the instrument metadata from the data file depending on the file format.

get_basic_instrument_metadata_agilent(data, metadata=None)[source]

Obtains basic instrument metadata from Agilent data.

get_basic_instrument_metadata_bruker_d_tsf_no_mob(data, metadata={})[source]

Obtains basic instrument metadata from Bruker .tsf data.

get_filter_idx(Filter, acq_types, acq_polars, mz_ranges, precursors)[source]

Gets the index of the filter that corresponds to the given filter information. This is unused in the current implementation.

load_files(*args, **kwargs)[source]

Processes the data files based on the specified data format, MS level, and whether ion mobility data are present.

static parse_bruker_scan_level(scanmode)[source]

Obtains a descriptive scan mode string from the scan mode integer of a Bruker scan.

tdf_d_ms1_mob(metadata=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None)[source]

Data processing from Bruker .tdf files with only MS1 data and ion mobility data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line).

Return type:

metadata (dict)

tdf_d_ms2_mob(metadata=None, normalize_img_sizes=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None, **kwargs)[source]

Data processing from Bruker .tdf files that contain MS2 data and ion mobility data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • normalize_img_sizes (bool) – Flag indicating if image sizes should be normalized. Overwrites self.normalize_img_sizes if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line) or list of ion image arrays of shape (height, width).

Return type:

metadata (dict)

MSIGen.mzml module

This module contains a subclass of the base MSIGen class for handling files with the .mzml file extension. This can handle files with or without ion mobility data, and with or without MS2 data. This has been tested on the following file formats converted using MSConvert.

Thermo .raw files that contain MS1 or MS2 data and do not contain ion mobility data. Agilent .d files containing MS1 or MS2 data with or without ion mobility data. Bruker .d files of .tsf format containing MS1 or MS2 data. Bruker .d files of .baf format containing MS1 data. Bruker .d files of .tdf format containing ion mobility data and MS1 or MS2 data.

.mzml files from other sources may or may not be processed as expected.

class MSIGen.mzml.MSIGen_mzml(*args, **kwargs)[source]

Bases: MSIGen_base

check_dim(ShowNumLineSpe=False)[source]

Gets the acquisition times and other information about each scan to decide what mass list entries can be obtained from each scan.

Returns:

A list of acquisition times for each line. filter_list (list): A list of information that would be included in Thermo-style filter strings for each line.

Return type:

acq_times (list)

getUserParam(spectrum, param_name)[source]

Obtains the value of a parameter based on its parameter name from the spectrum object.

get_ScansPerFilter(filters_info, all_filters_list, filter_inverse, display_tqdm=False)[source]

Determines the number of scans that use a specific filter group

get_mobility_range_from_mzml_spectrum(spectrum)[source]

Determines the lower and upper bounds of the mobility range from the spectrum object.

load_files(*args, **kwargs)[source]

Processes the data files based on the MS level and whether ion mobility data are present.

ms1_no_mob(metadata=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None, **kwargs)[source]

Data processing for .mzml files with only MS1 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line).

Return type:

metadata (dict)

mzml_ms1_mob(metadata=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None, **kwargs)[source]

Data processing from .mzml files with only MS1 data and ion mobility data. When using MSConvert to create this .mzml file, the option “combine ion mobility scans” must be checked for MSIGen to read the data properly.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line).

Return type:

metadata (dict)

mzml_ms2_mob(metadata=None, normalize_img_sizes=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None, **kwargs)[source]

Data processing from .mzml files that contain MS2 data and ion mobility data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • normalize_img_sizes (bool) – Flag indicating if image sizes should be normalized. Overwrites self.normalize_img_sizes if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line) or list of ion image arrays of shape (height, width).

Return type:

metadata (dict)

mzml_ms2_no_mob(metadata=None, normalize_img_sizes=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None, **kwargs)[source]

Data processing for .mzml files that contain MS2 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • normalize_img_sizes (bool) – Flag indicating if image sizes should be normalized. Overwrites self.normalize_img_sizes if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line) or list of ion image arrays of shape (height, width).

Return type:

metadata (dict)

MSIGen.raw module

This module contains a subclass of the base MSIGen class for handling files Thermo files with the .raw file extension. This works for MS1 or MS2 data files without ion mobility.

class MSIGen.raw.MSIGen_raw(*args, **kwargs)[source]

Bases: MSIGen_base

check_dim(ShowNumLineSpe=False)[source]

Gets the acquisition times and other information about each scan to decide what mass list entries can be obtained from each scan.

Returns:

A list of acquisition times for each line. filter_list (list): A list of information from the filter strings for each spectrum in each line.

Return type:

acq_times (list)

get_ScansPerFilter(filters_info, all_filters_list, display_tqdm=False)[source]

Determines the number of scans that use a specific filter group

get_basic_instrument_metadata(data, metadata={})[source]

Gets some of the instrument metadata from the data file.

get_filters_info(filter_list)[source]

Gets information about all filters present in the experiment.

Returns:

A list of filter information, including filter names, polarities, MS levels, precursors, and mass ranges. polar_loc (int): The index of the polarity in the filter string. types_loc (list): A list of indices for the acquisition types in the filter string. filter_inverse (np.ndarray): An array of indices for the filters.

Return type:

filters_info (list)

get_scan_without_zeros(data, scannum, centroid=False)[source]

A faster implentation of multiplierz scan method for .raw files.

Parameters:
  • data – The mzFile object containing the raw data.

  • scannum – The scan number to retrieve.

  • centroid – Boolean indicating whether to use centroid data (True) or profile data (False). Default is False.

Returns:

The m/z values of the scan. intensity_points (np.ndarray): The intensity values of the scan.

Return type:

mz (np.ndarray)

load_files(*args, **kwargs)[source]

Processes the data files based on the MS level and whether ion mobility data are present.

make_filter_string_from_filter_dict(filter_dict)[source]

Constructs a Thermo filter string from a dictionary of filter components. :param filter_dict: Dictionary containing filter components such as:

analyzer, polarity, dataType, source, scanType, msMode, precursorMz, activationType, activationEnergy, scanRangeStart, scanRangeEnd.

Returns:

A formatted filter string.

Return type:

str

ms1_no_mob(metadata={}, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None, **kwargs)[source]

Data processing for Thermo .raw files with only MS1 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line).

Return type:

metadata (dict)

ms2_no_mob(metadata={}, normalize_img_sizes=None, in_jupyter=None, testing=None, gui=None, pixels_per_line=None, tkinter_widgets=None, **kwargs)[source]

Data processing for Thermo .raw files that contain MS2 data.

Parameters:
  • metadata (dict) – Metadata dictionary to store instrument information. Overwrites self.metadata if provided.

  • normalize_img_sizes (bool) – Flag indicating if image sizes should be normalized. Overwrites self.normalize_img_sizes if provided.

  • in_jupyter (bool) – Flag indicating if the code is running in a Jupyter notebook. Overwrites self.in_jupyter if provided.

  • testing (bool) – Flag for testing mode. Overwrites self.testing if provided.

  • gui (bool) – Flag for GUI mode. Overwrites self.gui if provided.

  • pixels_per_line (int) – Number of pixels per line for the output image. Overwrites self.pixels_per_line if provided.

  • tkinter_widgets – Tkinter widgets for GUI progress bar. Overwrites self.tkinter_widgets if provided.

Returns:

Updated metadata dictionary with instrument information. pixels_aligned (np.ndarray): 3D array of intensity data of shape (m/z+1, lines, pixels_per_line) or list of ion image arrays of shape (height, width).

Return type:

metadata (dict)

parse_filter_string(string)[source]

Parses a Thermo filter string into a dictionary of its components. :param string: The filter string to parse. :type string: str

Returns:

A dictionary containing the parsed components of the filter string.

Return type:

dict

reorder_pixels(pixels, filters_grp_info, mz_idxs_per_filter, mass_list_idxs, filters_info=None)[source]

Reorders the pixels to match the order of the mass list.

MSIGen.GUI module

This module provides a GUI for the MSI Generator (MSIGen) software.

class MSIGen.GUI.FileExplorerWindow(callback)[source]

Bases: Tk

File explorer that allows .d data to be treated as files rather than folders.

add_listboxes()[source]
Makes a box on the left side of the window that contains the commonly used directories such as:

Drive letters, Downloads, Desktop, etc.

for easier navigation

close_raw_file_selection_window()[source]
fill_listbox_driveletters()[source]
get_current_directory_contents()[source]

Gets the current files and folders in the selected directory for display

get_selected_drive_values(event)[source]
get_selected_values(event=None)[source]
move_to_parent_dir(event=None)[source]
on_double_click(event)[source]

Opens folder or selects files.

on_double_click_drives(event)[source]

Opens a drive when clicked

on_dropdown_change(*args)[source]

Allows the user to not display unselectable files.

on_return(event=None)[source]

Allows for navigation with Return instead of the mouse

on_textbox_resize(event)[source]

Adjust the Text widget width dynamically with Listbox. Allows for proper resizing of the box displaying the currently selected directory

on_textbox_return(event)[source]

Goes to directory typed into the textbox or selects file if it is a file.

class MSIGen.GUI.MasterWindow[source]

Bases: Tk

The main window of MSIGen. Files and parameters are input here before running the data extraction workflow.

delete_selected_rawfiles()[source]
destroy_all_windows()[source]
display_mass_list()[source]
fill_param_box(event=None)[source]
generate_images()[source]

Exports images based on the active tab and inputted parameters

get_input_vars()[source]
get_scale_threshold_values(dropdown_menu_var, scale_stringvar, threshold_stringvar)[source]

Gets the appropriate threshold or percentile to scale the image intensity to for later use.

initialize_param_box()[source]

Sets up the box containing tolerances and image dimension inputs

monitor_progressbar()[source]
open_file_explorer()[source]
open_image_maker()[source]

Opens the window that contains all parameters needed to export images. It includes 3 tabs:

1: For creating ion images 2: For creating fractional images 3: For creating ratio images

Images can be saved as figures (containing a title and colorbar), images, or arrays and can be saved using a selection of colormaps. The brightness of the image can be scaled by a percentile or an absolute threshold. The mass list can be viewed to obtain mass list entry indices.

open_images_were_saved_dialog()[source]

A window that contains a hyperlink to the folder the images were exported to.

open_progessbar_window()[source]
receive_raw_files(raw_files)[source]
reselect_raw_files()[source]

Goes back to file selection screen. All progress will be lost.

run_workflow()[source]
scale_or_threshold_display(selection, scale_label, scale_entry, threshold_label, threshold_entry, row)[source]

Toggles the display between percentile and threshold depending on currently selected dropdown value

select_mass_file()[source]
select_output_file_path()[source]

Opens a dialog box to select directory to save files to

show_or_hide_std_idx_entry(*args)[source]

Hides the std_idx entrybox when intl_std normalization is not selected

toggle_checkbox(checkbutton, event=None)[source]
class MSIGen.GUI.MyButton(master=None, **kwargs)[source]

Bases: Button

Button that can be selected with Tab and pressed with Return

MSIGen.GUI.get_download_path()[source]

Returns the default downloads path for linux or windows

MSIGen.GUI.get_final_mass_list_gui(metadata)[source]

Gets the mass list in displayable form for the GUI

MSIGen.GUI.run_GUI()[source]

Runs the MSIGen GUI

MSIGen.GUI.verify_rawfile_names_gui(rawfile_paths)[source]

Ensures that all file names that are selected in the GUI: 1: All have the same path 2: All have the same file extension 3: All have the same file name, apart from a final number 4: Contain a unique number at the end of the file name

Input: list(str) of file paths outputs:

rawfile_paths: A single file path as a string if only one path is given. Otherwise, this is the same as the input. filenames_checked: bool

MSIGen.tsf module

Python bindings for reading Bruker .tsf files.

MSIGen.visualization module

Functions used for visualizing images from data processed by MSIGen. This includes functions for saving and displaying normalized or raw ion images, fractional abundance images, and ratio images.

MSIGen.visualization.base_peak_normalize_pixels(pixels)[source]

Normalizes each image to the highest intensity in pixels in that image.

MSIGen.visualization.despike_images(pixels, threshold=1.5, num_pixels_on_each_side=2, axis='x')[source]

Despikes the images by comparing the pixel value to the mean of the surrounding pixels. If the pixel value is greater than the mean of the surrounding pixels by a certain threshold, it is replaced with the mean. Despiking is done only along the x-axis by default.

MSIGen.visualization.determine_titles(mass_list, idxs=None, fract_abund=False, ratio_img=False)[source]

Function for determining the default titles for the images.

Parameters:
  • mass_list (list) – The mass list for the images.

  • idxs (list) – The indices of the images to generate titles for. If None, titles will be generated for all images.

  • fract_abund (bool) – If True, the titles will be for fractional abundance images.

  • ratio_img (bool) – If True, the titles will be for ratio images.

MSIGen.visualization.display_fractional_images(fract_imgs, metadata=None, titles=None, aspect=None, save_imgs=False, MSI_data_output=None, cmap='viridis', title_fontsize=10, idxs=[1, 2], image_savetype='figure', scale=1.0, threshold=None, axis_tick_marks=False, interpolation='none', h=6, w=6, transparent_background=True, colorbar_height='match_image', pad_inches=0.0)[source]

Displays the fractional abundance images in the fract_imgs array.

MSIGen.visualization.display_images(pixels_normed, metadata=None, aspect=None, scale=0.999, how_many_images_to_display='all', save_imgs=False, MSI_data_output=None, cmap='viridis', titles=None, threshold=None, title_fontsize=10, image_savetype='figure', axis_tick_marks=False, interpolation='none', h=6, w=6, transparent_background=True, colorbar_height='match_image', pad_inches=0.0)[source]

Displays the images in the pixels array. Normalization must be performed prior to calling this.

MSIGen.visualization.display_ratio_images(ratio_imgs, metadata=None, titles=None, aspect=None, scale=0.999, save_imgs=False, MSI_data_output=None, cmap='viridis', log_scale=False, threshold=None, title_fontsize=10, idxs=[1, 2], image_savetype='figure', axis_tick_marks=False, interpolation='none', h=6, w=6, transparent_background=True, colorbar_height='match_image', pad_inches=0.0)[source]

Displays the fractional abundance images in the fract_imgs array.

MSIGen.visualization.fractional_abundance_images(pixels, metadata=None, idxs=[1, 2], normalize=None, titles=None, aspect=None, save_imgs=False, MSI_data_output=None, cmap='viridis', title_fontsize=10, image_savetype='figure', scale=1.0, threshold=None, axis_tick_marks=False, interpolation='none', h=6, w=6, transparent_background=True, colorbar_height='match_image', pad_inches=0.0)[source]

Generates fractional abundance images from the given pixels, metadata, and indices. The images are divided by the sum of the images to get the fractional abundance.

Parameters:
  • pixels (list or array) – The images to be displayed.

  • metadata (dict) – The metadata for the images. This should include the mass list and the image dimensions.

  • idxs (list) – The indices of the images to be used.

  • normalize (str) – The normalization method. Options are ‘None’, or ‘base_peak’. ‘base_peak’ will normalize the images to the base peak intensity before division.

  • titles (list) – The titles for the images. If None, the titles will be determined based on the mass list.

  • aspect (float) – The aspect ratio of each pixel for display. If None, the aspect ratio will be calculated based on the image dimensions.

  • save_imgs (bool) – If True, the images will be saved to the MSI_data_output directory.

  • MSI_data_output (str) – The directory to save the images to. If None, the images will be saved to the current working directory.

  • cmap (str) – The colormap to use for the images. Default is ‘viridis’.

  • title_fontsize (int) – The font size of the titles. Default is 10.

  • image_savetype (str) – The type of image to save. Options are ‘figure’, ‘image’, or ‘array’. ‘figure’ will save the image as a figure with a colorbar and title. ‘image’ will save the image as an image without a colorbar or title. ‘array’ will save the image as an array in csv format.

  • scale (float) – The quantile to lower intensity values to. Default is .999. Any pixel with an intensity greater than the pixel with this quantile will be decreased to this intensity. This is done to prevent saturation of the color scale. Ignored if threshold is given.

  • threshold (float) – The threshold for the images. Any pixel with an intensity greater than the threshold will be decreased to the threshold. If None, the threshold will be determined based on the scale.

  • axis_tick_marks (bool) – If True, the axis tick marks will be shown. Default is False.

  • interpolation (str) – The interpolation method to use for displaying the images. Default is ‘none’. Using ‘nearest’ or ‘none’ will make the images look pixelated, while ‘bilinear’ will make them look smoother/blurrier. See https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html for more options.

  • h (int) – The height of the displayed figures in inches. Default is 6.

  • w (int) – The width of the displayed figures in inches. Default is 6.

  • transparent_background (bool) – If True, the background of the saved image will be transparent. Only used if image_savetype is ‘figure’.

  • colorbar_height (str) –

    The height of the colorbar. More options to come in the future. default is “match_image”. Only used if image_savetype is ‘figure’.

    If ‘match_image’, the colorbar will be the same height as the image. Otherwise the colorbar will be the default height determined by matplotlib.

  • pad_inches (float, None) – The amount of padding around the saved image in inches. Only used if image_savetype is ‘figure’. None will use the default padding determined by matplotlib. Default is 0.0, which means no padding.

MSIGen.visualization.get_and_display_images(pixels, metadata=None, normalize=None, std_idx=None, std_precursor=None, std_mass=None, std_fragment=None, std_mobility=None, std_charge=None, aspect=None, scale=0.999, how_many_images_to_display='all', save_imgs=False, MSI_data_output=None, cmap='viridis', titles=None, threshold=None, title_fontsize=10, image_savetype='figure', axis_tick_marks=False, interpolation='none', h=6, w=6, handle_infinity='zero', transparent_background=True, colorbar_height='match_image', pad_inches=0.0)[source]

Displays the images in the pixels array. The images are normalized to the standard image or to the TIC image.

Parameters:
  • pixels (list or array) – The images to be displayed.

  • metadata (dict) – The metadata for the images. This should include the mass list and the image dimensions.

  • normalize (str) – The normalization method. Options are ‘None’, ‘TIC’, or ‘intl_std’.

  • std_idx (int) – The index of the standard image. Ignored unless normalize is ‘intl_std’. If none, the std_idx will be determined based on std_precursor, std_mass, std_fragment, std_mobility, and std_charge. 0 indicates the TIC image.

  • std_precursor (float) – The precursor mass of the standard. Ignored if std_idx is given.

  • std_mass (float) – The mass of the standard. Ignored if std_idx is given.

  • std_fragment (float) – The fragment mass of the standard. Ignored if std_idx is given.

  • std_mobility (float) – The mobility of the standard. Ignored if std_idx is given.

  • std_charge (int) – The charge of the standard. Ignored if std_idx is given.

  • aspect (float) – The aspect ratio of each pixel for display. If None, the aspect ratio will be calculated based on the image dimensions.

  • scale (float) – The quantile to lower intensity values to. Default is .999. Any pixel with an intensity greater than the pixel with this quantile will be decreased to this intensity. This is done to prevent saturation of the color scale. Ignored if threshold is given.

  • how_many_images_to_display (int, list, or str) – The number of images to display if this is an int. If this is a list, the images at the indices in the list will be displayed. If this is a string, it must be ‘all’, and all images will be displayed.

  • save_imgs (bool) – If True, the images will be saved to the MSI_data_output directory.

  • MSI_data_output (str) – The directory to save the images to. If None, the images will be saved to the current working directory.

  • cmap (str) – The colormap to use for the images. Default is ‘viridis’.

  • titles (list) – The titles for the images. If None, the titles will be determined based on the mass list.

  • threshold (float) – The threshold for the images. Any pixel with an intensity greater than the threshold will be decreased to the threshold. If None, the threshold will be determined based on the scale.

  • title_fontsize (int) – The font size of the titles. Default is 10.

  • image_savetype (str) – The type of image to save. Options are ‘figure’, ‘image’, or ‘array’. ‘figure’ will save the image as a figure with a colorbar and title. ‘image’ will save the image as an image without a colorbar or title. ‘array’ will save the image as an array in csv format.

  • axis_tick_marks (bool) – If True, the axis tick marks will be shown. Default is False.

  • interpolation (str) – The interpolation method to use for displaying the images. Default is ‘none’. Using ‘nearest’ or ‘none’ will make the images look pixelated, while ‘bilinear’ will make them look smoother/blurrier. See https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html for more options.

  • h (int) – The height of the displayed figures in inches. Default is 6.

  • w (int) – The width of the displayed figures in inches. Default is 6.

  • handle_infinity (str) – The method for handling infinity values that arise from normalization when the standard image has a pixel with intensity 0. Options are ‘zero’, ‘maximum’, or ‘infinity’. Default is ‘zero’. ‘zero’ will set the normalized pixel value to 0. ‘maximum’ will set the normalized pixel value to the maximum value in the image. ‘infinity’ will set the normalized pixel value to infinity, which will cause it to be colored as the maximum value in the colormap.

  • transparent_background (bool) – If True, the background of the saved image will be transparent. Only used if image_savetype is ‘figure’.

  • colorbar_height (str) –

    The height of the colorbar. More options to come in the future. default is “match_image”. Only used if image_savetype is ‘figure’.

    If ‘match_image’, the colorbar will be the same height as the image. Otherwise the colorbar will be the default height determined by matplotlib.

  • pad_inches (float, None) – The amount of padding around the saved image in inches. Only used if image_savetype is ‘figure’. None will use the default padding determined by matplotlib. Default is 0.0, which means no padding.

MSIGen.visualization.get_fractional_abundance_imgs(pixels, metadata=None, idxs=[1, 2], normalize=None)[source]

Normalizes pixels before getting fractional abundance. The images are divided by the sum of the images to get the fractional abundance. If the images are of varying size, all images are resized to the image corresponding to the first index given.

Returns:

The fractional abundance images.

Return type:

fract_imgs (list)

MSIGen.visualization.get_normalize_value(normalize, possible_entries=['None', 'TIC', 'intl_std', 'base_peak'])[source]

Parses the value of the normalize variable. Allows for error handling and for some leeway in mistyping the keywords.

Parameters:
  • normalize (str or None) – The normalization method. Options are ‘None’, ‘TIC’, ‘intl_std’, or ‘base_peak’.

  • possible_entries (list) – The allowed entries for the normalize variable. Only used to restrict the options for fractional abundance or ratio images.

MSIGen.visualization.get_pixels_to_display(pixels, metadata=None, normalize=None, std_idx=None, std_precursor=None, std_mass=None, std_fragment=None, std_mobility=None, std_charge=None, handle_infinity='zero')[source]

Normalizes pixels to TIC or to an internal standard. The if images are of varying size, the standard image is reshaped to the size of the image to be normalized.

MSIGen.visualization.get_ratio_imgs(pixels, metadata=None, idxs=[1, 2], normalize=None, handle_infinity='maximum', titles=None)[source]
MSIGen.visualization.match_to_mass_list(mass_list, idx=None, precursor=None, mass=None, fragment=None, mobility=None, charge=None)[source]

Matches the given mass, mobility, or charge values to the mass list. Returns the index of the match in the mass list. Fails if there is more than one match or if there is no match. If the index is given, the function will return that index. If the index is not given, the function will search for an entry in the mass list that uniquely matches the given mass, mobility, and charge values.

MSIGen.visualization.normalize_pixels(pixels, std_idx, handle_infinity='zero')[source]

Normalizes the pixels to the standard image. If the images are not all the same size, the standard image will be resized to match the size of the other images.

Parameters:
  • pixels (list or array) – The images to be normalized.

  • std_idx (int) – The index of the standard image. 0 indicates the TIC image.

Returns:

The normalized images.

Return type:

pixels_normed (list or array)

MSIGen.visualization.plot_image(img, img_output_folder, title, default_title, title_fontsize, cmap, aspect, save_imgs, thre, log_scale=False, image_savetype='figure', axis_tick_marks=False, interpolation='none', h=6, w=6, transparent_background=True, colorbar_height='match_image', pad_inches=0.0)[source]

The function that handles plotting the images for each display function.

Parameters:
  • img (array) – The image to be displayed.

  • img_output_folder (str) – The directory to save the images to.

  • title (str) – The title for the image.

  • default_title (str) – The default title for the image, used if the given title causes an error when saving.

  • title_fontsize (int) – The font size of the title.

  • cmap (str) – The colormap to use for the image.

  • aspect (float) – The aspect ratio of each pixel for display.

  • save_imgs (bool) – If True, the image will be saved to the img_output_folder directory.

  • thre (float) – The threshold for the image. Any pixel with an intensity greater than the threshold will be decreased to the threshold.

  • log_scale (bool) – If True, the image will be displayed on a log scale. If False, the image will be displayed on a linear scale.

  • image_savetype (str) – The type of image to save. Options are ‘figure’, ‘image’, or ‘array’. ‘figure’ will save the image as a figure with a colorbar and title. ‘image’ will save the image as an image without a colorbar or title. ‘array’ will save the image as an array in csv format.

  • axis_tick_marks (bool) – If True, the axis tick marks will be shown. Default is False.

  • interpolation (str) – The interpolation method to use for displaying the image. Default is ‘none’. Using ‘nearest’ or ‘none’ will make the image look pixelated, while ‘bilinear’ will make it look smoother/blurrier. See https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html for more options.

  • h (int) – The height of the figure in inches. Only used if image_savetype is ‘figure’.

  • w (int) – The width of the figure in inches. Only used if image_savetype is ‘figure’.

  • transparent_background (bool) – If True, the background of the saved image will be transparent. Only used if image_savetype is ‘figure’.

  • colorbar_height (str) –

    The height of the colorbar. More options to come in the future. default is “match_image”. Only used if image_savetype is ‘figure’.

    If ‘match_image’, the colorbar will be the same height as the image. Otherwise the colorbar will be the default height determined by matplotlib.

  • pad_inches (float, None) – The amount of padding around the saved image in inches. Only used if image_savetype is ‘figure’. None will use the default padding determined by matplotlib. Default is 0.0, which means no padding.

MSIGen.visualization.ratio_images(pixels, metadata=None, idxs=[1, 2], normalize=None, handle_infinity='maximum', titles=None, aspect=None, scale=0.999, save_imgs=False, MSI_data_output=None, cmap='viridis', log_scale=False, threshold=None, title_fontsize=10, image_savetype='figure', axis_tick_marks=False, interpolation='none', h=6, w=6, transparent_background=True, colorbar_height='match_image', pad_inches=0.0)[source]

Generates ratio images from the given pixels, metadata, and pair of indices. Each image is divided by the other to get the ratio images.

Parameters:
  • pixels (list or array) – The images to be displayed.

  • metadata (dict) – The metadata for the images. This should include the mass list and the image dimensions.

  • idxs (list) – The indices of the images to be used. len must be 2.

  • normalize (str) – The normalization method. Options are ‘None’, or ‘base_peak’. ‘base_peak’ will normalize the images to the base peak intensity before division.

  • handle_infinity (str) – The method to handle infinity values. Options are ‘maximum’, ‘infinity’, or ‘zero’. ‘maximum’ will set the infinity values to the maximum value in the image. ‘infinity’ will set the infinity values to infinity. ‘zero’ will set the infinity values to zero.

  • titles (list) – The titles for the images. If None, the titles will be determined based on the mass list.

  • aspect (float) – The aspect ratio of each pixel for display. If None, the aspect ratio will be calculated based on the image dimensions.

  • scale (float) – The quantile to lower intensity values to. Default is .999. Any pixel with an intensity greater than the pixel with this quantile will be decreased to this intensity. This is done to prevent saturation of the color scale. Ignored if threshold is given.

  • save_imgs (bool) – If True, the images will be saved to the MSI_data_output directory.

  • MSI_data_output (str) – The directory to save the images to. If None, the images will be saved to the current working directory.

  • cmap (str) – The colormap to use for the images. Default is ‘viridis’.

  • log_scale (bool) – If True, the images will be displayed on a log scale. If False, the images will be displayed on a linear scale.

  • threshold (float) – The threshold for the images. Any pixel with an intensity greater than the threshold will be decreased to the threshold. If None, the threshold will be determined based on the scale.

  • title_fontsize (int) – The font size of the titles. Default is 10.

  • image_savetype (str) – The type of image to save. Options are ‘figure’, ‘image’, or ‘array’. ‘figure’ will save the image as a figure with a colorbar and title. ‘image’ will save the image as an image without a colorbar or title. ‘array’ will save the image as an array in csv format.

  • axis_tick_marks (bool) – If True, the axis tick marks will be shown. Default is False.

  • interpolation (str) – The interpolation method to use for displaying the images. Default is ‘none’. Using ‘nearest’ or ‘none’ will make the images look pixelated, while ‘bilinear’ will make them look smoother/blurrier. See https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html for more options.

  • h (int) – The height of the displayed figures in inches. Default is 6.

  • w (int) – The width of the displayed figures in inches. Default is 6.

  • transparent_background (bool) – If True, the background of the saved image will be transparent. Only used if image_savetype is ‘figure’.

  • colorbar_height (str) –

    The height of the colorbar. More options to come in the future. default is “match_image”. Only used if image_savetype is ‘figure’.

    If ‘match_image’, the colorbar will be the same height as the image. Otherwise the colorbar will be the default height determined by matplotlib.

  • pad_inches (float, None) – The amount of padding around the saved image in inches. Only used if image_savetype is ‘figure’. None will use the default padding determined by matplotlib. Default is 0.0, which means no padding.