Tiltstack and mdoc modules: Functions for processing tilt-series#
Tilt-series (TS) represent the raw data available after a session of data collection. Those are accompanied by text files that store metadata, hereby referred as mdoc files, from their file extension. The set of operations that are performed to reconstruct the tomographic volumes from the tilt-series are collectively referred to as preprocessing steps. First, each projection image (which is collected as set of frames) should be corrected for beam-induced motion with a dedicated software. Each image file that belongs to the same tilt-series is then be combined into a stack that can be further processed to reconstruct the tomographic volume. CryoCAT offers a number of functions for processing the motion-corrected stacks until alignment of the projection images and tomogram reconstruction, for which a number of dedicated software already exist. The accepted format for tilt-stacks in cryoCAT is MRC. In this section of the user guide, we explain the basic concepts of the Tiltstack and Mdoc objects and then illustrate some of the operations that can be performed on tilt-stacks using cryoCAT and we provide some examples.
Pre-requisites#
Please, note that you need to load both the tiltstack and the mdoc module to be able to work with tilt-stacks.
[1]:
from cryocat import tiltstack
from cryocat import mdoc
Basics#
Tilt-series and mdoc files are internally stored in the Tiltstack and Mdoc classes, respectively. Please note that most of the functionalities are available outside of these objects, namely functions exist so that one can simply pass the path to the desired file to achieve the desired outcome without initializing the classes, whcih will be done internally. Nonetheless, for completeness we hereby illustrate the basic concepts of those objects and we provide examples for both the use cases, where applicable.
Tiltstack basics#
Internally, tilt-series are stored in the Tiltstack class, which can be initialized passing the path of a tilt-series or passing directly a numpy array. The attributes of a Tiltstack objects are:
data: 3D numpy array with the actual pixel values;data_type: type of the array elements;input_order: original order of the array dimensions (default xyz);current_order: order stored internally to operate on the stack;output_order: order of the desired output stack (default xyz);n_tilts: number of tilt images in the stack;heightandwidth: Dimension of the images in the stack (i.e. number of pixels in X and Y dimension).
Mdoc basics#
Internally, mdoc files are stored in the Mdoc class, which has 5 attributes designed to parse and store the information contained in mdocs, which can then be accessed programmatically. The attributes of the class are:
file_path: Path of the mdoc file;titles: List of general information about the session, for instance the microscope name, date of the session, the tilt-axis angle, spot size. These is stored in square brackets in the mdoc files before the metadata of each single image;project_info: This is a disctionary that contains again general information stored in the header of the mdoc files. Keys and values will be automatically assigned. Typical keys are PixelSpacing, Voltage, ImageFile, ImageSize, DataMode.imgs: This is dataframe with as many rows as the number of images in the stack and as many columns as the number of fields. The header of the dataframe corresponds with the names of the fields;section_id: The image identifier inside the mdoc (e.g. ZValue)
Common pre-processing operations#
The functions will be explained with the stack and mdoc paths defined below.
[ ]:
# Define the paths to the original files
original_stack = "path/to/original_stack.mrc"
original_mdoc = "path/to/original_mdoc.mdoc"
Crop the projection images: The
crop()function in thetiltstackmodule allow to crop the images of a tilt-stack and therefore redefinig the stack dimensions. Mind that the center of the stack projections will be retained, so an equal number of pixels will be removed from both edges of the axis. Example:
[ ]:
desired_new_width = 4000
desired_new_height = 4000
tiltstack.crop(original_stack, new_width=desired_new_width, new_height=desired_new_height, output_file="/path/to/my_cropped_stack.mrc") # this will crop original_stack.mrc to 4000 px in both X and Y axes and save the stack with new dimensions to my_cropped_stack.mrc
After cropping the images of a stack, the mdoc file should be updated accordingly. This can be accomplished as demonstrated below:
[ ]:
mdoc = mdoc.Mdoc(original_mdoc) # store the info in the mdoc file as Mdoc object
str_new_dimensions = " ".join([str(desired_new_width), str(desired_new_height)])
mdoc.project_info["ImageSize"] = str_new_dimensions
mdoc.add_field("CroppedSize", str_new_dimensions) # use this if you want to add a metadata field to store the cropped size for each image
mdoc.write(overwrite=True) # overwrite the original mdoc with the applied change
To check that the new field was properly added, you can try to call it as displayed below:
[ ]:
mdoc.get_image_feature("CroppedSize") # this should print the new dimensions for each image of the stack
Sort the stack images by their tilt angle: Depending on the acquisition software, the final stack might or might not be sorted by tilt angles. In case it is not, a useful first step is sorting the images by their tilt angle. This can be achived with the
sort_tilts_by_angle()function in thetiltstackmodule. This function requires both the original stack and the angles information as input. As for the information on the angles, this can be provided either as an array or as the path to a file containng angle information. In the latter case, accepted formats are mdoc files of Warp xlm files. Alternatively, you can pass a text file with the angle information. Please refer toioutils.tlt_load()for more information on the accepted formats. To work further with the sorted stack, it is a good idea to update the mdoc file as well. this can be accomplished with thesort_mdoc_by_tilt_angles()function of themdocmodule. If you wish to write the output of the two functions to a file, you need to pass the path to the desired output to theoutput_fileargument.
[ ]:
# Sort the stack and the mdoc file
sorted_stack = tiltstack.sort_tilts_by_angle(original_stack, input_tilts=original_mdoc, output_file="/path/to/sorted_stack.mrc")
sorted_mdoc = mdoc.sort_mdoc_by_tilt_angles(original_mdoc, reset_z_value=True, output_file="/path/to/sorted_mdoc.mdoc")
As alternative approach, it’s also possible to work on the Mdoc object as demonstrated in the following example:
[ ]:
my_mdoc = mdoc.Mdoc(original_mdoc)
my_mdoc.sort_by_tilt(reset_z_value=True) # this will sort the data of the images in the Mdoc object by their tilt angle similarly as the example above without saving to file
my_mdoc.write(overwrite=True) #overwrite original mdoc
Update the pixel size in the mdoc files: This can be done with the
update_pixel_size()function as illustrated below:
[ ]:
new_px_size = 2.414 # in Å
my_mdoc = mdoc.Mdoc(original_mdoc) # the function works only on Mdoc objects
my_mdoc.update_pixel_size(new_px_size)
my_mdoc.write(overwrite=True) # save changes to file
Remove tilt images from the stack#
Typically some images in the stack are characterized by poor quality and should be discarded during the pipeline to reconstruct the tomographic volume. CryoCAT offers fucntions to removed the desired images by passing their indices. The same images should be removed from the mdoc files as well for consistency. The indices can be passed as path to text file containing one index per line, as path to CSV file with a column named “ToBeRemoved”, or as array (for more information, please refer to the
documentation of `ioutils.load_indices() <https://cryocat.readthedocs.io/latest/generated/cryocat.ioutils.html#cryocat.ioutils.indices_load>`__ ) for more information on the accepted formats. Indices can sbe numbered from 0 or 1 (default).
[ ]:
import numpy as np
remove_idx = np.array[(1, 3, 35, 41)]
tiltstack.remove_tilts(tilt_stack=original_stack, idx_to_remove=remove_idx, numbered_from_1=False, output_file='/path/to/cleaned_tiltstack.mrc', input_order='zyx', output_order='zyx')
mdoc.remove_images(input_mdoc=original_mdoc, idx_to_remove=remove_idx, numbered_from_1=False, output_file='/path/to/cleaned_tiltstack.mdoc')
If you have performed CTF estimation with Gctf or CTFFIND4, it is a good idea to remove the same entries from the output .star or .txt file with the estimated defocus information. This can be accomplished as illustrated below using Gctf output file as an example:
[ ]:
from cryocat.ioutils import defocus_remove_file_entries
import os
# Get the filename without extension
base_stack_name = os.path.splitext(original_stack)[0]
defocus_file_suffix = "_gctf.star"
# Clean the defocus file
defocus_remove_file_entries(base_stack_name+defocus_file_suffix, remove_idx, file_type='gctf', numbered_from_1=False, output_file=base_stack_name+defocus_file_suffix) # this will overwrite the original file
Dose filtering#
To correct for dose exposure, the dose_filter() function can be applied to the stack. One of the required inputs is the dose information for each image. This can be passed in the form of the mdoc file, in which case the dose will be automatically computed based on the prior dose and the exposure dose for each image. Alternatively, it can be passed as a CSV file that needs to include a “CorrectedDose” column or as a numpy array with the cumulative dose information. Please refer to
`ioutils.total_dose_load() <https://cryocat.readthedocs.io/latest/generated/cryocat.ioutils.html#cryocat.ioutils.total_dose_load>`__ for more information.
[ ]:
# Fetch the pixel size from the mdoc
my_mdoc = mdoc.Mdoc(original_mdoc)
px_size = my_mdoc.get_image_feature("PixelSpacing")[0]
tiltstack.dose_filter(original_stack, px_size, original_mdoc, output_file="/path/to/dose_filtered_stack.mrc")
Other functions#
Merge and splitting tilt-series#
The following examples illustrate how to merge stacks and mdoc files and how to get the mdoc file of each tilt image of a stack.
[ ]:
# Combine all the stack files in the same folder that have a common "tomo01_" prefix
tiltstack.merge("/path/to/mrc_files/tomo01_", output_file="/path/to/tomo01_combined.mrc", output_order="xyz")
[ ]:
# Combine all the mdoc files contained in one folder
input_path = "/path/to/folder/with/mdocs_to_combine/"
combined_mdoc = mdoc.merge_mdoc_files(input_path, new_id="ZValue", reorder=False, output_file="/path/to/combined_file.mdoc")
# Combine all the mdoc files in the same folder that have a common "tomo01_" prefix
input_path_prefix = "/path/to/folder/with/mdocs_to_combine/tomo01_"
combined_mdoc = mdoc.merge_mdoc_files(input_path, new_id="ZValue", reorder=False, output_file="/path/to/combined_tomo01.mdoc")
[ ]:
# Split mdoc file of a tilt-series by images
from cryocat.mdoc import split_mdoc_file
split_mdoc_file(original_mdoc, output_folder="/path/to/folder/to/store/the_individual_mdocs/")
Functions to aid visualization of tilt-series#
To help visualizing the tilt series, the equalize_histogram() function in the tiltstack module can be used to stretch the contrast. Furthermore, it is often beneficial to have a binned version of the stack that occupies less memory and is consequently faster to be loaded in a visualization software as 3dmod or Napari. Binned stacks can be generated with the bin() function of the tiltstack module.
[ ]:
# Generate bin8 stack
tiltstack.bin(tilt_stack=original_stack, binning_factor='8', output_file="/path/to/stack_bin8.mrc")
# Stretch the contrast of the binned stack
tiltstack.equalize_histogram(tilt_stack="/path/to/stack_bin8.mrc", eh_method='constrat_stretching', output_file="/path/to/stack_bin8.mrc") # overwrite the original bin8 stack