Package `pyaurorax`

The PyAuroraX package provides a way to interact with the AuroraX Data Platform, facilitating programmatic usage of AuroraX's search engine and data analysis tools.

For an overview of usage and examples, visit the AuroraX Developer Zone website, or explore the examples contained in the Github repository here.

Installation:

pip install pyaurorax

Basic usage:

import pyaurorax
aurorax = pyaurorax.PyAuroraX()

Sub-modules

pyaurorax.data: Instrument data downloading and reading module. This module presently has support for data provided by the University of Calgary, such as THEMIS ASI, …
pyaurorax.exceptions: Unique exception classes utilized by PyAuroraX. These exceptions can be used to help trap specific errors raised by this library …
pyaurorax.models: Interact with various auroral models, such as the TREx Auroral Transport Model (ATM).
pyaurorax.search: Interact with the AuroraX search engine. This includes finding data sources, searching for conjunctions or ephemeris data, and uploading/managing …
pyaurorax.tools: Data analysis toolkit for working with all-sky imager data available within the AuroraX platform …

Classes

class PyAuroraX (download_output_root_path: str | None = None, read_tar_temp_path: str | None = None, api_base_url: str | None = None, api_timeout: int | None = None, api_key: str | None = None, progress_bar_backend: Literal['auto', 'standard', 'notebook'] = 'auto')

Expand source code

class PyAuroraX:
    """
    The `PyAuroraX` class is the primary entry point for utilizing
    this library. It is used to initialize a session, capturing details
    about API connectivity, environment, and more. All submodules are 
    encapsulated within this class, so any usage of the library starts 
    with creating this object.

    ```python
    import pyaurorax
    aurorax = pyaurorax.PyAuroraX()
    ```

    When working with this object, you can set configuration parameters, such 
    as the destination directory for downloaded data, or API special settings 
    (e.g., timeout, HTTP headers, API key). These parameters can be set when 
    instantiating the object, or after instantiating using the self-contained 
    accessible variables.
    """

    __DEFAULT_API_BASE_URL = "https://api.aurorax.space"
    __DEFAULT_API_TIMEOUT = 10
    __DEFAULT_API_HEADERS = {
        "content-type": "application/json",
        "user-agent": "python-pyaurorax/%s" % (__version__),
    }  # NOTE: these MUST be lowercase so that the decorator logic cannot be overridden

    def __init__(
        self,
        download_output_root_path: Optional[str] = None,
        read_tar_temp_path: Optional[str] = None,
        api_base_url: Optional[str] = None,
        api_timeout: Optional[int] = None,
        api_key: Optional[str] = None,
        progress_bar_backend: Literal["auto", "standard", "notebook"] = "auto",
    ):
        """
        Attributes:
            download_output_root_path (str): 
                Destination directory for downloaded data. The default for this path is a 
                subfolder in the user's home directory, such  as `/home/user/pyaurorax_data` 
                in Linux. In Windows and Mac, it is similar.

            read_tar_temp_path (str): 
                Temporary directory used for tar extraction phases during file reading (e.g., 
                reading TREx RGB Burst data). The default for this is `<download_output_root_path>/.tar_temp_working`. 
                For faster performance when reading tar-based data, one option on Linux is 
                to set this to use RAM directly at `/dev/shm/pyaurorax_tar_temp_working`.

            api_base_url (str): 
                URL prefix to use when interacting with the AuroraX API. By default this is set to 
                `https://api.aurorax.space`. This parameter is primarily used by the development 
                team to test and build new functions using the private staging API.

            api_timeout (int): 
                The timeout used when communicating with the Aurorax API. This value is represented in 
                seconds, and by default is `10 seconds`.
            
            api_key (str): 
                API key to use when interacting with the AuroraX API. The default value is None. Please note
                that an API key is only required for write operations to the AuroraX search API, such as
                creating data sources or uploading ephemeris data.
        
            progress_bar_backend (str): 
                The progress bar backend to use. Valid choices are 'auto', 'standard', or 'notebook'. 
                Default is 'auto'. This parameter is optional.

            srs_obj (pyucalgarysrs.PyUCalgarySRS): 
                A [PyUCalgarySRS](https://docs-pyucalgarysrs.phys.ucalgary.ca/#pyucalgarysrs.PyUCalgarySRS) object. 
                If not supplied, it will create the object with some settings carried over from the PyAuroraX 
                object. Note that specifying this is for advanced users and only necessary a few special use-cases.

        Raises:
            pyaurorax.exceptions.AuroraXInitializationError: an error was encountered during initialization 
                of the paths
        """
        # initialize path parameters
        self.__download_output_root_path = download_output_root_path
        if (self.__download_output_root_path is None):
            self.__download_output_root_path = Path("%s/pyaurorax_data" % (str(Path.home())))
        self.__read_tar_temp_path = read_tar_temp_path
        if (self.__read_tar_temp_path is None):
            self.__read_tar_temp_path = Path("%s/tar_temp_working" % (self.__download_output_root_path))

        # initialize api parameters
        self.__api_base_url = api_base_url
        if (api_base_url is None):
            self.__api_base_url = self.__DEFAULT_API_BASE_URL
        self.__api_timeout = api_timeout
        if (api_timeout is None):
            self.__api_timeout = self.__DEFAULT_API_TIMEOUT
        self.__api_key = api_key
        self.__api_headers = self.__DEFAULT_API_HEADERS

        # initialize progress bar parameters
        self.__progress_bar_backend = progress_bar_backend
        self._tqdm = None

        # initialize PyUCalgarySRS object
        self.__srs_obj = pyucalgarysrs.PyUCalgarySRS(
            api_headers=self.__api_headers,
            api_timeout=self.__api_timeout,
            download_output_root_path=self.download_output_root_path,
            read_tar_temp_path=self.read_tar_temp_path,
            progress_bar_backend=progress_bar_backend,
        )

        # initialize progress bar tqdm object (by pulling it from srs_obj)
        self._tqdm = self.__srs_obj._tqdm

        # initialize sub-modules
        self.__search = SearchManager(self)
        self.__data = DataManager(self)
        self.__models = ModelsManager(self)
        self.__tools = ToolsManager(self)

        # disable certain dependencies warnings
        #
        # pillow has deprecated a parameter that matplotlib uses, so matplotlib will eventually fix it
        warnings.filterwarnings("ignore", message="'mode' parameter is deprecated and will be removed in Pillow 13")

    # ------------------------------------------
    # properties for submodule managers
    # ------------------------------------------
    @property
    def search(self):
        """
        Access to the `search` submodule from within a PyAuroraX object.
        """
        return self.__search

    @property
    def data(self):
        """
        Access to the `data` submodule from within a PyAuroraX object.
        """
        return self.__data

    @property
    def models(self):
        """
        Access to the `models` submodule from within a PyAuroraX object.
        """
        return self.__models

    @property
    def tools(self):
        """
        Access to the `tools` submodule from within a PyAuroraX object.
        """
        return self.__tools

    # ------------------------------------------
    # properties for configuration parameters
    # ------------------------------------------
    @property
    def api_base_url(self):
        """
        Property for the API base URL. See above for details.
        """
        return self.__api_base_url

    @api_base_url.setter
    def api_base_url(self, value: Optional[str] = None):
        if (value is None):
            self.__api_base_url = self.__DEFAULT_API_BASE_URL
        else:
            # check if http:// or https:// is in the URL
            value = value.lower()
            if (len(value) <= 8 or ("https://" not in value and "http://" not in value)):
                raise AuroraXInitializationError("API base URL is an invalid URL")

            # remove trailing slash if there is one
            if (value[-1] == '/'):
                value = value[0:-1]
            self.__api_base_url = value

    @property
    def api_headers(self):
        """
        Property for the API headers. See above for details.
        """
        return self.__api_headers

    @property
    def api_timeout(self):
        """
        Property for the API timeout. See above for details.
        """
        return self.__api_timeout

    @api_timeout.setter
    def api_timeout(self, value: int):
        new_timeout = self.__DEFAULT_API_TIMEOUT
        if (value is not None):
            new_timeout = value
        self.__api_timeout = new_timeout
        self.__srs_obj.api_timeout = new_timeout

    @property
    def api_key(self):
        """
        Property for the API key. See above for details.
        """
        return self.__api_key

    @api_key.setter
    def api_key(self, value: Optional[str] = None):
        # set the private var
        self.__api_key = value

        # update the headers
        if (value is None or value == "") and "x-aurorax-api-key" in self.__api_headers:
            # remove it
            del self.__api_headers["x-aurorax-api-key"]
        else:
            # add/update it
            self.__api_headers["x-aurorax-api-key"] = self.__api_key  # type: ignore

    @property
    def download_output_root_path(self):
        """
        Property for the download output root path. See above for details.
        """
        return str(self.__download_output_root_path)

    @download_output_root_path.setter
    def download_output_root_path(self, value: str):
        self.__download_output_root_path = value
        self.initialize_paths()
        self.__srs_obj.download_output_root_path = self.__download_output_root_path

    @property
    def read_tar_temp_path(self):
        """
        Property for the read tar temp path. See above for details.
        """
        return str(self.__read_tar_temp_path)

    @read_tar_temp_path.setter
    def read_tar_temp_path(self, value: str):
        self.__read_tar_temp_path = value
        self.initialize_paths()
        self.__srs_obj.read_tar_temp_path = self.__read_tar_temp_path

    @property
    def progress_bar_backend(self):
        """
        Property for the progress bar backend. See above for details.
        """
        return self.__progress_bar_backend

    @progress_bar_backend.setter
    def progress_bar_backend(self, value: Optional[Literal["auto", "standard", "notebook"]] = None):
        if (value is None):
            value = "auto"
        else:
            value = value.lower()  # type: ignore
        if (value != "auto" and value != "standard" and value != "notebook"):
            raise AuroraXInitializationError("Invalid progress bar backend. Allowed values are 'auto', 'standard' or 'notebook'.")
        self.__progress_bar_backend = value
        self.__srs_obj.progress_bar_backend = value
        self._tqdm = self.__srs_obj._tqdm

    @property
    def srs_obj(self):
        """
        Property for the PyUCalgarySRS object. See above for details.
        """
        return self.__srs_obj

    # -----------------------------
    # special methods
    # -----------------------------
    def __str__(self) -> str:
        return self.__repr__()

    def __repr__(self) -> str:
        return ("PyAuroraX(download_output_root_path='%s', read_tar_temp_path='%s', api_base_url='%s', " +
                "api_headers=%s, api_timeout=%s, api_key=%s, progress_bar_backend='%s', srs_obj=PyUCalgarySRS(...))") % (
                    self.__download_output_root_path,
                    self.__read_tar_temp_path,
                    self.api_base_url,
                    self.api_headers,
                    self.api_timeout,
                    "None" if self.api_key is None else "'%s'" % (self.api_key),
                    self.progress_bar_backend,
                )

    def pretty_print(self):
        """
        A special print output for this class.
        """
        # set special strings

        # print
        print("PyAuroraX:")
        print("  %-27s: %s" % ("download_output_root_path", self.download_output_root_path))
        print("  %-27s: %s" % ("read_tar_temp_path", self.read_tar_temp_path))
        print("  %-27s: %s" % ("api_base_url", self.api_base_url))
        print("  %-27s: %s" % ("api_headers", self.api_headers))
        print("  %-27s: %s" % ("api_timeout", self.api_timeout))
        print("  %-27s: %s" % ("api_key", self.api_key))
        print("  %-27s: %s" % ("progress_bar_backend", self.progress_bar_backend))
        print("  %-27s: %s" % ("srs_obj", "PyUCalgarySRS(...)"))

    # -----------------------------
    # public methods
    # -----------------------------
    def initialize_paths(self):
        """
        Initialize the `download_output_root_path` and `read_tar_temp_path` directories.

        Raises:
            pyaurorax.exceptions.AuroraXInitializationError: an error was encountered during
                initialization of the paths
        """
        if (self.__download_output_root_path is None):  # pragma: nocover-ok
            self.__download_output_root_path = Path("%s/pyaurorax_data" % (str(Path.home())))
        if (self.__read_tar_temp_path is None):  # pragma: nocover-ok
            self.__read_tar_temp_path = Path("%s/tar_temp_working" % (self.__download_output_root_path))
        try:
            os.makedirs(self.download_output_root_path, exist_ok=True)
            os.makedirs(self.read_tar_temp_path, exist_ok=True)
        except IOError as e:  # pragma: nocover-ok
            raise AuroraXInitializationError("Error during output path creation: %s" % str(e)) from e

    def purge_download_output_root_path(self, dataset_name: Optional[str] = None):
        """
        Delete all files in the `download_output_root_path` directory. Since the
        library downloads data to this directory, over time it can grow too large
        and the user can risk running out of space. This method is here to assist
        with easily clearing out this directory.

        Note that it also deletes all files in the PyUCalgarySRS object's 
        download_output_root_path path as well. Normally, these two paths are the 
        same, but it can be different if the user specifically changes it. 

        Args:
            dataset_name (str): 
                Delete only files for a specific dataset name. This parameter is optional.

        Raises:
            pyaurorax.exceptions.AuroraXPurgeError: an error was encountered during the purge operation
        """
        try:
            # purge pyaurorax path
            for item in os.listdir(self.download_output_root_path):
                item = Path(self.download_output_root_path) / item

                # check if this is the dataset we want to delete
                if (dataset_name is None or item.name == dataset_name.upper()):
                    if (os.path.isdir(item) is True and self.read_tar_temp_path not in str(item)):
                        shutil.rmtree(item)
                    elif (os.path.isfile(item) is True):
                        os.remove(item)
        except Exception as e:  # pragma: nocover-ok
            raise AuroraXPurgeError("Error while purging download output root path: %s" % (str(e))) from e

    def purge_read_tar_temp_path(self):
        """
        Delete all files in the `read_tar_temp_path` directory. Since the library 
        extracts temporary data to this directory, sometime issues during reading 
        can cause this directory to contain residual files that aren't deleted during 
        the normal read routine. Though this is very rare, it is still possible. 
        Therefore, this method is here to assist with easily clearing out this 
        directory.

        Note that it also deletes all files in the PyUCalgarySRS object's 
        read_tar_temp_path path as well. Normally, these two paths are the 
        same, but it can be different if the user specifically changes it. 

        Raises:
            pyaurorax.exceptions.AuroraXPurgeError: an error was encountered during the purge operation
        """
        try:
            # purge pyaurorax path
            for item in os.listdir(self.read_tar_temp_path):
                item = Path(self.read_tar_temp_path) / item
                if (os.path.isdir(item) is True and self.download_output_root_path not in str(item)):
                    shutil.rmtree(item)  # pragma: nocover
                elif (os.path.isfile(item) is True):
                    os.remove(item)

            # purge pyucalgarysrs path
            self.__srs_obj.purge_read_tar_temp_path()
        except Exception as e:  # pragma: nocover-ok
            raise AuroraXPurgeError("Error while purging read tar temp path: %s" % (str(e))) from e

    def show_data_usage(self, order: Literal["name", "size"] = "size", return_dict: bool = False) -> Any:
        """
        Print the volume of data existing in the download_output_root_path, broken down
        by dataset. Alternatively return the information in a dictionary.
        
        This can be a helpful tool for managing your disk space.

        Args:
            order (bool): 
                Order results by either `size` or `name`. Default is `size`.

            return_dict (bool): 
                Instead of printing the data usage information, return the information as a dictionary.

        Returns:
            Printed output. If `return_dict` is True, then it will instead return a dictionary with the
            disk usage information.
        
        Notes:
            Note that size on disk may differ slightly from the values determined by this 
            routine. For example, the results here will be slightly different than the output
            of a 'du' command on *nix systems.
        """
        # init
        total_size = 0
        download_pathlib_path = Path(self.download_output_root_path)

        # get list of dataset paths
        dataset_paths = []
        if (download_pathlib_path.exists() is True):
            for f in os.listdir(download_pathlib_path):
                path_f = download_pathlib_path / f
                if (os.path.isdir(path_f) is True and str(path_f) != self.read_tar_temp_path):
                    dataset_paths.append(path_f)

        # get size of each dataset path
        dataset_dict = {}
        longest_path_len = 0
        for dataset_path in dataset_paths:
            # get size
            dataset_size = 0
            for dirpath, _, filenames in os.walk(dataset_path):
                for filename in filenames:
                    filepath = os.path.join(dirpath, filename)
                    if (os.path.isfile(filepath) is True):
                        dataset_size += os.path.getsize(filepath)

            # check if this is the longest path name
            path_basename = os.path.basename(dataset_path)
            if (longest_path_len == 0):
                longest_path_len = len(path_basename)
            elif (len(path_basename) > longest_path_len):
                longest_path_len = len(path_basename)

            # set dict
            dataset_dict[path_basename] = {
                "path_obj": dataset_path,
                "size_bytes": dataset_size,
                "size_str": humanize.naturalsize(dataset_size),
            }

            # add to total
            total_size += dataset_size

        # return dictionary
        if (return_dict is True):
            return dataset_dict

        # print table
        #
        # order into list
        order_key = "size_bytes" if order == "size" else order
        ordered_list = []
        for path, p_dict in dataset_dict.items():
            this_dict = p_dict
            this_dict["name"] = path
            ordered_list.append(this_dict)
        if (order == "size"):
            ordered_list = reversed(sorted(ordered_list, key=lambda x: x[order_key]))
        else:
            ordered_list = sorted(ordered_list, key=lambda x: x[order_key])

        # set column data
        table_names = []
        table_sizes = []
        for item in ordered_list:
            table_names.append(item["name"])
            table_sizes.append(item["size_str"])

        # set header values
        table_headers = ["Dataset name", "Size"]

        # print as table
        table = Texttable()
        table.set_deco(Texttable.HEADER)
        table.set_cols_dtype(["t"] * len(table_headers))
        table.set_header_align(["l"] * len(table_headers))
        table.set_cols_align(["l"] * len(table_headers))
        table.header(table_headers)
        for i in range(0, len(table_names)):
            table.add_row([table_names[i], table_sizes[i]])
        print(table.draw())

        print("\nTotal size: %s" % (humanize.naturalsize(total_size)))

The PyAuroraX class is the primary entry point for utilizing this library. It is used to initialize a session, capturing details about API connectivity, environment, and more. All submodules are encapsulated within this class, so any usage of the library starts with creating this object.

import pyaurorax
aurorax = pyaurorax.PyAuroraX()

When working with this object, you can set configuration parameters, such as the destination directory for downloaded data, or API special settings (e.g., timeout, HTTP headers, API key). These parameters can be set when instantiating the object, or after instantiating using the self-contained accessible variables.

Attributes

download_output_root_path : str: Destination directory for downloaded data. The default for this path is a subfolder in the user's home directory, such as /home/user/pyaurorax_data in Linux. In Windows and Mac, it is similar.
read_tar_temp_path : str: Temporary directory used for tar extraction phases during file reading (e.g., reading TREx RGB Burst data). The default for this is <download_output_root_path>/.tar_temp_working. For faster performance when reading tar-based data, one option on Linux is to set this to use RAM directly at /dev/shm/pyaurorax_tar_temp_working.
api_base_url : str: URL prefix to use when interacting with the AuroraX API. By default this is set to https://api.aurorax.space. This parameter is primarily used by the development team to test and build new functions using the private staging API.
api_timeout : int: The timeout used when communicating with the Aurorax API. This value is represented in seconds, and by default is 10 seconds.
api_key : str: API key to use when interacting with the AuroraX API. The default value is None. Please note that an API key is only required for write operations to the AuroraX search API, such as creating data sources or uploading ephemeris data.
progress_bar_backend : str: The progress bar backend to use. Valid choices are 'auto', 'standard', or 'notebook'. Default is 'auto'. This parameter is optional.
srs_obj : pyucalgarysrs.PyUCalgarySRS: A PyUCalgarySRS object. If not supplied, it will create the object with some settings carried over from the PyAuroraX object. Note that specifying this is for advanced users and only necessary a few special use-cases.

Raises

AuroraXInitializationError: an error was encountered during initialization of the paths

Instance variables

prop api_base_url

Expand source code

@property
def api_base_url(self):
    """
    Property for the API base URL. See above for details.
    """
    return self.__api_base_url

Property for the API base URL. See above for details.

prop api_headers

Expand source code

@property
def api_headers(self):
    """
    Property for the API headers. See above for details.
    """
    return self.__api_headers

Property for the API headers. See above for details.

prop api_key

Expand source code

@property
def api_key(self):
    """
    Property for the API key. See above for details.
    """
    return self.__api_key

Property for the API key. See above for details.

prop api_timeout

Expand source code

@property
def api_timeout(self):
    """
    Property for the API timeout. See above for details.
    """
    return self.__api_timeout

Property for the API timeout. See above for details.

prop data

Expand source code

@property
def data(self):
    """
    Access to the `data` submodule from within a PyAuroraX object.
    """
    return self.__data

Access to the pyaurorax.data submodule from within a PyAuroraX object.

prop download_output_root_path

Expand source code

@property
def download_output_root_path(self):
    """
    Property for the download output root path. See above for details.
    """
    return str(self.__download_output_root_path)

Property for the download output root path. See above for details.

prop models

Expand source code

@property
def models(self):
    """
    Access to the `models` submodule from within a PyAuroraX object.
    """
    return self.__models

Access to the pyaurorax.models submodule from within a PyAuroraX object.

prop progress_bar_backend

Expand source code

@property
def progress_bar_backend(self):
    """
    Property for the progress bar backend. See above for details.
    """
    return self.__progress_bar_backend

Property for the progress bar backend. See above for details.

prop read_tar_temp_path

Expand source code

@property
def read_tar_temp_path(self):
    """
    Property for the read tar temp path. See above for details.
    """
    return str(self.__read_tar_temp_path)

Property for the read tar temp path. See above for details.

prop search

Expand source code

@property
def search(self):
    """
    Access to the `search` submodule from within a PyAuroraX object.
    """
    return self.__search

Access to the pyaurorax.search submodule from within a PyAuroraX object.

prop srs_obj

Expand source code

@property
def srs_obj(self):
    """
    Property for the PyUCalgarySRS object. See above for details.
    """
    return self.__srs_obj

Property for the PyUCalgarySRS object. See above for details.

prop tools

Expand source code

@property
def tools(self):
    """
    Access to the `tools` submodule from within a PyAuroraX object.
    """
    return self.__tools

Access to the pyaurorax.tools submodule from within a PyAuroraX object.

Methods

def initialize_paths(self)

Expand source code

def initialize_paths(self):
    """
    Initialize the `download_output_root_path` and `read_tar_temp_path` directories.

    Raises:
        pyaurorax.exceptions.AuroraXInitializationError: an error was encountered during
            initialization of the paths
    """
    if (self.__download_output_root_path is None):  # pragma: nocover-ok
        self.__download_output_root_path = Path("%s/pyaurorax_data" % (str(Path.home())))
    if (self.__read_tar_temp_path is None):  # pragma: nocover-ok
        self.__read_tar_temp_path = Path("%s/tar_temp_working" % (self.__download_output_root_path))
    try:
        os.makedirs(self.download_output_root_path, exist_ok=True)
        os.makedirs(self.read_tar_temp_path, exist_ok=True)
    except IOError as e:  # pragma: nocover-ok
        raise AuroraXInitializationError("Error during output path creation: %s" % str(e)) from e

Initialize the download_output_root_path and read_tar_temp_path directories.

Raises

AuroraXInitializationError: an error was encountered during initialization of the paths

def pretty_print(self)

Expand source code

def pretty_print(self):
    """
    A special print output for this class.
    """
    # set special strings

    # print
    print("PyAuroraX:")
    print("  %-27s: %s" % ("download_output_root_path", self.download_output_root_path))
    print("  %-27s: %s" % ("read_tar_temp_path", self.read_tar_temp_path))
    print("  %-27s: %s" % ("api_base_url", self.api_base_url))
    print("  %-27s: %s" % ("api_headers", self.api_headers))
    print("  %-27s: %s" % ("api_timeout", self.api_timeout))
    print("  %-27s: %s" % ("api_key", self.api_key))
    print("  %-27s: %s" % ("progress_bar_backend", self.progress_bar_backend))
    print("  %-27s: %s" % ("srs_obj", "PyUCalgarySRS(...)"))

A special print output for this class.

def purge_download_output_root_path(self, dataset_name: str | None = None)

Expand source code

def purge_download_output_root_path(self, dataset_name: Optional[str] = None):
    """
    Delete all files in the `download_output_root_path` directory. Since the
    library downloads data to this directory, over time it can grow too large
    and the user can risk running out of space. This method is here to assist
    with easily clearing out this directory.

    Note that it also deletes all files in the PyUCalgarySRS object's 
    download_output_root_path path as well. Normally, these two paths are the 
    same, but it can be different if the user specifically changes it. 

    Args:
        dataset_name (str): 
            Delete only files for a specific dataset name. This parameter is optional.

    Raises:
        pyaurorax.exceptions.AuroraXPurgeError: an error was encountered during the purge operation
    """
    try:
        # purge pyaurorax path
        for item in os.listdir(self.download_output_root_path):
            item = Path(self.download_output_root_path) / item

            # check if this is the dataset we want to delete
            if (dataset_name is None or item.name == dataset_name.upper()):
                if (os.path.isdir(item) is True and self.read_tar_temp_path not in str(item)):
                    shutil.rmtree(item)
                elif (os.path.isfile(item) is True):
                    os.remove(item)
    except Exception as e:  # pragma: nocover-ok
        raise AuroraXPurgeError("Error while purging download output root path: %s" % (str(e))) from e

Delete all files in the download_output_root_path directory. Since the library downloads data to this directory, over time it can grow too large and the user can risk running out of space. This method is here to assist with easily clearing out this directory.

Note that it also deletes all files in the PyUCalgarySRS object's download_output_root_path path as well. Normally, these two paths are the same, but it can be different if the user specifically changes it.

Args

dataset_name : str: Delete only files for a specific dataset name. This parameter is optional.

Raises

AuroraXPurgeError: an error was encountered during the purge operation

def purge_read_tar_temp_path(self)

Expand source code

def purge_read_tar_temp_path(self):
    """
    Delete all files in the `read_tar_temp_path` directory. Since the library 
    extracts temporary data to this directory, sometime issues during reading 
    can cause this directory to contain residual files that aren't deleted during 
    the normal read routine. Though this is very rare, it is still possible. 
    Therefore, this method is here to assist with easily clearing out this 
    directory.

    Note that it also deletes all files in the PyUCalgarySRS object's 
    read_tar_temp_path path as well. Normally, these two paths are the 
    same, but it can be different if the user specifically changes it. 

    Raises:
        pyaurorax.exceptions.AuroraXPurgeError: an error was encountered during the purge operation
    """
    try:
        # purge pyaurorax path
        for item in os.listdir(self.read_tar_temp_path):
            item = Path(self.read_tar_temp_path) / item
            if (os.path.isdir(item) is True and self.download_output_root_path not in str(item)):
                shutil.rmtree(item)  # pragma: nocover
            elif (os.path.isfile(item) is True):
                os.remove(item)

        # purge pyucalgarysrs path
        self.__srs_obj.purge_read_tar_temp_path()
    except Exception as e:  # pragma: nocover-ok
        raise AuroraXPurgeError("Error while purging read tar temp path: %s" % (str(e))) from e

Delete all files in the read_tar_temp_path directory. Since the library extracts temporary data to this directory, sometime issues during reading can cause this directory to contain residual files that aren't deleted during the normal read routine. Though this is very rare, it is still possible. Therefore, this method is here to assist with easily clearing out this directory.

Note that it also deletes all files in the PyUCalgarySRS object's read_tar_temp_path path as well. Normally, these two paths are the same, but it can be different if the user specifically changes it.

Raises

AuroraXPurgeError: an error was encountered during the purge operation

def show_data_usage(self, order: Literal['name', 'size'] = 'size', return_dict: bool = False) ‑> Any

Expand source code

def show_data_usage(self, order: Literal["name", "size"] = "size", return_dict: bool = False) -> Any:
    """
    Print the volume of data existing in the download_output_root_path, broken down
    by dataset. Alternatively return the information in a dictionary.
    
    This can be a helpful tool for managing your disk space.

    Args:
        order (bool): 
            Order results by either `size` or `name`. Default is `size`.

        return_dict (bool): 
            Instead of printing the data usage information, return the information as a dictionary.

    Returns:
        Printed output. If `return_dict` is True, then it will instead return a dictionary with the
        disk usage information.
    
    Notes:
        Note that size on disk may differ slightly from the values determined by this 
        routine. For example, the results here will be slightly different than the output
        of a 'du' command on *nix systems.
    """
    # init
    total_size = 0
    download_pathlib_path = Path(self.download_output_root_path)

    # get list of dataset paths
    dataset_paths = []
    if (download_pathlib_path.exists() is True):
        for f in os.listdir(download_pathlib_path):
            path_f = download_pathlib_path / f
            if (os.path.isdir(path_f) is True and str(path_f) != self.read_tar_temp_path):
                dataset_paths.append(path_f)

    # get size of each dataset path
    dataset_dict = {}
    longest_path_len = 0
    for dataset_path in dataset_paths:
        # get size
        dataset_size = 0
        for dirpath, _, filenames in os.walk(dataset_path):
            for filename in filenames:
                filepath = os.path.join(dirpath, filename)
                if (os.path.isfile(filepath) is True):
                    dataset_size += os.path.getsize(filepath)

        # check if this is the longest path name
        path_basename = os.path.basename(dataset_path)
        if (longest_path_len == 0):
            longest_path_len = len(path_basename)
        elif (len(path_basename) > longest_path_len):
            longest_path_len = len(path_basename)

        # set dict
        dataset_dict[path_basename] = {
            "path_obj": dataset_path,
            "size_bytes": dataset_size,
            "size_str": humanize.naturalsize(dataset_size),
        }

        # add to total
        total_size += dataset_size

    # return dictionary
    if (return_dict is True):
        return dataset_dict

    # print table
    #
    # order into list
    order_key = "size_bytes" if order == "size" else order
    ordered_list = []
    for path, p_dict in dataset_dict.items():
        this_dict = p_dict
        this_dict["name"] = path
        ordered_list.append(this_dict)
    if (order == "size"):
        ordered_list = reversed(sorted(ordered_list, key=lambda x: x[order_key]))
    else:
        ordered_list = sorted(ordered_list, key=lambda x: x[order_key])

    # set column data
    table_names = []
    table_sizes = []
    for item in ordered_list:
        table_names.append(item["name"])
        table_sizes.append(item["size_str"])

    # set header values
    table_headers = ["Dataset name", "Size"]

    # print as table
    table = Texttable()
    table.set_deco(Texttable.HEADER)
    table.set_cols_dtype(["t"] * len(table_headers))
    table.set_header_align(["l"] * len(table_headers))
    table.set_cols_align(["l"] * len(table_headers))
    table.header(table_headers)
    for i in range(0, len(table_names)):
        table.add_row([table_names[i], table_sizes[i]])
    print(table.draw())

    print("\nTotal size: %s" % (humanize.naturalsize(total_size)))

Print the volume of data existing in the download_output_root_path, broken down by dataset. Alternatively return the information in a dictionary.

This can be a helpful tool for managing your disk space.

Args

order : bool: Order results by either size or name. Default is size.
return_dict : bool: Instead of printing the data usage information, return the information as a dictionary.

Returns

Printed output. If return_dict is True, then it will instead return a dictionary with the disk usage information.

Notes

Note that size on disk may differ slightly from the values determined by this routine. For example, the results here will be slightly different than the output of a 'du' command on *nix systems.