For Developers

7 Ways to Include Non-Python Resource Files into the Python Package in 2022

Ways to Include Non-Python Files into Python Package

Imagine this: you’re using Python and happen to store a few non-Python files alongside Python ones. These could be schema files, small lookup tables, etc. And then when you try to read them within a running Python script, you find that you can’t. It’s a common issue that people face. To prevent this from happening, here are seven ways to include non-Python resource files into the Python package while being able to read them.

Best methods to include non-Python files into the Python Package

While there are several ways to add non-Python files, we will cover seven methods ranging from using to manifest.in setuptools package_data directive and more.

Methods to include non-Python files into the Python package_11zon.webp

1. Setup.py

Let’s consider that you already have Python packages. You can use setup.py to include non-Python files. Tip: You don’t need to have a finished package for this process. An empty package will work too. What’s necessary is that the Python package folder exists.

Note: Install the latest version of setuptools using the following code:


pip install --upgrade setuptools

Assume you have schema.json in your project. Ensure that you place the same in exampleproject/data/schema.json.

If you wish to include this non-Python file in your package, you need to use the package_data argument of setup.

For instance:


setup(
    # ...,
    package_data={'exampleproject': ['data/schema.json']}
)

Code source

Another way to include all files of the same pattern into the package using setup.py is shown below:


setup(
    # ...,
    package_data={'': ['*.json']}
)

As given in the example, all *.json files will be included in the package.

If you want to access the location of all the installed files, these exceptional functions will get them in a matter of seconds:

- Pkg_resources.resource_stream: It offers you a stream of the file.

- Pkg_resources.resource_string: It offers you the contents of the file as a string.

- Pkg_resources.resource_filename: It offers the name of the file in case the above two options don’t work (Tip: You can extract a temporary file with this function if it is included in a zipped package).

You will require pyproject.toml for basic use of setuptools. This means you are declaring the use of setuptools to package your project. It also supports configuration from setup.cfg, setup.py, or pyproject.toml.

Here’s how you can use each of these functions for, say, dependency management.

1. Setup.cfg


[options]
install_requires =
    docutils
    requests <= 0.4

Code source

2. Setup.py


setup(
    # ...
    install_requires=["docutils", "requests <= 0.4"],
    # ...
)

Code source

3. pyproject.toml


[project]
# ...
dependencies = [
    "docutils",
    "requires <= 0.4",
]
# ...

Code source

2. Manifest.in

Many files like data tables, images, documentation, and others that are non-Python resource files often become a hindrance to the running of a Python package. Manifest.in is the mechanism that gets them included in the package.

Manifest.in is a list that includes relative file paths specifying files or globs that must be included in the package. In order to include your non-Python files into said package, you’ll need to supply include_package_data=True to the setup() function.

For instance:


include README.rst
include docs/*.txt
include funniest/data.json

Code source

3. Zip_safe flag

Zip_safe flag allows you to include non-code files into the Python package by installing setup tools in your project as a zip file or a directory. The Python packages can run directly from a zip file where the default choice is determined by the project’s zip_safe flag.

You will want to set zip_safe=False since not all packages are capable of running in compressed form.

4. Importlib.resources

Importlib.resources is a modern and reliable approach to load data files. It includes a hack in which you have to associate a non-Python file with file variable which directs to the present Python module as a file in the filesystem.

This approach makes use of Python’s import system to provide access to resources residing within packages. They can be accessed only if you can import a package. Once you do, you can read them in text or binary mode.

Here are some types of resources that will prove ‌worthwhile to loaders who wish to support resource reading by implementing get_resource_reader (fullname) method as specified by Importlib.

a. importlib.resources.Package

Here, Union[str, ModuleType] defines the package type. It means that if a function describes accepting a package, you’re open to pass in a module or a string.

b. importlib.resources.Resource

It defines the names of the resources that are passed into various functions in this package. It’s defined as Union[str, os.PathLike].

Here’s a complete list of the ‌functions available:

1. importlib.resources.files(package)

It returns an importlib.resources.abc where a traversable object represents the resource container for the package and its resources.

2. importlib.resources.as_file(traversable)

It gives importlib.resources.abc wherein the traversable object represents the file mainly from importlib.resources.files() that further returns a context manager to be used in a “with” statement.

3. importlib.resources.open_binary(package, resource)

It’s open to the binary reading of the resource that resides within a package.

4. importlib.resources.open_text(package, resource, encoding='utf-8', errors='strict')

It enables you to text-read the resource within a package. The resource is open for reading as UTF-8.

5. importlib.resources.read_binary(package, resource)

It reads and returns the resource contents within the package as bytes.

6. importlib.resources.contents(package)

It’s known to return an iterable over the named items within a package. Iterable does not recurse into sub-directories and returns “str” resources and non-resources.

7. importlib.resources.is_resource(package, name)

It returns “true” if there’s a resource named “name” in the package. Otherwise, it returns “false”.

8. importlib.resources.path(package, resource)

It returns the actual file system path as the path of the resource. Moreover, it returns context manager that can be used in a “with” statement and provides a pathlib.Path object.

9. importlib.resources.read_text(package, resource, encoding='utf-8', errors='strict')

This function reads and returns the resource content within the package as “str”. The contents are read as strict UTF-8 by default.

Here’s a brief example to understand it all:

Let’s assume that there is a plain text file on the side of a Python module in a Python package. We begin by leveraging a simple project with Poetry:


poetry new --src hello
cd hello
poetry install


Create two files in src/hello, greeting.txt, and greet.py.

For greeting.txt,

Hello, {recipient}!

For greet.py,


"""Tools for greeting others."""

import importlib.resources


def greet(recipient):
    """Greet a recipient."""
    template = importlib.resources.read_text("hello", "greeting.txt")
    return template.format(recipient=recipient)

Once this is done, launch the Python console with poetry run python. Then, move the following code:


>>> from hello import greet
>>> greet.greet("World")
'Hello, World!\n'
>>> greet.greet("Universe")
'Hello, Universe!\n'
>>>

Code source

Keep in mind that the call to importlib.resources.read_text() is what reads the contents of the file specified.

This method of including non-Python resource files into the Python package conveniently uses an import path. You don’t have to focus on the file paths while you load data from files located in other packages. Python will help you find the package and the file in question.

For instance, for import_hello, you can choose importlib.resources.read_text("hello"...), irrespective of what script or module you’re currently running.

5. Setuptools package_data directive

Package_data is a dict that includes package names (empty=all packages) to a list of patterns that can include globs.

Here’s an example:


from setuptools import setup, find_packages

setup(
    name='your_project_name',
    version='0.1',
    description='A description.',
    packages=find_packages(exclude=['ez_setup', 'tests', 'tests.*']),
    package_data={'': ['license.txt']},
    include_package_data=True,
    install_requires=[],
)

The approach used is highlighted in the following part of the code:


package_data={'': ['license.txt']},
include_package_data=True,

Code source

6. Recursive-include

First, create manifest.in in the project root. Make sure you create the same with recursive-include to the required directory or include with the filename.

Below is an example that explains this approach to include non-Python resource files in the Python package:


include LICENSE
include README.rst
recursive-include package/static *
recursive-include package/templates *

Code source

7. Setuptools_scm

Setuptools_scm is preferred over setuptools.setup. This is because it includes any data files that have a version on your VCS, be it git or other, to the package. Further, it makes “pip install” bring those files from the git repository.

All you have to do is add two lines to the setup call. No other imports or installs are required.

Here’s how:


setup_requires=['setuptools_scm'],
    include_package_data=True,

Code source

The takeaway

The next time you’re working with Python and want to include non-Python resource files while being able to read them, try the methods discussed here. Go through each one to see which works best for you.

FAQs

1. Where does Python set up py install?

Ans: Python sets up py install in the current directory, mainly at the setup script location.

2. Does it matter where Python packages are installed?

Ans: By default, Python packages are installed by the Python installer for Windows in the users AppData directory so that it doesn’t require any administrative permissions. If you are the only user on the system, place the Python packages in a higher level directory to locate them quickly.

3. How do you give a path to a Python script?

Ans: In Windows, go to My Computer > Properties > Advanced > Environment Variables. Click PATH, add the path to the folder containing your desired Python script, and save your changes. Here are some more ways to set file paths in Python:

  • Pathlib.Path() function

  • os.path() function

  • Use the raw string literals

  • Use the \ character instead of \

4. What is Package_data in setup py?

Ans: Package_data, in its default version as include_package_data, considers all the non-Python or .py files found inside the package directory as data files.

Press

Press

What's up with Turing? Get the latest news about us here.
Blog

Blog

Know more about remote work.
Checkout our blog here.
Contact

Contact

Have any questions?
We'd love to hear from you.

Hire and manage remote developers

Tell us the skills you need and we'll find the best developer for you in days, not weeks.

Hire Developers