Reading and writting files to Dropbox using python

2019-09-14
Python Dropbox API Pandas



dropbox_logo

When developing with python people usually want to store some data. If this data is quite big and/or contains personal information it is not advised to store it in github (or other git providers). One good option is to store it in dropbox.

1. Using dropbox with python

Dropbox has a really nice package that you can install with

pip install dropbox

It is not really difficult to use it but I noticed that every time I wanted to I had to look for old code. So I decided that I could create a post that explained everything.

The first thing to do is create an app inside dropbox since you cannot get a token without it. To do so go to dropbox developers.

dropbox_create_app

Register a new app that will use Dropbox API and will acces only the app folder. Once you have create the app go to the app settings page and create a token.

dropbox_get_token

You can now store that secret in a safe way (for example as an environment variable or a hidden file).

2. Writting files to dropbox

The first thing you need to do is to init the dropbox object with:

import io
import yaml

import dropbox
import pandas as pd

DBX = dropbox.Dropbox(token)

After creating the DBX instance you can upload files using DBX.files_upload.

2.1. Write a text file

The key to upload files is using io.BytesIO object.

txt = "Hello World"

with io.BytesIO(txt.encode()) as stream:
    stream.seek(0)

    # Write a text file
    DBX.files_upload(stream.read(), "/test.txt", mode=dropbox.files.WriteMode.overwrite)

To allow overwriting you need to pass mode=dropbox.files.WriteMode.overwrite to the function DBX.files_upload.

Important: filenames should start with /. It won't work without it.

2.2. Write a yaml/json

To write a dictionary-like file you can use the following:

data = {"a": 1, "b": "hey"}

with io.StringIO() as stream:
    yaml.dump(data, stream, default_flow_style=False)

    stream.seek(0)

    DBX.files_upload(stream.read().encode(), "/test.yaml", mode=dropbox.files.WriteMode.overwrite)

It is important to run stream.seek(0) to go to the begining of the stream.

This time we are encoding the stream to transform it to bytes.

2.3. Write an Excel

df = pd.DataFrame([range(5), list("ABCDE")])

with io.BytesIO() as stream:

    with pd.ExcelWriter(stream) as writer:
        df.to_excel(writer)
        writer.save()

    stream.seek(0)

    DBX.files_upload(stream.getvalue(), "/test.xlsx", mode=dropbox.files.WriteMode.overwrite)

The key is to use the ExcelWriter from pandas.

3. Reading files

To read a file you can use DBX.files_download. This will return some metadata as the first parameter and the result of the API call as the second.

3.1. Read a text file

_, res = DBX.files_download("/test.txt")

with io.BytesIO(res.content) as stream:
    txt = stream.read().decode()

Remember to decode the stream to transform it from bytes to string

3.2. Read a yaml/json

_, res = DBX.files_download("/test.yaml")

with io.BytesIO(res.content) as stream:
    data = yaml.safe_load(stream)

You should always use yaml.safe_load instead of yaml.load

3.3. Read an Excel

_, res = DBX.files_download("/test.xlsx")

with io.BytesIO(res.content) as stream:
    df = pd.read_excel(stream, index_col=0)

If you do not want a new dummy index use index_col=0.

4. Deleting files

To delete a file simply call DBX.files_delete(filename).

With this post you should have enough to work with dropbox using python. There could be other file formats that you could want to work with but they should work with a very similar way.