Recommendations for offline data analysis?

Hello, we have a production instance that is actively collecting participant data. Our team has expressed that it is very slow to query sensor data from the production instance using the python API.

Are there any recommendations on how to work with and query against an offline / standalone copy of the database for faster analysis? I have replicated a copy of the entire couchdb database to a researchers workstation making the entire database available to them locally to remove the network as a potential bottleneck.

It’s true that an API call can be slow considering the amount of data that’s being collected. I imagine what you’ve done by moving the data offline will improve speeds. On our end we don’t move data offline permanently, as that would be expensive, but we do implement a cache using the LAMP-cortex package, which is helpful. Other than that, you could manually change the frequency of data collection, thereby reducing the amount of data, improving query speeds.