Jeff Huang · Extracting Data from Tracking Devices

Many consumer personal tracking devices seem to have a shelf life of only a couple of years. So if you're interested in keeping a long-term history of your progress, you have to figure out how to work around their apps to get your own data back from their servers. Otherwise, the day their app or servers stop working, your data will simply disappear.

I've used two tracking devices where data was not easily exportable, the Microsoft Band (shut down May 2019), and the Hello Sense (shut down June 2017, and never sent the data export instructions that they said was forthcoming), so I'm documenting the process I went through to retrieve my own data in hopes it may be useful for others trying to do the same for other devices.

Microsoft Band (documented on December 2014)

When the Microsoft Band was announced, I was thrilled to discover the first wrist-worn device to have both a heart-rate sensor and GPS, plus a slew of other sensors. My Ph.D. student Alexandra managed to snag a Band when they were hard to find, but I was disappointed to learn that it suffered from the same problem that plagued so many promising wearable devices: the inability to export my own minute-by-minute data.

The Band syncs to its own smartphone app called Microsoft Health, but it was clear after a bit of searching online that no one knew a way to get their data out. I asked someone who worked on the Band at Microsoft Research whether he knew of a creative way to get data out of it, and he challenged, "You're right that we don't expose raw data at this point, but looking forward to seeing what you come up with... :)"

Where is the Data?

I asked Alexandra to dump out the data from the phone app to find out how the data was stored. She managed to export a bunch of files, but after digging around, we found cached data with daily summaries, but not the raw minute-by-minute values I knew was being stored somewhere because the sleep chart in the Microsoft Health app showed finer-grained data (left screenshot below).

I decided to dig further and decompiled the app to understand where the data went by reading the app code. I used an app on my phone called ES File Explorer to get the application package (the apk file) from the phone to my computer (right screenshot above).

An apk is just a zip file, and here's what the Microsoft Health apk file looked like when unzipped.

The main code for the app is in a file called classes.dex which is a Dalvik Executable file, basically a compiled Java binary. The file format is fairly well defined and I was lucky to find an open source tool called jadx to decompile the source code.

After decompiling the classes.dex file, I browsed through a few folders and came across this in the "microsoft/" folder, which seemed like the root directory for the Microsoft Health app.

There were hundreds of files inside each of these folders, so I grepped for keywords like "sleep", and then "sleepEvents" when I noticed that was a frequently occurring term.

The "aha" moment was when I encountered this getSleepEvents function in /src/com/microsoft/krestsdk/services/KRestServiceV1.java

Clearly, to get sleep events, the app is constructing a REST call. So what must be happening is the Band syncs with the app on the phone, which syncs with some servers that Microsoft owns. This explained why we were not able to find the raw data in the file dump of the app, as only the cached data was stored locally.

Intercepting Messages

My next intuition was to try and intercept the data between the app on my phone and the Microsoft server to see what was being transmitted. This is usually done by using a proxy in an application, so I first tried enabling the proxy on my Android phone (left screenshot below).

After a bit of testing, it was clear that the proxy feature in Android only affects the web browser, and not the Microsoft Health app. So I tried a different trick: setting the gateway (i.e., router) for the phone's wifi to be my computer instead of using DHCP, so that all the network data would be sent to my computer. I edited this setting (right screenshot above) and enabled IP forwarding so network packets could still reach the Internet instead of hitting my computer and getting lost.

Next I checked that the browser was still working on my phone, and it was so that was a good sign. Then the tricky part; I set up a packet filter to forward the incoming packets to a different port on my computer. In a new .conf file,

Then I installed a traffic inspector (an open source tool called mitmproxy) and fiddled with the flags until I figured out how to activate the transparent proxy mode.

So this basically simulates a man-in-the-middle attack to intercept the data. Note that this is only capturing the data sent and received by my phone, so it's not really an attack in the usual sense but just a way for me to view the data my phone is already dealing with.

I was reassured to see traffic being routed through my mitmproxy console when I visited websites. However, when I started up my Microsoft Health app, it wouldn't start at all (left screenshot below).

Eventually I figured out it was using HTTPS, which was running on a different port. So I made some changes. First, I installed an SSL certificate on my phone so that my phone would trust my computer which was intercepting the messages (right screenshot above). Then I added a line to my packet filter to also forward packets on the HTTPS port, by adding an extra line below to the .conf file and re-running the pfctl commands.

Basically, instead of the Microsoft Health app communicating with the Microsoft server over HTTPS, all communication is routed through my computer. The mitmproxy tool intercepts the SSL keys and injects its own, so it can decrypt and re-encrypt messages that go through it.

At last, I was able to see the traffic from the Microsoft Health app. Fortunately, the requests were easy to figure out, and data was simple to understand.

Notice that to request the data, the phone issues a REST GET request to a URL like https://prodphseus.dns-cargo.com/v1/Events(eventId='1234567890')?$expand=Sequences,

If you are just interested in data from one event (like last night's sleep), then you would be satisfied at this point so you can just save the response from the Microsoft prodphseus.dns-cargo.com server and be happy. To get a few more events, you can simply click through every sleep (left screenshot), exercise (right screenshot), or other type of event until your phone (and your computer intercepting the messages) receives all the data. Then you save them to a file and you can view it in your favorite text editor and process it using a script.

Getting the Full Export

But what if you don't want to manually go through every entry on your phone to have it transmit the data? Basically, to get the data over the entire time when you had the Band instead of individual events. Then recall the decompiled Java code at the beginning of this article containing:

which provides a clue that to retrieve the full set of data without having to select each entry on the phone, you could edit the URL to:

Simply pasting the URL into a browser wouldn't work because you have to reuse the same authentication token the Microsoft Health app is using, but editing a prior request should let you retrieve your entire raw data stream without retrieving each event one by one.

To summarize how the Band works, some data is cached on the phone app while the rest is stored on Microsoft servers in the "cloud". By intercepting the phone app's requests to the server, you can download the raw data being sent or retrieve your entire historical data like heart-rate, gps, step count, etc. down to the minute level. Happy tracking!

Hello Sense (documented on January 2017)

I was excited about the Hello Sense, a popular Kickstarter project whose mantra is "Know More. Sleep Better." It's a beautifully-designed globe packed with sensors to tell you about your sleep environment, and a movement-tracking clip for your pillow. I was working with my Ph.D. student Nedi on automatically generating sleep recommendations, so we wanted to learn how this device measured up.

We ordered two Hello Sense devices from their website to use for a few months, but alas it was again disappointing that we could not access the data from the sensors. Instead, we could only view the charts that it generated, and were limited by what the Hello Sense app allowed us to see. To tease us, the Kickstarter page promised, "We are building tools to allow you to export, use or delete your data. Press a button and your data will be exported or deleted. It is entirely up to you. These tools will be available on our website. [..] You will be able to download a complete archive of your data." (spoiler: this never happened).

The Hello Sense syncs to its own smartphone app called Sense, but it was clear that there wasn't a way to export this data. In fact, after searching for solutions, all I could find were other users lamenting the lack of data exportability.

Intercepting Messages

I took this as a challenge, and wanted to try the same procedure to intercept messages as the Microsoft Band. Many apps will use a REST call to request a certain slice of the data, so knowing how to do that lets you talk to the server holding your data. Basically, I would intercept the messages between the Sense app and its servers to watch how they authenticated and transmitted my data to the app to make the charts. Then I could learn the "language" and mimic the app to ask for my own data from the servers.

As before, I first enabled the proxy on my Android phone (left screenshot below) to my Macbook that was set as the gateway. And again set up a packet filter to forward the incoming packets to a different port on my computer. In a new .conf file,

And then running pfctl like before, and starting mitmproxy in transparent proxy mode to intercept the data.

Finally, I installed an SSL certificate on my phone from http://mitm.it, so that it could intercept the messages sent over https (port 443).

Traffic was correctly routed through my mitmproxy console when visiting websites (see the screenshot above on the right, where I eavesdrop on the phone's Chrome browser navigation). However, when I started up my Hello Sense app, it didn't connect to the server properly, "There was a problem securely connecting to the server."

Disassembling the App

I wanted to get a better sense of what was going on in the app that caused the error to come up. I extracted the apk file from my phone using the Android Debug Bridge this time. As with the Microsoft Band app, the main code is inside the apk in a file called classes.dex.

After poking around the code, it was clear that the app used a library called OkHttp that did some sort of certificate pinning. Basically, there was code in there that checked whether the SSL certificate was the right one, and threw an exception if it wasn't. At this point, there were a couple of options: I could disassemble the app (note that the decompiled Java can't simply be recompiled into an app, so instead it needs to be disassembled into smali and edited) and remove the certificate pinning checks.

But while I was looking into the source code, I came across a file called ApiService.java that showed the REST API queries that the app was making to retrieve data from the server. So theoretically, all we had to do was issue the same queries as if we were the app, and the server would send us the raw data back!

Notice that to request the data, the phone uses links like /v2/timeline/{date}/events/{type}/{timestamp}. The timeline is in fact exactly what we want to get all the events that happened during a night of sleep for a particular date.

But before we can send our own REST queries, the server asks for authentication using the OAuth protocol. OAuth authentication is done with a client ID and secret. So I searched the source tree for this, and luckily found it in a simple configuration file.

I highlighted the two lines that specify the client ID and secret. This file also tells us the base URL for the REST server, which is https://api.hello.is so we now have all the pieces we need: the hostname of the REST service, the format of the requests, and the client ID and secret. Note there are also a few lines blacked out that I think are secret keys that should probably not be made public.

Sending POST Requests

Sending a GET request is easy with any web browser by just typing in the right URL. But crafting a POST request which is what we needed to do requires using a command line tool like curl or finding an application to take care of the annoying bits. I used Postman which is free and simply designed (and I love application names that are puns).

So I entered the appropriate fields into the POST request, using a pretty standard OAuth format. The URL is https://api.hello.is/v1/oauth2/token based on what we found in the source code.

One thing that tripped me up for a bit was the Content-Type header in the request needs to be set to "application/x-www-form-urlencoded" or the request will be rejected. Once that is set, the result of the POST request is the access_token which provides us access to the rest of the data that we can get using other queries. In the screenshot below, I covered part of my access token to prevent people from snooping on my sleep data.

Now the fun part. I use the access token in the header as the Authorization field (the word "Bearer" needs to be prepended to it to indicate that it's a Bearer token type). And now I can change the URL to what I want, in this case to http://api.hello.is/v1/room/current to see my current room conditions from anywhere.

If you really wanted to do something with your data, you could write your own script to automatically authenticate and then grab several days of data, maybe to create an online dashboard or send yourself sleep recommendations.

Probably the most detailed data the Hello Sense stores is how it classifies my time during the night, as awake, medium sleep, sound sleep, etc. This is the timeline that is shown as bar charts in the app, but now we have access to the actual data to generate our own visualizations, or do comparisons and analyses. There are quite a few events during one night, but here's what it looks like.

So the good news is that the Hello Sense did have sort of an API already for data export. It took a bit of sleuthing to figure out how to get access to the data, but I imagine they never released this publicly yet because they wanted to provide a nicer interface to the API.

Anyways, I hope this documentation is useful for someone trying to export their data from whatever wearable or tracking app is available now. If you want to know what we did with the data, my student Jina Yoon wrote an article comparing 10 different sleep tracking devices and apps. But for the time being, I'm going to stick with tracking apps that are more open.

Also in this series

The Coronavirus pandemic has changed our sleep behavior

My productivity app is a never-ending .txt file

How I Got My Data Back from the Microsoft Band and Hello Sense

Extracting Data from Tracking Devices

By Jeff Huang, updated May 15, 2021

Microsoft Band (documented on December 2014)

Where is the Data?

Intercepting Messages

Getting the Full Export

Hello Sense (documented on January 2017)

Intercepting Messages

Disassembling the App

Sending POST Requests

Also in this series

Other articles I've written