We've been working in collaboration with Which? to review the Furby Connect from Hasbro, which is currently priced at around £32.00, and comes with a smartphone app that offers to "connect you to a world of surprises."
The idea of Furbies being sold with companion apps is not a new one: the Furby Connect's predecessor, the Furby Boom, also featured an accompanying app, however communication between it and the Furby was accomplished by means of high-frequency audio. This time around, Hasbro have equipped the Furby Connect with a Bluetooth Low Energy (BLE) connection, allowing it to interface more reliably with its companion app - named "Furby Connect World" - available for both Android and iOS.
We powered up our Furby and installed the app on an Android device. Aside from offering a Tamagotchi-like experience raising "furblings" and providing several new ways to interact with your Furby Connect toy (the Food Cannon, for example, can be used to feed it snacks) the app also acts as an update client, querying Hasbro's AWS endpoints for new Furby downloadable content. This content is distributed in the form of proprietary DLC files, and seemed to contain new songs, dances, and actions for the Furby Connect to perform. If any new content is found, the associated DLC file is downloaded by the app, then pushed to the Furby Connect over its BLE connection. As DLC files can be as big as 2MB and BLE is characteristically slow, this can take a long time, however the app hides this fact by uploading DLC files in the background while the user plays in the game.
By sniffing the BLE connnection during one such DLC update, we immediately discovered that the security situation was bad. Right off the bat, none of the standard Bluetooth LE security features (e.g. authenticated pairing or link encryption) were in use by either the app or the Furby Connect. This meant that anyone within range of the communication could intercept unencrypted packets, inject their own content, or establish their own connection with the toy - all without any physical interaction required on the part of the user or the attacker.
Furthermore, we also observed that the Furby Connect was exposing a number of services in addition to those involved in receiving DLC updates, including one whose UUID matched with that normally associated with the Nordic Semiconductor Over-The-Air Direct Firmware Update service (DFU OTA.) This is typically used by Nordic Semiconductor devices to receive firmware updates over a BLE connection, and has a newer version (which supports signature-based verification of firmware updates) and an older version (which doesn't.) The UID matched the older version, meaning that anyone within range could in theory connect to the device and push their own unsigned firmare updates to it.
A Chilli Conundrum
Finding these vulnerabilities didn't take very long, so we started having a look at what a mischevious attacker could do to a Furby once connected. Thankfully, we soon found the excellent bluefluff project on GitHub (credit to Florian Euchner aka Jeija.) The project documents much of the Furby Connect's BLE protocol, and has a tool for communicating with the Furby via its BLE connection. With a few modifications, the tool allowed us to flash the Furby Connect with a legitimate Hasbro DLC file and trigger the various animations contained in it by sending it 4-byte action commands.
However, our work was not done here. The bluefluff documentation identified the format used for audio data within the DLC files, and provided a simple script for writing audio samples into existing Hasbro DLCs. However, much of the DLC format remained a mystery, and a means by which to reliably add variable-length audio data to a DLC - or to modify or remove existing audio data - had yet to be found. Jeija had also identified the existence of a single Furby eye animation involving a chilli within the DLC, which can be seen being played part-way through one of the Furby Connect promotional videos, but had not yet found a way to play back the chilli animation once it was loaded onto the Furby.
We later found a second repository on GitHub which seemed to show a Furby playing modified eye animations, however it appeared that this had been achieved by modifying the Furby's internal storage via a hardware hack, rather than by uploading and triggering a DLC.
If we were able to find a way to convince our Furby to play back the chilli animation, we'd have a means to create and upload our own animations, as we could then simply overwrite the chilli animation with one of our own choosing. Frustratingly, the chilli animation appeared to be the only one of its kind - no other DLCs seemed to contain custom eye animations, which is possibly why all the GitHub projects we'd seen had also identified the chilli animation as being important.
We therefore set out our goals as follows:
- Reverse-engineer the Furby DLC file format
- Gain insight into how audio samples were packed into a DLC
- Gain insight into how commands sent over BLE could trigger these audio samples
- Understand the means by which a particular sequence of actions (involving eye animations, audio, and motion) was described within a DLC
- Find a way to modify one of these sequences to trigger the chilli animation
- Overwrite the chilli animation with an animation of our choosing
- Play back custom animations and audio on our Furby!
The tools we used to reverse engineer Hasbro's Furby DLC files were:
A hex editor. Nothing fancy; this was just the one that came bundled with my Linux distribution:
A Python interpreter, for writing up scripts to encapsulate understanding as and when we made progress:
Once we'd gathered a small set of DLC files from Hasbro's AWS endpoints, we were ready to get started. One technique I find quite helpful when kicking off a file format reversing task is asking myself "if I was a file parser, how would I approach this?" Parsers often start reading files from the beginning, so that's exactly what we did.
Here's the first few bytes of one of our DLCs, as seen in a hex editor. Just by eyeballing this bit of hex, a number of things spring out as interesting. The first 30 bytes here seem to form the word "FURBY" as a unicode string. This turned out to be true of all DLC files we observed, and appears to be the file signature for Furby Connect DLC files. Next, we see a couple of lines which clearly contain the word "DLC" followed by a string of zeroes, e.g. DLC_0000. It turned out that the spacing between each of these words was always a multiple of 38, which led us to believe that they were entries which could either be filled, or left empty (i.e. filled with null bytes.)
The entries each seemed to take the following form:
- A unicode string containing the characters DLC_0000.
- A three-character identifier e.g. PAL, SPR
- A little-endian dword, whose value appeared to increment by 26 for each successive entry
- A second dword, of variable size
- Four null bytes.
In the DLC shown above, a total of 9 entries were identified, containing the following 3-character identifiers:
It seemed that this initial section was something of a file header, which provided a limited description of the various sections within the DLC file. The "variable-length dword" noted earlier appeared to correspond to the length of each section, which we quickly confirmed by taking the length of this header, adding all of the values to it, and confirming that it matched the length of the file.
But what did each section contain? At this stage we were unsure, but we could make some rough assumptions based on the 3-character identifiers.
- PAL - "Palettes" (colour maps?)
- SPR - "Sprites" (images?)
- CEL - "Cels" (animation frames?)
- XLS - "eXecution List" (command definitions?)
- AMF - "Audio Media Files" (audio samples?)
- APL - "Audio Player" (more audio samples?)
- LPS - "Lips" (mouth movement definitions?)
- SEQ - "Sequence" (?)
- MTR - "Motor" (servo movement definitions?)
This didn't seem entirely right. Why would the mouth motion definitions be separate from the rest of the servo motion definitions? And why would there be two separate audio sections? Nonetheless, we now had a basic understanding how the file was structured: as a container, composed of a number of individual sub-sections, each describing a different part of the Furby's responses.
Nibbles and Bytes
Rather than just continuing sequentially through the file, we could now focus on specific sections of interest. The XLS ("eXecution List") section, though quite important-sounding, was probably also going to be quite complex. A good approach to reversing most things seems to be to start with the simplest parts, then build up to the more sophisticated parts from there, one piece at a time. Following this approach, we chose to leave the XLS section for later, and instead started out on the AMF ("Audio Media Files") section, which turned out to be arranged in the following rather straightforward format:
- A four-byte integer, giving the number of audio samples contained in the section.
- A sequence of four-byte integers, giving the offset to the start of each audio sample from the beginning of the section.
- A sequence of variable-length audio samples encoded with the GeneralPlus A1800 codec.
This general format seemed to be replicated in the APL section, which turned out to contain sequences of ordered references to samples in the AMF section. We therefore started referring to this section as the "Audio Playlist" section. Its arrangement was as follows:
- A two-byte integer, giving the number of playlists contained in the section.
- Another two-byte integer, which seemed to be the number of entries plus the constant value 0x546, and likely referred to an area of memory into which audio samples were stored.
- A four-byte integer, which seemed to be a version number (normally 0x04)
- A sequence of four-byte integers, each giving the offset to the start each audio playlist from the beginning of the section.
- The audio playlist data itself. Each playlist consisted of an array of 2-byte integers, terminated by the value 0xf000, with odd-numbered (1st, 3rd, 5th...) elements referring to audio sample numbers (from the AMF section), and even-numbered (2nd, 4th, 6th...) elements giving the pause length between samples.
We later found that quite a number of sections give offsets in words, hinting that the Furby might possibly be using a 16-bit processor under the hood. In fact, it turned out that the SEQ section - which we came to call the "Sequence" section - followed almost exactly the same format as the APL section:
- A two-byte integer, giving the number of "sequences" contained in the section.
- A four-byte integer, which seemed to be a version number (normally 0x04)
- A list of four-byte integers, each giving the offset to the start of each sequence in words from the beginning of the section.
- A collection of variable-length sequences, each terminated by the value 0x0000.
Although at this stage we weren't entirely certain what each word in a "sequence" actually referred to, it was likely that, as in the APL section, each entry was a reference to another part of the file, given their markedly similar layouts and grammar.
To better understand what role the SEQ section played, we modified a DLC file such that all the four-byte offsets at the beginning of the SEQ section were set to the same value, then uploaded this DLC to the Furby. The result was a Furby Connect which, regardless of the action command received over its BLE connection, would only play one sequence of motions, eye animations, and audio samples - but played that one sequence with no other irregularities. We therefore determined that the SEQ section was a sort of "tie-it-all-together" section, with each "sequence" entry describing a set of motions, audio, and eye animations that formed the complete response to a particular action command. With this additional insight, we were able to determine that one of the words in each sequence was being used to refer to a corresponding entry in the APL section.
It seemed that SEQ referred to entries in APL, and APL referred to entries in AMF. But how did action commands translate into SEQ entries? The XLS ("eXecution List") section, as predicted, proved to be somewhat more complicated, but once its format had been determined (it was a four-deep tree of pointers,) it was found to provide a means of mapping action commands to SEQ entries. At this point, we had a working understanding of which action commands would trigger which audio tracks, and how the audio sample section was formatted, meaning that we could reliably alter a DLC to contain samples of our choosing, and predictably play them back by sending specific action commands via the BLE connection. Nice!
The Magic Word
Despite all this, we were still no closer to playing the chilli animation, which had evaded both our own efforts and those of our predecessors. We suspected that the SEQ section was somehow involved in selecting eye animations, but weren't entirely certain how. To try to bring clarity to the matter, we found a number of Furby actions that all had one infrequently-used eye animation in common - the pulsing exclamantion-mark eye animation, normally displayed when new a new DLC file had been downloaded by the app - then had a look at the SEQ entries that corresponded to them.
We started by looking at just two of these SEQ entries. So that you can follow along with our reasoning, the two entries we looked at were as follows:
- Sequence A (SEQ entry number 16): 3000 454f e003 804b 1064 8068 1172 8051 0000
- Sequence B (SEQ entry number 31): 2000 455e e007 10a0 8005 115e 8068 0000
We'd noticed earlier that all sequences seemed to start with either 0x03 or, less commonly, 0x02, and were terminated by the value 0x00. We'd also identified that the second word was used to refer to the particular playlist in the APL section that'd be played as part of the action, and the third word was used to control what the Furby's servos would do during the action (either by pointing into the MTR section, or to one of the sets of actions pre-programmed into the Furby.) The fourth through to the second-from-last word remained a mystery.
These two sequences corresponded to the following eye animations:
If we look carefully at these two eye animations, you'll notice that they seem to be made up of a number of shorter eye animations played one after the other. We can pick out three distinct eye animations in A, and two distinct eye animations in B:
|Look About||Exclamation Mark|
Interestingly, there are also exactly three words starting with 0x8 in sequence A, and exactly two such words in sequence B, as shown below.
- Sequence A: 3000 454f e003 804b 1064 8068 1172 8051 0000
- Sequence B: 2000 455e e007 10a0 8005 115e 8068 0000
We might conjecture that these 0x8XXX words are used to select eye animations, but we'll need more proof to confirm this. If we look at these words again, do any of them share the same value? Indeed they do - the second word in the first sequence is the same as the second word in the second sequence (they're both 0x8068).
- Sequence A: 3000 454f e003 804b 1064 8068 1172 8051 0000
- Sequence B: 2000 455e e007 10a0 8005 115e 8068 0000
If these 0x8XXX words do in fact correspond to eye animations, we'd expect to see the same eye animation duplicated in both sequences, appearing at roughly the same point as the 0x8068 value.
If we now look back at our eye animations, we'll see that the second eye animation in sequence A does indeed match the second eye animation in sequence B: both of them involve the pulsing yellow exclamation mark.
|Sequence A||Sequence B|
This seems like fairly convincing proof, but to confirm our theory, we should look at one last sequence:
3000 4547 e00a 8024 106e 8024 1064 8068 0000
This sequence is particularly interesting. Not only does this sequence feature the value 0x8086, which seemed to correspond to the pulsing yellow exclamation mark in the previous two sequences, but it also includes the value 0x8024 repeated twice. We would therefore expect this sequence to play an eye animation made up of a total of three shorter eye animations, with the first and second being identical, and the third being the pulsing yellow exclamation mark.
Triggering this sequence causes the following eye animation to be displayed:
As before, we can pick out the individual short eye animations that play in sequence:
This is exactly as we'd expect, and confirms that the 0x8XXX words in each sequence are likely used to select which eye animations are displayed on the Furby.
In order to get the chilli animation to play, we'd need to switch one of these 0x80XX words for the value corresponding to the chilli animation. But which value was that? All the other animations were stored within the Furby's on-board memory, and we hadn't yet been able to bring ourselves to take the (frankly adorable) toy apart.
To solve the problem, we wrote a short script to pull out the set of all 0x8XXX words in the SEQ section of our DLC file that appeared to reference eye animations. We knew that none of the action commands we'd sent the Furby would trigger the chilli animation, so we modified our script to also pull out all that 0x8XXX words that didn't feature in a triggerable sequence. This set contained the following five values:
This set is small enough that we could have just tried each value in turn to see what animation they'd trigger, but the second entry in the set, 0x8401, looked particularly interesting, being as it was the only value we'd seen so far that started 0x84 rather than 0x80. We found a sequence we knew we could trigger, overwrote the 0x80XX values in it with 0x8401, then held our breath and sent the corresponding action command.
Success! We'd managed to trigger the chilli animation contained in our DLC file. All that was left to do was find a way to modify the animation to display graphics of our choosing.
We identified three sections that only appeared in DLC files with animations, named "PAL," "CEL," and "SPR." By eyeballing the CEL section in a hex editor, it was possible to see what looked like graphics, so we turned our attention there next.
The CEL section in the DLC we'd been looking at was exactly 242,688 bytes in length, or 0x3b400 in hex. This so happened to divide perfectly by 0xc00 into 79 distinct regions, each of which appeared to contain some, but not all, of one of the various graphics we'd seen in the chilli animation - the chilli itself, for example, seemed to be stored as four quarter-images (top-left, top-right, bottom-left, and bottom-right,) which when displayed together, would form the whole chilli sprite.
When we came to examine the SPR section, we found a number of sequential entries, each 8 words in length (plus a terminating 9th word.) The first, third, fifth, and seventh word in each sequence seemed to contain numbers from zero through to 79 - which seemed to hint that these were references to the quarter-images we'd seen in the CEL section. We concluded, therefore, that the SPR section essentially contained a description of "frames," with each individual "frame" composed of four quarter-images.
Rather than storing an RGB value for each pixel in these graphics, the image format used in the DLC files appeared to use a separate colour palette, stored in the PAL section - likely to help reduce the overall size of the file. Although initially we weren't able to determine exactly how entries in the CEL section referenced into palettes in the PAL section, we were able to build some "colour test" images to display on the Furby's eyes, and see what colours we were able to display. These gave us insights into how the palettes in the PAL section were arranged, which eventually allowed us to not only read and interpret the colours described in a palette, but also to create and write back our own.
|Eye 1||Eye 2||Eye 3||Eye 4|
|Chilli 1||Chilli 2||Chilli 3||Chilli 4|
|Chilli 5||Chilli 6||Chilli 7||Chilli 8|
|Chilli 9||Chilli 10||Chilli 11||Chilli 12|
At this point, in addition to being able to write audio and graphics into a DLC file, we also had a far better understanding of how the various sections within a DLC file related to one another. A full hierarchy diagram can be seen below.
Now that we had control of both the Furby's audio and graphics, our work was done, and we were able to put together a quick-and-dirty Python script that would chop up an official Hasbro DLC, reliably add, remove, and modify audio content, as well as adding our own eye animations.
Our research partner Which? contacted Hasbro to let them know about the security problems we'd identified and demonstrated in the Furby Connect. Their response is quoted below.
At Hasbro, children's privacy is a top priority, and that is why we carefully designed the FURBY CONNECT toy and the FURBY CONNECT WORLD app to comply with children's privacy laws. In support of this, we also engaged a third party to perform security testing on the FURBY CONNECT toy and FURBY CONNECT WORLD app.
We carefully reviewed the report, and take this very seriously. While the researchers at Which? identified ways to manipulate the FURBY CONNECT toy, we believe that doing so would require close proximity to the toy, and that there are a number of very specific conditions that would all need to be satisfied in order to achieve the result described by the researchers at Which?, including reengineering the FURBY CONNECT toy, creating new firmware, and then updating the firmware, which requires being within Bluetooth range while the FURBY CONNECT toy is in a "woke" state. A tremendous amount of engineering would be required to reverse engineer the product as well as to create new firmware.
We feel confident in the way we have designed both the toy and the app to deliver a secure play experience. The FURBY CONNECT toy and FURBY CONNECT WORLD app were not designed to collect users' name, address, online contact information (e.g., user name, email address, etc.) or to permit users to create profiles to allow Hasbro to personally identify them, and the experience does not record your voice or otherwise use your device's microphone.
We've released some preliminary scripts that can decode and modify DLC files. If you're feeling adventurous, you can have a go at making and uploading your own Furby animations. Get the code on GitHub.