Extracting WhatsApp messages from an iOS backup

Advertisements

[ad_1]

Hello everybody! ? I used to be lately exploring how one can get a neighborhood backup of WhatsApp messages from my iPhone. I switched from Android to iOS up to now and misplaced all of my WhatsApp messages. I wished to guarantee that if I switched once more from iOS to Android I don’t lose any messages. I don’t actually care if I can import the messages in WhatsApp. I simply don’t need to lose the entire essential data I’ve in my chats. I don’t have any fast plans for switching (if ever) but it surely appeared like a enjoyable problem and so I began surveying the accessible instruments and the way they work.

This was largely a studying train for me concerning how Apple shops iOS backups and the way I can selectively extract data and information from one. My goal was to have a neighborhood copy of WhatsApp messages that I can learn and search by way of domestically. It could be doubly superior if I can transfer the messages to an Android system however, as I discussed earlier than, that wasn’t my important intention.

Exploring iOS backup

By default, while you create an iOS backup on Mac (Catalina in my case), it’s saved underneath ~/Library/Software Help/MobileSync/Backup/. This folder incorporates sub-folders with distinctive system identifiers. Every sub-folder is a backup and incorporates a bunch of further subfolders together with the next 4 essential recordsdata:

  • Data.plist
  • Manifest.db
  • Manifest.plist
  • Standing.plist

We primarily care about each of the Manifest recordsdata.

The Manifest.plist file is a binary Property Checklist file that incorporates details about the backup. It incorporates:

  • Backup keybag: The Backup keybag incorporates a set of knowledge safety class keys which are completely different from the keys within the System keybag, and backed-up information is re-encrypted with the brand new class keys. Keys within the Backup keybag facilitate the safe storage of backups. We are going to study safety courses later
  • Date: That is the timestamp of a backup created or final up to date
  • ManifestKey: That is the important thing used to encrypt Manifest.db (wrapped with safety class 4)
  • WasPasscodeSet: This identifies whether or not a passcode was set on the system when it was final synced
  • And way more…

Supply: O’Reilly + Richinfante

Whereas, the Manifest.db file incorporates all of the juicy information concerning the recordsdata within the backup and their paths. The one downside is that the Manifest.db file is encrypted and we have to use the knowledge from the Manifest.plist file to decrypt it. If the backup was not encrypted, we may have in all probability gotten away with out making use of the Manifest.plist file.

We are able to confirm that the db file is encrypted by opening it in any SQL db viewer. I used “DB Browser for SQLite” and it confirmed me this display screen:

![SQLCipher Encryption](/photos/exploring-ios-backup/SQLCipher encryption.png)

This clearly reveals that the db is encrypted. Later we are going to see that not solely is the DB encrypted, however each file can be encrypted with its personal random per-file encryption key.

Decrypting the Manifest.db file

The fundamental decryption course of is as follows:

  1. Decode the keybag saved within the BackupKeyBag entry of Manifest.plist. A high-level overview of this construction is given within the iOS Safety Whitepaper. The iPhone Wiki describes the binary format: a 4-byte string sort subject, a 4-byte big-endian size subject, after which the worth itself.

    The essential values are the PBKDF2 ITERations and SALT, the double safety salt DPSL and iteration depend DPIC, after which for every safety CLS, the WPKY wrapped key.

  2. Utilizing the backup password derive a 32-byte key utilizing the right PBKDF2 salt and variety of iterations. First, use a SHA256 spherical with DPSL and DPIC, then a SHA1 spherical with ITER and SALT.

    Unwrap every wrapped key based on RFC 3394.

  3. Decrypt the manifest database by pulling the 4-byte safety class and longer key from the ManifestKey in Manifest.plist, and unwrapping it. You now have a SQLite database with all file metadata.

  4. For every file of curiosity, get the class-encrypted per-file encryption key and safety class code by trying within the Recordsdata.file database column for a binary plist containing EncryptionKey and ProtectionClass entries. Strip the preliminary four-byte size tag from EncryptionKey earlier than utilizing.

    Then, derive the ultimate decryption key by unwrapping it with the category key that was unwrapped with the backup password. Then decrypt the file utilizing AES in CBC mode with a zero IV.

Supply: StackOverflow

If safety courses and double safety doesn’t make a lot sense, I’d extremely suggest studying the iOS Safety Whitepaper from web page 12 onwards. It gives particulars about all of this and why iOS makes use of these safety courses.

When you don’t know what a Keybag is, Apple has first rate documentation:

A knowledge construction used to retailer a set of sophistication keys. Every sort (consumer, system, system, backup, escrow, or iCloud Backup) has the identical format.

A header containing: Model (set to 4 in iOS 12 or later), Kind (system, backup, escrow, or iCloud Backup), Keybag UUID, an HMAC if the keybag is signed, and the tactic used for wrapping the category keys—tangling with the UID or PBKDF2, together with the salt and iteration depend.

An inventory of sophistication keys: Key UUID, Class (which file or Keychain Knowledge Safety class), wrapping sort (UID-derived key solely; UID-derived key and passcode-derived key), wrapped class key, and a public key for uneven courses.

We are able to learn the Manifest.plist file in Python utilizing the biplist module. You’ll be able to set up it utilizing pip:

pip set up biplist

After which use it like this:

from biplist import readPlist
import os

backup_directory = os.path.expanduser("~/Library/Software Help/MobileSync/Backup/<unique_id>")
plist_path = os.path.be a part of(backup_directory, "Manifest.plist")
plist = readPlist("Manifest.plist")

Be aware: Don’t overlook to switch <unique_id> with the identify of you specific system backup folder.

That is what the plist contents would appear to be:

Manifest.plist

From this dict, we require the backupKeyBag and ManifestKey. It’s going to assist us decrypt the Manifest.db file. The BackupKeybag is a binary string with the next format:

  • 4-byte block identifier
  • 4-byte block size (most vital byte first), size 4 means complete block size of 0xC bytes.
  • information

The primary block is “VERS” with a model variety of 3. There are plenty of block sorts: VERS, TYPE, UUID, HMCK, WRAP, SALT, ITER, UUID, CLAS, WRAP, KTYP, WPKY, and so forth.

Supply: IPhone Wiki

Decrypting the keybag

There are fairly just a few sources accessible on-line that present you how one can decrypt the keybag. It makes use of PBKDF2 for key era and AES for encryption. You’ll be able to check out this StackOverflow reply for working Python code to decrypt the keybag. I might be making use of the code from that reply.

There are a bunch of various safety courses. The one used for the manifest database is class 3. We are able to discover this by studying the primary 4 bytes of the ManifestKey worth in our Manifest.plist file:

import struct
manifest_class = struct.unpack('<l', plist['ManifestKey'][:4])[0]
# Output: 3

I encrypted my iOS backup. That is helpful as a result of Apple doesn’t again up delicate information until the backup is encrypted. Delicate information contains stuff like WiFi passwords. Now we will use the code from StackOverflow, the preliminary backup encryption passphrase you used whereas creating the backup, and the remainder of the ManifestKey from the Manifest.plist to decrypt the Manifest.db file:

manifest_key = plist['ManifestKey'][4:]

kb = Keybag(plist['BackupKeyBag'])
kb.unlockWithPassphrase('passphrase')
key = kb.unwrapKeyForClass(manifest_class, manifest_key)

with open('Manifest.db', 'rb') as f:
    encrypted_db = f.learn()

decrypted_data = AESdecryptCBC(encrypted_db, key)

with open('decrypted_manifest.db', 'wb') as f:
    f.write(decrypted_data)

As you may see above, if you happen to don’t bear in mind the passphrase you used whereas backing up your iOS system, you cannot decrypt something. It’s essential to proceed the remainder of the decryption course of.

Now if we attempt to open the decrypted_manifest.db in a SQL viewer we will see the precise information:

decrypted manifest.plist

We are able to seek for all recordsdata related to WhatsApp by doing a world seek for WhatsApp. The chats are saved in a ChatStorage.sqlite file:

whatsapp-manifest-plist

We are able to get this report utilizing Python:

import sqlite3

db_conn = sqlite3.join('decrypted_manifest.db')
relative_path = "Chatstorage.sqlite"
question = """
    SELECT fileID, file
    FROM Recordsdata
    WHERE relativePath = ?
    ORDER BY area, relativePath
    LIMIT 1;
"""
cur = db_conn.cursor()
cur.execute(question, (relative_path,))
outcome = cur.fetchone()
file_id, file_bplist = outcome

One factor to notice is that the fileID is made up of a hash of the area + file identify so it will in all probability be the identical for you. It’s generated like this:

import hashlib

area = "AppDomainGroup-group.internet.whatsapp.WhatsApp.shared"
relative_path = "ChatStorage.sqlite"
hash = hashlib.sha1(f"{area}-{relative_path}".encode()).hexdigest()

# hash = 7c7fba66680ef796b916b067077cc246adacf01d

The report within the db incorporates the binary plist file related to ChatStorage.sqlite file. We acquired a maintain of that by working the above question. We are able to have a look inside through the use of the readPlistFromString methodology of the biplist module and extract the required data:

from biplist import readPlistFromString
file_plist = readPlistFromString(file_bplist)

# print(file_plist)

# {'$archiver': 'NSKeyedArchiver',
#  '$objects': ['$null',
#               {'$class': Uid(5),
#                'Birth': 1617036196,
#                'EncryptionKey': Uid(3),
#                'Flags': 0,
#                'GroupID': 501,
#                'InodeNumber': 45839007,
#                'LastModified': 1650483880,
#                'LastStatusChange': 1650481761,
#                'Mode': 33188,
#                'ProtectionClass': 3,
#                'RelativePath': Uid(2),
#                'Size': 22056960,
#                'UserID': 501},
#               'ChatStorage.sqlite',
#               {'$class': Uid(4),
#                'NS.data': b'x03x00x00x00tE1xd1Hn"ex06xf7x1cl'
#                           b'x82xedx05xe7x1dx1cxd6x97x0exe9x8b"'
#                           b'xfax16x93x9c3x18xbenx14x1eR;fx98xe3v'},
#               {'$classes': ['NSMutableData', 'NSData', 'NSObject'],
#                '$classname': 'NSMutableData'},
#               {'$courses': ['MBFile', 'NSObject'], '$classname': 'MBFile'}],
#  '$prime': {'root': Uid(1)},
#  '$model': 100000}
file_data = file_plist['$objects'][file_plist['$top']['root'].integer]
protection_class = file_data['ProtectionClass']

encryption_key = file_plist['$objects'][file_data['EncryptionKey'].integer]['NS.data'][4:]

# file_data
# {'$class': Uid(5),
#  'Start': 1617036196,
#  'EncryptionKey': Uid(3),
#  'Flags': 0,
#  'GroupID': 501,
#  'InodeNumber': 45839007,
#  'LastModified': 1650483880,
#  'LastStatusChange': 1650481761,
#  'Mode': 33188,
#  'ProtectionClass': 3,
#  'RelativePath': Uid(2),
#  'Measurement': 22056960,
#  'UserID': 501}

# protection_class
# 3

# encryption_key
# ---truncated---

Now we have to use the keybag class (kb) to unwrap the encryption key from above for the required safety class (3):

file_decryption_key = kb.unwrapKeyForClass(protection_class, encryption_key)

Decrypting ChatStorage.sqlite

Candy! All that’s left is to decrypt the precise chat db. However the place is it saved? Apple shops recordsdata within the backup folder in a predictable format. It places them in a subdirectory with the identify beginning with the primary two characters of fileID (eg 7c/7c7fba66680ef796b916b067077cc246adacf01d). We are able to get the complete path to the chat db file like this:

filename_in_backup = os.path.be a part of(backup_directory, file_id[:2], file_id)

This may enable us to open the encrypted file and decrypt it utilizing the file_decryption_key we extracted above:

with open(filename_in_backup, 'rb') as encrypted_file:
    encrypted_data = encrypted_file.learn()

decrypted_data = AESdecryptCBC(encrypted_data, file_decryption_key)

Be aware: This AESdecryptCBC perform is part of the code we acquired from StackOverflow

Generally the encryption introduces padding on the finish of the info to make it a a number of of the blocksize. So we’d like to verify we take away any padding from the tip of the info as nicely:

def removePadding(information, blocksize=16):
    n = int(information[-1])  # RFC 1423: final byte incorporates variety of padding bytes.
    if n > blocksize or n > len(information):
        elevate Exception('Invalid CBC padding')
    return information[:-n]
    
decrypted_data = removePadding(decrypted_data)

We are able to save this decrypted information in a brand new SQLite file:

with open('decrypted_ChatStorage.sqlite', 'wb') as f:
    f.write(decrypted_data)

If we now open this new file in a SQLite browser, we will see all of the tables:

WhatsApp Messages Table

The chats are saved within the ZWAMESSAGE desk:

WhatsApp Messages

If you’re searching for all of the media recordsdata that had been despatched with messages, you’ll have to return to the decrypted Manifest.db file and filter for media recordsdata saved underneath Message/Media:

media manifest.plist

You need to use the next SQL question to get all of those media recordsdata:

"""
SELECT fileID,
       relativePath,
       flags,
       file
FROM Recordsdata
WHERE relativePath
    LIKE 'Message/Media/%'
"""

Now right here comes the most effective half. You don’t must do any of this your self. There’s already a Python program on the market that may parse by way of your iOS backup, obtain all of the media recordsdata, chats, and get in touch with record, and convert them into HTML format. This manner you may learn your chats with out porting the backup right into a WhatsApp shopper.

Whatsapp-Chat-Exporter works with iOS and Android ✨

I used this instrument to finally convert all of my WhatsApp messages into HTML format for simple searching on my laptop computer.

Helpful Sources

I took some assist from a bunch of various sources whereas writing this text. You’ll be able to undergo them to get a deeper understanding of a few of the stuff talked about on this article:

Conclusion

I hope you discovered a factor or two from this text. I had a enjoyable time diving into the weeds of iOS backups. I had no concept how Apple was storing the backup and the way simple/laborious it was going to be to get the actual file I wished from that backup. Suffice to say it wasn’t too laborious and taught me just a few enjoyable issues within the course of.

[ad_2]