Email systems are comprised of the most integrated, well-defined architecture consisting of various software, hardware, and protocols. The need for forensic analysis of the emails emerged after the cyber crooks started making use of the electronic messaging system for illegal activities. Cybercriminals use email communication mediums for sending spam emails, phishing emails, cyberbullying, child pornography via the sharing of images, and much more.
Email headers are one of the most crucial resources for an investigator to proceed with the email investigation. Also, the “Message-ID” plays a key role in tracing out the necessary required evidence from the emails. The further part of the content will throw light on the structure and anatomy of the message – IDs along with the detailed analysis of its construction mechanism.
MESSAGE–ID FORENSICS – Dig into the Hidden Artifacts
RFC 2822, the internet email address format states that each email must have a globally unique identifier. This unique identifier is termed as Message-ID or Client-ID which is an integral part of the email header. The RFC 2822 has also defined the syntax of message-ID which looks similar to the normal email address.
Two email messages cannot pursue the same message-ID. FQDN (Fully Qualified Domain Name) represents the globally unique MTA (Mail Transfer Agent) that gives the unique message-ID to for each email message. The combination of date and time with process ID and some random numbers makes it unique for a particular MTA. The image is shown below clearly shows the structure of an email header along with the position of Message-ID (Blue font).
Generation and Format of Email Message-ID
The structure of the message-ID is somewhat similar to the structure of the native email addresses, say “firstname.lastname@example.org”. The message-ID is always defined between the pair of angle brackets i.e., <email@example.com>.
A Message-ID is comprised of two parts: –
Let’s call the first part as #X that is given as (abcdefg) before the “@” symbol. The second part is given as (xyz.com) after the “@” Let’s call it as #Y section. Let’s consider it as #X.#Y
- The #X section of the message-ID contains the timestamp information about the mail such as the date and time of the message when it was sent. This timestamp information can be traced after the first eight digits in the message-ID. The date pattern is followed in the following form: – YYMMDDHHMM
In the above-defined message-ID: – 1505080910, that can be written as 15 05 08 0910, means – 2015 (Year) / May (Month) / 08 (Date) / 9 (Hours) / 10 (Minutes) Pacific Day Time
- #Y section of the message-ID consists of FQDN i.e. Fully Qualified Domain Name. In the above-mentioned identifier, the FQDN is gmail.com. The FQDN is a repository of detailed info such as:
- The local hostname
- The source of email
- The MTA (Mail Transfer Agent) of the source
In the above-considered example, gmail.com is the dedicated localhost that works with Gmail’s server i.e. MTA.
Detect Forged Email Headers via Message-ID Analysis
Just like the hacker can spoof other artifacts of the email header, message-ID spoofing is also possible. We have observed many email headers where the MTA is generated, to create a message-ID that looks legitimate.
For example: 200808131227.m7DCKVem009817@Here.US.EDU
LHS (left-hand side) of the dot is simply showing the date and time whereas RHS (right-hand side) of the dot contains 14 characters, The first 8 characters are the combination of numbers and some English alphabets and other 6 just number.
So before using message-ID for forensic analysis, the message-ID must be first verified for its validity.
Knowledge of construction and proper format of message-ID will help investigators to verify the message-ID. The sequence number and process ID are created dynamically with characters, so verifying them manually is very difficult. Some forensic tools are available that can help to logically examine or verify the Message-IDs.
Spam identification consists of the spam mail filter process. It is used to check empty illegal message-ID or message-ID pattern in the header of the email. However, the message-ID field in the email header is optional and it is also possible to be spoofed. Therefore, checking the message-ID is not a consistent spam checking method because a good spammer always tries to create such patterns of message-ID that it looks legitimate to the user’s eyes.
The above discussion reveals that message-ID in the email header plays an important role in the email forensic investigation process. The global unique feature of message-ID helps to distinguish each email that can help in forensic analysis. Message-ID construction part and format will help the investigators to identify spoofed emails and other details like source host, timestamp details, email log file analysis, etc.