Skip to content

spamanalyzer.utils

analyze_subject(headers, wordlist)

Checks if the email has gappy words or forbidden words in the subject.

Parameters:

Name Type Description Default
headers dict

a dictionary containing parsed email headers

required
wordlist list[str]

a list of words to be used as a spam filter in the subject

required

dkim_pass(headers)

Checks if the email has a DKIM record.

dmarc_pass(headers)

Checks if the email has a DMARC record.

get_domain(field) async

Extracts the domain from a field.

Parameters:

Name Type Description Default
field str

a string expected to contain a domain

required

Returns:

Name Type Description
Domain

a Domain object containing the domain name and the TLD

has_auth_warning(headers)

Checks if the email has an authentication warning, usually it means that the sender claimed to be someone else.

has_html(body)

Checks if the email contains html tags.

Parameters:

Name Type Description Default
body str

the body of the email

required

Returns:

Name Type Description
bool bool

True if the email contains html tags

has_html_form(body)

Checks if the email has a form.

Parameters:

Name Type Description Default
body str

the body of the email

required

Returns:

Name Type Description
bool bool

True if the email has a form

has_images(body)

Checks if the email contains images.

Parameters:

Name Type Description Default
body str

the body of the email

required

Returns:

Name Type Description
bool bool

True if the email contains images

Checks if the email has mailto links.

Parameters:

Name Type Description Default
body str

the body of the email

required

Returns:

Name Type Description
bool bool

True if the email has mailto links

has_script_tag(body)

Checks if the email has script tags or javascript code.

Parameters:

Name Type Description Default
body str

the body of the email

required

Returns:

Name Type Description
bool bool

True if the email has script tags or javascript code

inspect_attachments(attachments)

A detailed analysis of the email attachments.

Parameters:

Name Type Description Default
attachments List

a list of attachments

required

Returns:

Name Type Description
dict dict[str, bool]

a dictionary containing the following information:

dict[str, bool]

```python

dict[str, bool]

{ "has_attachments": bool, # True if the email has attachments "attachment_is_executable": bool # True if the email has # an attachment in executable format

dict[str, bool]

}

inspect_body(body, wordlist, domain)

A detailed analysis of the email body.

Parameters:

Name Type Description Default
body str

the body of the email

required
wordlist list[str]

a list of words to be used as a spam filter in the body

required
domain Domain

the domain of the sender

required

Returns:

Name Type Description
dict dict[str, Any]

a dictionary containing the following information:

  • has_http_links (bool): True if the email has http links
  • has_script (bool): True if the email has script tags or javascript code
  • forbidden_words_percentage (float): the percentage of forbidden words in the body
  • has_form (bool): True if the email has a form
  • contains_html (bool): True if the email contains html tags

inspect_headers(email, wordlist) async

A detailed analysis of the email headers.

Parameters:

Name Type Description Default
headers dict

a dictionary containing parsed email headers

required
wordlist Iterable[str]

a list of words to be used as a spam filter in the

required

Returns:

Name Type Description
tuple

a tuple containing all the results of the analysis

  • has_spf (bool): True if the email has a SPF record
  • has_dkim (bool): True if the email has a DKIM record
  • has_dmarc (bool): True if the email has a DMARC record
  • domain_matches (bool): True if the domain of the sender matches the domain of the server
  • has_auth_warning (bool): True if the email has an authentication warning
  • has_suspect_words (bool): True if the email has gappy words or forbidden words in the subject
  • send_year (int): the year in which the email was sent (in future versions should be a datetime object)

parse_date(headers, timezone)

Date format should follow RFC 2822, this function expects a date in the format: "Wed, 21 Oct 2015 07:28:00 -0700", and returns a tuple where: 1. the first element is the parsed date or None if the date is not in the correct format 2. the second element is a boolean indicating if the date is valid or not

Eventually in future versions will be specified the kind of error that occurred, like in spamassassin (e.g. "invalid date", "absurd tz", "future date")

percentage_of_bad_words(body, wordlist)

Calculates the percentage of forbidden words in the body.

Parameters:

Name Type Description Default
body str

the body of the email

required
wordlist list[str]

a list of words to be used as a spam filter in the body

required

Returns:

Name Type Description
float float

the percentage of forbidden words in the body

spf_pass(headers)

Checks if the email has a SPF record.